- resizing to the same size
- removing metadata
This results in much faster transfer (10x less bandwidth used often for mobile uploads) and reduces server load by "farming" out the work to the clients.
# Edit: On Keeping Full Resolution Images
Some people mention having original highest-resolution images are important. I don't think that is true for most applications.
Most apps don't need hi-resolution history as much as current, live engagement so older photos being smaller isn't a big deal. As technology moves on you simply start allowing higher-res uploads. Youtube, facebook, and others have done this fine as the older stuff is replaced with the new/current/now() content.
In fact, even our highest resolution images are still low-quality for the future. Pick a good max size for your site (4k?) and resize everything down to that. In a year, bump it up to 6k, then 10k, etc...
Keeping costs low has it's benefits, especially for us startups. Now if you have massive collateral, then knock yourself out.
1) Although the site serves up images at 1024 pixels (or whatever) today, in the future they may want larger images. When everyone is rocking 10K monitors and 6K phone displays, those small images are going to look pretty bad.
2) The original image has some metadata that they want to keep (geolocation, etc).
3) They think they can do a better and more consistent job resizing than the various browsers, which is probably true.
Regardless, there are still better filters than bilinear, i.e. Lanczos, which I'm pretty sure none of the browsers use.
Isn't exif data something you should strip out?
You know, a thumbnail with a cute kitten but something completely different after you click on it ;)
As an aside, I wish the "share" button would share a lower resolution image instead. I don't mind storing the full quality picture, but handling a 10mb image is seriously silly.
Might be a fair presumption today, but might not be for the future with hiDPI screens, VR etc... for the relative storage costs, it'd be better to have the original, and then you can programmatically run from there.
"drawImage() will ignore all EXIF metadata in images, including the Orientation. This behavior is espacially troublesome on iOS devices. You should detect the Orientation yourself and use rotate() to make it right.
If the origin of the image is the client and you got the client side resize wrong, then you might introduce artifacts when trying to fix it on the server because the data loss. Also if clients are mobile, you might like to optimize the battery of clients instead of computing time on the server.
> Some people mention having original highest-resolution images are important. I don't think that is true for most applications.
It is true for every application when the next generation of displays hits the market. The question is not the long term usability of our current low-res images but just the migration to the next step. At the moment Acorn announces their new APhone and has a million handsets sold by tomorrow, you want your service to deliver at least viewable images. It's not always the app that sets the bar, sometimes
it is the device.
Edit: As someone who travels rather remote places of this planet regularly I'm grateful for every app that does not put the burden on the client. My battery packs only last so long.
Also they don't save preview images and generate them when needed as I understood. So what you are suggesting requires a lot of disk space to keep thumbnails that might be never needed later.
And if you don't have millions of uploads per day then it makes no sense trying to save some seconds of CPU time by unnecessarily complicating the system. In most languages there already are libraries for resizing images.
"How Discord Resizes 150M Images Every Day for Free"
But it also means you need to know how you will display when you save it. Layout changes, screens change, how do you anticipate the future dimensions / resolution you will need out of the original?
The image file formats are very very complicated, many are platform specific, some are covered by patents.
For example of a common issue, another comment mentioned the rotation parameter, it's set by many cameras but the support is inconsistent.
also -- there's also page load time. if it's an intensive calculation then the overhead of sending the results over http is still less than the browser calc time.
Seemed like a lot of unnecessary work for them to reimplement a service from scratch without gaining any major perf benefits over their existing one and without leaning on an existing well-known and well-built foundation.
The one thing these don't support though is smarter cropping that takes into account image contents, which takes enough cpu power to require preprocessing
You really have to run this kind of complex parsing in a disposable containerized environment to do it safely. Or do everything carefully and in a memory safe language.
See the persistent, years-long trend where mobile devices and game consoles get exploited via some combination of libtiff and libpng.
It's the entire reason python is generally fast enough, anything that's slow generally uses a C lib under the hood anyway.
However, there isn't much choice. Performance is very important in image processing, so much that many libraries contain hand-written assembly. In the article, it says that 90% of processing power is dedicated to it. Using a safer language in a safe way could completely kill performance and significantly increase the costs.
I also recommended a mitigation strategy for unsafe code. Complaining that security is too hard is the reason for the situation we find ourselves in as an industry.
seems to vary wildly. for some, it's not that expensive.
How much indeed ? What was the last time ? Ah, yes, Equifax. What happened ? Nothing.
if i was a betting person, i'd wager that it may see somewhat like "rewrite it in rust" cargo culting.
Also, your life must be very stressful.
Additionally, image manipulation is inherently challenging - not even due to the actual manipulation of image pixel data, but due to the proliferation of complex image container formats which require binary data manipulation and byte copying in performance-critical code. This is a minefield for secure programming practices because it puts at direct odds performance and sanity checking, as well as encouraging pointer and memory arithmetic and unsafe access.
seems to me that there is no limit to available room. well, i suppose we're capped by the collective capacity of local storage and storage service providers.
> We likely could have addressed this behavior in Image Proxy, but we had been experimenting with using more Go, and it seemed like a good place to try Go out.
At the heart of if, they were looking for opportunities to use more Go in their stack and they deemed this situation as a fit.
1. Static typing increasing confidence and velocity
2. Better developer-facing tooling increasing velocity
3. More employees knowledgeable about Go than Python
4. More enthusiasm (and therefore faster velocity) around Go development.
The blog post was about the engineering challenges they faced and how they solved them and I think it was a great write-up in that regard. The post wasn't about why they switched this service from Python to Go.
I'm the kind of hacker who if a service runs out of memory every 2 hours, writes a crontab to restart it every hour after X random minutes so they don't all restart at the same time. It gets a lot of eye rolls from the other engineers searching for perfection, but it tends to produce services quickly that are highly reliable.
And look now the engineers who like chaos monkey don't even have to set that up. It's built in.
It looks like most of the savings were in switching from pillow to opencv, something that thumbor already does. https://github.com/thumbor/opencv-engine
1. Add profiling and telemetry to their Python code. Refactor the codebase based on insights from this.
2. Write a C<->Python interop for their image libraries.
I can't see the cost of #2 being any different than the cost they paid on writing it in Go. As for #1, depending on how the code is structured, a rewrite may have been less time than profiling spaghetti code. At that point, it depends on how much Go experience the team has.
Looks like I didn't scroll properly when I looked at that file. My bad :-/
Also http://thumbor.org and https://imageresizing.net if you want a library to host yourself which are already very fast and well tested. Put them in a docker container on a kubernetes cluster and it's all done in an hour.
* Previews (images, gifs, and videos)
Previews going down would be a pretty big deal for my communities based on the way we use the platform.
Also, it is totally core to what they do. Images are a huge part of the Discord UX.
If they're going to spend 60k/year on instances, the dev time definitely wasn't worth it for this. They just wanted to use that language because this is a NIH situation, not really an engineering priority.
Unfortunately the post seems to have disappeared from the internet (it was probably around 6 years ago), so here are some other teasers:
Disclaimer: not affiliated with Ceph apart from being a happy sysadmin.
Talk is from Lua workshop 2017. Relevant content begins at 15m40s.
I have built an image resizing service around this with go and libvips. With go libvips, s3gof3r, you can load s3 images directly into a buffer, pass to libvips, and serve without writing to disk. Basically, you can use edge functions with your origin as the above go service.
Don't need anything fancy. Just w=? h=? would be great, developers can handle the DPI stuff with sourceset tags.
PCI express is ~100 gbit/sec, much faster than any network interface. Internally, a GPU can resize these images by an order of magnitude faster than that, see the fillrate columns in the GPU spec.
Since the GPU hardware has become commonplace, there's definitely a lot more attention on using it in the server space and I think it'll become common in the next few years but that has a migration cost for early adopters since you're hitting less mature projects for critical functions. Internet-facing image processing has a bunch of tedious but important work handling format variations and errors (it'll be reported as a bug in your software if the image opens in a browser and/or photoshop), making sure that you handle gamma/colorspace consistently, etc.
If you're trying to get production-ready server out the door, it's really tempting not to deal with any of that once you hit the point where it's fast enough that engineering time costs more than the server savings.
GPUs can do that, too: http://fastcompression.com/products/jpeg/cuda-jpeg.htm
> you now need to make sure that all of your servers have GPUs available
OP is running on google’s cloud: “n1-standard-16 host type, peaking at 12 instances on a typical day.” That instance costs $0.76/hour. Adding NVIDIA Tesla K80 is $0.7 extra.
> it's really tempting not to deal with any of that
Yeah, that’s understandable. But the original article dealt with a lot of strange technologies to get the performance they want. And ended up doing much slower, performance wise, than what’s possible with a GPU.
> GPUs can do that, too: http://fastcompression.com/products/jpeg/cuda-jpeg.htm
Agreed - but for how many different formats, and how well do those implementations support all of the various format options for things like bit depth or palettes, compression variants, etc.? That's not just things like compliance testing – itself a big problem – but also handling all of the slightly non-compliant data in the wild which users will inevitably expect to work.
(I'm somewhat biased having spent time dealing with JPEG 2000 imagery where various lapses on the standards side meant that it's still common to find images which don't display correctly in one or more implementations but are silently reported as correct in others)
Again, I'm not arguing that doing this on a GPU isn't a good idea — the hardware has become common enough that it's reasonable to assume availability for anyone who cares — but just that there's significant overhead cost for anyone who needs to handle images from unconstrained sources. It'll happen but this kind of thing always takes longer than it seems like it should.
Flickr is doing just that, and they’ve been using GPUs for more than 2 years already:
> It'll happen but this kind of thing always takes longer than it seems like it should.
I think the main reason for that is lazy software developers reluctant to learn new stuff.
No kernels are available _out of the box_. You code a pixel shader, implement any kernel, or any other resizing method besides kernels: https://stackoverflow.com/a/42179924/126995
> that only handles resizing, not compressing/decompressing
In my previous comment there’s a link to a commercially available JPEG codec, 100% compliant with JPEG Baseline Standard, that does both compression and decompression.
I don’t disagree but this is very subjective.
You don’t need to invent anything, you only need to carefully implement a well known approach, e.g. this one: https://developer.nvidia.com/gpugems/GPUGems/gpugems_ch24.ht...
Also there’re third party libraries for that, e.g. here’s one from the same company who do JPEG codec: http://fastcompression.com/products/resizer/gpu-resizer.htm
> what about PNG, GIF, and WEBP?
As far as I understand, you goal was to cut server costs, right?
I assume the majority of pictures on the Internet are jpegs. If you have them processing on the GPU, this leaves you 16 virtual CPUs you’ve already paid for just sitting idle and waiting for the GPU to finish the job. No need to do everything on GPU.
P.S. Some other people already implemented what I’m telling you: http://code.flickr.net/2015/06/25/real-time-resizing-of-flic...
eg: instead of this
we can create alias like octo and url will become this
This involves transfering, encrypting, compression and creating checksum of terabytes of data a hour (per node). While not exactly resizing images, I would image the computational power was on par with the service described. The entire system has about 4 PB or 8 PB in it right now, as backups are pruned (based on what people will pay for storage).
My software has a ton of space to grow and become better, but I think a better story would have been how discord handles 150M images a hour. If anything bandwidth acquiring the source image would be what I would consider the largest problem, not the CPU time to resize. In fact as long as your resize code slightly faster than the download then streaming it in and out would put your bottleneck entirely on bandwidth.
I will also note I am not a fan of libraries :p but that is not what this is about.
Also kudos to you, somebody criticized your post and you had the best response one could have. Inquiring minds are awesome.
Where I work we have single nodes processing near that much data a hour -- these are beefy systems though.