Hacker News new | past | comments | ask | show | jobs | submit login
Web Graphics Done Right (evilmartians.com)
325 points by progapandist 3 months ago | hide | past | web | favorite | 65 comments

These guys (https://evilmartians.com/) sure understands a thing or two about images and image compression. I have used their imgproxy[1] (Golang image resizer) for a long time and it has been solid from the very beginning.

[1] https://github.com/imgproxy/imgproxy

I recommend Thumbor if you want the advanced manipulation and smart cropping features common in the hosted services: https://github.com/thumbor/thumbor

So I work on real estate websites that typically consume 50-80GB for images so articles such as this are really interesting to me. I am using PHP7 BTW...

Currently, I'm converting all the jpegs to webp at 55 compression at the time of import and I love both the quality and the sizes of the files, and so far couldn't be happier with the results[0] I managed to shave some 20GB from the disk usage as well.

The 800lb gorilla is, however, Safari. We all know that Safari does not currently support webp and due to internal politics, may never do so.

So the first "solution" is to keep copies of both formats and use the <picture> tag. This just won't do I have zero interest in doubling my already prodigal disk usage.

There are some rather complicated OS libs that will take a jpg and convert on the fly to webp...I want to do it the other way around.

One thing I hate are "kitchen sink" libs that do SO much that you are often left confused on how to do the specific thing you want to do, so I am in the process of rolling my own solution by using GD and intelligently converting the 11% of all requests that are Safari and using nginx caching to keep these converted images ready to go.

I would like to hear of anyone with better solutions, if possable, since as you can see, I haven't yet implemented my ideas.


I don't have a solution, but you may find this of interest: I was snooping around on AirBnB with my network analyzer. (I like to quasi-reverse-engineer sites). I noticed that if you hit the site with Chrome/FF, you get served webp images; if you hit the site with IE, you get served JPEGs.

It would be really great if there was an on-the-fly conversion library -- basically, reference the image once and let the run-time deliver up the right image type. What I don't like about that is that I generally shy away from doing any sort of "if then" based on browsers' user-agent-string.Edit/Update: I don't like that that because it got us all into a lot of trouble during the Browser Wars[0] and is now generally considered bad practice. [1]

This might get me some downvotes... but hold my beer and watch this: I wonder if there is a javascript solution? Maybe you serve up the webp images, always, and write your own jquery plug-in that converts the webp to jpeg in memory and then writes it to an HTML5 canvas, but only if it's safari/IE?

[0] https://youtu.be/yRkQOw1uRrw?t=164

[1] https://developer.mozilla.org/en-US/docs/Web/HTTP/Browser_de...

> What I don't like about that is that I generally shy away from doing any sort of "if then" based on browsers' user-agent-string.

Can't you just base it on the "Accepts" header? That's the whole reason it exists, after all.

I think this is the right answer and semantically better than parsing user agents, guessing, or other magic.

I've never used it but there does appear to be something like that called webp-hero


Wouldn't it be possible to combine the approaches? Say, using a <picture> tag where the webp is requested when supported, and if not a jpg is requested, which is generated from the webp on the backend.

It's been quite a few years now, but back in the day I had success leveraging `mod_pagespeed` in Apache, to rewrite server-generated markup on the fly, and to respond to raster image requests with webp assets for browsers that supported them. For other similarly conditional responses I relied on a proper WURFL integration -- not just shallow UA-sniffing -- but IIRC mod_pagespeed handled the webp conditionality on its own. Which is to say, I think mod_pagespeed was able to generate and cache a webp version of my jpg assets, and it would simply serve them in response to requests that included the relevant value in their "Accepts" HTTP headers.

If you go that route, be aware that HTTP caching mechanics can get tricky, and some CDN service providers (deliberately and self-interestedly) violate the rfc's (eg refuse to cache anything w/ a "Vary" response header at all -- looking at you Akamai)...

In retrospect, it was great fun to effectively piece together a web performance optimization service layer to support and accelerate the heck out of a traditional 3-tier web application... but doing it all by hand, these days, might not have great ROI.

I haven't dealt with this volume of images but Cloudinary will auto convert and serve the correct image format

It's hard to give advice without knowing more detail but I would suggest putting a CDN (like cloudfront) in front of it to reduce bandwidth costs, and potentially storing the images themselves in S3 instead of on your server to reduce storage costs.

Other things like making thumbnails of all the images can also help reduce bandwidth but depends on how the site is structured.

AWS outbound bandwidth costs are insanely high compared to most alternatives, so this very rarely makes any sense unless your images are only very rarely accessed.

Being in a luxury market, is 20GB (or even 60GB for that matter) more disk space really considered too expensive, especially given the fact that it's going to need engineering effort? Or maybe you're talking traffic, but that isn't so costly either.

> When you are reading a text on your phone, you hold the screen at a certain distance from your eyeballs, but when you look at a computer screen—you tend to be further from it to read comfortably. The larger the distance between a retina and the Retina (pun intended), the larger the size of a “CSS pixel” will be.

That is terribly wrong. Actually it just opposite: as further you move your eyes from screen as smaller that "CSS pixel".

Yet the whole idea of that "CSS pixel" is terribly flawed:

> The trickiest one is a so-called CSS pixel, which is described as a unit of length, which roughly corresponds to the width or height of a single dot that can be comfortably seen by the human eye without strain.

"comfortably seen" is an "average temperature of patients in a hospital" - makes no reasonable value at all for particular me.

CSS pixel is exactly this: a 1/96 inch square measured on surface of the screen.

> as further you move your eyes from screen as smaller that "CSS pixel".

You are misreading it. The pixel gets physically larger so that you will see it at the same size.

A CSS pixel is a 1/96 inch square on the surface of a typical desktop screen. (It wasn't adjusted to the recent trend of screen growth, so that number is getting more wrong with time.) Different media are expected to change that size so that your page renders the same way.

Is a desktop screen different than a phone screen? In the physical world, 1/96 inch is 1/96 inch. The notion of HiDPI abstracts for the fact that their pixel pitch is much finer than a typical display, usually by 2-4x. Otherwise, most displays fudge for the fact that they're not quite 96ppi, since 100 or 94 isn't vastly different.

Anyways, this is a good thing. It shouldn't really adjust with screen growth, but it's important to note that it's an abstract dimension (so thinking in terms of PPI/DPI with CSS pixels is unwise). It allows you to be agnostic to devices, and respect the user's preference (if possible) for how close to the physical display they are.

100 ppi for a Full HD 22" screen.

386 ppi for a Full HD 5.7" screen (i.e. my phone).

> Is a desktop screen different than a phone screen?

Hum... Yes. That's the entire point of my comment.

One is 90cm away, the other is 150cm away. An arcsecond of viewport on one is almost half of the linear size of the same viewport size on the other.

There is an abbreviation for that - YMMV, right?

My monitor is closer to that usually. And that's the point. It makes absolutely no sense to measure viewing angles.

The only thing that we can agree is that area of finger/screen touch measured on screen surface is roughly equal for humans - 1cm x 1cm or something like that.

So size of your buttons shall be measured in units of screen surface, but not in some mysterious things that depend on distance orthogonal to screen surface.

So 1. CSS pixels must be treated as logical length unit equal 1/96 inch, and, 2. Browsers must ensure that 1in or 10cm are exactly that when measured by physical ruler on screen surface.

I, as a developer, must know that my 1cm button is exactly that on screen and so human clickable.

Your mileage will vary. A lot.

But CSS sizes are defined by viewing angles, not by linear size. There was plenty of debate at the time it was standardized, and really, none of the options is good so they got one of the bad ones.

Yep, initial idea of HTML as just one dimensional tape of scrollable text - something that has given width but no height - literally just a Text for reading ...

Who would have thought that we will start clicking <button>s by literally fingers.

But how much of dev time wasted already in discovering that `px` stands for points in MacOS terms (1/72 on inch initially and now 1/96 of inch) and nothing near real pixels ...

Good article, but the author mentions "JPEG compression artifacts are so prominent that it practically spawned a new art form." and links to a glitch app.

Glitch art wasn't inspired by lossy JPEG compression, it was inspired by corrupted digital feeds and missing image data: https://en.wikipedia.org/wiki/Glitch_art

Thanks for the clarification! I still think it takes a lot from the JPEG artifacts, but I'm happy to change it if the author objects :)

Incorrect motion estimation data (e.g. datamoshing), Gibbs effect ringing, seams that come from JPEG-style macroblocks are all noticeable artifacts, along with noise from tape decks. They might not have started the genre, but they're certainly part of it now. I see nothing major to correct.

Pops and crackles to simulate vinyl record noise, too. Bands like The Bird and the Bee, and Death Cab for Cutie add those imperfections to digital tracks to get the sound they're looking for.

Thomas Ruff comes to mind as a prominent artist (photographer) working with JPEG artifacts.

That 3.5MB image can be squashed down further. It has way too much resolution for something that takes up a quarter of a monitor. An optimized 1080p-size image (possibly too much for desktop, might be OK for mobile hi-res screens) shouldn't be more than 300k.

That's right :) You can compress even more without losing much in quality. 3.5Mb is an example of where to start :)

I recently rebuilt the theme of my moderately-well-known WordPress-based site to use all vector graphics for the site UI. Further, I implemented SVG sprites so all relevant SVGs are embedded in the main HTML, eliminating the overhead of subsequent HTTP requests for those images. Along with a few other non-image-specific optimizations, that change made the site blazing fast, even when under heavy load (i.e., when linked from HN or /r/TodayILearned).

(I'll refrain from linking the site to avoid the appearance of naked self-promotion, but my username is a dead giveaway if you care to see it in action).

Did you use any tool to make the SVGs?

Mostly I used Adobe Illustrator, building up the shapes and exporting as CSV. I also did some hand-editing of the SVGs in Sublime for fine tuning, SVGs are just XML so it's not difficult to adjust colors or nudge nodes in a text editor.

Oops, I said 'CSV' above when I meant 'SVG'.

There's also Trimage for the Linux users that want an magic, drag-n-drop GUI to do a lot of image compression. It's an alternative to Mac's ImageOptim.

There's too much filler in this article for me to understand what's being conveyed.

The article is geared towards front end engineers, clarifying how and when to compress images and which formats to choose. It stresses the importance of the image context more than the technicalities of the image. This context can also lead to the decision to choose other types of image, such as choosing a simplified drawing instead of a photo when the image context is icon-like.

Use svgs*

* when you can

I just went through a project where we deliberately converted all SVGs to PNGs (using an image CDN where we could request different transformed resolutions) and it resulted in MUCH better performance, both in tools like PageSpeed and in perceived performance, for 2 major reasons:

1. The PNGs tended to be MUCH smaller than the SVGs, and I think you'll find this likely whenever you have even a very minorly complicated graphic (the article points this out).

2. There is an issue on Android devices where scrolling was absolutely awful when there were lots of large SVGs on the page. This completely went away when we switched to PNGs.

I think the best recommendation is to store the initial image either as an SVG or very high resolution raster image, but then use an on-the-fly transforming CDN so you can experiment and easily get the right image (size and format) where necessary.

SVG is an incredibly inefficient way to represent a line drawing,with all the coordinates written out as text, and not as deltas from the previous one. Flash's internal representation is much tighter, but not cool any more.

SVG can do relative coordinates fine: "m 10 10" (m dx dy) rather than "M 10 10" (M x y). But yes, they're text until you compress them.

Yes, but unless you round the floating point digits, you're going to use about as much bytes.

The parent is correct. At some point we gained a fear of binary formats, but binary formats are still the only way to transfer data efficiently.

Yes, flash had other problems. No, this doesn't make their point false.

Is that really true with gzipping?

Browsers still need to unzip and parse the SVG, and store the DOM representation of it.

Any examples you mind sharing?

One case where you probably can't is for image uploads. You might think it would be nice to allow a user to upload a svg for their avatar or something.

But SVG can embed javascript which can lead to XSS: https://hackerone.com/reports/148853

That's certainly something to be considered with user uploaded svg images. However it's reasonably straightforward to parse an svg and remove any script elements.

You can safely use XSLT to subset SVG, and (for example) limit complexity. Scripts in SVGs do not load in img tags either.

No wonder scrolling is bad for us! Without line-tracking, scrolling kinda culls our attention span, patience and retention—all in one stroke. ;-)

I found the article very nicely written and do not think there was filler in there.

Using AV1 <video> tag with a single frame is an amazing hack. Thanks for sharing.

That's been known since the talk of html5 video tag started.

It's not so clever given changing implementation of video prefetch logic, and differing codec behaviour with 1 frame videos with hardware decoders.

I knew of few Android devices which claim to be able to do 4k VP9 in hardware, but plainly hang upon playing VP9 from Youtube.

Does Android Chrome do JPEG decoding with hardware these days?

But needs autoplay=true which works less and less. Even with "muted"

I would never use it though, I’m guessing it breaks all default browser image behavior. Drag image from browser to desktop, right click, accessibility, etc.

This is an extraordinarily well-written, informed and useful guide. Bookmarked for future reference.

In the last code example image names are probably mixed up, huge-cupcake.webp should probably go to (min-width: 800px) media size, not (max-width: 799px)

Thanks! Will fix it in an hour :)

What's the story with AV1 based image format to replace JPEG?

Also, is HEIF royalty free? Looks like AVIF is using it for container, which is surprising.

Looks like HEIF is not royalty free?? But it's already supported in Chrome and Safari but not Firefox



HEIF container is just ISOBMFF container, so it should be royalty-free.

HEIF has problems because of HEVC encoder, which was replaced in AVIF.

ISO requires the container format to be royalty free?

This describes the problem more or less but it does not give a single working solution. Image optimization now includes videos, scaling, format and color conversion, and simply compressing images at Quality 80 does not provide consistent visual quality [1].

[1] https://optimage.app/benchmark

There was a footer pop-up that, I think, was about cookies. Just blowing lightly on my mouse wheel made it disappear and I can't get it back. What did I agree to via bad UI?

The text reads Humans! We come in peace and bring cookies. We also care about your privacy: if you want to know more or withdraw your consent, please see the Privacy Policy. and links to [0]. Like most websites, also here a lot of analytics is gathered, over which you have little control.

[0] https://evilmartians.com/privacy

A potential future option I only just learned about is JPEG XL (not XR!) which is proposed as a royalty-free successor to the JPEG we know. The draft is at https://arxiv.org/abs/1908.03565 and a little bit about what it has and why is at https://www.spiedigitallibrary.org/conference-proceedings-of... (but I haven't found nearly as much public detail on tools, reasoning behind choices, etc. as there is for e.g. AV1).

The most interesting bit of it is that it contains a JPEG1 recompressor that saves about 20% space but allows exact reconstruction of the original file. It uses more modern entropy coding and goes to more effort to predict coefficients than JPEG1. It has almost exactly the same gains as, and sounds a lot like, Dropbox's Lepton, described here: https://blogs.dropbox.com/tech/2016/07/lepton-image-compress... .

Seems like a big deal to plug seamlessly into all the JPEG-producing stuff that exists, without either doing a second lossy step (ick) or forking into two versions when you first compress the high-quality original. 20% off JPEG sizes is also a bigger deal than it may sound like; totally new codecs with slower encodes and a bucket of new tools only hit like ~50% of JPEG sizes. As Daala researcher Tim Terriberry once said, "JPEG is alien technology from the future." :)

For JPEG XL's native lossy compression, it has some tools reminiscent of AV1 and other recent codecs, e.g. variable-sized DCTs, an identity transform for fundamentally DCT-unfriendly content, a somewhat mysterious 4x4 "AFV" transform that's supposed to help encode diagonal lines (huh!), a post-filter to reduce ringing (that cleverly uses the quantization ranges as a constraint, like the Knusperli deblocker: https://github.com/google/knusperli ).

Interestingly it does not use spatial prediction in the style of the video codecs. A developer in a conference presentation mentioned that it's targeting relatively high qualities equivalent to ~2bpp JPEGs -- maybe spatial prediction just doesn't help as much at that level?

Don't know if AV1-based compression or JPEG XL will get wide adoption first, but either way we should actually have some substantial wins coming.

What does this mean:

Have you considered HTTP/2 that supports multiplexing at a protocol level

With HTTP/2 multiple requests ca simultaneously go though one connection. This allows multiple images to be efficently used directly. With HTTP 1.1 it was common to tile multiple images together so they could be loaded in one request, then use CSS to display just the bits you wanted.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact