Always cool to see new visual compression libraries hit the scene. That said I think the hardest part isn't the math of it, but the adoption of it.
Likely the format with the best chance of overthrowing the jpg/gif/png incumbents is AVIF. Since it's based on AV1, you'd get hardware acceleration for decoding/encoding once it starts becoming a standard, and browser support will be trivial to add once AV1 has wide support.
Compression wise AVIF is performing at about the same level as FLIF (15-25% better than webp, depending on the image), and is also royalty free. The leg it has upon FLIF is the Alliance for Open Media[1] is behind it, which is a consortium of companies including: "Amazon, Apple, ARM, Cisco, Facebook, Google, IBM, Intel Corporation, Microsoft, Mozilla, Netflix, Nvidia, Samsung Electronics and Tencent."
I'm really excited for it and I hope it actually gets traction. It'd be lovely to have photos / screenshots / gifs all able to share a common format.
>Likely the format with the best chance of overthrowing the jpg/gif/png incumbents is AVIF
I used to think the same as well, however I now think Jpeg XL is poised to be the 'winner' among next gen image codecs. It's royalty free, great lossy and lossless compression which is said to beat the competition, as well as providing a perfect upgrade path for existing jpeg's as it can losslessly recompress them into the jpeg XL format with a ~20% size decrease (courtesy of the PIK project).
It's slated for standardisation within a couple of weeks, it will be very interesting to see large-scale comparisons of this codec against the likes of AVIF and HEIF.
It may not be the best name, but it sure is better than the names of the projects it was based on: PIK and FUIF. At least if you speak Dutch. A "pik" is a penis and a "fuif" is a party, so the combination would be the "penis party" image codec. I prefer "JPEG XL".
The etymology of the name "XL" is as follows: JPEG has called all its new standards since j2k something that starts with an X: XR, XT, XS (S for speed, since it is very fast and ultra-low-latency), and now XL. The L is supposed to mean Long term, since the goal is to make something that can replace the legacy JPEG and last as long as it did.
There might be an explanation around it, but I find it not much better than Apple with their iPhone XS. I agree that XL is a bad name, and I don’t think that “pik fuif” would matter a lot (I’m also Dutch).
I hope that JPEG XL will be simpler than the competitors. If it's compression ratio is similar to AVIF and it can do HDR, then I'll all for it!
AVIF (and its image sequences) seems to be fairly complicated. Here's few comments [1] about it:
"Given all that I'm also beginning to understand why some folks want something simpler like webp2 :)"
"Despite authoring libavif (library, not the standard) and being a big fan of the AV1 codec, I do occasionally find HEIF/AVIF as a format to be a bit "it can do anything!", which is likely to lead to huge or fragmented/incomplete implementations."
Anyway, instead of adding FLIF, AVIF, BPG, and lots of similar image formats to web browsers, I think only one good format is enough and JPEG XR might be it. After something has been added to web browsers, it can't be removed.
Safari hasn't added support for WebP (which is good, there's no need for WebP after AVIF/JPEG XR is out) and it hasn't added support for HEIF (which is weird, considering Apple is using it on iOS), but maybe they know that there's no need to rush.
It is complementary to AVIF (which targets photo capture more than delivery over the web).
WebP2 is like WebP: keep it simple, with only useful and efficient features, no fancy stuff. And aim at very low bitrate, it's the current trend, favoring image size over pristine quality.
Progressive rendering uses ~3x more resources on client side. So, instead, better have efficient 300-bytes thumbnails in the header.
The example you linked to is pretty telling, because not only do the BPG images decode more slowly than natively supported images, the javascript decoding approach apparently breaks the browser's (Firefox's) color management. I think native support is needed for newer codecs to be viable for more than simple demos.
Chrome appears to interpret the canvas as sRGB and to convert from that, but that means that images decoded that way are effectively limited to sRGB until the canvas API allows specifying other colorspaces.
In Firefox, these canvas images appear stretched into the full gamut of the monitor (oversaturated), even though I have a color profile and have full color management enabled in about:config.
I'd by far rather see several full article-appropriately sized sideline image at only progression step 1 or 2 out of 5 or 6, than several empty boxes all showing spinners while the images load in.
Plus I'd love the ability to say "I'm on a low bandwidth, $$$ per megabyte network, stop loading any image after the first iteration until I indicate I want to see the full one" because you almost never need full-progression-loaded images. They just make things pretty. Having rudimentary images is often good enough to get the experience of the full article.
(whether that's news, or a tutorial, or even an educational resource. Load the text and images, and as someone who understands the text I'm reading, I can decide whether or not I need the rest of that image data after everything's already typeset and presentable, instead of having a DOM constantly reflow because it's loading in more and more images)
Progressive mode is better than a loading spinner in the same way that PWAs are better than a loading spinner: By getting mostly usable content in as little time as possible to the user you decrease perceived wait, you decrease time to interactive and you increase perceived loading speed (even though time to full load might be the same or slightly increased).
Progressive photos always irritate me. The photo comes on my screen all blurry and out of focus and I'm disappointed that the moment I thought I had captured didn't turn out. Then, 3 seconds later, the picture changes and gets slightly better. Then I'm hopeful, but disappointed again. Then i think, "maybe it's not loaded yet", so i wait and hope. Then 2 seconds later it changes again. Is it done? Is it loaded now? Will it get better? Is my computer just dog slow? How long should I wait before I assume it's as good as it's going to get?
I know it's a small thing and doesn't really matter, but I don't like progressive photos.
Edit: This is just one context. There are plenty of other contexts where progressive is very useful.
On the other hand, it also doesn't constantly change the DOM, moving you across the page because images above what you're reading are getting swapped in and now you're looking at the paragraph above the one you were reading.
The progressive mode in Jpeg and JPEG XL is quite different, because the quality is so much better your perception of it changes. Where Progressive Jpeg are literally useless before it finish loading, JPEG XL provides decent quality.
In the common case that you don't actually care about the picture, you decrease actual wait, not just perceived. Progressive mode lets you ignore it before it's fully loaded.
true progressive mode actually gives you something usable before it's fully loaded, to the point where "not fully loading" can be entirely acceptable. If all images load "first iteration" first, you have a stable DOM that won't need to reflow in any way as each image goes through the rest of its progressive layers. And having a nice setting that says "only load up to X on slow networks" and "require image click to load past first iteration" would be godsend, even on gigabit connections. If I could bring the size and load time for modern webpages back down to "maybe a few 10s of kb" and "immediate" rather than "2.9MB and a second or two", I'd happily turn those options on.
How so? Progressive JPEGs load in multiple passes. The first pass trades fidelity for speed, successive passes add quality over time. Seems pretty much in line with what PWA is all about.
Imagine this: you set your browser to download only the first N bytes of each image, showing you a decent preview. If you want more detail on a particular image, you tap it, and the browser requests the next N bytes (or the rest of the file, if you prefer).
And to enable this, the site only needed to create one high-resolution image.
Seems like a victory for loading speeds, for low-bandwidth, and for content creation.
Agreed, but is it likely? Does any browser implement the even simpler feature "Show a placeholder for images. Tap the placeholder to load the actual image"?
Systems that render low res can download the prefix of the file and show an image, and stop there. Many modern image formats support this for precisely this reason. If you then want higher quality for some particular image, you can download more, not having to start over with a separate image file.
The talk is by the FLIF author. One of the big marketing points for FLIF is its progressive mode. Of course every other codec will be criticized for not having one.
>Interesting. Where have you seen that adoption will be swifter with JPEG XL instead of, say, AV1/AVIF?
Well, it's not finalized as of yet (though it is imminent), so rate of adoption is just pure guesswork at this stage. However, things I deem necessary for a new image codec to become the next 'de facto' standard are:
royalty free
major improvements over the current de facto standards
Both AVIF and Jpeg XL tick these boxes, however Jpeg XL has another strong feature which is that it offers a lossless upgrade path for existing jpeg's with significantly improved compression as a bonus.
Yes, it losslessly recompresses existing jpeg's into the jpeg XL format, while also making the files ~20% smaller, key point being lossless. Thus it is the 'perfect' upgrade path from jpeg which is the current lossy standard, as you will get better compression and no loss in quality when shifting to Jpeg XL.
This being a 'killer feature' of course relies on Jpeg XL being very competitive with AVIF in terms of lossy/lossless compression overall.
I'm assuming this is bidirectional? You can go back from XL to jpeg losslessly as well? If thats the case, I'm having trouble imagining a scenario where you're not correct; it'd be an utterly painless upgrade path
>A lightweight lossless conversion process back to JPEG ensures compatibility with existing JPEG-only clients such as older generation phones and browsers. Thus it is easy to migrate to JPEG XL, because servers can store a single JPEG XL file to serve both JPEG and JPEG XL clients.
>You can go back from XL to jpeg losslessly as well
I don't think so, but I don't quite see the point unless you are thinking of using it as a way to archive jpeg's, but in that case there are programs specifically for that, like PackJPG, Lepton etc.
Decompressing and recompressing a zip gives a lossless copy of the actual data, but there's no way to reconstruct the same exact zip you started with.
The same thing can be done with image data. For something like jpeg you can keep the coefficients but store them in a more compact form.
For what it's worth JPEG XL claims that it's 'reversible', but I'm not sure if that means you get your original jpeg back, byte for byte, or you get an equivalent jpeg back.
> Decompressing and recompressing a zip gives a lossless copy of the actual data, but there's no way to reconstruct the same exact zip you started with.
I don't think getting the same JPEG is the goal here but getting a JPEG that decodes to the same image data.
Btw, if memory serves right, Google Photos deliberately gives you back the same pixels but not the same JPEG codestream bits under some circumstances. That's too make it harder to exploit vulnerabilities with carefully crafted files.
Is it possible to get a JPEG back without any loss when recompressing a JPEG to another JPEG (after discarding the original, ie JPEG -> bitmap -> JPEG)?
If you can go both directions, you can store the more efficient jpegXL format but still have perfectly transparent support for clients that don't support jpegXL.
If you can't produce the exact same original jpeg, then you can still have some issues during the global upgrade process -- eg your webserver database of image hashes for deduplication has to be reconstructed.
A relatively minor problem to be sure, but afaict if jpegXL does support this (which apparently it does), the upgrade process is really as pain-free as I could imagine. I can't really think of anything more you could ask for out of a new format. Better & backwards+forward compatibility
Sounds like it will just take an existing JPEG and reduce the size, but not re-compress it - so even though the original JPEG is lossy, there will be no additional loss introduced, whereas another format not based on JPEG would require a re-encode pass that would lose additional information.
Does it have the feature where the file can be truncated to any size to form its own thumbnail? That would be an incredibly useful feature in so many applications.
Some wavelet formats (SPIHT and EZW e.g) have the property that truncating the lossless version at 10% of the size gives you the 10:1 lossy rendering of that picture. That property holds not just for 10% but for any fraction of the file.
It's not quite the same thing, but a very highly compressed high-res picture would probably look okay in thumbnail format.
Good news: the reference software (http://gitlab.com/wg1/jpeg-xl) can decode about 50 Megapixels/s on a single Skylake core, with 3.1-3.7x speedup on 4 cores.
Encoding throughput is configurable from ~1 MP/s to about 50 per core. That's somewhat slower than JPEG, but can be faster with multiple cores.
> It'd be lovely to have photos / screenshots / gifs all able to share a common format.
Given the way GIF is used nowadays the only reason it exists is it let's you put a video on the most of the places where you are only allowed to put pictures. For some weird reason many of such websites still won't let you upload an MP4 file despite they auto-convert the gifs you upload to MP4/AVC. From the technical point of view there hardly is a reason for GIF to be used at all.
Objectively worse are infinite CSS-animations with opacity:0 set on them that Blink can't figure out are indeed invisible and don't need repainting at 60fps. Cue me noticing my CPU is at 100% and 69°C after about 2 minutes...
(ThinkPad T400s are really bad at thermal management.)
Exactly what you are saying: FLIF is not new (2015) but is still not adopted in any browser I know.
I also believe this was due to some license choices in the past.
But in this time of responsive websites it is a great format!
Just download the amount of pixels you need to form a sharp image and then skip loading the rest.
What it will take for something to overthrow JPEG is for both Google and Apple to adopt it. If Android, Chrome, iOS, macOS and Safari all support it (and the phones shoot in it natively like the iPhone does with HEIF right now), it will take over.
Think of it Google favoring sites with AVIF, JPEG XL, whatever they settle on in their page rank? Within half a year 80% of the web sites will have converted.
Worth noting that it's not really "new" as such, being over 4 years old. It still doesn't seem to have hugely caught on...
That said, lossless codecs have it easier because you can freely transcode between them as the mood takes you, with no generation loss, so there's less lock-in. For example, FLAC is dominant in audio, but there are a variety of other smaller lossless formats which still see use. Nobody much minds.
Would this really be lovely? Isn't there some value in having 3 distinct formats (still picture, moving picture, moving picture with audio)? Each one is a subset of the next, but they are 3 different things that can easily be identified by what they (only potentially, but probably) are.
It's a win from the implementation side because you can support all three use cases with one format, reducing code duplication and at least hypothetically leveraging the same improvements across the three use cases with no additional effort.
The UX concern you're describing doesn't necessarily have to have anything to do with the implementation details themselves, as demonstrated by sites like imgur rebranding silent mp4/webms as "gifv". As long as the new format makes it reasonably easy to access data about its content (e.g., header field indicating animation/movement), there shouldn't be any issue with collapsing the use cases into a single implementation and simultaneously addressing the UX concern you mention.
It’s faster (especially to encode), and in my opinion, better tuned for high quality. H.265 can give a “usable” image in a relatively small file size, but it sometimes takes a surprising amount of additional bytes to get rid of the somewhat oversmoothed aspect, and that additional amount can vary a lot from one image to the next, so with the current implementations, you have little choice but to verify visually. At least, that’s my experience with the reference encoder. I haven’t had a chance to experiment with Apple’s implementation.
In contrast, with JPEG XL, I can simply use the high quality setting once per image and be done with it, trusting that I will get a faithful image at a reasonable size.
> Keep in mind that the author of the FLIF works on the new FUIF (https://github.com/cloudinary/fuif) which will be part of the JPEG XL. So, probably FLIF will be deprecated soon. And as far as JPEG XL is also based on Google PIX, there is a high probability that Google will support this new format in their Blink engine.
FLIF author here. I have been working on FUIF and JPEG XL the past two years. FUIF is based on FLIF but is better at lossy. JPEG XL is a combination of Google's PIK (~VarDCT mode) and FUIF (~Modular mode). You'll be able to mix both codecs for a single image, e.g. VarDCT for the photo parts, Modular for the non-photo parts and to encode the DC (1:8) in a super-progressive way.
I'm very excited about JPEG XL, it's a great codec that has all the technical ingredients to replace JPEG, PNG and GIF. We are close to finalizing the bitstream and standardizing it (ISO/IEC 18181). Now let's hope it will get adoption!
It looks like the the creator (Jon Sneyers) has since (2019) made another image format more focused on the lossy compression, FUIF[0], which itself has been subsumed by the JPEG XL format[1]. I hope the "JPEG" branding doesn't make folks think that JPEG XL isn't also a lossless format!
JXL is an interesting standard. Integrates a recompressor that shaves ~20% off .jpgs without further loss(!), FUIF for essentially-lossless images in a progressive/interlaced mode like this, a separate lossless format, and a new lossy format with variable-sized DCTs and a bunch of other new tools (other transforms for DCT-unfriendly content, postfilter for DCT artifacts, new colorspace, chroma-from-luma...). It's intended to be royalty-free.
I'm not sure if putting it all under one standard will make it hard to adopt, and/or if AVIF (based on AV1) will just become the new open advanced image format first. Interesting to see in any case.
I think the progressive loading feature of FLIF/FUIF/JXL would be really interesting for WebGL. WebGL games already have asynchronous resource loading and incremental image decoding/texture upload just by virtue of using the available APIs. One could get a pseudo texture streaming effect by default (although no unloading of textures...)
Most of them had expired. Wavelet is something that looks good in theory but in practice couldn't beat the millions man hours work put into standard DCT.
Yet, I think it's really hard to change what has stuck. The gains have to be enormous to warrant the hassle of trying to keep publishing in and supporting a new image format until it just works for everyone.
The reason we have PNG and JPEG is that they are, all in all, more than good enough. Yes, the dreaded "good enough" argument surfaces again stronger than ever. They are also easy to understand, i.e. use JPEG for lossy photos and PNG for pixel-exact graphics. But most importantly they both compress substantially in comparison to uncompressed images (like TIFF) and both have long ago reached the level of compression where improving compression is mostly about diminishing gains.
As there's less and less data left to compress further the compression ratio would need to go higher and higher for the new algorithm to make even a dent in JPEG or PNG in any practical sense.
Also, image compression algorithms try to solve a problem that has been gradually made less and less important each year with faster network connections. Improvements in image compression efficiency are way outrun by improvements in the network bandwidth in the last 20 years. The available memory and disk space have grown enormously as well.
For example, it's not so much of a problem if a background image of a website compresses down to 500Kb rather than 400Kb because the web page itself is 10M and always takes 10 seconds to load regardless of which decade it is. If you could squeeze a half-a-megabyte off the website's image data the site wouldn't effectively be any faster because of that (but maybe marginally so to allow the publisher to add another half-a-megabyte of ads or other useless crap instead.
The reason we have jpeg is because png is not good enough for photos and people prefer the lossy compression of jpeg over using png. The reason other lossy formats are struggling is because they are still lossy. This promises to basically be good enough for just about anything. That sounds like a big promise but if true, there's very little stopping major browser implementing support for this. I'd say progressive decompression sounds like a nice feature to have for photo websites.
Compression is still majorly important on mobile. Mobile coverage is mostly not great except maybe in bigger cities where you get to share the coverage with millions of others. Also mobile providers still throttle connections, bill per GB, etc. So, it matters. E.g. Instagram adopting this could be a big deal. All the major companies are looking to cut bandwidth cost. That's also what's driving progress for video codecs. With 4K and 8K screens becoming more common, jpeg is maybe not good enough anymore.
File size matters for networks, not compression. Compressors have an interface where you specify desired file size and the program tries to produce a file of that size. With better compression algorithm the image will be just of a better quality, time to download and cost per GB will be the same.
> Compressors have an interface where you specify desired file size and the program tries to produce a file of that size.
That’s not really the case for JPEG XL, where the main parameter of the reference encoder is in fact a target quality. There is a setting to target a certain file size, but it just runs a search on the quality setting to use.
This algorithm is supposedly lossless and requires no fiddling with such settings. So a good enough non lossy algorithm could be quite disruptive. Of course with a lossy algorithm, if you throw enough detail away it's going to be smaller.
With the amount of images readily available on the internet with easily visible JPEG compression artifacts, and sometimes the difficulty of finding images without such, I would say there is still quite a lot to gain from better image formats.
The section "Works on any kind of image" is really misleading, as it mentions JPEG as a lossy format (alongside JPEG 2000) then says "FLIF beats anything else in all categories."
It really needs a giant caveat saying "lossless". I mean, that's still great and impressive, but it clearly doesn't erase the need for a user to switch formats as a lossless format is still not suitable for end users a lot of the time.
(It does have a lossy mode, detailed on another page, but they clearly show it doesn't have the same advantage over other formats there.)
It literally stands for "FLIF - Free Lossless Image Format", and the first sentence is "FLIF is a novel lossless image format which outperforms PNG, lossless WebP, lossless BPG, lossless JPEG2000, and lossless JPEG XR in terms of compression ratio."
Seems like they're doing a pretty decent job at communication it's lossless to me.
It would be reasonable to interpret the shorter boast as making the very surprising claim that it beats anything else, including lossy JPEG, in all categories of performance, including compression ratio. As it turns out, they don't intend to claim that, because it's not true. It's probably impossible for a lossless file format to do that, even for commonly occurring images. (It's certainly possible for a lossless image format to do that for images that have a lot of redundancy in a form that JPEG isn't sophisticated enough to recognize.)
Surely nobody could interpret this as beating an arbitrarily lossy format. I could just erase the entire input, after all. One would have to incorporate quality into the metric. Because of that, it seems natural to assume they mean lossless.
If a given lossy format is inefficient enough, then it’s possible for a lossless format to beat it at compression. I initially interpreted the website as making the surprising claim that FLIF beats several popular lossy formats while itself being lossless. Most people understand “JPEG” to mean the popular lossy format, JPEG. The site could be a lot clearer that that they are actually comparing it to lossless JPEG, lossless WEBP, etc.
> It's probably impossible for a lossless file format to do that, even for commonly occurring images.
It's actually provably impossible using a simple counting argument. A lossy algorithm can conflate two non-identical images and encode them the same way while a lossless algorithm can't, so on average the output of a lossless algorithm is necessarily larger than a lossy one because it has to encode more possible outputs.
Yes, of course it's impossible to losslessly compress all possible images by even a single bit. But how predictable is the content of typical images? How much structure do they have? They certainly have a lot more structure than PNG or even JPEG exploits. Some of the “loss” of lossy JPEG is just random sensor noise, which places a lower bound on the size of a losslessly compressed image, but we have no idea how much.
Doesn't matter. Whatever you do to compress losslessly you can always do better if you're allowed to discard information. And the structure is part of the reason. For lossless compression you have to keep all the noise, all the invisible details. With lossy compression you're allowed to discard all that.
You should take more time before you start throwing out terms like "provably impossible". A sufficiently bad lossy algorithm can be worse than an algorithm like PNG in every single case. For example, a format based on PPM that can lossily reduce the file size by up to half.
> A sufficiently bad lossy algorithm can be worse than an algorithm like PNG in every single case.
Well, yeah, obviously if you want to shoot yourself in the foot that is always possible. But for any lossless algorithm it is trivial to derive a corresponding lossy algorithm that will compress better than the lossless one.
Fair enough. Nonetheless, a lossy compression algorithm by definition produces fewer possible outputs than possible inputs. A lossless compression algorithm, again by definition, produces exactly as many different outputs as there are possible inputs. Therefore, on average, unless the lossless algorithm has a perverse encoding that uses more bits than necessary to encode the possible outputs, no lossless algorithm can beat it on average. It is of course possible that a lossless algorithm might beat a lossy one on some select inputs. It's even possible that those inputs are the ones you are interested in, but in that case you are using the wrong lossy algorithm.
> Therefore, on average, unless the lossless algorithm has a perverse encoding that uses more bits than necessary to encode the possible outputs, no lossless algorithm can beat it on average
I think you mean "unless the lossy algorithm has a perverse encoding."
But your argument proves too much. It's true that no lossless algorithm can beat a lossy algorithm if averaged across all possible images, for precisely the reason you say. But for the same reason, no lossless algorithm can beat the "compression algorithm" of no compression at all, like BMP but without redundancy in the header, averaged across all possible images.
The unknown question — which I think we don't even have a good estimate of — is how much incompressible information is in the images we're interested in compressing. Some of that information is random noise, which lossy algorithms like JPEG can win by discarding. (Better algorithms than JPEG can even improve image quality that way.) But other parts of that information are the actual signal we want to preserve; as demonstrated by Deep Dream, JPEG doesn't have a particularly good representation of that information, but we have reason to believe that on average it is a very small amount of information, much smaller than JPEG files.
If it turns out that the actually random noise is smaller than the delta between the information-theoretic Kolmogorov complexity of the signal in the images we're interested in compressing and the size of typical JPEG files for them, then a lossless algorithm could beat JPEG, as typically applied, on size. Of course, a new lossy algorithm that uses the same model for the probability distribution of images, and simply discards the random noise rather than storing it as a lossless algorithm must do, would do better still.
We are seeing this in practice for audio with LPCNet, where lossy compression is able to produce speech that is more easily comprehensible than the original recording: https://people.xiph.org/~jm/demo/lpcnet_codec/
It seems unlikely to me that the ideal predictor residual, or random noise, in an average photo (among the photos we're interested in, not all possible Library-of-Borges photos) is less than the size of a JPEG of that photo. But we don't have a good estimate of the Kolmogorov complexity of typical photos, so I'm not comfortable making a definite statement that this is impossible.
For example, one unknown component is how random camera sensor noise is. Some of it is pure quantum randomness, but other parts are a question of different gains and offsets (quantum efficiencies × active sizes and dark currents) in different pixels. It might turn out that those gains and offsets are somewhat compressible because semiconductor fabrication errors are somewhat systematic. And if you have many images from the same camera, you have the same gains and offsets in all of the images. If your camera is a pushbroom-sensor type, it has the same gains and offsets in corresponding pixels of each row, and whether it does or not, there's probably a common gain factor per pixel line, related to using the same ADCs or being digitized at the same time. So if your lossless algorithm can model this "random noise" it may be able to cut it in half or more.
> I think you mean "unless the lossy algorithm has a perverse encoding."
That's right.
> It's true that no lossless algorithm can beat a lossy algorithm if averaged across all possible images
I concede the point. I thought that this argument applied to the average of any input set, not just the set of all possible images, but then I realized that it's actually pretty easy to come up with counterexamples e.g. N input images, which can be losslessly compressed to log2(N) bits, which will easily outperform jpg for small N. That's not a very realistic situation, but it does torpedo my "proof".
It's probably still true that no lossless format could possibly beat lossy JPEG for typical photos.
If I understand your example correctly, it only has the lossless algorithm beating JPEG by cheating: the lossless algorithm contains a dictionary of the possible highly compressible images in it, so the algorithm plus the compressed images still weighs more than libjpeg plus the JPEGs. But there are other cases where the lossless algorithm doesn't have to cheat in this way; for example, if you have a bunch of images generated from 5×5-pixel blocks of solid colors whose colors are generated by, say, a known 68-bit LFSR, you can code one of those images losslessly in about 69 bits, but JPEG as typically used will probably not compress them by an atypical amount. Similarly for images generated from the history of an arbitrary 1-D CA, or—to use a somewhat more realistic example—bilevel images of pixel-aligned monospaced text in a largish font neither of whose dimensions is a multiple of 8, say 11×17. All of these represent images with structure that can be straightforwardly extracted to find a lossless representation that is more than an order of magnitude smaller than the uncompressed representation, but where JPEG is not good at exploiting that structure.
If we're going to switch to a new format, it would preferably support and be good at everything, including animation, lossy, alpha, metadata and color profiles.
What the first graph shows me is that this format is slightly better than WebP in one aspect, ignoring half the other features of it. Considering WebP is already far more ahead in terms of adoption, I'm happy sticking with that instead. It's natively supported in most browser and many graphics softwares, which is the real uphill battle for a new file format.
If you can download an arbitrarily-small amount of a lossless image and end up with a copy of reduced quality, would you even need a lossy image format for most browser use-cases?
The very obvious thing missing from the site is decode and encode benchmarks. Its very context dependent but if it had a long decode time, that could outweigh the bandwidth savings.
That's exactly what I thought. About ten years everyone is rushing to distribute large downloads in xz format. These days some have started to move away from it just because how slow it is to compress and decompress.
Compression is only mildly faster or on par with xz, but decompression is (at similar ratios) vastly faster. Which really helps consumers of compressed blobs.
Yeah, that's a useful distinction. So it's excellent for ex. packages (hence https://www.archlinux.org/news/now-using-zstandard-instead-o...), but iffy for write-once-read-hopefully-never backups. (Although I've heard it suggested that this might flip again the moment you start test-restoring backups regularly)
Agreed. I think I'd take zstd over xz for WORN backups anyway, just because it's pretty reliable at detecting stream corruption. (Then again, I suggest generating par2 or other FEC against your compressed backups so that's not a problem.)
(I know you know this, as one of the principals; this is for other readers.)
And importantly, that 32-bit checksum is a pretty good checksum; a truncated XXH64(): https://tools.ietf.org/html/rfc8478#section-3 That's about as high-quality as you can expect from 32 bits. It's not, say, Fletcher-32.
Two good options here: Brotli gets within 0.6 % of the density of lzma, zstd within 5 %. Brotli is 5x faster than lzma and zstd 9x. I'm somewhat surprised that some people migrated from lzma to zstd, without considering brotli.
While weight is a factor in professional car racing, and while I've built a go-kart that is significantly lighter in weight than a professional racing car, it is not a particularly impressive achievement that I have managed to do so, and I wouldn't think to criticize a race team for the weight of their vehicle.
Good thing we’re discussing JavaScript library sizes and not cars in random contexts, then.
77kb gzipped is massive considering it’s only doing one thing. If I want my website to load in ~100 milliseconds or less (or even 1 second or less!), I absolutely do need to pay attention to all the libraries I add.
I can and do criticize software developers for making hideously bloated websites because they don’t pay attention to what they add. Not only are a lot of modern websites wasteful, they’re painful or outright useless on slow mobile connections—not a problem for software developers on fibre networks, beefy dev machines, etc.
So it turns out that the use-case for a polyfill of a 77KB image decoder isn't particular suited to a site you want to load in sub 100ms. Oddly, though, that's not the only usecase in the world, and there are circumstances where saving ~30% on every image load turns out to be significantly more efficient than not loading a 77KB JS library.
In other news, I also chose to forgo adding a 15 lb. fuel pump to my go-kart, despite the fact that every NASCAR team in the world uses one, and my go-kart drives just fine. I should go tell the NASCAR teams that fuel pumps are a terrible idea. I clearly know something they don't.
I do know it's a niche use case. And I for sure wouldn't recommend that every man and his dog use the polyfill. I'm not even sure it's right for the scenario I'm thinking of. But I can definitely contrive a case where it is.
I haven't benchmarked this specific thing, because I've never used it. But I might, because it sounds like a fun thing to do.
There's a lot of images online that are 1MB. If you're, say, displaying a gallery of photos, you may have saved more in bandwidth using this library plus several smaller images.
> FLIF is based on MANIAC compression. MANIAC (Meta-Adaptive Near-zero Integer Arithmetic Coding) is an algorithm for entropy coding developed by Jon Sneyers and Pieter Wuille. It is a variant of CABAC (context-adaptive binary arithmetic coding), where instead of using a multi-dimensional array of quantized local image information, the contexts are nodes of decision trees which are dynamically learned at encode time.
I wonder if tANS [0] could be used instead of arithmetic coding for (presumably) higher performance. Although I guess the authors must be aware of ANS. Would be interesting to hear why it wasn't used instead.
They have a demo using a prototype browser polyfill so you could arguably use it right now at the cost of slightly higher CPU usage. Depending on your data connection it could even end up using less battery on mobile due to reduced radio usage.
I want to root for it but I think the LGPL license ruins it as long as there are BSD or MIT licenses alternatives that are good enough. Firefox might implement it but I think there’s zero change that Chromium or Safari add support.
I think that LGPL for the encoder is exactly the right choice. A format's strength is in uniform support; taking a MIT-licenced encoder and making an improved incompatible format won't be great for end users.
Not explicitly, but in practice, it does. It is often not viable to jump through the hoops required to do it. That is, if your lawyers will even let you try.
Having it as a dynamic library is not enough. It has to be a dynamic library tgat you can replace. This doesn’t work when, for instance, distributing through many app stores.
They could conceivably buy a different license from the author, if the author were to be interested in that, and if the author's the only contributor and didn't derive their code from othe LGPL code, etc, etc.
I use this to store archives of scanned documents. The last thing I want is to scan something only to later find some subtle image artifact corruption (remember that case of copy machines modifying numbers by swapping glyphs?). I store checksums and a static flif binary along with the archive. It's definitely overkill, but a huge win compared to stacks of paper sitting around.
My intuition was informed by choosing FLAC for my music collection ~15 years ago, and that working out fantastically. If a better format does come along, or if I change my mind, I can always transcode.
The issue with copy machines modifying glyphs isn’t a problem with all algorithms; Really, only that one. Instead of just discarding data like a lossy algorithm, it would notice similar sections of the image and make them the same.
Yeah, I'll admit that specific example wasn't the most relevant. Really I just want to be able to scan papers and then be confident enough to destroy them without having to scrutinize the output. Rather than committing to specific post-processing I settled on just keeping full masters of the 300dpi greyscale. Even at 5M/page, that's just 100GB for 20k pages.
I don't think PNG provided meaningful compression, due to the greyscale. If FLIF didn't exist, I certainly could have used PNG, for being nicer than PGM. But using FLIF seemed like a small compromise to pay for going lossless.
JPEG would have sufficed, but JPEG artifacts have always bugged me. I also considered JPEG2000 for a bit, which left me with a concern of how stable/future-proof are the actual implementations. Lossless is bit-perfect, so that concern is alleviated.
With the reference encoder licensed under LGPLv3, I doubt any browser team will be able to incorporate this work into their product. They would need to do a full clean room reimplementation simply to study it (since GPLv3 seems unacceptable to them, and LGPLv3 can’t coexist with GPLv2, and so forth and so on). It’s really unfortunate that the FLIF team chose such a restrictive license :(
EDIT: Their reference JS polyfill implementation is LGPLv3 as well, which may further harm adoption.
The decoder-only version of the reference implementation (libflif_dec) uses the Apache 2.0 license (specifically for this reason, I'd assume). Browsers shouldn't need to encode FLIF images very often, so decoder-only would be fine for that use case.
Not even the most restrictive copyleft license hampers evaluation. You are literally free to do whatever you want with the program on your own hardware. Its only when you start redistributing that the license kicks in.
What degree of rewriting would be necessary to neutralize the LGPLv3 license restrictions? Would I be sued if I used the same logic flow but handcrafted every statement from a flow diagram generated from the source code?
If I study the source and then make a single change to my own algorithm to incorporate the secret sauce in order to test its efficacy using my existing test suites, have I infected my own codebase with LPGLv3?
How can I test this code in the real world with real users if I’m not allowed to redistribute it? Would I be required to pay users as contractors to neutralize the redistribution objection?
Etc, etc.
EDIT: Neutralizing LGPLv3 would be necessary to combine this code with GPLv2 code and many other OSF-approved open source licenses, which is why that particular line of reasoning is interesting to me.
Your question makes no sense. If a license is so restrictive to you that you can’t deploy it to testers, why would you be wanting to evaluate it in the first place?
If you’re not distributing the result to other people, literally nothing you described matters at all.
As for integrating ideas, as long as you don’t copy actual lines of code, simply learning ideas and techniques from any OSS doesn’t cause license infection.
That’s not evaluating a codec though, is it? You’ve gone far beyond the scope of this thread.
A web browser doesn't need to "evaluate the efficacy" of an image codec. It either supports it or it doesn't, and in the former case it only needs to decode, which means it only needs to incorporate Apache 2.0 licensed code.
Like I said: they shouldn't need to encode images very often, and that doesn't seem like all that much of a common use case (nor would it preclude a browser from implementing decoding and just not supporting FLIF for encoding-dependent functionality like that).
It does not seem to do consensus to me, just speculations. And even if it would be a consensus, there is way to link statically with LGPL and still respect the license.
Could you please provide supporting evidence for your view that is a consensus? Where is the documentation explaining your “static link” proposal and why it’s been endorsed by lawyers as a permitted approach? What was the response from the GPL team when they were asked to comment on this circumvention of their license? Will they be patching GPLv3 to correct it?
> Could you please provide supporting evidence for your view that is a consensus?
Consensus is what is admitted by most actors, specially including the main interested actor: The FSF and the GNU project. And I do not remember reading any communication of the FSF or the GNU project saying it is forbidden.
Currently the license text is very explicit about what you can do and not do if you link statically.
> 0) Convey the Minimal Corresponding Source under the terms of this License, and the Corresponding Application Code in a form suitable for, and under terms that permit, the user to recombine or relink the Application with a modified version of the Linked Version to produce a modified Combined Work, in the manner specified by section 6 of the GNU GPL for conveying Corresponding Source.
Which means: You have to provide a way for your user to recompile statically their application with a new version of the library. Meaning, you have to provide a tarball with your object.o files if someone ask for it.
So, for macOS — where static executables aren't viable — you would need to ship all pre-linking components and then let the user assemble them.
You could ship those elements in a signed DMG, but it would be up to the user to assemble them, and so the user would not be able to codesign them under your developer ID.
Since they would not be able to compile and run the app as your developer ID — it would have to be theirs, since they're the ones linking it — the app won't have access to any ID-linked features such as iCloud syncing, Apple push notifications services, and so forth.
So that would imply that it's outright impossible to use this "static linking" bypass on MacOS when code signing is enabled, since you can't falsify the original signature's private key in order to sign the executable and gain access to signature-enforced platform capabilities.
I guess it would probably work fine for Linux folks, and you could always sign it with your own ID, but this certainly is not "compatible with signed executables" — to answer the original question asked upthread.
Safari is by Apple, who seems to have decided to deprecate and replace every open source component that relicenses as v3 (such as OpenSSL and bash); their chances of allowing their team to come into contact with this work would be particularly poor as a result, not to mention their competing HEIF product.
Firefox is by Mozilla, and the browser team ships code under MPL2 (afaik) which permits dual-licensing with GPL2 - but not, as far as I can work out personally, either variant of GPLv3; is MPL2 permissible to dual-license with LGPLv3, or would they be required not to implement an encoder due to incompatibility? (Mozilla seems invested in the AV1 codec, so it’s safe to assume they would be interested in a lossless frame encoder with higher efficiency than lossy options.)
Chromium seems to accept any mix of BSD, (L)GPLv2, and (L)GPLv3 at a brief glance at their codebase, which is quite surprising. (I wonder if shipping Chrome Browser knowingly includes libxz under the GPLv3 terms. If so, that ought to have certain useful outcomes for forcing their hand on Widevine.)
None of these questions would be of any relevance with a less restrictive license, whether BSD or CC-BY or even GPLv2.
I don't think I agree; there are plenty of video and audio formats that exist that were lossless that had traction in the archival and library worlds. Sure, that's not nearly as big of an audience as something like YouTube or Instagram, but it's not nothing.
I could totally see a universe where something like FLIF becoming a de-facto standard in image archiving, particularly for extremely large images. 33% smaller than PNG is a measurably savings in image archiving.
I forgot to mention, there are plenty of non-archival formats that get traction even with incomplete browser support. The MKV container format is only partially supported in browsers (with WebM), but it's still a popular format for any kind of home video.
I wonder how it compares with simply LZMA'ing (i.e. 7zip) a BMP. In my experience that has always been significantly smaller than PNG (which is itself a low bar --- deflate/zlib is a simple LZ+Huffman variant which is nowhere near the top of general-purpose lossless compression algorithms.)
Along the same lines, I suspect BMP+LZMA would likely be beaten by BMP+PPM or BMP+PAQ, the current extreme in general-purpose compression.
It shouldn't be that surprising. PNG has a maximum window size of 32KB. That means you could use a small set of identical tiles to make an image, and PNG would have to store a new copy every row, because the previous row is out of range.
Sorry to ask a very ametuerish question, but how is lossless compression of an image different from regular run of the mill compression (zip, 7z)? Is there any sort of underlying pattern or feature which is unique to image data that is leveraged/exploited for lossless image compression?
Yes. If you examine PNG format, it actually uses pixels around a pixel to predict its value and compress the difference, which is much closer to 0 values. It actually uses zlib to compress just like gzip.
Pretty cool, but where are the links to the issues on mozilla, chromium, webkit, and edge's bug trackers to add native support for it?
As an unencumbered open source technology, it should breeze through legal's OK pretty quickly, and getting it integrated could certainly take a bit of time, but should just be part of the FLIF roadmap itself, if the idea is to actually get this adopted.
You don't set out to come up with "one format to rule them all" and then not also go "by implementing libflif and sending patches upstream to all the open source browsers that we want to see it used in" =)
I'd be interested in a format that can replace TIFF, without the files being quite so enormous. However, it seems like all the FLIF tools are assuming you're coming from PNG-land.
TIFF is a very hairy animal, but I'm happy to report that JPEG XL will be able to offer most of the functionality TIFF is currently used for:
- multi-layer (overlays), with named layers
- high bit depth
- metadata (the JXL file format will support not just Exif/XMP but also JUMBF, which gives you all kinds of already-standardized metadata, e.g. for 360)
- arbitrary extra channels
All that with state-of-the-art compression, both lossy and lossless.
BTW, I know this because I recently embarked on some photo import/edit/management scripts and wondered the same thing you did: "Why isn't PNG a thing yet??" There are reasons. A few, from minor to major IMO:
TIFF acknowledges EXIF and IPTC as first class data. PNG added EXIF data as an official extension a couple of years ago and I know ExifTool does support it but I'd want to check all applications in a workflow for import/edit/export support before trusting it.
TIFF supports multiple pages (images) per file and also multilayer images (ala photo-editing).
TIFF supports various colour spaces like CMYK and LAB. AFAIK, PNG only supports RGB/RGBA so for image or print professionals, that could be a non-starter.
So I get why PNG can't warm photographers' hearts yet. Witness still the most common workflows: RAW->TIFF & RAW->JPEG.
TIFF is a container format that allows you to add other features into the file. A good example of this is GeoTIFF, which is like a grid data file that also happens to be viewable as an image.
This is something I wish was used on the web. Imagine instead of creating a high compression image for use on a web page and then having a link for the full res one, you could just say "Load this image at 60%" and if users right click and save it would download the entire image.
It sure looks impressive. I think it's important to remember that comparing it to a lossy format will show it's disadvantages. For example on this demo:
Introducing better compression of images and animations would be another small step fighting climate change. Less data, less transfer, less energy consumption!
Not necessarily, the decoding can be stopped once you get enough usable information in regards to the usage of the image (display size ...) with the same source image. That's neat!
> A FLIF image can be loaded in different ‘variations’ from the same source file, by loading the file only partially. This makes it a very appropriate file format for responsive web design. Since there is only one file, the browser can start downloading the beginning of that file immediately, even before it knows exactly how much detail will be needed. The download or file read operations can be stopped as soon as sufficient detail is available, and if needed, it can be resumed when for whatever reason more detail is needed — e.g. the user zooms in or decides to print the page.
I have already converted all my JPEGs to WebP and configured my camera app to save directly to WebP. I hope one day I will be able to switch to FLIF the same way.
If only it could focus the progressive download on areas of the picture, so it details faces in the early part of the file, and background in the later part.
Firstly, it's not a "problem" when code is released under the GPL. In many cases this is the best way to protect user freedom.
Secondly, this is the Lesser GPL, which means that only modifications to the FLIF implementation itself have to be free. It can still be linked in proprietary programs as long as they don't make any modifications.
It can be linked in proprietary programs even if they do make modifications, they just need to release the modified source code (of the LGPL library). The trickier obstacle is that it is required to be able to replace the LGPL library with another version in the proprietary program. I.e. the LGPL library must be dynamically linked, or the linkable compiled object code for the rest of the proprietary program must be provided so the program can be relinked statically.
> Firstly, it's not a "problem" when code is released under the GPL. In many cases this is the best way to protect user freedom.
Ah, thanks for the correction.
I went through a legal headache years ago releasing some software and our lawyers strongly urged against *GPL and pushed us to Apache because we would have had severely limited our opportunity for Fortune 500 companies. Apparently many prohibit anything with the letters GPL in the license, despite that the software was free and we were charging for services. It was a long headspinning debate and due to mounting legal fees we went with Apache.
> That means anything that touches it becomes LGPL, right?
Wrong, mostly.
Any changes you make to the encoder library itself would be LGPL, but if all you are doing is calling the library from other code then that is not an issue.
If you make a change to the library, even as part of a larger project, nothing else but that change is forced to be LGPL licensed. If your update is a bit of code you use elsewhere, as long as you own that code it does not force the elsewhere to be LGPL - while you are forced to LGPL the change to the library there is no stipulation that you can't dual license and use the same code under completely different terms outside the library.
I would appreciate a reference implementation in Rust. Or, if not intended for immediate linking, in something like OCaml or ATS. Clarity and correctness are important in a reference implementation, and they are harder to achieve using C.
Best practices and rules in C++ are changing on a daily basis as the language is still evolving. On the other hand, C is much more readable for many programmers and researchers even with a little programming experience. Moreover, C is more portable and helps the reference implementation be quickly adapted for production or being used by the other compatible languages.
Likely the format with the best chance of overthrowing the jpg/gif/png incumbents is AVIF. Since it's based on AV1, you'd get hardware acceleration for decoding/encoding once it starts becoming a standard, and browser support will be trivial to add once AV1 has wide support.
Compression wise AVIF is performing at about the same level as FLIF (15-25% better than webp, depending on the image), and is also royalty free. The leg it has upon FLIF is the Alliance for Open Media[1] is behind it, which is a consortium of companies including: "Amazon, Apple, ARM, Cisco, Facebook, Google, IBM, Intel Corporation, Microsoft, Mozilla, Netflix, Nvidia, Samsung Electronics and Tencent."
I'm really excited for it and I hope it actually gets traction. It'd be lovely to have photos / screenshots / gifs all able to share a common format.
[1] https://en.wikipedia.org/wiki/Alliance_for_Open_Media