Hacker News new | past | comments | ask | show | jobs | submit login
What’s the best lossless image format? (siipo.la)
215 points by pmoriarty on June 7, 2022 | hide | past | favorite | 164 comments



JPEG XL is the standard to beat them all. Like how Opus became the definitive codec for lossy compression for the web.

JPEG XL covers nearly every use case for images, even replacing GIFs with it's animation support.

Existing JPEGs can also be re-encoded losslessly into JPEG XL format/data structure and get significant file size savings. This backwards compatiability is very important in transitioning to web to JPEG XL seamlessly.

What I'm most excited about for JPEG XL is progressive decoding, I had been following the FLIF project for a while and was disappointed with it's lack of adoption. But it's future found itself in JPEG XL.

Without a doubt 5 years now JPEG XL will become like Opus and set the standard for next generation image formats.


> Without a doubt 5 years now JPEG XL will become like Opus and set the standard for next generation image formats.

I wrote a little X11 screenshot tool last year; one of the reasons I wrote it is because I wanted to save images as WebP rather than PNG that most tools use, since it's quite a bit smaller. Not that I'm running out of disk space, but it just seemed nice; one of those "no, it's not really needed but it's fun and why not?" weekend projects.

Anyhow, I discovered that loads of stuff doesn't support WebP. This was unexpected as it's been around for ages and supported by almost everything, but things like uploading to GitHub, Telegram, and many services doesn't work. It was disappointing and I had to add PNG support just for that.

I fear that JPEG XL will be in a similar situation in the future, where loads of websites can and do use it for their images, but also loads of services don't support it so it's a lot less useful than it could/should be. I hope I'm wrong.


You could make a similar case for audio. GP described Opus as "the definitive codec for lossy compression", which made me do a bit of a double take. Opus is absolutely the technically superior audio format, but when you want to give users a file you'll be sure they can play, you give them an MP3.

I expect JPEG and PNG to become for images what MP3 is today. And I expect H.264 to become that for video. These formats are too ingrained, whatever their technical limitations.

Technologists consistently underestimate the power of backwards compatibility. Itanium was better than AMD64, DVORAK is better than QWERTY, and I'm sure Esperanto is lightyears ahead of the monstrosity that is English. But backwards compatibility always wins, because it fits what the world was already designed around.

-----

(To be fair to GP, they said Opus is the definitive choice for the web, which may be true for integrated in-browser playback. But in other contexts, well, Amazon sells MP3s, not Opus files.)


> Itanium was better than AMD64,

No, AMD64 is far superior, Itanium's benefits never materialized. Any BOTH offer backwards compatibility, anyhow.

> DVORAK is better than QWERTY,

Not by enough to matter except in extreme circumstances. I say this as a Dvorak typist.

> I'm sure Esperanto is lightyears ahead of the monstrosity that is English.

It's not. It is needlessly complicated. "Basic," "Simple," or "Technical" English (which came along a few years after Esperanto) is far simpler and quicker to learn.

> But backwards compatibility always wins, because it fits what the world was already designed around.

Besides all the above being incorrect, some examples like the rise of ARM serves as a count-point.


ARM enabled a class of devices that x86 outright didn't support at the time (and largely still does not). In my eyes, that's the exception to the rule: when something is not merely incrementally better, but actually a difference in kind. That's what it takes to supersede backwards compatibility.

I said in my last post that I expect h.264 to become the MP3 of video files—but if that doesn't happen, I suspect it will be due to the format's lack of HDR support. That's also a difference in kind—something every consumer will immediately notice, if high quality HDR displays become common in the future.


> ARM enabled a class of devices that x86 outright didn't support at the time (and largely still does not).

What time, and which class of devices? Certainly it's true of very low-power embedded devices, but those aren't of particular interest... Embedded devices can easily switch between architectures without end users even noticing.

If we are referring to ARM being used for more computationally intensive devices... Like smart phones, tablets, and servers, then x86 has always had competitive options.

The earliest couple Nokia Communicators were x86 based, not ARM. Years before they switched to ARM, the National Semiconductor (later AMD) Geode processors debuted and were surely competitive with ARM, having power ratings as low as 0.7watt. Blackberry phones came years later, and iPhone and Android devices much later yet. The OLPC project using Geode CPUs on their XO-1 certainly proves their viability.


> It's not. It is needlessly complicated. "Basic," "Simple," or "Technical" English (which came along a few years after Esperanto) is far simpler and quicker to learn.

I'm sure it's easy to say that when you are already an English speaker. I personally doubt it very much. Is there some data maybe that supports this point of view?


> I personally doubt it very much.

It might help if you specified exactly what it is you doubt...

It's a simple fact that Basic English has a shorter wordlist than Esperanto, simple rules, and being based on English it omits things like gender for objects.

While perhaps exaggerated, the claim was: "it would take seven years to learn English, seven months for Esperanto, and seven weeks for Basic English."

Some references to start with:

https://en.wikipedia.org/wiki/Esperanto#Criticism

https://englishharmony.com/english-is-easy/

https://academickids.com/encyclopedia/index.php/Basic_Englis...


The first doubt was around the fact that a language with the pronounciation background coming out of the English language will be "easier" to learn than a strictly phonetic one.

Just looking over the list of words in your last link, I'm pretty sure that any number of them would trip someone that is a blank slate. For example: "who", "wheel" and "humour" are vastly inconsistent with regards to how they sound. Having to learn these idiosyncrasies, even for 900 words, would be - in my opinion - more difficult than the basic rules of Esperanto (and Ido, which seems even simpler).

I would imagine that coupled with a phonetic transliteration of the common sounds it would be easier but then the language would probably become a different thing all together.

I hope that clarifies a bit.


As a non-native speaker, I think it's "easier" in the sense that many non-native speakers already tend to have some exposition to English through music, films, and other media. For Esperanto, this exposition is basically 0 unless you're a geek and watched that William Shatner Esperanto film.

The amount of exposition varies based on a number of factors, but anything is better than nothing.


I see, that makes perfect sense. But I wouldn't call the language itself being "easier" but rather the process of learning it. However I don't think parent meant it like that.


> DVORAK is better than QWERTY

Not if you're a programmer, punctuation, which is much more frequent in programming languages than english, is a lot harder to type.

I've heard colemak is better for developers, but as a vim user, relearning touch typing AND my vim muscle memory doesn't appeal to me, so i haven't given it a serious try.


In vim wouldn’t you just remap everything so that the physical keys stay the same in normal mode, and only use an alternative layout in insert mode?


There's too much cognitive overhead if you do this. For example, to change a word we use `ciw`, which roughly means "Change Inner Word" as a mnemonic. There are quite a lot of similar keys with mnemonics, which would prolly make no sense if you keep the QWERTY keypresses on a Dvorak layout.


I find such mnemonics only useful for initially learning how vim works. After a while it's all just muscle memory and I don't even think about how to do stuff, I just do it.


I was at that point several times (using Neo2).

The problem is that there is too much to remap, basically you need every letter on the keyboard. And if you keep positions, mnemonics do not make sense anymore. "Change Inner Word" would become "JC," (on Dvorak).

If you keep the letters (instead of the positions) all commands in muscle memory do not work anymore. I tried to relearn those ... but in the end it is too much effort for too little gain.

These days, whenever I get fancy I just switch the keyboard layout to Neo2 when writing longer texts and switch back when going back to command mode. (I guess there is a way to automate that switch, that would be interesting now that I think of it).


> because it fits what the world was already designed around

There is this quote from Edward Bernays that I like

> “The great enemy of any attempt to change men's habits is inertia. Civilization is limited by inertia.”- Edward L. Bernays, Propaganda


> But backwards compatibility always wins, because it fits what the world was already designed around.

Early Itanium included a x86 unit. Intel tried everything it could to push it.

> and I'm sure Esperanto is lightyears ahead of the monstrosity that is English

A lot of the complexity in English comes from active use and adapting to new concepts and integrating foreign words. Hopefully Esperanto will never be in a position to suffer the same issue and join the ranks of Itanium instead.


I don't think you're right about H264 for video. Video is big enough that there is real money to be made in better formats.


In 2020, 90%+ of online Video are still served with H264. And then you have to factor in storage cost. Remember Network bandwidth cost continues to drop with clear roadmap ahead. While Storage, both NAND and HDD cost reduction has many roadblocks. Storing additional video in different codec can be costly depending volume.


>but when you want to give users a file you'll be sure they can play, you give them an MP3.

Or AAC-LC. Which is so much better than MP3 while having the 99.9999% compatibility. ( It is pretty hard to find something that support MP3 and NOT AAC )


> Itanium was better than AMD64

I thought the consensus was that Itanium turned out to be a significant mistake and represented a misestimate of how we could utilise it in software?


Yeah, axiolite‘s reply caused me to read up more on the history of Itanium, and I think that was a poor example on my part.


> You could make a similar case for audio

It wasn't really my intention to "make a case"; just an observation/lament/complaint.


Oops, that was just a figure of speech, I didn't mean anything by it!


Apple didn't support WebP for the longest time. A bunch of fairly major sites started using them. While they worked in the browser, when downloading the image macOS couldn't open it. The same went for the video format.

I just tried downloading a couple and they seem to work now, but I've grown to hate WebP. Even if they open today, I likely won't keep anything saved as WebP, because I won't trust it will work down the road, where jpg and png will likely work for decades.


Apple could force the industry forward if they added Opus, FLAC, and JPEG XL support everywhere. Ok, they’re often the ones holding everyone else back in these areas.


Steve Jobs was asked a question related to this many years ago [1]. Older Jobs was asked as well [2].

The trouble is that you may want Opus, FLAC, and JPEG XL, but someone else wants a different set, and someone else wants yet a different set... then in 3 years everyone changes their mind. There is value to some stability and consistency to these file formats so our media is durable and broadly supported over time.

I'm not sure Apple is the only one holding this stuff back. I just checked out a JPEG XL site [3] and it only shows partial support and it's hidden behind a feature flag. As long as this is the current state of things, Apple isn't holding back anything. It's simply not ready for prime time.

I get the desire for open source standards used for media, but they never seem to be as broadly supported, and for your average user, that's all they care about. "Can I view my image? Can I listen to my music?" If the answer is no, they don't care about anything else. Most users won't know if they are looking at a PNG or JPEG XL, or listing to an m4a file vs Opus.

[1] https://www.youtube.com/watch?v=48j493tfO-o [2] https://www.youtube.com/watch?v=XmRNIGqzuRI [3] https://jpegxl.io/#tutorials


It's not too wrong to say Safari is the new IE being the anomaly of all.


Telegram supports WebP in a very irritating way; you can simply copy paste and it works perfectly, but it assume it is a sticker and so it is shown sticker sized and without the ability to zoom.


Can you share any links here of your tool?


Sorry, I haven't published it. At this point it (mostly) "works for me" but there are a bunch of things that don't (yet) and there are a bunch of caveats and problems.

Also, it's really not that interesting of a project as such, but it was/is interesting for me anyway in the sense that I now know more about image processing and X11 ("outdated", allegedly, but works fine for me, so whatever).


Also ticks the HDR box!

As I comment elsewhere[1] it also has an experimental fast mode in tree that is crazy fast, using AVX2, NEON, or fallback.

Hopefully Microsoft patenting the thing someone else invented (ANS)[2] doesn't gum this all up.

[1] https://news.ycombinator.com/item?id=31658871

[2] https://news.ycombinator.com/item?id=31657383


JPEG XL in lossy mode models SDR and HDR images using the same model (absolute color XYB). This reduces tweaking and makes it easier technically to create mixed SDR/HDR composites, animations or overlays. All other coding systems that I know of use a different coding method for SDR and HDR, meaning that the quality settings and other encoding parameters are slightly different for these modes.


Everything about JPEG XL sounds great, except that it takes about 34 times longer to encode an image compared to PNG.

Edit: Apparently there's an experimental faster version: https://github.com/libjxl/libjxl/tree/main/experimental/fast...


Depends, are you optimizing the PNG? Because cjxl is a hell of a lot faster for a much smaller file than zopflipng.


zopflipng is a very aggressive form of optimization.


If it takes that much longer to encode then I am guessing it takes a lot longer to decode as well which I am going to guess makes it completely out for image sequences, film archiving, VFX work, high end photography, etc etc.


That is usually not true.

The time spent by high quality encoders is generally trying different settings to find optimals... decoding should be unaffected.


It may take less time actually since it can be parallelized. PNG's big problem is it's serial.


In fact, Apple introduced its own extension chunk to PNG to enable parallel decoding, by splitting IDAT into two halves, put a zlib sync flush between them and record the offset to the second half [1]. This kind of split is bulit into JPEG XL.

[1] https://www.hackerfactor.com/blog/index.php?/archives/895-Co...


Unless nobody adopts it?

EXR seems like it does all of the high end pro features of JpegXL and more (including 32 bit color). JPEGXL can do some odd stuff like animated gif type images but that seems more like a web end user feature than something professionals would care about. I can see it replacing Jpeg / GIF but I don't see this becoming an industry standard anytime soon.


I kind of hope that happens. First of all, still charging for the spec is ridiculous. So most people really have no idea how (over)complicated or long it is, except by trawling through a C++ reference implementation. E.g. AVIF does this much, much better. Do we really need to support images with 4096 channels in a web browser? And the name is going to be super confusing; it's too close to JPEG.


I used to think this, but I've lost confidence over the last year. I literally work on the software that generates jpg files on Googles Pixel phones, and there seems to be no realistic path to JPEG XL adoption within the next 5 years.


Why not? What’s stopping the next Pixel phone producing JPEG XL images the same way the iPhone switched to HEIC images a few years back?


To first approximation, nothing. And it's possible that we will at some point in the near future. The problem is consuming it. You won't be able to view the photo on device until Android adds it, or until the gallery and camera apps add custom software decoders. You won't be able to share with anybody until Facebook/Whatsapp/etc add support. you won't be able to view online until the browsers add support....


How does it relate to JPEG2000?

I may soon have to recommend a lossy but potentially lossless image format for 3D medical images (with 16 bits channels), would JPEG XL fit this bill?


The future isn't here yet unfortunately.

I imagine the doctors would view these images in multitudes of ways(browsers, operating systems, image viewers) with varying level of support. So encoding in a standard that is yet to be entirely finalized and adopted is a bad move as of now.

You'd be best off with going with JPEG2000 still if that's what you prefer or DICOM.


Oh adoption does not have to be that good. It is for researchers right now, to exchange results. So the only requirement is that it is possible to use it with a somewhat mature lib in python.


I absolutely recommend JXL. It is developed by the JPEG Group as a successor to JPEG2000 and performs better in nearly every metric (big exception currently being encoder speed). JPEG2000 isn't really widely supported, but does have many more libraries available.

So unless you expect library availability and/or encoder speed to hurt adoption, go for JXL.


Thanks I'll give it a try!


you should use NIfTI, which is the de facto standard for 3D medical imaging.

not compressed, but handles the voxel-image space transform which is far far more important and quite complicated.

you can slap gzip on top of it and most libraries/FIJI will transparently decompress


Isn't JPEG patent-encumbered technology?


JPEG XL is a different technology than the original JPEG format. (JPEG is also the name of the organization that contributed to both formats.) JPEG legacy's patent expired in 2006 although lawsuits continued afterward. JPEG XL is royalty free.


The standard supported by a whopping 9 (!) apps! Sorry, but prevalence of support matters too, and PNG can't be beat there.


This is very misleading.

Some of those 9 "apps" include imagemagick, ffmpeg, exiftool, libvips, and Qt KImageFormats.

These libraries power a vast amount of software. Support is also in beta for Firefox and Chrome.


So if you are on linux, practically every app that deals with images will work. On windows and mac, it's probably less likely.


I wonder why companies with multi-billion dollar budgets for each OS release are such laggards at building in support for formats that presumably would not change much between each release.


For one thing, delaying support for basic features lets one ship them as hot features for future launches and updates.

Case in point, Apple making a fuss last Monday over finally allowing users to change file extensions in CandyCrushOS^H^H^H^H^H^H^H^H^H^H^H^H iPadOS 16, a move some 51 years after mv was introduced to Unix on which that operating system is ostensibly based.


"So if you are on linux, practically every app that deals with images will work."

Linux has way more than 9 apps that deal with images.

So if you are on linux, practically every app that deals with images will NOT work with JPEG XL.


Their point was that most Linux apps (and a lot of Windows/macOS apps, actually) that deal with images with do so via one of the above libraries, and therefore will handle JPEG XL without even knowing what it is.


Yeah, but what is the file extension?

If it is .jpx, maybe.

Otherwise, no.

(Only partially kidding, unfortunately. Outcomes sometimes hinge on trivial details.)


It's .jxl, apparently. The wiki says so and jpegxl.io's online converter also produces .jxl-s.


If the format is meant to be the next defacto standard, wish they made the name more distinguishable. People without fully understanding it will abbreviate to JPEG as well and definitely cause confusion.


I’ve read up about JPEG XL in the past but a question just occurred to me: is it just an image format or is it also a document format? Can it replace TIFF as a container for multi-page documents/images?


No, it's not a document format. JPEG XL won't replace TIFFs by and large because it has a different use case.

TIFFs are more in-line with a combination of FLAC/WAV from the audio world. It's meant for raw data used in production/manipulation.

As a professional you would work with TIFFs/RAW photo formats and then master it in JPEG XL for mass-consumption.


I’m aware that it isn’t meant to replace TIFF, but merely having document/multi-page support wouldn’t do that (or at least, not for most pro usages). There’s a severe dearth of consumer-friendly multi-page static image containers that don’t start with the letters P and D and end with F, and it would be great if someone in a position of influence would suggest such a format (especially if it really were just “multiple page support” and not “everything plus the kitchen sink masquerading as a multi-page image/document format).


How well does JPEG XL deal with color spaces? I don't know much about them, but understand that not all color spaces are create equal, and data gets lost in conversions.


JPEG XL bitstreams always have a defined color space, optionally described by an embeded ICC profile. They are a part of bitstreams instead of containers [1] because they are considered necessary to reproduce the exact output. When converting from file formats with an ambiguous color space (e.g. PNG without color space information) cjxl assumes sRGB as far as I know; you can give any display color space to djxl.

[1] A JPEG XL file can be a minimal bitstream or an ISOBMFF container, where the latter can contain the actual bitstream and additional metadata like Exif. The bitstream itself should be enough for decoding.


Does lossless JPEG XL support 16-bit images?


JPEG XL supports 24-bit integers/32-bit floats per channel.


I gave cjxl a 16-bit depth PNG, told it to use lossless encoding, and it gave me a 16-bit depth JXL. Seems so.


> Opus became the definitive codec for lossy compression for the web.

never heard of Opus for images... must have been sleeping


:P I assumed anyone reading it would know Opus is an Audio-codec. Was comparing the paradigm of where I see image codecs going with a similar one that happened in audio.


They are also common in that they are effectively two different codecs combined.


If your dataset allows it, I would look at pngquant for "lossy" compression of PNG images by means of color quantization: https://pngquant.org/

This works absolute wonders if you are talking about images with a limited amount of colors and without many gradients. I use it to compress screenshots of websites, which, if you think about it, are mostly large blobs of uniform colors with hard edges. I also use it every time I want to include a screenshot from my desktop in some publication or email. The savings in filesize are too good to be true without any apparent loss of visual fidelity.


Still hoping for vector desktop screenshots, now that modern GUIs are all vector-based.* 11 years on, gtk-vector-screenshot still stands alone, though we do now have a handful tools for rendering webpages to SVGs.

*Add this to Category:Articles_with_obsolete_information: https://en.wikipedia.org/wiki/Vector-based_graphical_user_in...


MacOSX has that for a while, years ago. Mostly I think it was misleading and aimed at print magazines.


I wonder how fpng[1] would compare- a lossless speed optimized png compressor. Oh! The author Rich screenshoted[2] a really nice deck[3] that also boasts fpnge[4] and fjxl[5], both of which are even faster, both use AVX2 (or neon for FJXL).

Notably fjxl is in the reference libjxl tree. I'm not sure what if any constraints fast mode has, what it takes to use. The article uses cjxl, which does not run the fjxl fast mode, and further, fast mode was added after this article was written[6].

Other notes on jpeg-xl: it has progressive rendering. It has hdr!

The deck on FJXL & FPNGE is epic. Thanks Quote Ok Image (QOI) format for setting the fire on this all!!

[1] https://github.com/richgel999/fpng

[2] https://twitter.com/richgel999/status/1485976101692358656

[3] https://www.lucaversari.it/FJXL_and_FPNGE.pdf

[4] https://github.com/veluca93/fpnge

[5] https://github.com/libjxl/libjxl/tree/main/experimental/fast...

[6] https://github.com/libjxl/libjxl/pull/1124


There is also a new experimental Fast Lossless mode in JPEG XL as well, but it was developed after this post was written

https://github.com/libjxl/libjxl/tree/main/experimental/fast...


> but it was developed after this post was written

The article literally compares it to others. It might not have been finalized at the time of writing, but it's not like the author was unaware of it.


I think you misunderstand pizza; the article compares JPEG XL, but it doesn't compare the new lossless coder implementation (fjxl/fast_lossless), which is a separate tool/binary from cjxl.


> What is the best lossless image format?

For which application? Otherwise you can only say "depends". Tiff is good if you want to save data fast to disk in Blender. EXR is great if you want 32bit color depth, etc.


> Tiff is good

Mind that TIFF is only a container format that can contain any arbitrary codec you like; it can be uncompressed pixels, RLE compressed, it can be a PNG, a JPEG XL, even lossy JPEG or WebP if you'd like.

Saying your image format is TIFF is roughly as useful and conveys as much information as saying your video format is MKV.


Good point. What I wanted to say is, that what is "the best" depends a lot on the application. Maybe you don't care about compression/size, but about write speed. Maybe read speed is something you care about, maybe you do actually care about size, but compatibility with certain clients is even more important, maybe you care a lot about how long decompression takes, how much strain it puts on the CPU or the memory, etc.

So it is certainly not as simple as just going for the thing that produces the smallest file in some benchmark that may or may not fit your realworld data.


In that regard lzw has been the standard lossless tiff compression format I’ve seen.


Usually with TIFF people mean uncompressed.


> For which application?

For this article the context is for use in browsers, and the main metric being measured is compression ratio.


I think decompression times would be a relevant criteria, but they're surprisingly not there...


Yes please. Bandwidth is relatively cheap, when do we start optimising for compute instead? Would be great to have some kind of metric that would allow us to better understand efficiency (e.g. effort/energy/time) for the use-span of a given image.

... something that might tell us, given a 3s encode with available technology, at what number of over-the-wire loads / decodes does it become more efficient to use a cheaper codec? Slightly less relevant with lossless, but still.


I found that strange too. An image is encoded only once but decoded potentially millions of times.


Quite, and encoding speed, which is included, seems almost irrelevant


Encoding speed can be very important, especially if you are seeking to decode the resulting image as quickly as possible at all times. Certain image formats are wildly better real-time containers than others. Encoding speed is irrelevant unless it's very low or very high. Most people are thinking amortize the decode against the encode and don't even consider what would happen if you took it into the other extreme.

Did you know that, with the correct SIMD code, you can encode a 1080p bitmap into a 4:4:4 JPEG within 10 milliseconds on a typical consumer PC? Encoding times like this open up an entire dimension of new possibilities.

I do have a side project where I use libjpegturbo to synchronize a frame buffer between server & web client (i.e. 60 FPS real-time). HN may frown on using [m]jpeg for real-time applications today, but in my experience it has proven to be a very robust path. Bandwidth is really the only tradeoff (and it is a big one). This is the very last thing Netflix would want to do (short of sending RAW files to your TV), but it does provide some very compelling attributes in other areas such as streaming gaming.

Our networks are only getting faster over time. Intraframe video compression techniques are really elegant in their simplicity and resilience to motion-induced artifacts (as seen with x264, et. al.). They also can provide lower latency and easier recovery for clients (since every frame is a keyframe).

For real-time applications, I'd happily trade some extra bits per second to have a final result sooner. Especially if it looked better.


Additionally it should be mentioned how crazy imaging tech got lately. USB2 with 480 mbit/s is at the physical limit near the speed of light. This is why it easily fails in reality if you have the slightest disturbance. Faster busses like USB3 basically use more lines even if we still call it serial...

If you would transfer raw image data (rgba) you get what for full hd images? Around 7 fps? I don't even want to see the bus for modern smartphone cameras that now seem to casually have around 100 megapixels...

So very fast "on-site" encoding is pretty much mandatory today...


What do you think about MJPEG versus RFB (the VNC protocol)? I know that in modern VNC implementations, at least unofficial ones, JPEG is one of the available encodings. Still, IIUC, the big difference is that with MJPEG, the encoder sends a complete JPEG image for each frame, while this doesn't need to be the case with RFB.


I didn't know RFB was a thing. I will be looking into this.

What I really want is to be able to send an initial jpeg keyframe to the client, and then use some application-level heuristic to determine if I should then send a delta-encoded frame (what to change based upon the prior) or another keyframe. For 90%+ of my frames, using a prior frame and indicating deltas would be far more efficient than sending a keyframe each time. Maintaining the previous frames on the server per-client is not a huge deal and would not impact latency if done properly.

I wonder if there is some way I can use jpeg itself to half-ass this sort of approach in the browser. Maybe... encode keyframes at ~80% quality and delta frames at 100% (still not guaranteed, I know). Maybe canvas compositing or simple buffer tricks with bitmaps could get ~95% of the way there. For my application, the pixels don't have to be perfect or accurate to any particular standards.


In case you're interested, here's a permissively licensed server-side implementation of the protocol, including all major codecs, in C: https://github.com/any1/neatvnc


Why is encoding speed irrelevant?

I'm currently in the process of scanning and archiving lots of photographs as high-resolution 16-bit PNG's and encoding speed is definitely a significant influence on how long this process takes.


You missed the "almost".

The vast majority of use-cases and users barely care about encoding performance. Your use is in the tiny minority. That isn't to say that you don't have a use-case, just that because of how rarely encoding speed matters, it truly is "almost irrelevant" for 99.99% of users 99.99% of the time.

--------------

It definitely would be nice if JPEG-XL also had good encoding performance, though. There's an experimental fast encoder in development currently[1], although before that lands, it would be nice if your scanning tool supported a workflow somewhat like "scan in parallel with image encoding, present encoded images to user, and allow them to asynchronously go back and re-scan/encode/review pages that came out badly".

[1] https://github.com/libjxl/libjxl/tree/main/experimental/fast...


It’s common enough to create an image on the fly and send it to a single user to be viewed once.


Because it's a one-time cost and you can cheaply throw more CPU power at the problem, or just wait and then you're done.

If you're doing this at such a scale that encoding speed matters (like if you're Instagram) the major concern would likely be cost, which you're going to save in bandwidth every time an image is viewed.


"you can cheaply throw more CPU power at the problem, or just wait and then you're done"

How am I going to "throw more CPU power at the problem" when I'm using a scanner hooked up to my laptop? Should I buy myself a new, more powerful laptop? That doesn't sound cheap.

As for "just" waiting... my time is valuable, and I have better things to do with it than wait... and, unfortunately, the process of scanning and encoding needs a lot of babysitting from me, so I can't just dump a bunch of photos in my scanner and come back when it's all over if I want even half-decent results, not to mention archival quality images.

In this real-world scenario, a quicker encoding speed would save me a lot of valuable time, and there's no getting around it short of hiring a skilled and trusted person to do it for me.


Like Tomovo above already said; your workflow would benefit from using an intermediate format that is faster to encode. You can then automate the conversion to the output format to occur unattended at a more convenient time.

The use of intermediate formats is a well proven technique in video editing where encoding real time to AVC/HEVC at high quality is not possible. Codecs like ProRes are used that are much easier to encode to at the expense of storage space.


Save uncompressed to a big cheap harddrive, batch compress to PNG overnight?


How many megabytes per second is your scanner feeding you? I'm very surprised that encoding a PNG is slower than scanning.


It's not slower than scanning, but I still spend a significant amount of time waiting for the maximum compression PNG encoding to complete.


Don't use maximum compression in this use case. Maximum (Zopfli-based?) is for those who will serve the image hundred times over the Internet while users are waiting for it. It is not cost effective or practical for storage only use.


If it's not encoding in the background so it can be done within a second or so of the data finishing, then it's just programmed wrong, it's not an encoding speed problem.


Not a bad choice - all that really matters in those cases anyways is what your scanner is capable of


Can I ask why you chose PNG for your archive format?


It's lossless, offers compression, works with 16-bit images, and is well supported in a lot of software on all sorts of platforms.


Interesting, thanks. I'd heard of PNG being used for this before and thought people had lost the plot.

For context I started a neg scanning project a few years ago, ~150 rolls on an epson v850 (flatbed), and stuck with the default compressed tif after some surface research. There is little chance I would consider another processing step .. unless it was batched on a server somewhere! Very painstaking process, which has been on indefinite hold for a small while now (waiting for the AI to catch up).


I guess this would be a good time to plug a small project I worked on: https://github.com/rben01/collagen . I got tired of seeing how crappy text looked when placed on a JPEG (due to the compression artifacts) and how large some PNGs were because they had a small JPEG-like area, so I made a little CLI program that solves that problem. Rather than settle on any particular image format, it allows the user to provide a collection of inputs -- images (in any format your browser supports), SVGs, shapes, text -- and bundles them into a single SVG file (it base64 encodes the images). So if you want to scale a PNG and place it on top of a JPG, you... just do that, with SVG transforms and whatnot. It produces a single file and is a lot easier than writing out the SVG yourself, and is nearly as easy as opening up a tailor-made mage editor and doing it there. And it's really, truly lossless, as it doesn't touch any of the input data.

In theory, you wouldn't even transmit the SVG directly; you'd zip up the components and send them across the network and then encode them into an SVG on the other end, which would mitigate some of the size increase due to base64 encoding.


I would love to see an estimation of how small real world images would get if we could calculate their Kolmogorov complexity.

Aka if we could find the shortest program that produces the image as output.

Some images obviously would become very small. For example an image that is all blue, a spiral, the flag of the United States, the Mandelbrot set ...

But what about the average portrait for example?


You cannot calculate Kolmogorov complexity in all cases because that would involve solving the halting problem.

It is worth pointing out that certain formats like AVIF do sort of do this: they use a few different algorithms, see which combination of algo/block split removes the most entropy, and then compress _that_. In practice, this is pretty good for most non-noisy images.


Wouldn’t you want entropy to increase when you compress?


Entropy can be considered on a per-byte or per message basis, and it seems that people who are talking about data at rest (compressed pictures, archives, etc) almost always talk about the per-message numbers. I don't think I've ever seen Kolmogorov complexity applied to a bitstream. It's always been per message.

In which case entropy is exactly conserved, unless the encoding is lossy.


The term is a mess in computer science. The standalone definition almost works reverse to the definition in thermodynamics. There is also entropy coding, which is a compression through a set of prefixes and entropy compression which is about the termination of processes. I am sure there are many more...

Most practical approach is to read the meaning through the respective context and sometimes to ignore the sign...

That said, I believe the parent is correct. Compressed data certainly has a higher entropy. But any additional bit from an image or sound is rarely independent in the first place like Shannon demanded it to be the case for the definition of entropy. That they are not independent is precisely the attack vector of many forms of compression. For example the base frequency of a JPG has more entropy if we use the literal definition than higher frequencies that get removed because they lack entropy in a information technical sense.


If you're doing some kind of "pre-compression" stage then you want to reduce entropy as much as possible so as to make the "real" compression more effective.


Yes I’ve often thought of this too. Also a lot of images are screenshots of text or UIs.


I wonder how much could be achieved just using a vector format with basic programming constructs like for loops.


If you want something that is going to be around for awhile and has a LOT of features you probably do not need then you are going to want to go probably with a .DPX or .EXR which are the standard for the VFX industry as well as film in general. On the consumer side I guess adobe .DNG might not be bad either or for maximum compatibility but then maybe a Tiff.

For video I personally like Cineform but both ProRes and DNxHD are more widely used

EXR will get you full 32 bit float if needed


How do EXR and JXL compare to each other, feature-wise?


I have never used JXL honestly - no job I have had it was ever even considered. Given the Jpeg in the name I doubt any pro studio would consider it given the other options do everything needed. DPX is basically the older standard for film and EXR is the newer one.

edit: Just looked at the specs and JpegXL is definitely marketed at least more as a web end viewing file type. It does higher color depths (up to 16 bit) which is pretty good but will not do 32 bit like EXR. Given that EXR is so widely used I would likely not consider anything else.


JPEG XL definitely supports 32-bit float samples, but cjxl is currently unable to read a 32-bit EXR [1] (compare with PFM [2] which is fully supported).

[1] https://github.com/libjxl/libjxl/issues/465

[2] http://www.pauldebevec.com/Research/HDR/PFM/


Got it. Well that is good at least.

Realistically I can see JPEGXL becoming a good DELIVERY format but I highly doubt any industries will be swapping to it for production use (or archiving) given there are very widely adopted formats and standards already that can do the same thing.

I'm almost tempted to say that trying to make ONE format that is meant for both (delivery and production or archiving which are inherently opposites in terms of requirements) is a bad idea just from a marketing and ease of use standpoint. Sometimes its nice to have two clearly separate technologies meant specifically for their purpose.


The default implementation has more than 1000 more dynamic than the human eye or the best digital movie theaters. It would be possible to compress the highest quality photography with JPEG XL without introducing quality problems.

Likely you get 1/5th of the file size from EXR -- smaller file sizes can improve some web based workflows -- for example when an advertising team is distributed, or partially in work-from-home mode.

Bringing the quality of professional photography to the masses is another thing. Today, mobile phones rarely have too much capacity and many users are fighting with their memory space getting full. There, uncompressed or poorly-compressed formats are not a realistic or cost-efficient option.


PNG vs JPG on a nutshell: https://i.stack.imgur.com/qJoKW.png


For web graphics, isn't a mix of PNG/JPEG good enough generally speaking?

I'd imagine the benefits of changing to something more modern/efficient aren't worth the cost/time involved other than in esoteric situations given that most end users have high bandwidth connections and bandwidth costs typically aren't too significant?


You might want to check your pagespeed insights to verify that. I noticed that our pagespeed score was very poor (due to google using new methods to calculate those scores). Adding webp versions of images reduced the size of those images significantly, and brought our pagespeed scores back to more reasonable values in google's eyes.


It’s unlikely that WebP was what helped there, but rather more lossy encoding (presuming it was JPEGs you were replacing). Yes, WebP is smaller for a given perception quality level, but it’s typically not that much smaller; you could have achieved results almost as good by reencoding to the same perception quality as you used on the WebPs but still as JPEGs.


One lossless png (a screenshot) was converted to (default web quality, which is lossy) webp, and visually it was identical. Original png was 151k, webp was 23k. Converting it to a jpg results in a size of 66k (equivalent quality to the webp).

So in this case webp was a third of the size of a jpg. Typically, webp will be about 30-50% smaller than an equivalent jpg, which is a pretty significant amount.


JPEG doesn't support transparency.


Yeah, images where lossy compression is fine but an alpha channel is needed are the one situation where switching to WebP (or AVIF, and soon JPEG XL) makes sense.


I'd say if you want to go on with your day painlessly, they are solid choices. If the website has large traffic for static assets, webp could make sense - the files that would be otherwise jpegs could be half the size, so half bandwidth. I imagine that CDNs would do this for you though.


webp, webp2, and avif are absolutely worth using in modern web scenarios. It's even considered "poor practice" to serve legacy formats now.


I wouldn't say it's "poor practice" to serve legacy formats. If you're using a setup to allow for fallbacks, that's the best practice. There are more device types some developers need to consider than the latest version of Chrome.


That's why I put it in the quotes. Google will flag you for core web vitals issues if you're serving jpegs and pngs all over the place.

Many CDNs automatically serve the best format compatible with the useragent these days!

Also... the bit about the latest version of Chrome is pretty far off. WebP has been usable for 90+% (for my use cases, 99%) of the web for a LONG time. The only thing holding us back was safari 14.0 which is now not on my list of targets!


(2021)

I wonder how AVIF and JPEG XL would stack up today.


There's been discussion lately whether or not JPEG XL adoption will be hindered by patents.

https://patents.google.com/patent/US11234023B2/

https://www.theregister.com/2022/02/17/microsoft_ans_patent/


Lossless was not a focus point for AVIF. There JPEG XL wins hands down, with something like 50 % difference. Also, AVIF can only do up to 12 bit lossless, whereas photography workflows benefit from 14+ bits.

On lossy JPEG XL wins by 17%: https://twitter.com/jonsneyers/status/1532771586381688833

"Overall for lossy, for the same visual quality, JPEG XL is ~17% smaller than AVIF, which is ~17% smaller than WebP, which is ~17% smaller than JPEG."


I want to see if they’re really lossless. I know for a fact that PNG is. But last I checked HEIF didn’t support lossless RGB images, only lossless YUV… So HEIF was unusable for bit-perfect reproduction of the original.


High quality? Less size?

Most importantly, Microsoft and Apple don't like the new image format. They are founding members of AOM, but they reject AVIF. Most browsers using Blink engines, including Chrome, support avif, but not Edge. Safari? Of course not. Safari is Apple's version of Internet Explorer. The fact that Apple devices cannot use webp efficiently is why many webmasters and service managers still do not adopt webp.


I was kinda sad QOI (https://github.com/phoboslab/qoi) wasn’t included in the comparisons.


QOI is kinda overrated, especially given the existence of fpng(e) and fjxl which operate within the limitation of existing formats and generally outperform QOI. QOI is a great demonstration that better prediction can make up for less efficient coding, but not much beyond that.


What, no one mentioned MNG[0] yet? :)

[0] http://www.libpng.org/pub/mng/spec


https://bellard.org/bpg/ Not sure if this fits your bill.


For raw data it would be Bitmap as it provides easy access.

As a more obvious answer I would still go with png and webp since it is most widely supported.


Very good article


JPEG 2000


SVG!


All graphics are images, but not all images are graphics


In my days we would have said, not all images are vector based - some are raster based.


Technically ..., but practically you are correct.


The best lossless format is the one you can decode back to the original image. When evaluating them, there is an implicit assumption that the decode step will happen a short time later, but that's not always true. Will there be JPEG XL decoders commonly available in 40 years? Will the code even compile then? As a thought experiment, try to find the lossless encodings from 40 years ago and see if they can be faithfully reproduced. (or even 20 years ago, remember JPEG2000?)

Framing best in terms of file size or encoding speed is a really use-specific framing, and not ideal for preservation.


Your concern is valid but misplaced, there is a reason that we have standards after all. The Library of Congress maintains file format evaluations for the purpose of preservation, and there is an (pretty favorable) entry for JPEG XL [1] as well. Not yet in public, but I'm also working on the very first reimplementation of JPEG XL from scratch and reporting any spec bugs from ISO/IEC 18181 (there are plenty!) and I expect there would be more implementations in the future.

[1] https://www.loc.gov/preservation/digital/formats/fdd/fdd0005...


> or even 20 years ago, remember JPEG2000?

Open Source JPEG2000 libraries:

- Grok, last commit 9 hours ago: https://github.com/GrokImageCompression/grok

- JasPer, last commit 5 days ago: https://github.com/jasper-software/jasper

- OpenJPEG, last commit 8 days ago: https://github.com/uclouvain/openjpeg

I didn't try them, but I think you can probably still read your 20 year old JPEG2000 files today without too much problems.


This seems excessively handwringing about code rot. TIFF was introduced 35 years ago and still well supported. JPEG2000 didn't become super widespread, but is still used in many places. Smart passports encode your passport photo in JPEG2000 for example.


Another example is in radiology. Many thousands of medical images are created and stored in lossless JPEG2000 every day with entire ecosystems of software to store and move them.


Also in digital cinemas films are JPEG2000 sequences.


Huh?


Exactly what they said. Movies in cinemas with digital projectors are distributed as sequences of JPEG 2000 images: https://en.wikipedia.org/wiki/Digital_Cinema_Package


The question isn’t whether any format from 35 years ago is still feasible to decode. The question is whether every format from back then is.


This is a good point - and also not just is it POSSIBLE to do - but is it easy?

Say you archive hours and hours and hours of footage and photographs in JPEG XL and then in 5 years the industry moves on to something completely different (maybe even worse in terms of specs) like the beta vs VHS thing. In 20 years it will surely be possible to decode those JPEG XL files but if you just want to send one to a friend and they have to jump through hoops to view it then it becomes a pretty big pain in the butt.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: