"you really don't want compression artifacts in X-rays" You really don't want co...

kragen · on June 4, 2021

Compression artifacts in X-rays can easily kill people, either by requiring additional unnecessary X-rays (which both delay diagnosis and cause cancer) or by causing erroneous diagnoses; by comparison, the cost of the data storage thus saved is trivial. Compression artifacts in filtered photos of your cute pet turtle for Instagram are much less likely to kill people.

simondotau · on June 5, 2021

And this becomes increasingly true as compression methods get increasingly clever.

https://www.theregister.com/2013/08/06/xerox_copier_flaw_mea...

layoutIfNeeded · on June 5, 2021

Ah yes, the JBIG2 fiasco!

JBIG2 is a format for storing black and white documents in a highly compressed way. It works by detecting each letter in the document, and then replacing it with a pointer to the reference version of that letter, up to a certain threshold. Basically compression via OCR.

Of course, this means that when a distorted letter is too close to the reference version of another letter, it will get replaced with a clean version of that incorrect one. So even though a human could easily recognize that something was off with that letter in the original image, the JBIG2-compressed image has no such clue!

What’s really bad is that JBIG2 compression was built into certain Xerox machines that were used by archivists to digitize important documents for years until someone noticed the discrepancies. JBIG2 was promptly banned for archival purposes, but there might still be a ton of documents with these kind of invisible errors in our archives! :-)

nextaccountic · on June 5, 2021

It would be so cool to add the OCR as metadata. Texts in internet images could be readily selected and available to assistive technologies if images were OCRd at creation time.

layoutIfNeeded · on June 5, 2021

PDF supports this use case by adding an invisible text layer on top of the raster content.

On the other hand, JBIG2 doesn’t actually do OCR. It only does template matching of similar-looking blocks of pixels. The compressor doesn’t try to understand which letter those pixels represent.

publicola1990 · on June 5, 2021

But isnt medical images interpreted by eye only, so artifacts of compression not visible to the eye, those should not be a problem possibly?

Synaesthesia · on June 5, 2021

Artifacts can be visible. Also they can be destructive.

skywal_l · on June 5, 2021

A lot of algorithms are applied to medical images, as pre-processing for eye examination but also for automated analysis.

me_again · on June 4, 2021

IIRC there was a study which indicated oncologists' ability to detect tumors in X-ray images was degraded even with lossy compression ratios which didn't introduce obvious artifacts. I couldn't find it with a quick search though.

eco · on June 4, 2021

Sure, but you don't normally end up getting an unnecessary biopsy with most other image artifacts.

marton78 · on June 5, 2021

You don't want to do anything lossy in any medical product, otherwise you'd have to prove in the certification process that your lossy compression doesn't introduce any risks.

skywal_l · on June 5, 2021

For storing yes but lossy compression can be useful to improve performance of your UI for example, as long as the user knows that the image displayed has been degraded.

clord · on June 4, 2021

some images manage to communicate in spite of high compression. X-rays are an example of an image where the cost of misinterpretation is very high.