
Daala progress update - dbcooper
https://people.xiph.org/~xiphmont/demo/daala/update1.shtml
======
AceJohnny2
Interesting to see they're also applying Daala to still images. It's a very
different set of constraints and optimizations than video.

My first instinct was to wonder how it compared to Bellard's recently featured
BPG image format [1], which achieves impressive quality at low sizes. Turns
out you can check this by selecting "HEVC" in Xiph's image comparison, since
that's the core method employed by BPG.

[1] [http://bellard.org/bpg/](http://bellard.org/bpg/)

~~~
quink
And Daala is not fixed yet, so there's a good chance of further improvements
a-coming. And it's likely to be relatively clean of patents and may enable a
cross-everything royalty free implementation. But time will tell how MPEG LA
will feel about that.

Edit: I'm also wondering, just in case any experts are around... with Daala's
design seeming at first glance inherently more suitable for that kind of
thing, lifting filters, wavelets and all, does that mean that we're more
likely to see some hierarchical modulation/scalable video coding/bitrate
peeling sooner rather than later?

~~~
pcl
_it 's likely to be relatively clean of patents and may enable a cross-
everything royalty free implementation. But time will tell how MPEG LA will
feel about that._

So that's the thing. I'm all for royalty-free encoders, but the only way to
know if an algorithm is really royalty-free is for it to be older than the
current maximum patent age. I believe that Daala is being designed to be
royalty-free, but that's not measurable in advance.

There are other good reasons to support Daala, and explicitly building
something to avoid royalty issues is certainly laudable (esp. compared to
explicitly doing the opposite), but you can't just go and guarantee that
something new is royalty-free, and it's frustrating to see the community make
exactly those sorts of statements.

~~~
phaemon
Where exactly did the community _guarantee_ that Daala was clean of patents?

~~~
keeperofdakeys
I don't know if there has been any concerted effort to find possible patents,
however daala is touted as patent free because its implementation uses
different technologies from H264/HEVC, hence most/all of the MPEG-LA patents
shouldn't apply.

------
IvyMike
Charles Bloom has been discussing PVQ and Daala over on his blog. I am not an
expert nor have I fully understood these articles, but they definitely look
like interesting related reading.

[http://cbloomrants.blogspot.com/2014/12/12-16-14-daala-
pvq-e...](http://cbloomrants.blogspot.com/2014/12/12-16-14-daala-pvq-
emails.html)
[http://cbloomrants.blogspot.com/2014_12_01_archive.html](http://cbloomrants.blogspot.com/2014_12_01_archive.html)

------
aidenn0
I prefer the JPEG to the Daala in that one sample image; JPEG does have more
artifacts, but Daala seems to preserve less detail.

For low-to-medium bitrate video that might be the correct tradeoff though.

[edit] H264 and H265 look like successive incremental improvements over JPEG
in the direction I personally prefer; they reduce artifacts without losing
detail.

~~~
0x0
I think the sky looks much nicer in Daala, but the bushes look much more
blurred.

~~~
aidenn0
I agree; JPEG has more artifacts which is most noticeable in low-frequency
areas, but preserves more detail in high-frequency areas.

All of the non-JPEG codecs seem to have greatly reduce artifacts in the low-
frequency areas, but for some of them (VP8 in particular) it was at a
significant loss of detail.

~~~
nitrogen
Apart from JPEG's blockiness and VP8's complete loss of detail, it looks like
the main difference between codecs is which parts of the image have detail
preserved. Some codecs preserve subtle sky texture at the expense of the
trees, others do the opposite.

------
rikacomet
Just to add as feedback to that: I much prefer the Jpeg render of 2 things in
particular in that sample image. First, the brownish stone on the lower-right
corner. As we as the car tracks. Regarding the car-tracks, the rough look in
the Jpeg render looks more natural to me. The stone is though only slightly
better in the jpeg. For the rest of it, Daala wins! I specially like the thin,
tall leave-less tree on the left.

~~~
indolering
FWIW, I know that there is some research showing that students (over time)
express a preference for MP3 compression artifacts. Something similar could be
at play here.

------
0x0
Applying an easing animation on the split screen comparison mouse drag was
probably not the most user friendly idea.

~~~
nitrogen
Is it actually an animation? I assumed it was lag caused by redrawing each
time the mouse moves and not ignoring queued mouse movements. Perhaps I
assumed incorrectly.

~~~
derf_
12:37:59 <+xiphmont> no, the lag is intentional.

12:38:11 <+xiphmont> it seems wrong... but exact tracking felt worse.

------
bla2
Subjectively, on that image VP9 looks nicer than Daala. (And x265 looks nicer
than both.)

~~~
pornel
While in this case VP9/x256 may be better indeed (they seem to preserve
higher-frequency details than Daala), beware of "looks nicer" when judging
codecs in genreal. Codecs are not in the business of making nicely distorted
images (they're not Photoshop/Instagram). They're supposed to keep images _as
close to the original as possible_.

For example smooth images tend to be judged as "nice", but if the original had
noise then smoothness is a distortion caused by lack of detail and deblocking
blur covering it up. JPEG 2000 fell into the smoothness trap: it gave it
better PSNR and nice low-bitrate examples, but ultimately failed, because you
don't always want everything looking like plastic.

~~~
andy_ppp
It would be interesting to build a codec with human preferences in mind; you
might end up with images that most people think looks better than the original
at a fraction of the size - how you make those choices is very difficult
though.

~~~
jallmann
Some video encoders already incorporate perceptual optimizations (eg, x264's
psy-rd and psy-trellis) that "look" better but lead to objectively worse
results with traditional image quality metrics.

Audio codecs, however, have been using psychoacoustic models for decades.
Frequencies outside the human hearing range are clipped, masked noises are
discarded, voice codecs emphasize the range of human speech, etc.

