
The Daala Video Codec: Research Update [pdf] - dochtman
https://people.xiph.org/~xiphmont/demo/daala/daala-vp9summit-20140606.pdf
======
sitkack
Xiph needs to start a public prototyping playground that extends unpatented
video and image codec research to create islands of techniques that are not
patentable. They don't hav e to work, just be an idea.

Like if I said, "I'd like to use cellular automata and voronoi segmentation to
do multiscale texture extraction and motion representation."

Someone else could possibly do a 500-1000 line python program that implemented
something like that.

Create a 1000 of those ideas drop them into the public domain (we need
something like the GPL for ideas) so that there is a large body of techniques
and work that is unpatentable.

~~~
e12e
I'm afraid the existing body of software patents dictates not placing such
ideas in the public domain, but rather as a form of "patent left" foundation
that can force cross licensing access.

~~~
higherpurpose
Can you elaborate? Why would putting the technologies in the public domain not
work?

~~~
e12e
It would work for future patents, but I'm afraid it wouldn't work for future
_software_ \-- the theory being that any non-trivial software will likely
infringe on any number of patents. A collection of patents might help force
cross-licensing, and so protect new (Free or not) software against existing
patents (the assumption being that also holders of non-trivial software
patents develop new software, and might infringe on one or more patents in the
collection).

------
kbaker
Monty links to his demo pages [1] at the end of the slides, with a good
introduction to the codec if you are new to Daala (like I was.) Also, these
demos are linked from the main xiph.org page as well [2]. There's more context
and explanation there then in the slide deck ... I assume close what would you
would get if you were actually at the talk. :)

[1]
[https://people.xiph.org/~xiphmont/demo/](https://people.xiph.org/~xiphmont/demo/)

[2] [http://xiph.org/daala/](http://xiph.org/daala/)

------
higherpurpose
This caught my eye:

> Building a new codec from scratch may cost less than licensing

Samsung is already participating in many open source projects: F2FS, Tizen,
they are part of Linux foundation, and I believe is even helping Mozilla with
Servo. So why not try to get them to commit to adopting Daala in all of their
devices as soon as it's stable and out, and perhaps even help with funding a
bit? There might be other companies out there willing to do it, too. They need
to reach to them.

~~~
revelation
Why would they do that? It's Samsung. They probably already have a "covers
everything forever" license for the whole MPEG pool.

~~~
derf_
Something you may not know: Samsung had their own internal project to build a
royalty-free codec, but they gave up because it was "too hard" (which for a
large corporation usually means "costs too much"). The annual caps for H.264
are just low enough to discourage that kind of activity. The currently
proposed caps for HEVC are much higher, and that may change the equation for
some people.

------
cromwellian
There's no real guarantee that Daala won't be covered, at least partially by
some submarine patents. I don't think you can work on a years long R&D effort
and make that promise.

IMHO, you work on stuff with the assumption that if you're a big success,
someone's gonna sue you, and plan for it accordingly.

In the meantime, HEVC is going to be widely deployed eventually. VP8 and VP9
are valid alternatives for some, and I see no reason not to support them while
we wait years for a moonshot codec to save us. I have to wonder if there isn't
some NIH fear that if VP8/9 get any degree of success it would make it that
much harder to switch to Daala later.

That is, it's better to keep things in a "bad" state (H264/HEVC) because as
the situation gets worse, it would be easier to justify Daala adoption later.
Similar to how if you're waiting for Healthcare reform, and you want Single
Payer, supporting a partial solution (e.g. health exchanges) might mitigate
the worst pain, and make it harder to argue for Single Payer later.

~~~
derf_
> I have to wonder if there isn't some NIH fear that if VP8/9 get any degree
> of success it would make it that much harder to switch to Daala later.

As the person who

A) leads the Daala project, and

B) made the decision to ship VP9 (a conversation that went approximately like
this: My Boss: "Should we support VP9 in Firefox?" Me: "Yes. Duh."), and

C) has been fighting hard to make VP8 Mandatory To Implement for WebRTC...

I can tell you that nothing would make me happier than to see VP8 and VP9 be
wildly successful. Hell, I'd've been ecstatic if we'd successfully managed to
get H.264 Baseline made RF (there was an effort to do so a couple of years
ago: it failed by 2 votes). See also OpenH264.

~~~
cromwellian
That's very good to hear, I retract my speculative accusation given in the
absense of facts.

------
ZeroGravitas
At the end there is a request for niche areas that Daala could target. Here's
my crazy idea:

Mozilla is adding webrtc into the browser, and I'm sure the basic case of
video chat is being thought about. But another use case is screen sharing, and
in particular sharing a web page. How much better/faster could a video encode
be if you could feed it live information from the system that was drawing the
page? e.g. knowing that nothing has changed without having to compare one
picture to another, knowing that a certain area contains text, that another
area contains a gradient, that another area is animated with a repeating
animation or that the screen is being scrolled up/down at a certain speed,that
the repeating background is composed of a specific repeating png, and so on.

No idea if that's a valid idea, but it's what popped into my head on reading
the question.

~~~
derf_
With proper SIMD optimizations, the analysis to determine "nothing has
changed" is so ridiculously fast that it's hard to compete even with direct
XDamage output (or comparable things on other systems), whose data is not
really in the format that an encoder wants.

Not saying there's no gains here, but people have proposed this idea before,
and then given up on it after actually sitting down to implement it. It's also
mostly an encoder optimization, and thus doesn't have much influence on the
standard.

What's more interesting is adding special tools to the bitstream to represent
things like text, which do not compress well with typical block transforms.
This is certainly something we've spent some time thinking about, but there's
no code committed for it yet.

------
ZeroGravitas
Did anything else come out of this summit? I was trying to google info on it
the other day and it was so invisible on the web I was beginning to doubt my
recollection of the date that it was scheduled for.

It perhaps didn't help that someone just released a new gun model called VP9
which was filling up all the recent google results.

~~~
ZeroGravitas
I'm only 20% through, but I thought I'd comment on the first section of
"Don'ts". It's basically saying that Google's strategy with VP8/VP9 sucked. I
don't agree. They took on a massive task and have had some small successes. It
could have gone much worse. It reminds me a bit of those business books that
look at firms that succeeded and cargo-cult everything they did or didn't do.
But many of the factors that hold you back are random contingencies.

The big stumbling blocks for open web codecs are Microsoft/IE (on the desktop)
and Apple/Safari (on mobile). VP8/VP9 has failed miserably on this score. But
Opus, the amazing new audio codec developed by Xiph/Mozilla/IETF/etc. in the
manner suggested by the "Do's" is also notably absent from iOS and IE, (the
latter of which could be considered particularly galling since it was co-
developed with Microsoft subsidiary Skype).

Not that I think multiple approaches isn't a good thing, and you've got to
sell what you're doing. It just seems a bit negative when Google has VP8
shipping on Android (and Android seems like the only mobile OS likely to ship
Opus any time soon too).

~~~
zanny
The thing is MS and Apple won't negotiate on any of this. They have never been
involved with open standards and respectively pushed WMV and MOV formats
instead.

You have to move forward under the presumption that neither will ever
cooperate in a macroscopic sense, especially after Google caved and gave up
making youtube webm based.

You still have to make the best solutions, and speak why they are the best,
and hope everyone else will switch so that inevitably the dinosaurs do too.
Because you can't reason with a trex.

~~~
magicalist
> _especially after Google caved and gave up making youtube webm based_

They didn't remove h264 from Chrome, but when I watch videos on youtube, right
click -> stats for nerds it lists the video as 'video/webm; codecs="vp9"'.
That includes videos with ads, which didn't use to be the case.

~~~
ZeroGravitas
Any idea when this happened? I only noticed a few days ago and there doesn't
seem to be any official announcements or even blog posts about the switch.

I think they used to prompt you to install Flash if you didn't have it too,
which it didn't seem to do earlier this week, just played adverts and content
via HTML5/H.254 (in IE11) and HTML5/VP8 in Firefox.

After this the next step is probably to start delivering HTML5 in preference
to FLash, where possible. Wonder what the plan is for that transition?

~~~
magicalist
No, I hadn't noticed it until testing it for that comment (actually I didn't
notice an ad started playing for the first video I clicked on and even the ad
itself was webm), though I do feel like I've gotten almost all html5 videos
for a while now (I have click to play on for plugins, so the ones that are
ready to go stand out).

> _After this the next step is probably to start delivering HTML5 in
> preference to FLash, where possible. Wonder what the plan is for that
> transition?_

Notably, if you go to the html5 youtube page
([http://www.youtube.com/html5](http://www.youtube.com/html5)) in incognito
mode in Chrome, it actually says "The HTML5 player is currently used when
possible", so I assume this is the default now. That isn't the case in
Firefox, where the option to request the html5 player to be the default is
still there. Maybe it's a vp9 thing? I don't believe Firefox has included vp9
yet (and the player says: 'codecs="vp8.0, vorbis"')

~~~
ZeroGravitas
Ah that's interesting. I actually opted out of HTML5 to see what would happen,
but only in Firefox.

Firefox does have VP9, but it doesn't yet have the MSE (Media Source
Extension) support that YouTube requires to deliver VP9, though a semi-
functional version is available behind a preference in the nightly builds.

------
oflordal
Its hard problem making something unique enough to avoid current patents while
still being a nice fit for a haäw pipeline. Things like överlappning
transforms may end up creating new minimum values for the vertical context
needed.

New itu/MPEG standards typically has lägre involvement from hw companies.

~~~
derf_
Memory bandwidth for the lapping isn't any worse than for a deblocking filter.
The difference is that it's not adaptive.

~~~
oflordal
HW decoders will not send the deblocking barrier to external memory
(typically). Instead you will buffer the required rows until you have decoded
the next line of macroblocks. As long as the filter margin are the same this
will work fine in itself but you will not have done the buffering at the same
place you did deblocking or any of the simpler overlap filters. In that way
you will force larger changes to your architecture. Similarly your transform
unit is likely not set up to handle this so it requires redesign. Getting to
many of these hurdles in can be the death of adoption for the CODEC.

------
khitchdee
Too bad they're not making this plain open source They should copyleft all the
IP in this which will foster innovation on the algo instead of just basing
this on customer needs Open algos are much easier to extend.

~~~
lambda
What are you talking about? Daala is as open as you get.

~~~
khitchdee
As I read it, they plan to patent new stuff. See page 9, the third bullet for
example.

~~~
lambda
Read the fourth bullet: "Use a patent license that encourages adoption and
discourages defection."

It sounds like they are planning on a copyleft-style defensive patent
licensing system.

Right now, one of the big reasons that open codecs like VP8 and VP9 don't see
wider adoption is that there's a lot of FUD thrown around by people saying
"they might infringe H.264/HEVC patents, it's too risky to use them without a
patent licensing pool."

By filing patents of their own, and offering them under a free (libre) license
that has a clause causing it to be revoked if you engage in some other related
patent litigation, this adds pressure against anyone trying to use their
patents against someone using Daala.

Not sure how well this would work out in practice; I know that some people
have gotten patents under these kinds of defensive open patent licenses
before, but I don't know if they've ever been used defensively in practice
before.

~~~
khitchdee
So you're saying in practice copyleft and hence open source doesn't really
work, you have to patent and then license for free to protect your users. How
does this impact innovation based on the ideas you have patented?

~~~
derf_
We've talked about making a GPLv3 release to at least open up the patents _we_
control to all copyleft software. But there are important details to work
through, and like most FLOSS projects we have more things to do than people to
do them, so it hasn't happened yet. If someone needed it to happen it should
be a pretty easy conversation to have.

Keep in mind also, our goal is generally to stay out of court. See above about
having too much to do.

