
“Land initial Rust MP4 parser and unit tests” - steveklabnik
https://bugzilla.mozilla.org/show_bug.cgi?id=1175322
======
saidajigumi
Audio, video, and image codecs written in Rust seem like a fantastic early
opportunity to investigate Rust's utility for closing down related security
vulnerabilities.

Is anyone aware of work on fuzz testing an interesting Rust-based "attack
surface"(1)? I'm very interested to see what kinds of issues are/aren't turned
up in Rust vs. the usual C/C++ code for these libraries.

[1] By "attack surface", I'm thinking of traditional bits of code with high
exposure to untrusted data: codecs, HTTP parsers, etc.

~~~
vbezhenar
I don't think that production-quality codecs will be written in Rust. As far
as I aware, codecs are usually very complex piece of software and employ a lot
of hand-written assembly and very low-level C. Rust just doesn't offer
anything valuable in this area.

~~~
pjc50
A very complex piece of software is exactly the sort of thing you want to
avoid writing in C or assembly. The last person I know writing a codec
implementation did it by machine-translating the spec into a functional
program that _output_ assembler, C, python, Verilog etc as targets. That did
require handcrafting "leaf nodes" for things like matrix multiplication, but
they were small enough to verify.

~~~
haberman
Yes, I think of Rust as a promising target for parser generators. Thinking
about text parsing, if Bison generated Rust, then you'd have a memory-safe
parser that should be about as efficient as C. Something like this probably
already exists or is being worked on. Ideally existing code generators could
target Rust also to get memory safety.

There are some SIMD optimizations in parsers though that I don't know how
easily you could express in Rust. The quintessential example of this for text
parsing is Clang's optimization that uses SSE to skip over C++ comments 16
bytes at a time:

[https://github.com/llvm-
mirror/clang/blob/61f6bf2c8a8e94c4fa...](https://github.com/llvm-
mirror/clang/blob/61f6bf2c8a8e94c4faed77c5a836329edba91f53/lib/Lex/Lexer.cpp#L2299)

~~~
steveklabnik
I am far from an expert in this area, but you can do SIMD with Rust:
[https://github.com/huonw/simd](https://github.com/huonw/simd)

~~~
haberman
This looks like a good start (though experimental). If it supported something
like AltaVec's vec_any_eq() intrinsic on its u8x16 type, that would do the
trick. vec_any_eq() takes a vector and a value and returns true if any element
of the vector equals the value.

On x86 with SSE, this could generate a sequence of two instructions: pcmpeqb
(do 16 byte-wise compares) followed by pmovmskb (collect the 16 comparison
results into a single byte). Then you'd get the same efficiency as what Clang
does (Clang searches 8 bytes at a time for a '/' character when skipping over
comments).

~~~
jroesch
Huon is working on SIMD full time at Mozilla this summer as far as I know. We
should see more mature support materialize in the next couple of months for
sure.

------
scosman
Can anyone explain why mp4 parsing lib in a specific language is #1 on HN? mp4
is just the container format (not the video stream). I'm not trying to
belittle, just curious if I'm missing context.

~~~
davmre
This is the first intrusion of Rust into the Firefox source tree.

The whole point of Rust and Servo development was that they would eventually
lead to safer, cleaner code in Mozilla's shipping web browsers. This code
itself doesn't seem significant, but it's a milestone in that it marks the
beginning of the payoff from that effort.

~~~
bsdetector
It also is vindication for making Rust easy to combine with other languages.
If you tried to do this with Go and its own custom stdlib, main(), GC,
threads, stacks it would be a nightmare (not to mention an extra 1 MiB to the
binary).

~~~
castell
Go and Rust have different purposes. You can think of Go as a leaner Java/C#.
And Rust as a safer C/C++. Despite both Go and Rust are labeled as "system
programming languages" by some, there are different understandings what
"system" exactly means in that context.

------
molsson
Is it possible to query the mozilla bugtracker to see how many security issues
was fixed in the pre-rust MP4 parser before it was replaced?

~~~
cpeterso
Most security bugs are private. Even after the bug has been fixed, users
running older versions of Firefox may still be affected so Mozilla doesn't
want to expose too much information about how to exploit the bug. The fix, of
course, is open source for everyone to examine.

~~~
irishcoffee
Mozilla doesn't opensource their bug metadata?!

Ah its cool, they're rational actors who don't crucify people for private
interests.

~~~
bzbarsky
Mozilla security bugs do get opened up eventually. Specifically, once not only
Mozilla but also various downstream distributors (linux distributions, etc)
have shipped a fix for it. Release cycles there vary, so there is typically a
gap of a month to a bit over a year (depending on whether the fix could be
backported to the previous ESR) between the fix shipping in Firefox and the
bug being fully disclosed.

That said, even after a security bug is open some information in it may remain
hidden. For example, weaponized exploits attached to bugs are generally kept
hidden even after the bug is opened.

------
alexnewman
Amazing the future is now! How does it perform?

~~~
jewel
I imagine that using rust for this is going to mainly provide security
benefits. Parsing the MP4 file isn't very CPU intensive; it's a
straightforward binary format. (Decoding the video stream is what is CPU
intensive, and this code doesn't do that.)

~~~
izietto
It would be pretty exciting to see an audio or even a video decoder/encoder
written in Rust

~~~
JoshTriplett
Perhaps, but software codecs aren't nearly as interesting as hardware codecs.
Efficient high-resolution video decoding uses dedicated hardware these days.

It'd be interesting to see portions of libva rewritten using Rust, though.

~~~
jallmann
The only reason to implement a codec in hardware is speed or power
consumption. Software codecs are much more flexible with the parameters you
can tweak, and more tolerant of variances in the input such as buffer
under/overflows, etc.

In that respect, software codecs are much more interesting, but hardware does
allow you to get cutting-edge compression to market more quickly.

~~~
JoshTriplett
> The only reason to implement a codec in hardware is speed or power
> consumption.

Which are two of the most critical properties of a codec. When people look at
the battery life of a new platform, one of the common questions is "how many
hours of continuous video playback?". Or "how many hours of screen-off audio
playback?".

~~~
jallmann
> Which are two of the most critical properties of a codec.

If your target is mobile or embedded, yes. Fortunately, that's not the only
use case. Software codecs are a lot more interesting from the standpoint of
not being fixed function.

~~~
swah
FPGAs could change that equation :P

~~~
ris
Unlikely. They haven't so far in the last 20 years they've been around. And
the (painful) processing time required for "place & route" massively limits
the degree to which their dynamic nature can be exploited.

------
shmerl
So in the future Firefox / Servo will implement decoders in Rust too?

~~~
cpeterso
The decoders in Firefox now are third-party libraries (like Google's VP9), so
I don't think Mozilla will rewrite them in Rust. Perhaps Mozilla will write
the official Daala decoder in Rust?

~~~
TD-Linux
One of the Daala developers here. The official reference Daala codebase is
implemented in C89 - this is to ensure the widest possible compatibility with
all sorts of weird platforms. In addition, the tooling for assembly and
intrinsics are mature.

Of course, this is only the reference implementation - once the bitstream is
stable, it'd be great to try writing a decoder in Rust.

An easier starting point might be audio or image codecs, where speed is not as
critical and the formats are well defined. For example, here is a pure Rust
image codec library:
[https://github.com/PistonDevelopers/image](https://github.com/PistonDevelopers/image)

~~~
shmerl
By the way, do you know if Daala will be renamed to NetVC or not? I don't
really like these names like NetVC/ HEVC which sound like some infections. May
be it can be kept as Daala? After all Opus wasn't named "NetAC".

~~~
TD-Linux
Opus's contributing codecs were called CELT and SILK. The IETF working group
name was just "CODEC". The name "Opus" was chosen at the end. I imagine a
similar procedure will happen in the NETVC working group.

~~~
shmerl
You mean you expect other major codecs to be merged with Daala and out of
respect for the contribution there will be a change of name?

~~~
TD-Linux
That's some of it, but also it's just to see if we can come up with a better
name than Daala. We also have to check trademarks, etc. There hasn't really
been any major discussion around the name yet on the netvc mailing list [1].
The first netvc working group meeting will be in July at IETF 93.

[1]
[https://mailarchive.ietf.org/arch/search/?email_list=video-c...](https://mailarchive.ietf.org/arch/search/?email_list=video-
codec)

~~~
shmerl
Thanks for the pointer, I'll keep an eye on it. I think Daala is a good
sounding name, but if trademarks are a problem, that's a different story.

------
ExpiredLink
So Firefox is completely re-witten in Rust? Seriously?

Yes / No / Maybe / Don't know?

~~~
lmkg
The current plan is to write a new browser called Servo in Rust, which is not
intended to supplant Firefox. Servo is currently classified as a research
project, not product development, with the goal of exploring parallelism and
safety in browser engines. Small, isolated utilities written in Rust may end
up in Firefox.

(I didn't downvote you, btw. I'm guessing someone thought you were being
needlessly sensationalist.)

~~~
Brakenshire
I think the plan is that FirefoxOS and Firefox for Android will transition
over to Servo, presumably because of the major benefits of parallelism on
mobile. Desktop, as you say, it's just limited modules.

