There's a certain romance to the idea of a random video that looks like colorful noise actually crashing or exploiting your device. It's basically the closest thing to "snowcrash" that we have.
It was really bad when the best way to play a video on a website was through the god damn adobe flash plugin - isn't it crazy thinking back on how that was just normal for so long? [1] [2]
People lament the loss of all those flash games and stuff but some of those sites were SKETCHY, not to mention ad networks with unchecked .swf "creatives"...
Oh yeah, I remember the "flash super cookie" being a well-known marketing technique.
You know what's really crazy? Before the Snowden revelations in 2013, something like 70% of the web was using http:// (including sites like Amazon and Facebook, IIRC - although checkout pages may have been secured). LetsEncrypt really did a great job of getting the Web past the starting line of basic security.
Oh yeah arp spoofing or just session hijacking your way into people's accounts was way too facile back then. I seem to recall a browser plugin that did it [1]
Firesheep proliferated quite rapidly at my uni, even to extremely non technical people due to its sheer ease of use and the potential for comedy.
The worst people would usually do is leave a vaguely embarrassing status on your FB page, which was the usual prank if you left your computer unlocked anyway.
I'd hope for someone reverse engineering the brain (or a specific individual's brain), then figuring out exactly what incomprehensible colourful noise of a video to show you, and at the end you mysteriously know how to speak Cantonese.
Could a hack be possible by exploiting the GPU? Like making a 3D game scene that's actually encoding malware so when the GPU tries to render it you now have access to the system resources?
Don't know about a scene, but EC2-like access to virtual GPUs was rumored to be pretty dangerous, potentially even to the hardware itself (think something like changing the voltages via undocumented registers). The attack surface there is enormous. It was rumors, maybe someone here knows better.
You can target co-processors in general, e.g., here [1], thus I assume people do hack GPUs.
Generally, the better we become in introducing mitigations, the more expensive attacks become and attackers have bosses, budgets and deadlines. They will try to find other avenues to land on a target :-)
We do read about occasional vulnerabilities on phone GPUs, but I do have to wonder. Wouldn't the compartmentalization and difference in compute abilities between CPU and GPU inherently limit the scope to which a vulnerability on the GPU in a typical PC can exploit?
Compartmentalization might make chaining the exploit more difficult, but it's certainly not unheard of. There have been exploit chains in the past that manage to jump from the baseband to the main CPU, for example.
One of my favorite exploits [0] was from Project Zero where the chain began with a vulnerability in the Apple wireless stack (Broadcom? maybe, but that might be a different exploit I'm thinking of), and ended in arbitrary kernel RCE. In other words, it was "a wormable radio-proximity exploit which allows me to gain complete control over any iPhone in my vicinity."
More relevantly to your question, here's a writeup about exploiting a GPU. [1]
Generally, no. Often you’ll see GPUs have extensive access to the system’s physical memory and vice versa which makes exploiting either processor from the other fairly common due to buggy drivers.
This is a pretty bad way to make an announcement. The abstract mentions iOS, Firefox, VLC and multiple Android devices.
1. At least 2 of those entries are software users CAN upgrade manually, but versions are hidden deep within the text. If users can take action that should be incredibly clear.
- I have been unable to quickly skim and find a version of Firefox has fixes for these.
- VLC has a fix for a use after free issue in 3.0.18, but I can't tell at a glance if other issues still persist.
2. Do we know anything about whether this is exploited in the wild or not? I can't tell from the abstract nor the conclusion.
The technical information is valuable, but anything actionable for users to mitigate effects is really hard to find :(.
[Edit] Ok, the disclosure and ethics subsection does mention that Apple, Mozilla and VLC have fixed these bugs in their latest releases, and Google and MediaTek are aware of the problems.
Those are fair questions but note that this is a research paper (and an excellent one at that), not a blog post meant for general audiences. The focus of the abstract and introduction sections is more higher level, on the contributions of the paper, instead of what it means for end users.
I fully agree it is a research paper and the subject matter is absolutely great! My complaint is the research paper hints at what may be affected in the abstract, that got picked up in the title, and then it got a challenging to find actionable details quickly for either an end user or someone only mildly interested in the topic.
It also liked to a github page which is a great place include the short non-research content. I think the github page was WIP at the time the article appeared on hn.
I know my attitude was critical, but I also tried to include that info in the message (even though I found the responsible disclosure subsection after posting my original comment). I also feel that info should be present on HN in one of the top comments for these kinds of articles (regardless of whether it's mine or someone else's).
> It also liked to a github page which is a great place include the short non-research content. I think the github page was WIP at the time the article appeared on hn.
I see. When I made the reply it already linked to the pdf. So you can't really be blamed neither can I. Such confusion is normal when a link changes.
> I know my attitude was critical, but I also tried to include that info in the message
The paper was written to suit an academic audience first, which includes people reading it 20 years later. Actionable advice for users is then in accompanying notices.
Also, usually when papers get published, it's months after they have been submitted so usually security bugs are already fixed and if you are on somewhat up-to-date patch levels, you are safe. I say usually because some vendors might not be as quick to patch or it might not be as easy to fix bugs (say bugs in hardware or such).
The problem with a general-purpose fuzzer is that the H.264 format is complex - you'd end up with a lot of syntactically-incorrect files (which decoders would easily reject) whereas H26Forge is a specialized fuzzer that ends up with syntactically-correct but semantically-incorrect files, and that's how it finds actual vulns before the heat death of the universe.
Re Rust: the problem here is hardware-acceleration, as far as I can tell. Even if we had a pure Rust H.264 decoder, you'd probably still want to use whatever your hardware has to use overall fewer resources. The drivers might be the place to look, and there's some progress on that front in Android for example, but as things stand fuzzing like that is extremely valuable.
Isn't the whole claim to fame for AFL that it largely mitigates or avoids that problem by tracking branch coverage so it doesn't waste time permuting the input in ways that don't change the program behavior meaningfully?
AFL is great for most file formats (e.g. ELF), but probably not suitable for video formats like H.264, which uses complex encoding even for things as simple as frame width/height in the header (see things like ue(v) and CAVLC).
It will take ages for AFL to generate a valid H.264 NALU that isn't rejected outright.
How do you know *ahead of time* which mutation of input will result in new path in code? You don’t. What you can do deduplicate possible inputs for mutation based on the branches taken/path.
AFL works by trying to modify bits and seeing what branches change direction. Arithmetic coding means this relationship desyncs almost instantly, so it’s hard to mutate into interesting test cases.
Yes, fuzzing does work on decoders. I can't remember how deep AFL managed to get but I do remember a flurry of crash bugs against our decoder when somebody first tried it
IMO any complicated protocol or format will be subject to crashing bugs because just verifying correct behavior is difficult.
To discover bugs, just build the xyz file yourself. People tend to use tools to generate content, and those tools generally don't make invalid content. That's a general problem with qa/verification.
I wonder if the iOS versions would have been tracked down if the researchers didn't have access to Corellium. I'm glad they did, since it sounds like a pretty nasty exploit you could trigger from almost any web page.
The iOS issues were found by directly playing generated videos on an actual iPhone with iOS 13.3. The kernel panics helped guide us on where to look in Ghidra. Corellium was helpful for kernel debugging, and testing newer versions of iOS. Without Corellium, kernel debugging may have been more painful.
You probably never heard about bluetooth spec. The spec grows so big to the point that no one even able to figure it out completely. And neither can you confidently say there is no bug in your implementation because reading them completely is nearly impossible. The cve database have tones of bug related to bluetooth and that is even adding up every year.
I remember stories from 20 years ago, about mplayer having security issues. These were supposedly used by people connected to the music industry to hack on people that were downloading music illegally. It felt very much as scaremongering, I didn't hear much of a followup.
In this age of video and especially video on the web, the added complexity might have accounted for even more security issues in decoders.
I find it funny that now fuzzers are being written in rust, as if that translates to better quality bugs being found.
It looks to me like the effort would have been better spent writing a decoder in rust. AFAIK Mozilla moved the mp4 (that's the container format often used around h264) parser in firefox to rust, but their h264 decoder is from ffmpeg (?). In the end the h264 will most likely be decoded in the gpu using closed source code by the hardware vendor.
Kudos to the team for doing smart fuzzing instead of just throwing garbage. Most fuzzing projects spend much less brain power, and usually get worse results.
The mp4 demuxer is indeed in Rust [0], and runs in the content process (= the process in which the web page is loaded).
We don't have a h264 decoder in our source tree, we use the platform's decoder (because of patents). It's very often in a separate, dedicated process, and when it's not, it's in the GPU process, because when hardware accelerated decoders are used, they're using more or less the same resources as the rendering code.
Those other processes with the tightest sandbox possible (per process type, per platform, etc.), and don't have access to the web page.
On Linux, the platform decoder we're using is `libavcodec` from FFmpeg, but that's still in a separate process with a tight sandbox.
We're also doing something interesting, which is compiling libraries to WASM and then back to native code to get memory safety [1]. This is used when performance isn't critical (unlike codecs, so, e.g. a demuxer that we don't want to rewrite in Rust).
> now fuzzers are being written in rust, as if that translates to better quality bugs being found.
I'm not sure if you're being facetious, but this is a classic straw man fallacy[0]: you've constructed a nonsensical motivation for the authors' use of Rust, then argued against that motivation.
There is superficial similarity between (1) "the authors wrote the fuzzer in Rust" and (2) "the authors wrote the fuzzer in Rust because it translates to better-quality bug findings", but the paper's authors did not claim (2), nor would any reasonable person claim (2).
More likely is that Rust is a useful language for authoring fuzzers because it is fast and supports modern abstractions while eliminating multiple troublesome categories of bugs. Fast performance, zero-cost abstractions, automatic memory management, and concurrency safety are useful language properties regardless of their relevance to security bugs.
A Rust decoder was something discussed at the start, which is why we chose the language. As research goes, we primarily focused on just the H.264 syntax elements.
The Chromium folks are working on a Rust crate called cros-codecs [1] for VP8, VP9, and H.264 parameter set parsing, with VAAPI as a back-end.
> I find it funny that now fuzzers are being written in rust, as if that translates to better quality bugs being found.
I can't find any such claim in the article. It says the tools are written in Rust and Python. I don't see any claims that fuzzing with a Rust-based fuzzer produces "better quality bugs". That seems to be your own assumptions projected on to the authors?
It's more likely that the authors wrote the tools in languages they're comfortable working with. It's not surprising that security researchers would be familiar with writing Rust.
You seem to be imputing motivation for using Rust, then criticizing them for making your assumption. If they had written a fuzzer in python, would you assume they used python because they had the mistaken idea that a slower program would produce better results?
It seems like a much more likely explanation is they know Rust well and it is their tool of choice. Another possibility is that a good fuzzer is non-trivial and they appreciate the compile-time checks offered by the language to avoid an entire class of errors.