Hacker News new | past | comments | ask | show | jobs | submit login
Chrome: Heap buffer overflow in WebP (googleblog.com)
333 points by skilled 12 months ago | hide | past | favorite | 262 comments



Within Google Chrome, WebP images are decoded by the renderer, and therefore any exploit can only get renderer code execution - which is inside a sandbox.

Since the renderer is very complex, there are lots of exploits discovered each year for it. However, when someone manages to get code execution in a renderer they don't get much more access than a webpage would normally get. Specifically, they can't see or leave files on the filesystem of your machine, nor even read cookies for other domains.


> when someone manages to get code execution in a renderer they don't get much more access than a webpage would normally get

Unless of course there is a sandbox break-out exploit to combine this with. So not an immediate priority, unless there is such an exploit in the wild that I don't know about, but important to patch as soon as not terribly inconvenient.


Well, this was exploited in the wild. Here's the actual commit for the bug fix in webp. https://chromium.googlesource.com/webm/libwebp/+/902bc91


Aye, TFA mentions that.

It doesn't say if it breached the sandbox (I'd expect another CVE if there is an active sandbox flaw also), or indeed if the exploit targeted Chrome specifically at all or the library more generally.


There are two different possible attack vectors.

1. First, once you "local root" within the sandboxed process what can you compromise?

a. Can the session cookies be stolen? (Think gmail inline attachment rendering a malicious webp image).

b. Can you launch attack on other local network resources from within the compromised sandbox – read insecure filesystems, make tcp sockets to local resources etc – these are not prevented by OS sandboxing capabilities that Chrome depends on.

2. Second, can you probe for vulnerabilities in OS sandboxing that chrome depends on to break out of sandbox? On older unpatched OS versions this is definitely possible. These attack vectors may have been fixed in newer patches of OS already so you won't see a new CVE and patch for the same from the OS vendors.


> b. Can you launch attack on other local network resources from within the compromised sandbox – read insecure filesystems, make tcp sockets to local resources etc – these are not prevented by OS sandboxing capabilities that Chrome depends on.

Huh? Chrome on Linux uses a network namespace to remove all networking from the renderer process, and then uses seccomp to forbid most system calls, and can very easily prevent e.g. opening new sockets with that.

https://chromium.googlesource.com/chromiumos/docs/+/HEAD/san...

https://blog.chromium.org/2012/11/a-safer-playground-for-you...

https://blog.cr0.org/2012/09/introducing-chromes-next-genera...


Ad 2. You don’t need a vulnerability in OS sandboxing. You need a way to start communicating with the browser process, since the way renderer processes do I/O is they delegate it to the browser process by communicating over a socket. With a sufficiently advanced arbitrary code execution that’s possible without trying to find bugs in the kernel.


Im not in security. What are the answers for 1a and b? Are they hypothetical? I assume no.

2, as they mentioned, is the separate CVE they suggested.


Here is a description of process and site isolation in Chrome:

https://chromium.googlesource.com/chromium/src/+/main/docs/p...


Feels like it must have, otherwise what would be the point? But perhaps the CVE will come from Microsoft or some other vendor, if it's Renderer -> Kernel exploit.


> what would be the point?

It was initially reported as an exploit in Apple's ImageIO library, which is not properly sandboxed on iOS. https://citizenlab.ca/2023/09/blastpass-nso-group-iphone-zer...


> otherwise what would be the point?

The exploit could be a PoC that at the moment does something noticeable but relatively innocuous. If there was a sandbox exploit that is already exploited along with this one, I'd expect this announcement to have been held back until when (or after) the other is announced.

ACE & RCE bugs should be taken this seriously even if another (as yet not known to exist) bug needs to be chained with it in order to do _real_ damage, because for all we know a black-hat out there could be sat on such a bug hoping no one else notices (and therefore it gets fixed) before something like this comes along to provide the other half of a really serious zero-day PitA.


> The exploit could be a PoC that at the moment does something noticeable but relatively innocuous.

Why would you exploit some DC lawyer with a POC?


Unrelated, what does "TFA" stand for? The only thing I can come up with is "the fucking article", which I really doubt is correct.


That is what it stands for. Inspired by RTFM I assume. I learned these on Slashdot about 20 years ago ... they made sense in that context, and have carried on as a familiar jargon ...


This is correct. An alternate reading is The Friendly Article. Either way it means "the article or page being discussed"


https://en.wiktionary.org/wiki/TFA

The Fucking Article

You can say "The Fine Article" if you're feeling nice. :)


I've always read it as The Featured Article


Same. As a shorthand reference for "the article that is under discussion" it doesn't need to convey the hostility/frustration that "RTFM" does.

Unless you're saying "Did you even read the TFA?!?" but that would violate the code of conduct around here.


Getting permissions on a single website user context is already enough considering the vast amount of user uploaded images on the modern internet. It's a bit like XSS on steroids, because it is not limited to one site or frontend.


The idea here would be that I'm visiting Reddit, someone uploads a malicious WebP, and now my Reddit account is owned, right?


Sites like Reddit load user uploaded content from different domains (redditmedia.com) so you should be safe in this case.


I'm not that familiar with Chrome's process isolation but I think that the renderer process that displays your Reddit.com tab would still render that image, even if it came from a different domain.

Chrome doesn't spin up another process to render the image and then transfer the pixels.


Wouldn’t the exploit code be able to bypass all those cross domain checks? As long as they are running in the same sandbox, right?


It should be noted that some sites (not sure about reddit) will re-encode images in various ways (sometimes to entirely different formats), possibly mitigating this.

Exploits aside, there are quite a few undesirable behaviours you can cause with media, such as the bouncing video file which changes it's height every frame and makes content around it reflow.


TBH I just used Reddit as an example of "site with images", I'm not actually concerned about my Reddit account/ think it'd be a good target.


Well websites are stupid if they don't reencode user-uploaded images for precisely this reason...

Otherwise an evil client uploads a malicious webp image, which then gets hosted and 'shared' by the server to other users, who upon viewing said image get exploited and share more malicious images...


100% Websites that accept complex media from users really are stupid. It’s bad enough sanitizing other input, but the added costs with transcoding and storing user content, all the meanwhile exposing unnecessary risk to processes. I’d just pipe the media upload to Cloudflare and have them deal with it


Unless you mean the client should transcode before upload (which they can, mind) that mostly makes the server exploitable.


The server is usually processing the images anyway, at the very least to remove EXIF metadata, but also to resize and compress the images for different resolutions.


You'd set up the server so that it's okay if it gets exploited. For example, maybe its only job is decoding images.


If an exploit gives you remote code execution on the image transcoding server, you can simply change the transcode operation into a pass through. Hell, you could even start adding your payload to other users' uploads.


I said the decoder, not the encoder.


Then transcode in a different server.

Not transcoding user-uploaded files is borderline negligent.


Transcoding images is why all old pictures on the web a a blocky mess of unholy JPEG.


There are ways to transcode images in such a way that if they're downloaded and reuploaded to your service you don't lose any further quality in your re-upload.


As someone who shared a good deal of Spore creations over the internet, I disagree.

(The game would generate a "card" with a visual preview, but would stuff the XML encoding of the creation into some PNG metadata field so the image could be dragged onto someone else's game.)


Reencoding images doesn’t mean they’re not exploitable anymore.



I'm using lossless webp, it's much more efficient than png

(oops, wrong thread, meant to reply to https://news.ycombinator.com/item?id=37479576 "jpeg is good enough")


This security bug specifically happens in the WebP Lossless decoder ("VP8L").


Looks like you're in a position of not losing anything. ( other than a system and accounts due to a hack )


Is it Web scale?


I lol'd


Are you saying that the renderer can't send information back to the website?

Also, perhaps you turned off JavaScript, and now all of a sudden the website can still execute code?


This is why I’m more sympathetic to browser developers being slow to adopt new formats. WebP is a marginal advantage over JPEG (mostly transparency) which hasn’t seen much success but now that’s translated into multiple high-priority security holes and we’re all going to be spending the next month deploying patches everywhere which links against libwebp.

That doesn’t mean we shouldn’t do new things but I think as developers we’re prone to underestimate the cost pretty heavily.


Firefox has a nice solution which gives an extra layer of sandboxing which can be applied to individual libraries rather than a whole process: https://hacks.mozilla.org/2021/12/webassembly-and-back-again...


Yes - not saying it’s impossible to improve or that we shouldn’t add new things to the web platform, only that it always costs more than we anticipate.

One example: that’s great for Firefox but that helps a format become more common, which increases the likelihood of tools integrating a library but not using a browser-grade sandbox. How many apps end up passing user uploads through something like ImageMagick, for example? (And how many years will it be before the last container image is updated?)


For clarity, you don't really need a "browser-grade sandbox" for this -- the wasm2c route that Firefox/RLBox are using is quite lightweight. The only runtime support necessary is basically malloc/free + a signal handler to recover from OOB accesses (and even the signal handler is optional if you can tolerate a runtime slowdown). It's usually pretty easy to compile and link wasm2c's output with an executable that can speak the C ABI.

(I've contributed to wasm2c and think it's a cool tool.)


Fair - I was really just thinking about the ton of things on my Linux systems which link against libwebp.


To be fair issues in jpeg decoding libraries also have been used in the past as vector for malware payload. While the webp ecosystem is far less mature, I am pretty sure that "older" format handling would also have its fair share of security issues.

But your reasoning is valid, it seems like a few weeks ago, netizen were arguing that jpeg xl should be adopted as fast as possible, and for that to be possible the browser developer "only needed to include the reference decoder code into their codebase" at "very little cost".


> netizen were arguing that jpeg xl should be adopted as fast as possible, and for that to be possible the browser developer "only needed to include the reference decoder code into their codebase" at "very little cost".

Because otherwise AVIF should not have made into the codebase. High-profile C/C++ projects can't prevent all security bugs, but they can make them easier to find and harder to get in. AVIF and JPEG XL roughly have the same impact in this regard (written in C++, uses Google's standard convention, tested and fuzzed regularly, and so on).


AV1 video support was already a baseline since Youtube and other websites really want to use it, so AVIF didn't add any attack surface with the decoder. (this is also unlike WebP vs VP8, since WebP added a lossless mode which is quite literally the part this vuln is in)

HEIF container parsing was the additional attack surface added by AVIF, and while it's probably more complex than JPEG-XL's container alone, it's definitely less complex than a full JPEG-XL decoder.


> High-profile C/C++ projects can't prevent all security bugs, but they can make them easier to find and harder to get in. AVIF and JPEG XL roughly have the same impact in this regard (written in C++, uses Google's standard convention, tested and fuzzed regularly, and so on).

Isn’t all of that true of libwebp? I’m sympathetic to the argument that it’s a lot of work to replace C but I’ve been hearing that C/C++ will be safer with enough diligence and better tools since the 90s and it hasn’t happened yet.


Correct, hence "[they] can't prevent all security bugs". They are written in C/C++ primarily because their primary users (e.g. web browsers) will still use C/C++ anyway, so the interoperability is more important. This is changing thanks to Rust and other languages, but the pace is still much slower than we hope.


Then why is it C/C++? To be allowed into the browser, a new codec ought to be implemented in a memory-safe language.


Because browser vendors have already invested too much into the existing C/C++ code base. They can thus accept new code with the same degree of coding standard.


Those same vendors are using Rust and Swift now which have comparable performance and solid interoperability. It seems like time for a policy saying new internet-facing code be implemented in a safe language.


Browser vendors are ahead of you on this. They just need time to put it into practice.


Yes - I know both Mozilla and Chrome are doing this. The main thing I was thinking is that it’d be a lot more supportable if the response was, say, “we’ll add JPEG-XL only in Rust”.


To be clear, I’m not saying we shouldn’t add new formats - more that it’s one of those “take the developer’s estimates for how long it’ll take and double them, and then put a zero on the end” situations. I’m not sure WebP was worth it but AVIF definitely is and both of them do have one advantage over other formats if they can share code with VP8 and AV1, respectively, since browsers were definitely going to ship those.

What I wonder is whether this is saying there should be two tiers, where stuff like JPEG XL might be implemented in WASM for better sandboxing so browsers could more easily add support with less security risk and then possibly “promote” it to a core format if it proves successful and there’s enough performance win.


Except no one will want to ship an encoder/decoder for performance (network & CPU) as well as requiring JS. It would take of images to break even on the cost of shipping the WASM—so these things would never get adopted.


But would have been these bugs discovered if the browsers had not adopted the new format? There are many other image formats and their libraries which are most likely riddled with bugs, but nobody cares because they are not used by major software (certainly not the people capable of finding and exploiting these bugs because of poor time reward ratio). Being unused for longer wouldn't make them any less buggy.


This vulnerability was found in Apple's ImageIO library as used in their iMessage application on iOS...

https://citizenlab.ca/2023/09/blastpass-nso-group-iphone-zer...


Found to be part of a Pegasus spyware installation exploit. This is from the company that sells spyware to dictators so that they can arrest & murder people critical of them…


I can confirm that. I talked about it in my last blog post.


The security holes are caused by C/C++ being unsafe language to develop with, not new image formats. If image, and other, encoders and decoders do not use unsafe languages, it’s unlikely they introduce any such bugs.


That would definitely help, but it doesn’t eliminate the problem entirely (consider for example the attacks on hardware accelerators). I do think that’d be a good policy: new codecs have to be written in Rust or run in WASM.


It is a good idea. When WebAssembly becomes more widespread, it is likely we can drop fixed encoders altogether. Though at this stage the performance impact might be still too much to warrant anything like this.


So every website would ship with its own codecs? There aren’t shared caches across domains. Why would anyone choose that much latency+badwidth+compute (which new require JS enabled) to use a new codec over the native ones? Certainly not the small–medium sites.


Because of the total saved bandwidth. Codec binaries are few hundred kb at most, while high quality images are dozens of megabytes.


My JPEG XL decoder uses WASM: https://github.com/niutech/jxl.js


Nice! I did a quick proof of concept for JPEG 2000 a few years back but it really needed the sophistication of your multithreading and SIMD work.


It's also a far more complex format than JPEG, and complexity almost directly correlates with defect density.

That said, I also blame the culture of making code more complex than necessary, along with developers who barely understand the details.


So that slowness has bought nothing? You have bad format with exploits in the wild while using that as an excuse to gate great format. (which the browsers could "slowly" RIIR instead to address the safety issue directly)


I don’t follow your argument - can you clarify?


Slow browser devs didn't stop this marginally better format from being added with CVEs such as this, but it does prevent significantly better formats from being added, so being slow isn't the way to go, the safety issues should be addressed with safety tools like memory safe languages and other wuffs instead of slowing down the progress


First, I would submit that possibly people have learned from this - WebP is 12 years old and both the level of craft and tools have improved since then. If you suggested Rust you would have been proposing far more of a gamble than now.

Second, I don’t think this is preventing better formats due to safety conservatism - AVIF is at roughly 83% and climbing – but it does support the argument that the Chrome team has too much unilateral control over the web. I don’t think their opposition to JPEG-XL is solely based on the support cost.


Agree. This is one of the reasons it's better to go with older and more reliable JPEG for viewport streaming. An exploit chain would need to penetrate screen capture images to pass to the client. Browser zero days do occur and this is why it's important to have protection. For added protection people often use browser isolation, which specifically adds another layer of protection against such zero days.

If you're interested in an open source (AGPL-3.0) solution for zero trust browser isolation check out BrowserBox on GitHub. We are using JPEG (now WebP) for viewport pixels: https://github.com/dosyago/BrowserBoxPro

Technically, we did try using WebP due to its significant bandwidth gains. However, the compute overhead for encoding versus JPEG introduced unacceptable latency into our streaming pipeline, so for now, we're still against it. Security is an additional mark against the newer standard, as good as it is!


“JPEG is more secure than WebP” is the wrong takeaway here.


Is it? OK, what's the right takeaway to you?


Image decoding is complex


Hm. Ok but why is that the right takeaway?


Eh, I think websites adopt webp for their smaller size than anything else.


Yes but you have to pay a LOT in bandwidth for a <10% savings to be worth the cost of supporting an entire extra toolchain and dealing with the support issues (better now but it took a decade not to have “I right-clicked and now I can’t open it” reports from users). Google and Facebook serve that much but most people do not.


For some datacenters, that 10% saving would be worth the effort and could push back costly maintenance to increase egress bandwidth.

And I would argue that beside Facebook, the end user right clicking and saving the image for them to use in an inappropriate manner ( downloading the image is not the issue, using it without permission would cause copyright infringement ) would be an issue for some of the website that are hosting the image.


> For some datacenters, that 10% saving would be worth the effort and could push back costly maintenance to increase egress bandwidth.

No argument - my point was simply that very few sites on the web fall into that category.

> And I would argue that beside Facebook, the end user right clicking and saving the image for them to use in an inappropriate manner

That’s only true for a subset of sites, only to the extent that this wasn’t covered by fair use, and it came up enough that it was a common objection.


We use webp internally for storing very small images that are cropped out of larger images (think individual bugs on a big strip). Webp lets us get them small enough we can store the binary image data directly in postgres which was a lovely simplification.

(We evaluated it for storing a bunch of other stuff but didn't find it worth the compatibility and need to transcode problems)


From experience, in many cases it's 50% savings when done correctly and considerably makes the app\website faster on large images when you have 20-50 images to load on one page.


Interesting - I’ve never seen that much compared to mozjpeg and other optimized codecs. We also lazy-load large images.


> the cost of supporting an entire extra toolchain and dealing with the support issues

Why I love features like Fastly's Image Optimizer. No extra work on our end but we get the bandwidth savings https://www.fastly.com/products/image-optimization


It would be cool if an app could use an user-provided “codecs” for all sort of common (media) things. That way I can determine an acceptable risk myself customized for my circumstances.

Maybe I’ll use an open, safe but feature incomplete webp implementation because I don’t care if three pixes are missing or the colors are not quite correct. Maybe I’ll provide a NULL renderer because I just don’t care.

I know this sounds stupid, but a man can wonder.


You can use WASM for image codecs, like my JPEG XL decoder: https://github.com/niutech/jxl.js

There is also WebCodecs API: https://developer.mozilla.org/en-US/docs/Web/API/WebCodecs_A...


Did you know the path for decoding JPEGs on Apple hardware goes through the kernel?


Do you have a reference for this? that seems like a huge security issue with 0 benefit since decoding can be done easily in userland.


Performance


You mean AppleJPEGDriver? Yes, but I'm pretty sure I learned about that from one of your older comments.


it also has lossless capability as well, that's a big one to me, it can replace png and jpeg for 99% of use cases.


Yeah, I'm not saying it has _no_ value but I don't think it's enough so to be compelling - it's widespread enough that we're probably going to be supporting it for the next few decades but the time where it had an advantage will be what, half a decade?


It's much better from my experience.


I mean, transparency is nice but I think it is reasonable to ask how much value a format will add relative to the cost & risk. WebP has already peaked but for compatibility it’ll need to be supported for decades to come.


This will be in today's Firefox 117.0.1 and Fenix 117.1.0: https://hg.mozilla.org/releases/mozilla-release/rev/e245ca21...


FTR there is a WebP decoder implementation in safe Rust in the image crate: https://github.com/image-rs/image

It used to be quite incomplete for a long time, but work last year has implemented many webp features. Chromium now has a policy of allowing the use of Rust dependencies, so maybe Chromium could start adopting it?


If Chrome had a flag for "use the safe versions of XYZ libraries, even if they might be slower" I would absolutely flip that switch and take the performance hit.


Right, but for most users, Chrome's differentiator is that it's fast. I can't imagine them flipping the flag on by default - and if 99.99% of users aren't going to use it, it's probably not worth it to bloat the binary size either. Maybe someone will make a chromium patch with it that people can keep rebasing, though


> Chrome's differentiator is that it's fast.

I don't think that's the whole story. Chrome aggressively marketed itself around its sandboxing and security at a time when browser drive bys were huge. Anyone I recommended Chrome to, I did so because of security, not performance.

> if 99.99% of users aren't going to use it,

Ideally the crate would get closer and closer to the performance required to enable it entirely. I just meant as a temporary state.

Also, I think more than .01% of users would enable it if there were a checkbox on install like "We can make your browser a tiny bit slower but potentially increase its security".

> it's probably not worth it to bloat the binary size either.

eh, there's so much in `chrome://flags` I don't think that they mind too much? idk


It's crazy to me in 2023 we're still writing new C/C++ code for something with as enormous an attack surface as a web browser.


The original commit in question: https://github.com/webmproject/libwebp/commit/f75dfbf23d1df1...

The commit that fixes this bug: https://github.com/webmproject/libwebp/commit/902bc919033134...

The original commit optimizes a Huffman decoder. The decoder uses a well-known optimization: it reads N bits in advance and determines how many bits have to be actually consumed and which symbol should be decoded, or, if it's an N-bit prefix of multiple symbols, which table should be consulted for remaining bits.

The old version did use lookup tables for short symbols, but longer symbols needed a graph traversal. The new version improved this by using an array of lookup tables. Each entry contains (nbits, value) where `nbits` is # bits to be consumed and `value` is normally a symbol, but if `nbits` exceeds N `value` is interpreted as a table index and `nbits` is reinterpreted as the longest code length in that subtree. So each subsequent table should have `2^(nbits - N)` entries (the root table is always fixed to 2^N entries).

The new version calculated the maximum number of entries based on the number of symbols (kTableSize). Of course, the Huffman tree comes from an untrusted source and you can easily imagine the case where `nbits` is very big. VP8 Lossless specifically allows up to 15 bits, so the largest possible table has 2^N + 2^15 entries when every single LUT is mapped to its own secondary table, and doing this doesn't need that many symbols (you only need 16-N symbols for each table). Amusingly enough the code itself had a mode where only the table size is calculated (VP8LBuildHuffmanTable called with `root_table == NULL`), but it wasn't somehow unused and the fixed maximum size was assumed. So if the Huffman tree was crafted in the way that maximizes the number of entries, it will overflow the allocation.

To be fair, I can see why this happened; the Huffman decoding step is one of the most computationally intensive part of many compression format and any small improvement matters. The Huffman decoder optimization described above is well known, but the longer code case is commonly considered less important to optimize because longer code should rarely appear in general. The original commit message refuted this, and was able to be merged. I'm not even sure that a memory safe language could have prevented this; this is a rare case where you actively want to avoid the overflow check [1].

[1] I should note that the memory corruption occurs during the table construction, which is not a tight loop, so a partial overflow check would be very helpful. The actual fix never altered the `ReadSymbol` function at all! But the safety of the tight loop should be still justified, so the wrong justification can ruin everything.


> I'm not even sure that a memory safe language could have prevented this; this is a rare case where you actively want to avoid the overflow check.

This component should be written in WUFFS. If you were correct, that the bounds check isn't needed that's fine, WUFFS doesn't emit runtime bounds checks. If, as was the case here, the software is wrong because it has a bounds miss, that won't compile in WUFFS.

You might be thinking, "That's impossible" and if WUFFS was a general purpose programming language you'd be correct. Rice's Theorem, non-trivial semantic properties are Undecidable. Fortunately WUFFS isn't a general purpose language. The vast majority of software can't be written with WUFFS. But you can write image codecs.


I agree that Wuffs [1] would have been a very good alternative! ...if it can be made more general. AFAIK Wuffs is still very limited, in particular it never allows dynamic allocation. Many formats, including those supported by Wuffs the library, need dynamic allocation, so Wuffs code has to be glued with unverified non-Wuffs code [2]. This approach only works with simpler formats.

[1] https://github.com/google/wuffs/blob/main/doc/wuffs-the-lang...

[2] https://github.com/google/wuffs/blob/main/doc/note/memory-sa...


To me it looks like more generality is neither necessary nor desirable.

It's better that the codec is 100% safe but there's a tiny amount of unsafe glue code, than make the codec unsafe to add generality.

Google has smart people, finding the bug in 15 lines of glue code before it goes out the door is definitely easier than finding this C bug.

This is the same thing that powers Rust successfully, but taking it further.


> It's better that the codec is 100% safe but there's a tiny amount of unsafe glue code, [...]

I meant that many codec cannot be easily made in such way because memory allocation can occur here and there.


> many codec cannot be easily made in such way because memory allocation can occur here and there.

This is an extremely vague complaint and thus suspicious. Of course we could imagine an image format which decides to allow you to declaratively construct cloud servers which transmit XML and so it needs DRM to protect your cloud service credentials - but I claim (and I feel like most people will agree) the fact WUFFS can't do that is a good thing and we should not use this hypothetical "image" format aka massive security hole.

Try specifics. This is a WebP bug. For a WebP codec, where does it need memory allocation? My guess is it does allocations only during a table creation step, and after it figures out how big the final image is. So, twice, in specific parts of the code, like JPEG.


WebP Lossless is a comparably simple codec; it is basically a PNG with more transformation methods and custom entropy coding. So I believe it is probably more doable to rewrite in Wuffs than other codecs. The entirety of WebP has a lot more allocations though, a brief skim through libwebp shows various calls to malloc/calloc for, e.g. I/O buffering (src/dec/io_dec.c among others), alpha plane (src/dec/alpha_dec.c), color map and transformation (src/dec/vp8l_dec.c), gradient smoothing (src/utils/quant_levels_dec_utils.c) and so on.


I've written a WebP Lossy decoder in Go and I'm confident that I can write one in Wuffs too.


If you want to open an image file of unknown resolution, you might want the library to allocate and return a byte array matching the resolution of the image.

Oh, you could design a more complex API to avoid that - but I'm a programmer of very average ability, ask me to juggle too many buffers and struts and I'll probably introduce a buffer overflow by mistake.


> This is an extremely vague complaint and thus suspicious.

Look I have nothing against using WUFFS but I don’t think you’re in a strong position to determine what’s suspicious: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...


I mean, sure, I think you should solve this class of problems with WUFFS. I wrote that for Apple a day or two ago, I wrote it here about Google, I'd say it to Microsoft, I'd say it on a train, or a boat, I'd say it in the rain or with a goat.

It's not going to stop being true, and I don't get the impression most HN readers already knew this sort of work should be WUFFS, so I'm going to keep saying it.

You can pair me with lots of things, "tialaramex vaccine" works, "tialaramex oil", "tialaramex UTF8", "tialaramex Minecraft".


> memory allocation can occur here and there

What about HW decoders? Of video formats (that are much more complex than image decoders)? Such as MPEG, H.264, H.265, VP9, AV1 and similar. If memory allocation is needed here and there, I guess there is always known maximum size of such allocation in advance. Written in the spec. Think of at chip design time, or at software decoder compile time. How else would HW decoders be even possible?

Also: Hey Google, do you even fuzz-test? Your own stuff?


Looks pretty interesting, I wonder it the result produced by WUFFS transpiled code could be used as drop-in replacement

Of course the issue with transpiled code ( same for generated code ), is that is making debugging a lot more difficult.


Chrome's GIF rendering is already WUFFS, but no, this would not be a "drop-in" replacement because it needs glue. I assume the GIF decoding has glue but you can't "just" reuse it.


Wait, really? (Searches) Oh indeed [1]. I should note that the glue code is more than 1000 lines of C++ though. Still glad to see that it got into the production.

[1] https://source.chromium.org/chromium/chromium/src/+/main:thi...


Wuffs and SkWuffsCodec.cpp author here. Wuffs' GIF decoder has shipped in Google Chrome since July 2021 (milestone M93).

A fair chunk out of that 1000 lines of glue code is adapting Wuffs' API to Skia's API. (Skia is the 2-D graphics library used by Chromium and many other projects). Specifically, GIFs can be animated, Wuffs' animation API is designed for sequential access and its state needs O(1) memory but Skia's SkCodec animation API allows random access and needs O(N) memory, where N is the number of animation frames.

Random access means that, after decoding frame 100, the SkCodec can rewind and produce frame 70 (by scanning backwards through its O(N) state to find the most recent I-Frame equivalent that's <= 70 and replaying forward from there).

Random access seems a bit of a weird feature to me, but Chrome/Skia's old GIF codec could do it, for whatever historical reasons, so the new Wuffs-backed one does too (even though it needed a chunk of glue code).


As pointed out by another commenter, WUFFS would not itself allocate memory, which is fine, but would need some additional help to perform that task


Thanks for your explanation.

I’m not a C programmer over the past decade+ and was never very good at it anyway. But I was thinking, based on your description, I agree that a bounds check would have caught the issue, but I’m also curious if an automated test could have been constructed to catch this sort of thing. I personally work with code where you could factor out some calculations to their own function and test them in isolation. Perhaps that would be tough to do here because of performance; I am really not sure.


I believe it is possible to construct a dedicated automated test, but a normal fuzzing wouldn't be very effective in this case because you need some non-trivial logic to construct a Huffman tree with very large LUT. It's like a condition like phi(input) equals to the specific number [1]; you can construct an input by hand, but fuzzers generally have no idea to approach that. I do think this was found by fuzzing, but it is entirely reasonable that fuzzers took much time to trigger the actual memory corruption.

That said, this particular bug could have been avoided by a careful coding; VP8LBuildHuffmanTable could have received the maximum size of `root_table`, just like snprintf. It still would have needed much time to find but at least it could never be a security bug.

[1] https://math.stackexchange.com/questions/265397/inversion-of...


I guess my thinking was based off of this:

> Each entry contains (nbits, value) where `nbits` is # bits to be consumed and `value` is normally a symbol, but if `nbits` exceeds N `value` is interpreted as a table index and `nbits` is reinterpreted as the longest code length in that subtree. So each subsequent table should have `2^(nbits - N)` entries (the root table is always fixed to 2^N entries).

Given that, wouldn't it be only natural to construct a test where nbits exceeds N, so that you exercise the code where `value` was being interpreted as an index, and `nbits` as the longest code length in that subtree?

Hmm, I guess that this could have been done and perhaps was done (I took a quick glance at the code and didn't really grok it in a couple mins), and the issue would be that even if you did do what I suggest, you would likely not exceed the maximum fixed size with your crafted test, and thus not catch the bug.


After some more reading, I concluded that authors did think about this possibility but wrote a wrong code.

I have mentioned that there was the hard-coded maximum number of entries. This is derived from zlib's enough.c [1], which determines the maximum possible size of 2-level table for given number of symbols, N and the maximum allowed bit length (here 15). I've verified that those numbers indeed come from this program:

    for ((color_cache_size = 0; color_cache_size < 12; ++color_cache_size)); do
        # Note that color_cache_size = 0 entirely disables the color cache, so no symbols
        ./enough $((256 + 24 + (($color_cache_size > 0) << $color_cache_size))) 8 15
    done
So at the worst case, there are 256 + 24 + 2^11 = 2328 symbols possible, and the maximum table size is 2704. (Caveat: I couldn't verify values for color_cache_size >= 8 with the current version of enough.c. Probably the value was calculated with an alternative implementation using bigints.) So this bug cannot be found if any randomly constructed Huffman tree was thrown!

But enough.c states that it covers "all possible valid and complete prefix codes". In the other words it assumes that the Huffman tree has been already verified to be correct. If it's not the case, it is very easy to make much worse cases. For example `enough.c 280 8 15` returns 654, which is possible with the following tree:

    Len  Code range                            #         Root entry         Overhead  #
    ---  ------------------------------------  ---       -----------------  --------  ---
    1    0                                     1         0xxxxxxx           0         128
    9    10000000:0 .. 11110110:1              238       10000000-11110110  2^1       119
         11110111:0                            1         11110111           2^2       1
    10   11110111:10 .. 11110111:11            2
         11111000:00 .. 11111110:11            28        11111000-11111110  2^2       7
         11111111:00 .. 11111111:10            3         11111111           2^7       1
    11   11111111:110                          1
    12   11111111:1110                         1
    13   11111111:11110                        1
    15   11111111:1111100 .. 11111111:1111111  4
But the following partial tree should be able to reach 768 entries:

    Len  Code range                            #         Root entry         Overhead  #
    ---  ------------------------------------  ---       -----------------  --------  ---
    9    00000000:0                            1         00000000           2^7       1
    10   00000000:10                           1
    11   00000000:110                          1
    12   00000000:1110                         1
    13   00000000:11110                        1
    14   00000000:111110                       1
    15   00000000:1111110 .. 00000000:1111111  2
         00000001:0000000 .. 00000010:1111111  256       00000001-00000010  2^7       2
         00000011:0000000 .. 00000011:0001111  1         00000011           2^7       1
So the real issue here is that the lack of tree validation before the tree construction, I believe. I'm surprised that this check was not yet implemented (I actually checked libwebp to make sure that I wasn't missing one). Given this blind spot, an automated test based on the domain knowledge is likely useless to catch this bug.

[1] https://github.com/madler/zlib/blob/master/examples/enough.c


I’m quite surprised coverage-guided fuzzing didn’t find it almost immediately, to the point that I wonder if it had even been set up.



Also interesting: Shortly after there is another commit that is related to oss-fuzz: https://github.com/webmproject/libwebp/commit/95ea5226c87044...

This could mean Google optimized their fuzzers for libwebp after finding that bug and now they're finding more.


Seems to be reported by Apple and looks a lot like this security update: https://support.apple.com/en-us/HT213906


Reported by Apple “and the citizen lab at the UoT Munk School”, which is exactly who’s attributed on the page you link to. So yeah seems pretty likely, either Apple uses libwebp internally in ImageIO or they made a similar mistake.


Former


It is interesting to see how different the reactions are now than last week


There is a long history of vulnerabilities in image codecs. The actual image processing is nice linear code you could write in memory-safe FORTRAN IV but compression means you have a lot of variable length data structures, pointer chasing, etc... And the pressure to make it run fast.


Does this affect electron as well? If so, what versions?


It does, see [0]. Fun fact: Signal desktop, which uses Electron under the hood, is running without sandbox on Linux [1][2].

[0] https://github.com/electron/electron/pull/39824

[1] https://github.com/signalapp/Signal-Desktop/issues/5195

[2] https://github.com/signalapp/Signal-Desktop/pull/4381


To be fair, if signal didn't use electron, they would probably have used libwebp still, and then they would either have to implement sandboxing themselves, or they would be subject to the same vulnerability. Still, that they didn't prioritize this is sad.


If they didn’t use electron they’d probably not even have a Linux client, so there would be no problem!

But the lack of sandboxing when it’s available, for something like Signal, is kind of weird. I wonder what the reasoning was.


I just love the unnecessary jousting with the stupid stale bot in 5195


Is there a realistic path to exploiting this? From what I hear, heap-spraying on 64bit is no longer practical. Is there a predictable object in memory that could be overwritten?


It's being exploited in the wild, so yes. Heap Spraying is definitely still a thing for kernel exploitation on 64bit, idk about the primitives people use for v8 exploitation.


And people wonder why the pentagon enforced user to run browsers in a VM or print out websites to read as flattened pdfs


Heh, ah yes, pdf: the MOST sekure file format :-D

But seriously, depending on the attack vector in mind, using those "browser over VNC" technologies may address a lot of that risk if the objective is to read content without risking the local workstation running arbitrary code


I'm sure that in 2023, no mainstream software runs image decoders outside a sandbox right??


... and then people chain it with a sandbox escape.

This bug is the initial vector of last week's NSO Group zero-click exploit for iPhones: https://citizenlab.ca/2023/09/blastpass-nso-group-iphone-zer...


The sandbox escape is the 'bigger' issue really.

Applications parse crazy complex stuff to do everything they do, so obviously have a really big attack surface. Often the complexity is unavoidable - if you are a web browser, you cannot avoid parsing html for example.

However the sandbox is designed to have an attack surface as small as possible (and be configurable via permissions to have the bare minimum needed). The sandbox interfaces with the rest of the system are fully controllable by Apple, so there is no need to be passing complex and dangerous legacy datastructures across the boundary either.

Therefore, it should be the sandbox that is 'hardest' to break out of.


Your point on sandboxes reduce attack surface is good.

> so there is no need to be passing complex and dangerous legacy datastructures across the boundary either.

lol, by same logic there is no need to be passing complex and dangerous legacy stuff to browser to parse, just rewrite the world to be simpler.


Yep, not following best practices was banned late 2022 so thankfully your fave outsourced shop is only shipping High Quality and Secure Code (tm)

Oh wait, that whole LLM thing...


Right, so all attacks now go:

1. Send payload to a process (image in a browser say)

2. Use payload to get code execution

3. If your current process has all the access you need then go forth and conquer; else

4. Send a payload to another process, go to step 2

Sandboxes are mitigations and roadblocks that increase the complexity of an attack they don’t make them go away.


Will libwebp get a new release/tag soon?



The WeBP and WeBP2 lib are a lot more tested than others format. I tried the official jxl lib and the heic tools, and I was able to identify bugs among these tools.

I'm pretty sure that you can find a buffer overflow in these decoders.


WebP2 is highly experimental, so don't count on it being safe.

libwebp is relatively simple and clean C code, and has been around for over a decade. There are definitely scarier codec implementations than this.


It would be interesting to know how this vulnerability has been found.


https://citizenlab.ca/2023/09/blastpass-nso-group-iphone-zer...

> Last week, while checking the device of an individual employed by a Washington DC-based civil society organization with international offices, Citizen Lab found an actively exploited zero-click vulnerability being used to deliver NSO Group’s Pegasus mercenary spyware.

And since the initial bug appears to be in Google's webp library other programs are also vulnerable.


Okay, but I mean the initial finders of the vulnerability. If they are conducted manual code reviews, automatic static analyses or fuzzing.


Visit the link in the comment to which you're replying.


I think he is talking about NSO group, or whoever sold it to them. However, obviously that's unlikely to get revealed.


I have to imagine that NSO group just watches the commit log for "I optimized webp" lol it's free CVEs


All of the above.


It's 2023, surely this is not yet another bug related to memory unsafety that could be avoided if we'd stop writing critical code that deals with extremely complex untrusted input (media codecs) in memory unsafe languages?

Yep, of course it is: https://github.com/webmproject/libwebp/commit/902bc919033134...

I guess libwebp could be excused as it was started when there were no alternatives, but even for new projects today we're still committing the same mistake[1][2][3].

[1] -- https://code.videolan.org/videolan/dav1d

[2] -- https://github.com/AOMediaCodec/libavif

[3] -- https://github.com/AOMediaCodec/libiamf

Yep. Keep writing these in C; surely nothing will go wrong.


We all know that’s The Price Of Freedom. As they say, “give me exploitable trivial memory safety bugs or give me authoritarian dystopia” (https://news.ycombinator.com/item?id=27638940).


Dystopia? Given a choice, you'd rather trust a compiler than a human given the billions already lost to buffer overflows, read after free, and other memory related bugs, surely?


We don't have a very good mainstream [1] solution for critical and performance-hungry applications. I do think that they are limited in number (many applications are not that performance-hungry) and Rust is already good enough, but those applications need the proof for correctness so that more dangerous code---say, what would need `unsafe` in Rust---can be safely added. This particular bug could have been reasoned once you know where to look at, after all.

[1] In my definition, this translates to "as popular and user-friendly as sanitizers".


> those applications need the proof for correctness so that more dangerous code---say, what would need `unsafe` in Rust---can be safely added

There are actually already tools built for this very purpose in Rust (see Kani [1] for instance).

Formal verification has a serious scaling problem, so forming programs in such a way that there are a few performance-critical areas that use unsafe routines seems like the best route. I feel like Rust leans into this paradigm with `unsafe` blocks.

[1] - https://github.com/model-checking/kani


Yeah, I'm aware of Kani, but it is not optimal for many reasons, including but not limited: hard to install in some cases, loop unrolling is mandatory and does not scale in general (this is a main issue with bounded model checkers, not just Kani), and hard to give hints for efficient verification and/or harness generation. And Kani makes use of CMBC, which is a model checker for C/C++, so Rust is not even special in this regard.


It should be possible with Ada/SPARK. It's focused on low level systems programming and compilable programs are free from run-time errors. In addition, the correctness of the program or key properties can be proven at compile-time.


> "as popular and user-friendly as sanitizers" solution for critical and performance-hungry applications.

Google just did enable Memory Tagging Extension (ARM's hardware accelerated sanitizers) on the Pixels last year. May be sanitizing ISAs offer hope?


MTE is good to have, but it only reduces the probability of successful attacks (by a factor of 16). If the process is allowed to segfault multiple times attackers have a good chance to work around MTE.


That's a general limitation with tagged pointers, I belive? Have you come across demonstrations of such exploits?


Do you have a source for that? AFAIK Pixel 7 uses an ARM v8.2 CPU, which doesn't support MTE. The upcoming Pixel 8 might support it, but Google hasn't announced anything yet.



Why aren't people using safe C compilers or libraries and stuff like that? Do they affect performance that much? If yes then what about libraries written in C so that they can be used in other languages (meaning performance is not the number one concern)?


This is a great question.

The answer is that it's literally impossible to write a "safe C compiler" since the language is inherently memory unsafe.

There are various static analysis tools that can try to simulate C programs and try to automatically discover memory management bugs, but due to fundamental limitations of computation they can never catch all possible faults.


How difficult is it to make a compiler extension that remembers buffers' size and checks if we're overflowing at each access? It could be used at least just in debug versions of critical software.

It doesn't sound impossible to me but I know nothing about compiler development :)


Hard. Apple actually has a RFC for this where functions taking buffer-like parameters are adjusted to take an additional length parameter and then the compiler edits the code to plumb lengths through all of these things to insert a bounds check at use. This can work in many cases, but not all.

Rolling out this sort of change across a large codebase is hard as shit. While it sounds like it is mostly transparent, as soon as you run into a sufficiently large codebase all sorts of things start blowing up that you need to fix by hand before such a feature can be rolled out.

You can also do this with pointer tagging and some other techniques, but without hardware support this is amazingly slow. You can see just how much slower an asan build is, for example.


Apple is basically catching up with the Windows XP SP2 effort, which lead to the introduction of SAL annotations on Windows, and yes it was the reason for its delay.


I think the short answer is "trivial in some cases, impossible in others". It's almost certainly possible that your compiler could inspect every allocation and tag each pointer with it internally. The problem comes with everything else - once you add loops and conditionals the length of that pointer can be all over the place. You'd basically need a symbolic executor tracking every pointer.

There are some big issues with this:

1. It's slow. Symbolic execution involves the interpretation of your program.

2. It would be imperfect and you'd likely have false positives.

3. It would likely be incomplete - for example, how would you handle the situation of only having a header?

So it's a good idea but it's very hard to make practically useful.


The easy way to do it involves changing the ABI of pointers so that they are now (address, bounds) pairs instead of just addresses. However, an awful lot of C code assumes that a pointer is just an address, and changing the ABI in this way will break the vast majority of non-trivial programs. (Witness the difficulty CHERI has in getting major software to work with it.)


You can, it's called valgrind (or more accurately, memcheck). And people don't use valgrind because it is slooooooow. Dynamic checking is useful, but not an ultimate way to go.


> remembers buffers' size

Where?

Once you have a bare pointer, you've lost track of what the original definition might have been, so you (the compiler / runtime / programmer) have no way of knowing that you've exceeded the size.


That's not true, it is merely true on most ABIs. The only case where C really erases this information is casting to uintptr_t and back.


Unless it is clearly specified on the ISO C standard, it is true in practice, and something that is impossible to rely on.


gcc also has some builtins to check pointer sizes when the compiler is able to figure it out.

https://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Object-Size-Che...

Which is why I harp on the idea that the real problem is the gold bricks on WG14 who are intentionally blocking improvements to make C safer.

Also point out that if you can implement C on 16bit 0x86's segmented architecture you can certainly implement C with phat pointers too.


It's trivial but Big Tech is in bed with Big Hacker

Or it's hard like everyone keeps saying.

I'm going with the second option


...What is a "safe C compiler"?


Check on the fiction section sir. Here is the CS one.

Although I think all the C compilers are safe ish lately. I haven't seen exploits that target defects in output. Usually the error is ID10T located in the prekeyboard device.


...a textbook oxymoron?


Not necessarily a "safe compiler" but maybe safe library for containers and things like that. It seems to me that most if not all major C projects just run sanitizers and static analysers.


rustc /s


All decent C compilers have compilation options so that at run-time any undefined actions, including integer overflow and out-of-bounds accesses, will be trapped.

The only problem is that these options are not the default and most C developers do not use them, especially for release versions.

I always use them, including for releases. In the relatively rare cases when this has a performance impact, I disable the sanitize options only for the functions where this matters and only after an analysis that guarantees that events like overflows or out-of-bounds accesses cannot happen.

Despite the hype, by default Rust is not safer than C compiled with the right options, because the default for Rust releases is also to omit many run-time checks.

Only when Rust will change the default to keep all run-time checks also in release builds, it will be able to claim that by default it is safer than C.

For now, when safety is desired, both C and Rust must be compiled with non-default options.


> Only when Rust will change the default to keep all run-time checks also in release builds, it will be able to claim that by default it is safer than C.

Which checks are you thinking of? The only thing that comes to mind is that integer overflow wraps instead of panics, but given that bounds are checked, it is still going to be a panic or logic bug rather than a buffer overflow.


It sounds like you're referring to sanitizers.

1. Notably, some sanitizers are not intended for production use. I think this has changed a bit for asan but at one point it made vulns easier to exploit. These aren't mitigations.

2. They're extremely expensive. You need tons of bookkeeping for pointers for them to work. If you're willing to take that hit I don't really understand why you're using C, just use a GC'd language, which is probably going to be faster at that point.

> Only when Rust will change the default to keep all run-time checks also in release builds, it will be able to claim that by default it is safer than C.

The only thing Rust turns off at release is that unsigned integer overflows panic in debug but wrap on release. That wrap can not lead to memory unsafety.


FWIW it is not recommended to use asan+co for release builds. These are designed as debugging tools, if you use them in production builds they may actually open up new bugs. See also: https://www.openwall.com/lists/oss-security/2016/02/17/9

I don't think anyone has built anything practically usable that is meant for production, though it wouldn't be impossible to do so.


It's more or less okay to use UBSan in production though, and that can be good.

But sometimes DoS is considered an exploit, and in that case you don't want to make things easier to crash.


> the default for Rust releases is also to omit many run-time checks.

...because the type system and borrow checker satisfies them at compile-time?

The only checks that are omitted at runtime are:

- checks that are exhaustively proven to be unnecessary by LLVM - checks that can never be triggered in the absence of UB

You shouldn't be triggering UB checks at runtime. If you rely on these checks, you're relying on UB itself, when all UB should be provably impossible.


Do you release with `-fsanitize=address` or what?


You can't really retrofit safety to C. The best that can be achieved is sel4, which while it is written in C has a separate proof of its correctness: https://github.com/seL4/l4v

The proof is much, much more work than the microkernel itself. A proof for something as large as webP might take decades.


> A proof for something as large as webP might take decades.

Assuming that it is even provable in the first place.


Stagefright was the point when I started to tell people to use a safe programming language.

When they ask "Which one?" I am answering "I don't care, as long as it's a programming language that uses a VM".

Every time I hear the discussions about how fast and perfect C is, people seem to miss the point that new programming languages try to avoid complexity, because they were using C/C++ themselves in the past and they fell on their noses more than once.

It's not about what is faster. It is about how often you make mistakes, statistically. And chances are you make dozens of mistakes per day in C that you will never be aware of, whereas in other memory constrained languages they have hundreds of mechanisms implemented to catch those common mistakes.


You should allow Rust as well, at least.


Yeah ofc. My view on programming languages is that both Rust and golang try to use VM methodologies wherever possible when it comes to memory safety and ownership. They just try to do as much as possible ahead of compile time rather than execution time.


[flagged]


Mozilla's wiki [1] says Firefox has shipped more than 20 components implemented in Rust and has several more currently in progress. That sounds like tangible results.

The first component they implemented in Rust, I think, was the MP4 parser, in order to avoid exactly the kinds of vulnerabilities this article is about, which tend to occur in parsing code.

[1] https://wiki.mozilla.org/Oxidation#Shipped


Rust is now being introduced in Chrome, exactly because no matter what, the team keeps getting into these issues.

https://security.googleblog.com/2023/01/supporting-use-of-ru...


Servo was a major start to the Rust ecosystem (+ major help on Rust's path to a final design for 1.0) and also produced webrender and a few other pretty important libraries. Those are absolutely tangible results.


> without actual tangible results

The entire styling engine in firefox is written in rust.


Pretty solid motivation for anyone who tries to tackle the browser market (a task I envy noone for) to go with a language like Rust. Inherent advantages + the incumbents can't get it to work internally.


I don't know there's a "browser market". As it is, I think the existing code bases are better served by gradually employing C/C++ verification and invariant checking tools. And not make the web any more complicated than it already is without a need (other than to maintain a browser cartel, that is).


> anyone tried to tackle the browser market

Bad news, barely anyone is even thinking about it. There are one or two players that are trying to build a new browser from scratch, but they are far from mainstream and nobody knows how long these efforts will exist.


Completely untrue. Servo was from the start a research project, but even then, big components like webrender and stylo are now parts of Firefox, and there's a whole list more here: https://wiki.mozilla.org/Oxidation


What is completely untrue? When the Rust team was let go in 2020, only a very minor part of Servo (Stylo) delivered and became part of FF according to [1]. Yet the Rust core team, already notorious for lone language decisions, engaged in what amounted to an epic risk of a publicity catastrophe for Mozilla. And even in their post mortem in [1], they care more about progressing Rust than about their employer or the (much more important) work they were paid for.

[1]: https://news.ycombinator.com/item?id=24143819


9.6% of Firefox is Rust as of today https://4e6.github.io/firefox-lang-stats/

That is 3.4 million lines.

And frankly I am upset by the way you characterize my post there. First of all, as I said, it was in a personal capacity, not a statement from the team. Second, as I was not employed by Mozilla at the time, of course I care more about Rust than I do Mozilla. However, that does not mean that I did not care for those who were let go, we were all very upset at how unfortunate the situation was. And the broader community stepped up and hired everyone, as far as I know.

And also language like "already notorious for lone language decisions," that is how both the Rust project and Mozilla wanted it! They actively did not control language decisions, even when Rust was small enough that the only folks who worked on it were at Mozilla. Nobody wanted Rust to be a single vendor technology, because of the exact risk of the situation that came to pass. Imagine if Rust were 100% Mozilla at that time, and they abandoned it. Instead of it being a bump in Rust's development, it would have been a catastrophe.

And, your original post: "the entire fsking Rust team." Again, Mozilla employed a very small number of people that made up the overall team. Yes, they were critical people, who did a lot of important work, but "fired the entire team" is straightforwardly inaccurate.


We still don't have an alternative. No, you can't use Arc<HashMap<Whatever>> in the middle of a decoder routine. Some software to actually has to perform well.


In this particular case Rust could have helped, because the table construction can use a normal Rust while the tight decoder loop can remain `unsafe` which can be justified more easily. But I demand more than the human-readable justification.


What Rust brings to the table is static enforcement of its ownership rules, that's it.

How does Rust deal with buffer overflows? Bounds checking. What an innovative solution, congratulations to the Rust people for this groundbreaking innovation. And they keep acting like they've fucking discovered hot water.


Rust is novel because it bundles lots of existing ideas into one usable package. Even the borrow checking is not new, it's adapted and improved from research languages like Cyclone.


No one is saying that bounds checking is novel. It's really C and C++ that are novel, in that they are the only languages where bounds checking is not the default.

But if you're having the "memory safe replacement for C/C++" conversation it shouldn't surprise you that people bring up Rust.


Agreed, the interesting part isn't that hot water exists, it's that given the widespread existence of hot water so many people still insist on exclusively showering with cold water.

I'm not even saying that bounds checking should be used everywhere, just that it really does seem like unsafe shouldn't be the default for so many projects.


Time to repost my favorite C.A.R Hoare quote from 1980, as C was gaining traction.

"Many years later we asked our customers whether they wished us to provide an option to switch off these checks in the interests of efficiency on production runs. Unanimously, they urged us not to--they already knew how frequently subscript errors occur on production runs where failure to detect them could be disastrous. I note with fear and horror that even in 1980, language designers and users have not learned this lesson. In any respectable branch of engineering, failure to observe such elementary precautions would have long been against the law."


I'd say the analogy goes the other way around.

Showering with cold water is obviously safer (no chance of accidentally scalding yourself). But most people prefer showering with hot water because it's the way they've always done it, they're more comfortable with it, and while they could get burned by it, they view the risk of significant damage to be relatively low (if you discover the water is too hot, fix it quickly and you'll probably be fine).


Yea but did anyone else market hot water first? I think not!


Thankfully you can write C-like Rust code if you wanted, just safe. Nobody says that you have to use a "Arc<HashMap<Whatever>>" for this task. Indeed, people have written competitive and safe decoders for various image and video formats in Rust already.


Even if bounds checking made the decoder 10x slower, would that even matter, outside of low end mobile devices? How many milliseconds are spent decoding images in your average website anyway?


If you think Rust is entirely memory-safe, I've got a bridge in Brooklyn to sell you.

Man can make it, man can break it.


That’s like saying we shouldn’t wear seatbelts or have airbags and antilock brakes because you can still die in a crash.

If anything, that’s still underselling it: there are entire categories of bug which are widespread even in well-tested, heavily-fuzzed C codebases like Chrome but would not be likely or often possible in Rust.


Does it have to be? Riding a bicycle with a helmet isn’t entirely safe, but I still prefer to wear one.


Imo other related code of the person who made that mistake should be audited, could be there is similiar mistake somewhere else.

Additionally imo possible that some developers would intentionally make such mistakes to sell these to interested persons, there are millions to be made here.


The breakage [0] was introduced by the creator [1] of the project. If you want to audit 1674 commits over the past 12 years, it'd be easier to just audit the full project.

[0] https://github.com/webmproject/libwebp/commit/21735e06f7c1cb...

[1] https://github.com/webmproject/libwebp/commit/c3f41cb47e5f32...


Establishing a blaming culture would be quite detrimental. Bugs happen, regardless of who commits the code.

A more robust approach is to implement appropriate checks (like fuzzing, code analysis etc) that offer an opportunity to review and correct issues before they ship (and for people to learn from).


The chromium developers do all this but this hasn't prevent this bug.


That's not a good reason to just blame a specific developer instead of having the best practice in place.


It's software, bugs happen.


Many bugs can be prevented.


Why even bother looking at other code this person wrote? Straight to jail for such a capital offense!!! (jeesus dude...)


Nobody said that. The suggestion was that probably that if this is caused by, say, one person having a slightly wrong mental model, it's worth checking if they made similar mistakes elsewhere. Not necessarily implying fault, just a heuristic. (And given the particular author, in this case probably not super useful, but still a decent question)


This does really not sound good


This is a vulnerability in the WebP library, so this isn't only about Chrome. Which software is affected by this?

- Android apps?

- Cross-platform Flutter apps?

- Electron apps?

- Microsoft Edge and Brave (and all other Chromium based web browsers)?

- Signal and Telegram displaying WebP stickers?

- Which image editors are affected? Affinity tools, Gimp and Inkscape seem to use libwebp.

- LibreOffice displaying WebP images?

- server software such as WordPress (does it decode images to re-encode them?)?

Apple devices were also affected, but they already got a fix.

Anything else?


I mean, it would most likely affect any software that uses libwebp. Taking a quick look at Arch's libwebp page:

* allegro

* emacs (lmao)

* ffmpeg

* freeimage

* gd (required for gnuplot, fceux, graphviz, etc)

* godot

* openimageio

* qt5/6-imageformats

* sdl2_image

* thunderbird

* webkit2gtk

* etc

optionally:

* gdal

* imagemagick

* python-pillow

* python-piexif

* etc

Should note that a vuln doesn't mean an exploit, of course.


Firefox?

Android WebView got patched so there's that.



Have only the recent Android phones been patched, or also 3—4+ year old phones as well?


That was the whole purpose of Project Mainline [0], which turned many critical system components into modules that can be updated through the Play Store regardless of manufacturer and OS updates.

Media codecs is one of the first things they turned into a module, specifically for this reason; it is one of the biggest source of security patches. I actually remember hearing a stat that 90%+ of security patches are limited to a very small handful of components (media codec, crypt lib, network stack). So by turning those into modules that can be updated independently of the OS, all Android devices get to benefit from it, even years after they're abandoned by their OEMs.

[0] https://source.android.com/docs/core/ota/modular-system


Phones of that era and later get WebView patched through Google Play, so they'll get the update.


log4j again?


WebP is an image format so it is likely a buffer overflow.

So C/C++ again.


I mean log4j as in a widely used library that has a vulnerability impacting a vast amount of products


It's nice that Google identified this security hole, but I suspect that Google is also the main reason for its existence in the first place: if Google didn't want to reduce by a few percent its costs for serving images, there probably would be no widespread usage of the WebP format.

(Similarly, it is nice that Google finds so many bugs in Javascript engines, but on the other hand they were one of the the main proponents of the bad idea of turning every web page into an executable program in the first place.)

ADDED. OK I was unfair in my guessing as to Google's motivation: the main motivation seems to have been to speed up page loading across the web.


Read some history, friend. People wanted to turn web pages into executable programs even before Google existed. Have you heard of Java applets (Sun)? Have you heard of the beginnings of JavaScript (Netscape)? Have you heard of VBScript (Microsoft)?

This is inevitable. Even if none of the aforementioned companies did it, some other company will.


The network is the computer!

:barf:


How is every webpage executable?


In order to reliably view the web pages I visit, I must be prepared to execute Javascript or more precisely employ a browser to do it for me.

The web was not always that way.


My 15 years of web usage there was no time when I could turn JS off and everything would work fine.


Well, that's pretty bad. User-uploaded WebP images are used in tons of messengers and tons of messengers also use Chrome's render engine to produce "native" apps that get statically compiled (i.e. a single update isn't enough to patch your system, every app needs an update). I hope this is just Google Chrome, somehow.

The secrecy is a little annoying. The page links to two places where you can supposedly find details, but both are internal resources. It's hard to judge how pervasive the bug is, and to find out if other software using Chrome as a render engine are also affected.

Also pretty comical that Google's Project Zero policy is to release details (edit: 7 days after) a public exploit is known to exist, yet the details are kept under wraps when their own software is vulnerable. Good for Apple that they didn't decide to pay Google back in kind.


> Also pretty comical that Google's Project Zero policy is to release details when a public exploit is known to exist, yet the details are kept under wraps when their own software is vulnerable.

Project Zero treats Google like any other vendor (much to the annoyance of other internal teams).


>Also pretty comical that Google's Project Zero policy is to release details when a public exploit is known to exist, yet the details are kept under wraps when their own software is vulnerable

That is not true please stop spreading misinformation.


You're right, after looking it up I found that the time period for actively exploited bugs is seven days. I must've misremembered.


Why release details publicly before people have a chance to update?


To hasten security vendor response is usually the most important reasons. Software vendors, especially corporate ones, will happily keep a known vulnerability under embargo for years if you let them.

Google's page (https://googleprojectzero.blogspot.com/p/vulnerability-discl...) has their reasoning explained in some more detail.


I'm not asking why vulnerabilities should be published, but rather why they should be published before an update is in place. Even the page you linked now explained the importance of waiting 30 days after a fix is in place and even allow a 14 days extension in the case of update cycles.

In the case of the Chrome update, the fix is rolling out during the coming "days/weeks" and you originally complained about the vulnerability not being public. Which raised my question.


Google has a history of releasing vulnerability details and PoCs before updates could be rolled out; CVE-2020-17087 was perhaps the worst example.

The two weeks grace period is in place for your run-of-the-mill 90-day disclosure time period, but for actively exploited bugs that extension period is up to three days.

After digging in deeper, it appears they have the fix rolling out 6 days after the report came in, so they're within their own deadline I suppose. Their statement about publishing the details doesn't mention releasing the details in a month like their own projects would, though: (https://www.bleepingcomputer.com/news/google/google-fixes-an...):

> "Access to bug details and links may be kept restricted until a majority of users are updated with a fix," Google said. "We will also retain restrictions if the bug exists in a third party library that other projects similarly depend on, but haven't yet fixed."

Google's inconsistencies when it comes to disclosure timelines irk me. "Wait until all third parties have also updated their software" isn't a luxury Google provides to others when they're the ones finding bugs. I'm all for swift disclosure timelines and pressure on manufacturers, but every Google team seems to have their own rules and guidelines written to serve themselves rather than general security.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: