Hacker News new | past | comments | ask | show | jobs | submit login
CVE-2015-8126: Multiple buffer overflows in libpng (nist.gov)
205 points by pjmlp on Nov 14, 2015 | hide | past | favorite | 84 comments

Uh oh. It's time for the monthly global computer security meltdown. Again, but worse this time. Security vulnerabilities in libpng are a huge deal; they affect lots and lots of different programs, including things people don't usually think of, and many them embed their own copies of libpng which makes them hard to update. The last time there was a security vulnerability in libpng, it took years before it was fixed everywhere.

Patching this properly will require not just replacing a centralized copy of libpng, but scanning all binaries for statically-linked copies. There might be some tools left over from the last time libpng got into trouble; this would be a very good time to update and link to them. Some categories of software that are at particularly high risk right now:

* Web browsers. Sandboxing likely helps, but there's a possibility that there are some usages (eg favicons) that decode outside the sandbox.

* MMS on Android (possibly remote root on phones with no user interaction)

* Any server that accepts PNG uploads and processes them (eg, user avatar thumbnail generation)

* Anything which shows people user-provided avatar images (eg, IM clients)

* Video games which download levels that can contain images

* File browsers which show image previews

* Music players which display album covers

Expect malicious images to start appearing everywhere, and soon. If you're responsible for the computer security anywhere, this is a drop-everything priority.

This is exactly why static linking/bundled libs/containerization is such a piss poor idea. We are going to be dealing with this vuln for years and years, if not decades to come. The exact same insanity has has happened with embedded copies of zlib in the past.

The solution to dependency management in way too many cases becomes "never update". And those mechanisms allow negligence like that to fester for years ignored/unnoticed.

Technology moves so fast that everyone has short memory, patience and attention span.

So stuff like this happens [this is a tongue in cheek scenario, treat it as joke, not criticism of particular people or framework]:

Developer1: "Argh, I can't figure out dependencies for this library",

Then someone comes along and "fixes" it:

Developer2: "Look, just static compile it / include all deps in a subdir some place. See, fixed! I even made a language / framework thingie to do this for you. Yeah, those stupid old packagers and OS people made such a mess...".

Developer1: "Oh wow so easy, I'll be using this and tell others to use it too." /goes to twitter...

A few years later, everyone uses the new shiny dependency-bundling library/framework thingie. Then a security vulnerabily hits and everyone realizes why those old boring packagers and OS people set things up the way they did. 6 days later everyone forgets and jump on the next new shiny framework. Cycle repeats...

Containerization might well be the solution to this kind of problem: libpng's vulnerabilities no longer matter much if you've isolated it into its own VM with no access to resources beyond the ones it needs to decode a PNG into a pixel buffer. I've been working on a little OS project along these lines, where every process runs in its own VM, and every shared service is provided by a local server. There's no dynamic linking. Instead of dealing with the cost of constantly churning code, and all the instability and unreliability that comes along with it, the idea is to pay for security up front by isolating everything. If you want to decode a PNG you have to spin up a VM, stream the PNG data in, and stream the decoded pixels out. The PNG decoder has no access to the file system, network interfaces, user interfaces, or really anything but stdin and stdout and some memory. If it crashes, it crashes, and from the client's perspective the app just exited with a non-zero error code.

> The PNG decoder has no access to the file system, network interfaces, user interfaces, or really anything but stdin and stdout and some memory.

So you've rediscovered what UNIX has allowed us to do for decades now.

Things like process isolation, memory isolation, file isolation, and hardware isolation aren't new. UNIX systems offered mature implementations of them in the 1980s, if not before then.

Best of all, this functionality was offered without needing to resort to resource-intensive virtual machines or anything of that sort.

Those who want the level of isolation you propose have been able to get it for a long time now. They just tend not to, because it's inconvenient and impractical for real world systems.

That's just not true. A program needs to be started root to execute setuid(2): there is no way for a program started by an unprivileged user to drop privileges. Nor can it drop network privileges. That's why tools like pledge(2) and seccomp had to be added, but they have their own problems. To get the level of isolation GP described, you need some fairly esoteric solutions.

Well seccomp() strict mode (only allows read, write, exit) has been available since linux-2.6.12 (2005). So no need to invent something new, or set up a bunch of cgroups and namespaces, to get processes that can only read stdin and write stdout and use some of their own memory.

Oh, sure, but once you start playing "there's nothing new under the sun", the mainframe guys will eventually come along and explain how they were doing all that in some vastly more sophisticated way back in 1962 using systems none of us have ever heard of. So what? I don't care. I just think that the definition of "practical" is shifting, and I want to play with another way of doing things in case it turns out to suit my needs better.

For something like libpng, there isn't any "instability and unreliability" and the code is hardly churning. Just `apt-get update` or `pacman -Syu` and you've got the patched version, compatible with all binaries on your system.

The only exception would be something like Chrome/Chromium that bundles a bunch of its own tweaked copies of common libraries. Or various closed-source software not controlled by your package manager, like some commercial games which have a linux build available. But many systems, like my home router/fileserver, and my workstation, don't have any of these exceptions, and they were updated in about 2 minutes. (I have no long running network daemons using libpng, but if I did they would be easy to restart.)

We have no processors which can efficiently isolate memory spaces with that fine granularity.

Which is why we should have stopped using C long ago and used languages which could guarantee that kind of isolation with less overhead.

Sounds like Qubes

I'm a gentoo user and before preserved libs came in, updates to libpng required recompilation of almost everything on a system. So while libpng is unknown to most linux users, gentoo peeps are well aware of it.

Anyway, this is exactly the question I wanted to ask, whether stuff like this counts against bundling. I am, however, not very experienced in these things, I'm curious if anyone else who has more experience with distribution of software could weigh in on whether there is a way to mitigate this issue when it comes to people who choose to bundle software for ease for users' sake.

Any sane library should be using a stable ABI. Qt does it with giant libraries but any software compiled against any 5.x release will work with any of them since they require ABI stability. You don't even need to ABI break on new features if you use PIMPL right. Its only when you change your API function signatures that already exist that you break the ABI. And thats the way every library should be.

Most libraries are not sane. API breaks are common, and often security updates are only available in new, incompatible releases.

Then don't use those libraries! How can you trust a third party library to provide functionality to your software if they cannot even release sanely with ABI stability? Its a disservice to you and your users to risk security vulnerabilities like this.

with tons of functionality there is no realistic alternative.

Simplify the interface. Don't combine unrelated functionality in a single library.

PIMPL is an inefficient antipattern when the holder object is itself heap-allocated. You're better off coding against interfaces, COM-style.

It doesn't matter if your holder is heap or stack allocated, the d pointer is always heap allocated and the only reference to it is the private internal. With interfaces you either have an awful dummy static constructor for your private object somewhere or you have a non-virtual parent class with dummy implementations of all its functions and a mangled constructor that produces its private child.

For the developer, interfaces are cleaner, but for users, the interface style is leaky where PIMPL is not. That makes it superior. It also avoids overriding the intent of inheritance as a way of achieving ABI compatibility when it already has a significant and intentional function of runtime polymorphism, and can easily confuse users as to whether the implementation is to preserve the ABI or provide multiple implementations in the same release.

Right, so if your holder object is heap-allocated, you need to dereference a pointer so that you can then call into the implementation. That's inefficient, bad for the cache, and puts needless pressure on the heap allocator. C++ classes built out of pure virtual member functions are just as compatible. If you use something COM-like, you can reference-count and manage them in a uniform way.

PIMPL as commonly described is a waste of good electricity, like most design patterns. Especially because most of the time I see it used, it's from the keyboard of some well-meaning but inexperienced developer who doesn't even need to maintain binary compatibility.

> I'm a gentoo user and before preserved libs came in, updates to libpng required recompilation of almost everything on a system.

I'm also a Gentoo user. (Hi! :D )

ISTR that preserved-libs "just" lets you delay recompilation of affected packages [0] by keeping the old libs around until everything that needs rebuilding to link against the new version has been rebuilt. You still need to rebuild packages. Am I wrong about that?

[0] Or, -alternatively- "just" lets you keep your system working while you rebuild affected packages.

Oh man, the number of times I have delayed updating a system because different programs required different versions of libpng. On the other hand, having used gentoo as my primary distro for nearly a decade I have found that there are some serious benefits to actually knowing what libs are actually involved in creating a functional linux environment.

No that's exactly right. I'm specifically referring to your [0] comment. It allows me to spread the compilation over a longer time (days) while being able to use the old things without breakage. Before, I had to recompile everything and things would be unworkable until everything was recompiled. I now have a number of boxes with distcc so compilation times aren't that bad nowadays. But back in 2010 or so, when I just had a shitty gateway laptop, dealing with libpng version bumps was a weekend affair.

> Before, I had to recompile everything and things would be unworkable until everything was recompiled.

Oh god, tell me about it. I was around for the first libpng upgrade catastrophe, way back when. Portage/emerge has gotten so much better since then. :D

The problem is that it only works with system libraries and when you have a central software repository. You couldn't even do it with an appstore, as you can possibly break tons of software that you can't patch yourself. So what actually happens with dynamic linking there is that old .dll/.dylib files (with vulnerabilities) get copied around.

With zlib I agree though. Static linking doesn't make sense with such a standard library, at least under unix systems.

So do a mix? Make system libs shared and apps bundled when they are needed? Can't the app store work like your package manager anyway? Speaking from ignorance here, I want to know if it's possible.

The Click-n-Run app store from LindowsOS back in 2002 was based on apt and dpkg. So yes, it is possible, especially if there are automated tests that can tell when a dependency update breaks something.

Static linking may be bad for security but good for saving hours of frustration. It's a trade-off and it's not clear that security is automatically the most important thing always.

Just look at any forum about some open source software. There'll usually be endless posts from people complaining that they can't compile it because of some weird dependency errors. These are really widespread and time-wasting problems.

Unnecessary dependencies are a bug. I've seen people pull in half a dozen nested libraries just to avoid writing ten lines of code. Sometimes re-inventing the wheel is the right thing to do.

If a software use libpng to read images that only come from installation directory, I don't think it's really a problem to link statically libpng.

For example a GUI that loads icons, background images, button images etc.

A system can be owned using various exploits combined together. So if someone finds a way to alter the installation images, suddenly a mild trick that would normally just result in funny icons is now an exploit.

Kinda good example right here: https://www.youtube.com/watch?v=CkPAgv1Gjz0

Maybe Nintendo never considered someone screwing around with their network to load an arbitrary image in a place that only Nintendo-Generated Mii png images should be but then I found a way... and if that libpng was vulnerable, suddenly we've gained usermode execution on a Wii U. Or worse yet, similar to the Sony PSP .tiff image exploit, maybe after gaining usermode we find a kernel-mode exploit and attack that. Now, we're near the realm of game piracy and private keys.

The best part about android is that all the important apps are tied together and have to be updated at once. It seems like it really wouldn't be hard to write an MMS worm that stuck around for a while.

Is this more than a denial of service due to crashes?

No one's published an example which causes arbitrary code execution yet (that I'm aware of), but most bugs of this type end up there eventually.

You'd need an infoleak but from somewhere, but I doubt anyone who has one will come forward.

I guess the problem is that you don't know:

> allow remote attackers to cause a denial of service (application crash) or possibly have unspecified other impact

As far as I understand, trying to predict what is possible by exploiting a buffer overflow is hard because it depends on many specifics, such as compiler options used, OS protections, ...

No. Buffer overflows can be used for privilege escalation.

Crashes are better than privilege escalation, but they do not mean that privilege escalation is impossible.

A good opportunity to check how https://github.com/PistonDevelopers/image-png is doing (a PNG decoder written in Rust). Looks like it includes bindings to use miniz.c for DEFLATE decoding as well as "inflate" (which seems to be DEFLATE in Rust). Also, it seems to have a fuzzing driver (png-afl). Good times!

There are PNG decoders written in a bunch of safe languages.

For instance the JVM uses a Java PNG decoder. So it isn't vulnerable.

What killercup might be implying is that Rust is both a "safe language" and that Rust code can expose a C-compatible ABI, which means that a library written in Rust could theoretically replace one written in C regardless of which other language is ultimately making use of it. For example, this is what Mozilla is working on doing in Firefox by replacing security-conscious components with Rust implementations ( https://bugzilla.mozilla.org/show_bug.cgi?id=1151899 , https://bugzilla.mozilla.org/show_bug.cgi?id=1161350 ).

However, I don't know if piston-image specifically provides such a C interface.

However, if the bug is that the library writes to a too small allocation by the application (based on a lie told it by the library) then it doesn't much matter what language the library is in.

I see there are some downvotes here, which I don't think are deserved. It's true that at some level of your stack you're going to have some code that's just poking bytes into buffers, and that generally takes some manual effort to verify.

For what it's worth, I'd expect this sort of thing to be walled up behind an `unsafe` block in Rust, which if nothing else would lend increased scrutiny in an audit.

I believe the point tedunangst is trying to make is that if the API works something like this:

  size_t sz = lib_get_buffer_size();
  char *buf = malloc(sz);
and there's a bug in lib_get_buffer_size that causes it to return too small a number, but lib_fill_buffer assumes it's big enough, you've got a vulnerability regardless of what language the library is written in. This isn't a bug in the application, because it's doing what it was told to do. It will be hard to spot in an audit of the library because the bug is in lib_get_buffer_size, which probably contains no unsafe blocks, while lib_fill_buffer probably looks fine.

It's a deficiency in the API, which should require the buffer size to be passed to lib_fill_buffer so lib_fill_buffer doesn't have to make any assumptions about the size. But if you're trying to preserve compatibility with existing C APIs, you might be stuck with APIs like this.

To clarify further, the idiomatic usage of `unsafe` in Rust stipulates that if you can't guarantee that your function is memory-safe for all possible inputs, then you must mark the function itself as `unsafe` to force callers to be aware of the risk. Obviously if you're both calling this theoretical function from a language without an `unsafe` construct and if you're also striving to maintain exact API compatibility with the C function then you can't really make this aware to the caller. If you do have control of the API then the way that this would generally be presented on the Rust side would be to have two functions: a safe one named "foo" that also takes the length as an argument so that you can check at runtime and an unsafe function named "unsafe_foo" that has the same behavior as the C function.

Why on earth isn't this common practice for any software that can be remotely controlled over the internet? Ie anything rendering parts of web pages, opening untrusted files, etc?

Everyone has known what a massive security risk unsafe languages are for a long time. Nearly every vulnerability is a buffer overflow. What's the value in persisting in writing such dangerous code? Just because it's 10% faster than a safer language? Hopefully the likes of Rust bring and end to this.

Perhaps there's a culture of C++ being the only proper language for libraries?

Libpng isn't exactly new. libpng.org seems to be offline at the moment so I'm not having a lot of success finding details - but it's safe bet that it predates Rust, afl-fuzz, and probably even clang's static analyser.

I'm all for saying "now is the time to rewrite ", but it's hard to get upset at that project when anyone else could have gone and done a modern take on the project - I hadn't heard of the earlier quoted Rust PNG library until this happened.

> but it's safe bet that it predates Rust, afl-fuzz, and probably even clang's static analyser.

Not only it predates clang's static analyser, it predates clang itself! The site is now online, and I can see on it news about libpng from last century (1995 to be more exact).

    Some applications might read the bit depth from the IHDR chunk and allocate
    memory for a 2^N entry palette, while libpng can return a palette with up
    to 256 entries even when the bit depth is less than 8.
On initial review, this will require assumptions in the client application to be exploitable. I haven't found a place where libpng itself causes the overrun. It is just an enabler by returning a smaller value for bit_depth than num_palette. If the application double-checks the result from png_get_PLTE or uses PNG_MAX_PALETTE_LENGTH instead of 1<<bit_depth, it won't have a problem except wasting memory.

Unfortunately RFC 2083 contains the requirement that a compliant PLTE will not be larger than the bit depth. So I'm sure many applications that trusted libpng to do the validation will be exploitable. I consider this a caution against promising too much in the spec, as well as not overly relying on an implementation to exactly follow the spec. Remember the robustness principle: be conservative in what you send, be liberal in what you accept.

There's now an idea that that robustness principle was backwards. The classic example being web browsers which were so liberal they encouraged all sorts of weird non-standard web pages which now have to be supported forever.

If a PNG app trusts the spec, and the library doesn't comply, and that causes the app to crash. That's a good thing overall. It would lead to somebody discovering the problem and hopefully fixing it. It's only a bad thing for the specific app that crashes. Or in this case, risks getting exploited. Maybe for security, the principle doesn't apply the same way.

Does anyone actually interface libpng themselves? The library is pure insanity if all you care about is getting the image out in some format.

I'd wager that there is one libpng and then only about 4 or 5 distinct pieces of code that use it that everyone else in turn is using. If that code is wrong, that would still make it widely vulnerable.

Github search: png_get_PLTE

    C    48892
    C++   2916
    Obj-C   36
Apple programmers aren't using it directly at least. Nor Android, which is not exploitable[1]. Chromium also uses SKIA I believe. The C number is obviously inflated by the number of repositories that contain a copy of libpng.

If everyone is using common glue code then I think it's more likely this won't be a problem since the middleware library probably noticed the bug and works around it. It's when developers hastily reinvent the wheel that sloppy validation bugs like this are problems. It's why the maxim "never write your own crypto" exists. Or as Mark Twain said[2], "put all your eggs in one basket and _watch_that_basket_."

But as I said, there's no actual buffer overflow in libpng. This is a validation bug. The headline and the NVD summary are misleading.



Ah, that's what I came to the comments to find out, having just heard about an Android vulnerability in Chrome. The way the news of that vuln read (that one just needs to visit a web page) I thought this vuln would certainly fit that description.

Going through some random code on codesearch.com a lot of code seems to just pass in a png_colorp * as the pallete, which this doesn't break.

The libpng simple decoder sample is pretty reasonable C, except for the fact that it uses setjmp to handle errors. (libpng's fault, not the sample.)

libpng is actually very easy to use directly.

Some other image libraries are hard because they force you to write many code-paths to handle various different cases, but libpng provides many different helpers that make it pretty simple.

Are you referring to setjmp? Because it's quite the opposite. This blog article does a good job explaining: http://latentcontent.net/2007/12/05/libpng-worst-api-ever/

Huh? I'm not referring to setjmp at all... (why would you think I'm referring to setjmp?)

The PNG format has lots of variations and optional bits that would be quite annoying if you needed to handle them all yourself. I'm referring to the many various flags, options, and little helper functions in libpng that make handling all these format variations quite painless for the user of libpng.

[Some libraries, e.g. libtiff, provide mechanisms to abstract out the low-level details, but still expect the application to explicitly handle many format variations, and this is quite annoying.]

libpng is hardly perfect; it's old, and that certainly shows in various ways (e.g. the use of setjmp). But to call it the "worst API ever" is just wrong.

Ah, I thought you were referring to setjmp with this:

> Some other image libraries are hard because they force you to write many code-paths to handle various different cases, but libpng provides many different helpers that make it pretty simple.

It seemed that you were thinking of error code handling as a "code-path" and setjmp/longjmp being a mechanism to reduce code paths.

Anyway, it sounds like we're in agreement. It's not that big of a deal to use libpng directly, and "worst API ever" is too strong. The author has probably not done enough Windows API programming to find the worst API ever :-).

From CVE:

>and 1.6.x before 1.6.19

Unfortunately, the latest version on libpng site [0] is 1.6.18.

Why was this CVE announced before the patch and version update was released?

[0] http://www.libpng.org/pub/png/libpng.html

Actually, the patch was announced before the details of the bug: I subscribe to the libpng mailing list, and got about a day's notice to ship the patch out before the CVE and associated details were widely publicised. They did a good job of notifying us quickly, before there was chance for the 'bad guys' to jump on my customers.

One day isn't really long enough. I see there is no update for Centos 6 as yet.

The Sourceforge site [0] is up to date:

> UPDATE 12 November 2015:

> The latest released versions are libpng-1.5.24 and libpng-1.6.19 [DOWNLOAD].

[0] http://libpng.sourceforge.net/index.html

Do I understand correctly that for this to be potentially exploitable, the application must allocate too little memory for the palette? In other words, could this be mitigated on the application side by always allocating at least enough memory for a 256-entry palette?

I don't mean to suggest that this is "really" an application bug or that people shouldn't upgrade libpng ASAP; I'm just wondering whether there are other options for installations/applications where that would be painful for some reason.

I wonder if this was found by fuzzing or inspection, because the former seems a quite popular method recently.

The bug was found with American Fuzzy Lop from optipng (https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=787647) and later the optipng author noticed that the same bug also affects libpng (http://sourceforge.net/p/png-mng/mailman/png-mng-implement/t...)

I wouldn't be surprised if this is from AFL.

What are the practical implications of this?

Buffer overflows are basically about memory corruption. They can lead to crashes, remote code execution, if run in privileged code to privilege escalation, etc.

Most modern operating systems have buffer overflow protection technologies such as ASLR. I tried recently exploiting a few guaranteed buffer overflows for fun, and it's getting irritatingly hard at least on Linux. Non-executable stack, *alloc functions have sanity checks, -fstack-protector provided canaries, ... It's possible to get past all that, but it takes a bit work.

I would be freaked if I was running some older operating system, and someone vendored a poorly compiled version of libpng. Windows applications are probably the scariest here, especially when run on older Windows servers...

No input validation in the IHDR as to the number of bits in the palette of the image. As I understand it, you just create an arbitrary image with the palette in the IHDR tweaked to a large number of bits, and watch as the process dies while trying to malloc() enough space to handle a palette that large.


I think it's the other way around: the malicious image specifies a small bits-per-pixel value in IHDR (thus in some sense advertising that the palette will have 2^bpp entries) but then contains a palette with more than 2^bpp entries, which libpng will copy in its entirety when asked to read the palette, overflowing the destination buffer if the application only allocated enough memory for the expected number of entries.

If that's the case, and bpp is always 8 or less, then software where the programmers were "lazy" and always allocated 256 entries for the palette might not be vulnerable.

From a quick search in MXR (https://mxr.mozilla.org/mozilla-central/) for the identifiers "png_set_PLTE" and "png_set_PLTE", it seems that at least Mozilla (Firefox) did the lazy thing and always allocated 256 entries.

Edit: I was right, from https://bugzilla.mozilla.org/show_bug.cgi?id=1224244 "Mozilla is not vulnerable to the security issues that it fixes, when using either the in-tree libpng or the system libpng."

You're precisely right.

Right, so the announcement is kind of vague; let's see what's actually going on. Here are the recent commits to libpng:


The first/latest one ("avoid potential pointer overflow") sounds scary, but I believe it's just about the pathological case where the library is handed (valid) buffers that extend to only a few bytes away from the maximum possible value (i.e. 0xfffff...). If your OS does this, it probably shouldn't.

Going down the list, there are three relevant-looking commits. Listed in chronological order:

[libpng16] Reject attempt to write over-length PLTE chunk https://github.com/glennrp/libpng/commit/81f44665cce4cb1373f...

[libpng16] Prevent reading over-length PLTE chunk (Cosmin Truta). https://github.com/glennrp/libpng/commit/a901eb3ce6087e0afee...

[libpng16] Silently truncate over-length PLTE chunk while reading. https://github.com/glennrp/libpng/commit/1bef8e97995c3312366...

- The first one only substantively changes png_write_PLTE, so it could only be an issue for applications that actually write out PNGs. It changes a check on the ‘num_pal’ argument from a hardcoded 256, aka PNG_MAX_PALETTE_LENGTH, to be based on the bit depth. The png_write_PLTE function is internal and called only from png_write_info, which passes it ‘info_ptr->palette’ and ‘info_ptr->num_palette’ as ‘palette’ and ‘num_pal’, respectively.

- The second one confusingly changes png_write_PLTE again, but just to rename a variable. The substantive change is a similar check on an argument to png_set_PLTE, which is a public function and, incidentally, the only thing that can set ‘info_ptr->num_palette’, unless the application accesses it directly, which is deprecated. (Thus, outside of the deprecated case, it should ensure that the behavior change in the previous patch never actually gets exercised.)

- The third one adds yet another check, to png_handle_PLTE, which is called when reading a PLTE chunk in a PNG file - this time, the variable is called ‘num’. After the check, ‘num’ is used to fill an array of size PNG_MAX_PALETTE_LENGTH - but there was already an earlier check for the length being too much for that, in which case it may longjmp out of the function with png_chunk_error or just return. png_handle_PLTE then calls png_set_PLTE with the same argument, so, in lieu of the added check, the previous commit’s check would still trigger and fail the image load. The new check just changes an error into something the library might be able to recover from.

So the second commit is the important one, and the third demonstrates how png_set_PLTE can be called while reading a PNG with a size argument greater than appropriate for the bit depth, but still <= PNG_MAX_PALETTE_LENGTH. What happens then? The oss-security post says:

> Some applications might read the bit depth from the IHDR chunk and allocate memory for a 2^N entry palette, while libpng can return a palette with up to 256 entries even when the bit depth is less than 8.

So are only applications that do something odd affected? The png_get_PLTE function gives the client the actual num_palette value as an out argument, so any client that uses that to size buffers wouldn’t be affected; nor would one that hardcoded 256 or PNG_MAX_PALETTE_LENGTH. For example, Skia does the latter in SkImageDecoder_libpng.cpp.

Are there any libpng-internal uses that could be affected? I grepped for ‘->(num_)?palette\b’. TLDR: I don’t think so, but if you want the pointless detail, among other less interesting uses were:

   /* Report invalid palette index; added at libng-1.5.10 */
   if (png_ptr->color_type == PNG_COLOR_TYPE_PALETTE &&
      png_ptr->num_palette_max > png_ptr->num_palette)
num_palette being too high could only make this fail, so it doesn’t matter.

png_image_read_header uses it to set image->colormap_entries, which has its own long list of uses. I think this is then used (by the application, via PNG_IMAGE_COLORMAP_SIZE) to determine the size of the colormap buffer, but there may be an issue somewhere.

png_image_read_colormap uses it to generate actual color map data, but this is not obviously dependent on the bit depth, and checked against image->colormap_entries (…in case it changed from earlier?).

png_set_quantize uses a user-specified num_palette value and in any case is not important functionality.

Some PNG_COMPOSE code uses num_palette to write to palette; not a problem…

png_do_expand_palette just looks up entries in palette (png_ptr->palette) by byte-sized indices without checking anything. The palette buffer must be at least 256 entries long, or else it will read out of bounds, which is presumably why ->palette is only set here:

   /* Changed in libpng-1.2.1 to allocate PNG_MAX_PALETTE_LENGTH instead
    * of num_palette entries, in case of an invalid PNG file or incorrect
    * call to png_set_PLTE() with too-large sample values.
   png_ptr->palette = png_voidcast(png_colorp, png_calloc(png_ptr,
       PNG_MAX_PALETTE_LENGTH * (sizeof (png_color))));

(it’s set differently in the dither case, which I don’t think is relevant?) Anyway, it’s not affected by this.

png_handle_tRNS and png_handle_hIST have separate arrays whose count is expected to be <= and ==, respectively, num_palette.

- For trans, the ‘trans_alpha’ buffer itself is always PNG_MAX_PALETTE_LENGTH sized, so there can be no overflow there; it also sets info_ptr->num_trans. This can affect the application if it uses png_get_tRNS.

- The hist buffer is also always maximum size and doesn’t have a separate length field, so it’s covered by me checking uses of num_palette.

png_do_check_palette_indexes checks the bit-depth invariant separately.

hIST again in pngwrite.c and pngwutil.c, nothing else interesting in those files.

I think num_trans is safe in the same way, but I haven’t looked as closely.

Curious if this affects PHP. AFAIKT GD is compiled with libpng for PHP (it's required) and is presumably used for functions like getimagesize() in PHP. That is used by WordPress core for uploads (although you need to be signed in). However I suspect many plugins use GD for unauthenticated image upload, so this could have a wide impact.

If I'm reading this comment https://news.ycombinator.com/item?id=10567353 and this code https://github.com/libgd/libgd/blob/4751b606fa38edc456d62714... correctly, then it looks like GD isn't vulnerable since it uses num_palette to size the palette rather than info_ptr->palette or bit_depth

Unlike, for example, heartbleed, a vulnerability in libpng really requires some contemplation to come up with all the places and ways that a typical production application uses the vulnerable code.

I have web applications that probably use libpng five different ways, each from a different client application and libpng version.

This vulnerability is quite ... foretold. We have had issues with image libraries before. Instead of just running about patching again for this one bug, why not use the opportunity to discuss some architecture-level approaches that may have mitigated this attack?

Knowing that image libraries may contain vulnerabilities, how would you design a secure image ingestion process?

Some reasonable strategies...

(0) Attack surface minimization: Similarly to other types of input, input is segregated and treated as toxic and dangerous until proven otherwise. (Eg. Don't store random inbound files where you store those you will re-use; never allow batch processes to touch temporary files received from external systems; delete external input as soon as possible; store input only after normalization)

(1) Attack surface minimization: only accept one file format. This provides no guarantees but helps to limit exposure to unknown vulnerabilities, though may not always be feasible.

(2) Parallelization (opposite of the above): explicitly convert all formats of input to some common format (using throwaway containers/VMs), ideally using multiple versions of multiple libraries on multiple operating systems, and compare the results. If the results of every environment do not match, raise an alarm and do not continue processing.

(3) Reverse-conversion check. Convert the file to another (non-lossy conversion possible) format, then convert it back (also in a throwaway environment). If the result matches (or the portion of the result that is non-lossy matches) continue. This could be combined with #2 for additional strength.

(4) "Application-level firewall" approach: progressively determine the various format-specific and general properties of the image in question (eg. starting with libmagic; and preferably all in relatively secured throwaway environments, using the above strategies!) then apply a neural network or basic statistical analysis to generate alerts for anomiles. For instance, the reduced pallette featured in this attack would not be normal for many types of image input (scans or photographs come to mind). Other useful checks may include "are the image dimensions sane?", "is the aspect ratio sane?", "is the image predominantly black and white?", etc.

Strategies #0 (in particular cases), #1 (with some luck), #2, #3 and #4 (with some luck) could have mitigated this attack. Executed wrongly, any of those strategies could also have triggered the vulnerability. #4 is ripe for a web2.0 Silly-Valley outsource-the-problem startup (investors?).

PS. I enjoy these types of problems; anyone looking for ideas or development get in touch.

Ah memories of flashing the PSP through an exploit in libtiff and ChickHen. Perhaps people could do something similar with libpng?

To assess the impact of this I would really need to know a few factors (ball park)

What fraction of applications that read png images use libpng?

How many of those applications have features that load 3rd party images (uploadable avatar etc)?

My gut feeling is that the most common usage is for loading things like UI elements, and not for external images? Are such programs safe from exploitation?

Where does one get the patched libpng-1.6.19? The libpng site is broken for me right now.

Sourceforge also seems to be under heavy load. I found this mirror:



I just compiled libpng-1.6.19, and then I read this post right after. XD

how about a sample "dangerous" png image?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact