Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Embed your source code in PNG files (github.com/fusion)
38 points by cyansmoker on Dec 23, 2021 | hide | past | favorite | 28 comments



I thought that this is what PICO-8 cartridge files (.p8.png) did; but it turns out that those use steganography within the image pixmap itself, rather than taking advantage of ancillary PNG chunks. Kind of a strange choice, honestly.

On a separate note, a fun fact: PNG uses what is basically a de-facto v3 of the https://en.m.wikipedia.org/wiki/Interchange_File_Format . PNGs can be parsed and generated with generic IFF tools. (Which can also be used to operate on AIFF, TIFF, and—perhaps surprisingly—Erlang .beam files.)

IFF is, IMHO, an incredibly underutilized “metaformat” for how simple it is to work with, how observable/inspectable the results are (for a binary file format), and yet how efficient it is (compared to text-based formats.)

PNG’s (backward-compatible) extensions over IFF are all pretty great ideas as well — e.g. using chunk name capitalization as metadata to mark chunks as optional (plus a few other things); linking chunks with checksums to indicate when derived chunks need to be recalculated; etc. If these extensions were promoted to features of the metaformat itself, that’d make probably the best document-oriented container metaformat around, beating e.g. “a zip file with a META-INF dir inside” by a wide margin. Sadly, AFAIK, nobody has tried to write a formal IFFv3 RFC to formalize these extensions. (Maybe something I’ll do one day myself.)


Wow, I never knew that about PICO-8. Very strange choice indeed given how much easier they could have done it.

I remember downloading a lot of albums off 4chan where you would save the image, then rename it to .zip to get a folder of MP3s. Good times!

I do wonder though how "stable" embedding files is. Like do most major hosting services process images to the point it strips that stuff out?


>I do wonder though how "stable" embedding files is.

I was surprised to see this was still working 3 years later:

  $ curl -s https://pbs.twimg.com/media/Dq2sPGNU0AEKyyC.jpg | dd status=none bs=1 skip=599 count=40| sh
From https://news.ycombinator.com/item?id=18347985


It’s not that strange if you consider that you use image files to transfer images. Trying to store data outside of that (in a custom chunk) isn’t a use case anyone is accommodating so it will get stripped even by accident.

So if you use stego and store data in the image, you have a bigger chance of preserving the data.


Depends on whether you're expecting people to treat the files as "images" that happen to contain other data, and so e.g. upload them to photo-sharing sites, imageboards, etc.; or whether you're expecting people to treat the files as "programs" that happen to render with a thumbnail by default on most Operating Systems.

Personally, I don't see a PICO-8 .p8.png cartridge as an "image" any more than a Fireworks project file is an "image." It's a document that wraps itself in an image container to enable the 'document' to be previewed. It just so happens that you're able to very carefully treat the document as "an image" in some contexts (e.g. if you put it up on your own web server, and then embed the resulting URL in a webpage, people who right-click "Save As..." the 'image' will get the original document.) But this isn't really the goal (since you could do that just as well by generating an ancillary "cover art" file to go with the cartridge, and linking to the cartridge file using the cover art file.) The goal of such embeddings is just to make your document visually "self-describing" when examined with regular OS tools.

Of course, if you're considering designing your own PNG-embedded document format, and sharing the document losslessly via imageboards, photo-sharing sites, etc. is explicitly the goal of your format choice of PNG; then yes, steganography is the way to go.

But, well... if you are going to go the "embed the data in the pixmap" route — why not go all the way? Skip steganography (which will survive re-containerization, but won't survive the slightest lossy re-encoding), and instead just generate a "cover art" image containing a QR code that embeds the document data. Then the document would even survive digital-analogue-digital conversion!

(For the PICO-8 case, if the .p8.png files were simply art containing QR codes that the software could read directly, then a PICO-8 mobile app could support importing cartridges using the camera. Then people could just stick their carts up as posters at indie game conferences, or give them out as business cards.)


If you are like me, you spend a non-negligible amount of time creating architectural diagrams using a DSL (UML, python libraries or what not), exporting them to a portable image format and uploading these to some form of Wiki.

I would like to be able to edit everything I store in said Wiki, and this flow breaks when it comes to images. Inspired by draw.io, I created this simple util that lets me (and you) store the diagram's code with the image. Now, as long as you have the final image, you and others can keep editing your diagram!


BTW, plantuml has been storing the source uml dsl in the metadata of generated png for years. cf. https://plantuml.com/command-line#ce21470ab49d1d19


I've been using Kroki for this - https://kroki.io is a hosted server for it - and it's pretty great, and supports a tonne of formats


I thought the first question in the FAQ was amusing:

FAQ Is this an Electron app?

It's a few MB, which is a two orders of magnitude below Electron size, but still seems rather large for what it does; especially the CLI version.


Looking at the makefile, it is possible to strip out a decent amount of a golang binary size with go build -ldflags '-w -s'...

https://stackoverflow.com/a/22276273/253773


Cool. But something I want to know is: what are the limits of the text chunk in PNG? I just browsed the specification[1] but there was no mention of it.

It's also interesting to me that the spec says that the viewer should give the user a way to look at all the textual parts of a png (there are three), although I've never seen this offered.

[1] https://www.w3.org/TR/PNG/#13Text-chunk-processing


https://www.w3.org/TR/PNG/#5DataRep

2GB like all the other chunks.


Adobe Fireworks used to do something similar to this. Their base file format was a PNG of the document composite, and their proprietary data was stored in a nonstandard chunk (which was safely ignored by standard PNG readers). Thus a client could always see the latest of a file simply by sharing it with them- no Fireworks required.


Draw.io does this. When you export a diagram as a PNG. There is an option to embed the source file in the png. If you subsequently open one of those PNGs in Draw.io you can carry on editing it. I find it really handy.


Whoa. Had no idea!


Has one use-case been creating "ebooks" (including visible covers), but not ebook per se?


Is this a troll post considering the NSO hacking post seen here https://news.ycombinator.com/item?id=29640474 and the google project zero post? https://googleprojectzero.blogspot.com/2021/12/a-deep-dive-i...

What is being described will get you on your way to the NSO hack as a service, because their hack was using a decompression algo to build a virtual cpu of sorts and run it, in a single pass of the decompression process.

How hard would it be to embed source code in such a way that you could also build a limited cpu to then run this embedded source code in memory with a single pass of graphic processing or decompression algo?


No, that's irrelevant here.

PNGSource embeds code in ancillary chunks. That's it. No code execution. No steganography (yet).


Yeah, but because of NSO I now look at every mandatory or common practice process that is used on a file to see if the NSO methods can be used for exploitation.

For example, PNG seems benign, but what it was stored in a zip file of sorts, could the MS windows zip process be exploited, could 7-Zip be exploited or even PKzip for that matter, do you see where I am coming from?

What about if I embedded some icons and image files as a resource in an application exe or dll. You have persistence then, even if its just a beacon or some unique domain name lookup to track the app online. https://docs.microsoft.com/en-us/windows/win32/menurc/enumer...

Likewise, what about compression built into HTML/Web browsers, could that be exploited? https://en.wikipedia.org/wiki/HTTP_compression

Would it be possible to build something into a webpage or imagefile on a popular website where it can exploit the methods NSO have/are using? Maybe we should go back to reading the internet using wget?


> Yeah, but because of NSO I now look at every mandatory or common practice process that is used on a file to see if the NSO methods can be used for exploitation.

NSO is definitely neither the first nor the only one to do this, but let's move on.

> For example, PNG seems benign, but what it was stored in a zip file of sorts, could the MS windows zip process be exploited, could 7-Zip be exploited or even PKzip for that matter, do you see where I am coming from?

Any nontrivial parser written in an unsafe language has a potential for being exploitable, that's for sure.

> What about if I embedded some icons and image files as a resource in an application exe or dll. You have persistence then, even if its just a beacon or some unique domain name lookup to track the app online.

This is why we have code signing. Well, that works unless the ASN.1 parser or the signature verifier has got some security issues, of course.

> Likewise, what about compression built into HTML/Web browsers, could that be exploited? https://en.wikipedia.org/wiki/HTTP_compression

It's usually much easier to just exploit the renderer/JavaScript engine.

> Would it be possible to build something into a webpage or imagefile on a popular website where it can exploit the methods NSO have/are using?

This is basically how malware distribution works over the web, just look for some VirusTotal samples...

> Maybe we should go back to reading the internet using wget?

If we're at that level of paranoid, bugs in the HTTP parser, TLS implementation, or the TCP/IP stack should be just as sensitive.


>If we're at that level of paranoid, bugs in the HTTP parser, TLS implementation, or the TCP/IP stack should be just as sensitive.

How many zero days exist when you put a new distro online in order to update, and thats without looking at the firmware for bugs.


...why are you even talking to some stranger here then, is it worth enough to risk being exploited with a RCE 0day

Like, uh, just define a clear threat model, accept risks, and move on??? Or just don't use computers


Is this preferable over concatenating the code onto the end of the file? The PNG structure remains intact and no app needed for insertion and extraction, right?


What about the opposite, embedding PNG files into source code?


xxd can out put any random set of bytes as C code.

For example:

    > xxd -i tmp.f

    unsigned char tmp_f[] = {
        0x38, 0x4a, 0x39, 0x6f, 0x61
    };
    unsigned int tmp_f_len = 5;
I use it as part of a number of build scripts.


Well, isn't it the binary representation of the PNG already?


Well, that's not exactly novel[1], though it can be handy.

1: https://png-pixel.com/


IIRC, HolyC from TempleOS could embed arbitrary files into a source file.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: