The linker trick has a problem. It doesn't set the non-executable stack bit, so as a result your whole binary will have an executable stack, and therefore be insecure against various stack smashing exploits.
Try running 'readelf -S blob.o' and you won't see any '.note.GNU-stack' in the output.
If you go via a C source file compiled with gcc (or clang), then the compiler sets the bit properly.
Edit: Also I tried to add a comment to the original posting to warn them, but blogger just eats my comment.
If you are right then this can be mitigated by two additional options passed to objcopy, namely --add-section to create an empty section and --set-section-flags to adjust the flags of this empty section. I'm just recreating the section/flags that I see in a normally compiled file.
Looks good. The reason I know a bit about this is I had the same "bin2o" hacky script that used objcopy. It broke every time someone found a new architecture or platform (ie. having to choose the correct -O and -B flags is non-trivial if you want to support every architecture).
The solution (which is not really much better than yours) is a script that creates some assembler and assembles it:
No binary object file "has a stack" since it's created ephemerally for the executable at runtime. However the linker checks all objects and determines if any of them contain code that requires an executable stack. For backwards compat reasons, that check will say "executable stack needed" for objects that don't contain the special section flag.
In other words, the check is a logical && operation across all the objects, and therefore somewhat fragile. It's for this reason that in RHEL we have a separate check for all the RPMs we ship to make sure that none of them contain executable stack binaries (well, except in some rather exceptional circumstances).
I think you can argue this is a bug in the linker command that creates the binary blob object.
Very likely larger amounts of binary data linked in your program should better be "read only data", and hence be put in the ".rodata" section. From the objcopy manpage:
I had made a short demo of this technique quite a few years ago, but regretably hadn't included the .rodata thing: https://github.com/vogelchr/objcopy_to_carray, so if you want to check this out with minimal effort, have fun with it ;-).
I'll add that by segregating read-only and writeable data, you allow the virtual memory system to efficiently share a single copy of the read-only data among all processes loading it.
You are perfectly right. And there's also a second use-case: In some embedded applications things run directly from ROM/flash and things put in the .rodata-section will stay there and not get copied to RAM (which you obviously have to do for modifyable data).
This isn't extremely important, but you should probably declare those symbols as straight `char`'s, not `void * `. The symbols themselves aren't pointers, the symbol itself refers to the first byte in your binary block - The same way `b` from `char b` refers to a byte on the stack. That's why you have to take the address of the symbol to get the address of the block. It makes more sense to declare it as a `char` because then nobody will attempt to use the original block symbols to access the blob itself.
As it is, you have `_binary_blob_start`, `start`, `_binary_blob_end` and `end`. All 4 are `void * `'s, but `start` and `end` are the only ones which actually point to the block! `_binary_blob_start` and `_binary_blob_end` are actually pointers made-up of the first 4/8 bytes of your binary-data, and thus aren't actually pointers.
On OS X: pass -sectcreate when linking (see man ld) or use segedit.
If you're building native executables rather than cross-compiling for some embedded target, xxd has the advantage of not needing different commands on different platforms, although xxd itself may not be commonly found on Windows - oddly enough, it's part of vim...
Albeit if it's not embedded, you should consider why you're embedding data into executables in the first place.
Back in the days of BeOS and gcc 2.95, I ran out of memory (32M!) trying to compile xxd-type data. x86 BeOS used ELF and PPC BeOS used PE (and the MetroWerks toolchain) so using the linker trick probably wouldn't have been an option (had I known about it at the time!).
> does it really need embedding in the executable?
Yes, because self-contained executables are massively easier to deploy than ones that have data dependencies, because it's just one file that you can put anywhere. Plus you avoid having to write any file I/O code (and deal with potential errors).
Honestly I never understood why this technique is so obscure, rather than being standard practice for C/C++ devs.
> what about the case when you want n blobs of data instead of one?
Each one turns into a .o file, then you link them all together. There's nothing limiting you to just one.
Well, the only drawback that comes to mind is that you can't mutate the state of that blob (well, you can, but you really really shouldn't). Also if it's obscenely large, it might be better to keep it on disk and load only as much as you need/can.
Also if it's obscenely large, it might be better to keep it on disk and load only as much as you need/can.
That's what this technique does, since the kernel doesn't fully load the executable file to memory, it'll mmap it but only load data from disk as it's requested.
I wasn't aware that kernel did that. Now that you said it, it's obvious that they would, but it never actually crossed my mind. Thanks for providing some insight and making me a bit wiser! :-)
i'm also not sure i believe you that self contained executables are massively easier to deploy in most cases where the end user is involved. normally they want minimal interaction - if it makes a difference to them how many files it is, then you are doing something even more wrong in how you expect users to deploy - for most users on most platforms it should be an installer or via some app store - possibly both.
if you are targeting a very specific type of user then you might well be right, but thats not obvious at all.
i asked about the case for n blobs of data because it is unclear where the symbol names come from in your example, and changing them after the fact is not something i would expect people to work out for themselves...
That said, a self-contained executable is super-easy to distribute to a user: They download the executable and then they run it. No need for an installer. When they don't want it anymore they can delete it. That's actually pretty great.
That said obviously if you are using some sort of package manager anyway then the benefits are greatly decreased -- effectively, the package becomes the self-contained unit of deployment. The benefit of a self-contained executable is more for cases where bringing in a package manager would create a lot of extra work or if requiring the user to use some particular package manager would be an unacceptable limitation.
> it is unclear where the symbol names come from in your example
I'm not the author of the article.
The symbol name is based on the file name of the input data. This is my least-favorite part of this linker feature -- I have looked for a way to specify a different symbol name but haven't been able to find one. But in any case, yes, different input files will get different symbol names.
Finding the data belonging to an installed application is not portable across operating systems, so embedding the data will result in much simpler code.
When the application starts, the current working directory can be anywhere, you need to find the absolute paths to the data files, and this works differently on each operating system.
i think you are thinking of a special case... in general this is not a problem because the working directory is specified.
most users never touch a command line tool or system component where this kind of thinking is most valid because it will be run from any old context (but even then...)
its also quite common to think that embedding data in your executable is a bad idea if it is large, even if it shouldn't be something changing or generic. there are some classic reasons for this:
* some platforms have tight constraints on executable size, you actually just can't do it. most platforms have some constraints at all. you won't be embedding 64GB of lunar altimetry data into your executable no matter what...
* you lose control over what goes into and out of memory and have to trust some implementation detail. mmap isn't always great on your platform (it might load everything, twice!), and might not even be the approach the executable loader takes.
* you lose control over choosing to load the data after a delay or on the background. it impacts the time it takes for the executable to launch at all. for large data and small devices this is easily measurable without much data at all.
* file systems are a good way to manage data. it is what they are for. if you have lots of data and you need multiple people to work with it, then its easier to manage in some hierarchical and divided way - like a file system.
tbh i've written a lot of cross platform software targetting every major platform from the last 10 years and plenty of tiny ones too... this is not something i've ever felt was necessary, and that working directory problem... i stopped trying to work around that a very long time ago (the first time i tried to write a big bit of software i convinced myself it would be a problem somehow and wrote something similar to this link [but worse and less platforms]) and i've not looked back or had problems because of it, just less code to maintain.
Sure, it mainly depends on the data size whether embedding makes sense. For instance I've written an 8-bit emulator recently where all available software ever written for the original machine is under one MB. In this case it definitely makes sense to embed the data into the application executable. It doesn't make sense if the data is dozens or hundreds of megabytes big, and especially if only a small chunk of the data is needed at any one time.
I don't quite agree that the 'finding the data' problem is trivial. You can't just do an fopen("mydata.txt", "r") and expect it to work for different platforms and different launch methods, especially when launched through a desktop environment instead of the command line. There's always some platform-specific code involved to get the data's absolute location, on some platforms it's more complicated than on others.
I do this for test scripts at work. They're written in Lua and I found it easier to embed all the possible Lua modules (not individual scripts---there's a subtle difference) in a custon Lua interpreter. That way, all that is needed is one executable and the specific script. The two dozen modules are already there in the executable.
IF anyone is curious, doing this in java is only possible if the resulting java file is no bigger than 65kb and does not include more than 65k literals / constants
If you're shipping Java software, you probably already deliver more than one class file, possibly within a JAR. I think it makes more sense to include the binary data as a regular file and load it as a resource through the class loader.
Try running 'readelf -S blob.o' and you won't see any '.note.GNU-stack' in the output.
If you go via a C source file compiled with gcc (or clang), then the compiler sets the bit properly.
Edit: Also I tried to add a comment to the original posting to warn them, but blogger just eats my comment.