
Embedding files in C programs with koio - yarosv
https://drewdevault.com/2018/05/29/Embedding-files-in-C.html
======
_pRwn_
this like looking back to the 80s when we coded for Amiga & Atari ST ...

------
dvh
I've been using bin2c for years.

------
stevekemp
As other have already mentioned there are a lot of existing solutions to this
problem. I'm not averse to reimplementing tools myself, but it you're going to
do that it makes sense to add improvements along the way.

One obvious improvement would be to compress the stored data, via
gzip/bzip/similar, which would result in a smaller binary. As a small side-
effect the embedded resources would be less visible to anybody who ran
"strings" against your binary.

------
wahern
I imagine that many situations where you might want to embed assets into a
binary involve embedded work. With embedded work you often want to be able to
cross-compile. Requiring that the koio tool be built first on the host
architecture (as opposed to the target architecture) gets messy, especially if
you can't or don't want to depend on having it preinstalled.

The koio utility might better written in POSIX shell.

FWIW, here's a simple POSIX shell-compatible routine that will convert an
8-bit stream into a quoted C string

    
    
      cstring() {
        # use od to translate each byte to hexadecimal, sed to format as
        # proper C string
        od -An -tx1 -v | sed -ne '/./p' | sed -e '
          # prefix \x to each hexadecimal pair and remove trailing space
          s/\([0-9a-fA-F][0-9a-fA-F]\)[[:space:]]*/\\x\1/g;
      
          # quote escaped bytes
          s/^[[:space:]]*/"/;
          s/$/"/;
      
          # escape newline for all but the last line
          $!s/$/ \\/;
        '
      }

------
andyonthewings
I have been using a library named incbin
([https://github.com/graphitemaster/incbin](https://github.com/graphitemaster/incbin)).
On Mac and Linux it doesn't even require a cli tool to convert the file. It
just embed the content using the `.incbin` directive of the inline assembler.

It is pretty perfect for my project, which is a deep learning application for
Android. I use it to embed the CNN model file into the C++ code. It let me
avoid putting it in the apk, and then loading it from Java, and then passing
it to C++.

------
anilakar
Are there any advantages vs just embedding the file as a char array? I've
found it easier to mmap any input files anyway so as to avoid an extra level
of buffering in userspace.

~~~
jschwartzi
If you don't have a userspace or a filesystem, you can't memory map anything
from it. In that case, your binary needs to contain all of the assets it needs
to run such as images.

I've needed to do this to embed images in an application written on bare metal
before. objcopy and its ilk do turn the data into a byte array that has a
symbol in the symbol table. You then reference that symbol in your code.
Generally you're not embedding this stuff in an ELF in that case.

------
kccqzy
Even xxd has a mode to dump a file in the way expected by C. From then it's
just another #include away.

~~~
makapuf
I personnally end up using my own simple python scripts.

Alternatives being : \- use objcopy but this needs to specify the binary of
the output, which some build settings can make complex at the compile stage \-
use ld and -o binary. this works but this will put all data in .data section
which puts it in ram, without being able to change it. \- use incbin on inline
assembly. this will work too but it's slightly more complex and relies on
inline assembly so a bit less portable. \- use xxd utility : this only works
on linux, does not export _const_ unsigned char[] (fixable with a simple sed)
and we lack the ability to export just a header or control prefix /
extensions.

------
EdSchouten
Aren't there many tools out there that can already do this? I thought even
objcopy(1) can turn an arbitrary file into a .o file containing a single
symbol holding the data.

~~~
emmelaich
What is the clang or macos equivalent to objcopy?

~~~
planteen
You are just dealing with ELF files, so objcopy works fine in those scenarios
as long as the architecture matches for the linker. This trick works many
other compilers that use ELF such as clang, MSVC, Green Hills, etc.

~~~
tetromino_
Except that MacOS does not use ELF, it uses Mach-O.

~~~
planteen
Clang handles intermediate ELF object files fine and can link into a Mach-O
final executable.

Windows under MSVC doesn't use ELF either (it uses PE for the .EXE) and it
links in intermediate ELF object files fine. I've made an EXE that was a mix
of files compiled by MSVC and GCC.

