Hacker News new | past | comments | ask | show | jobs | submit login
DragonFFI: C Foreign Function Interface and JIT Using Clang/LLVM (github.com/aguinet)
120 points by matt_d on Feb 2, 2018 | hide | past | favorite | 23 comments

I do some embedded hardware work, and have tests for the C code in python vi FFI. It's also amazing to be able to graph out scenarios with inputs and outputs in Jupiter notebook.

But I hate, hate, hate, cffi's parsing of include files. Makes my life miserable every time I make a change there.

I'm really hoping DragonFFI takes off.

This is one of my favorite features of Terra [1]. It also uses Clang to parse the header files, so most things "Just Work" (including, even, JITing inline functions on the fly so they can be used).

[1]: http://terralang.org/api.html#using-c-inside-terra

Out of curiosity, what do you use Terra for? It looks cool but I haven't found an excuse to use it.

I used it to build Regent: http://regent-lang.org/ (TL;DR: Implicit parallelism on supercomputers with automatic compilation for GPUs, etc.)

I've been using ctypes for the exact same purposes and it has been pretty painless. Why have you decided to use FFI?

As far as I can see, ctypes requires you to manually specify everything, rather than load it from the .h include file. This is even more work.

Not to mention another source for error when testing the c code in python. Your test might fail because you forgot to update a header definition in the python, not because you introduced an error in the C code.

Though compared with libffi, this can't replace libffi with this in most projects. libffi is a tiny library with no dependencies on Python, Clang or LLVM that's easily integrated into a language, and which doesn't rely on the parsing of any files.

You can statically link libffi for platforms that don't usually have it installed as a dynamic lib; it adds next to nothing to the footprint.

I believe other languages could use this without Python, in principle. It is currently distributed as a Python package to provide functionality similar to the existing cffi package. The Python wheels are 18MB and statically linked -- so no external Clang or LLVM dependencies as far as I can tell. That is indeed quite large relative to libffi (or compared to LuaJIT, which is well under 1MB IIRC). But relative to 100s of MB for a full SciPy stack with Anaconda, it's not that much.

This could also be a foundation for more than "just" C FFI: it includes a full C++ compiler and JIT! Adding C++ FFI support would take a lot of work I’m sure, but I doubt it would noticeably increase the footprint because the bulk is already in LLVM+Clang. Another possibility would be to JIT Cython-generated code on-demand.

Reading the code, that 18mb is likely from statically linking LLVM into python

ihnorton's comment says this clearly: "... 18MB and statically linked -- so no external Clang or LLVM dependencies".

This is of course the technically best way, but the immense footprint of 18mb just for the FFI, and the constant c ABI changes in llvm was the reason we didn't do it so far also. You either ship the huge llvm libs or risk an outdated system llvm. But for proper LTS distros it's tempting.

(original author of dragonffi here)

FTR, the footprint of 18Mb is the compressed version, the uncompressed version is ~57Mb. I think there are huge improvements possible here, among them:

* compile llvm with -fvisiblity=hidden

* compile the whole thing with thin lto, which could have the effect to remove unused code

* potentially other idea :)

libffi author here...

Did you consider writing a purpose-built JIT compiler? I believe the FFI use-case is narrow enough that a tiny hand-written JIT compiler would be pretty easy... like the old QEMU template based JIT.

LuaJIT’s DynASM would go a long way there; might even have reusable stubs. Or slightly crazier, if 520K isn’t too big a burden: wrap the LuaJIT cffi ...from C... and embed all of LuaJIT.

(by the way: thank you for libffi!)

I didn't, as I really wanted to see how that would be possible using only clang/llvm. The way I see it is that it seems a shame to rewrite every possible ABI (and the more it goes the more we have (https://xkcd.com/927/)), where compilers like clang and gcc already does all the work.

One point of my llvm talk of tomorrow is to be able to discuss how we could "extract" these parts from clang to be able to do this the lightest way possible! (but that would still require the embedding of a full LLVM backend which is still huge for the only FFI case).

One good point we have embedding a full compiler is that we can really JIT C code from let's say a python interpreter. The usefulness of this is another debate :)

Said otherwise, I thought the experiment was worth the try, and it seems fun to see how far we can go from here :)

I've always wondered if there's a way to do this without needing the header files. Like by parsing the debug information. Obviously, this means you need there to be debug information, but that always struck me as a cleaner solution.

Me too. Surely it would be faster and more portable to put function signatures within the binary lib. No more parsing of source needed.

If you only added the signatures and not all the debug stuff the overhead would be pretty small.

This is actually a pretty funny/nice idea, and could be possible with the actual design!

Using debug info also make the whole thing potentially usable with other languages that emits functions with the good ABI and DWARF informations.

This should be possible as long as the binary is compiled with its symbol table intact.

Anyone know what "triadic arguments" means in this context? Is it a typo of variadic?

  Some C features are still not supported by dffi (but will be in future releases):
     * C structures with bitfields
     * functions with triadic arguments

I believe it's supposed to be variadic. I thought it original meant 3-arg functions, but I can see in one of the examples it calls a 3-arg function. The TODO file however says "var args" still needs to be implemented.

* Example calling three arg function: https://github.com/aguinet/dragonffi/blob/master/examples/ar...

* TODO File: https://github.com/aguinet/dragonffi/blob/master/TODO

this was a unfortunate typo I just fixed: https://github.com/aguinet/dragonffi/commit/dc623098d30f3706...

So it is about variadic arguments. The reason is time, that is I still didn't take the time to make it work, but it does not seem to have big issues doing it!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact