Hacker News new | past | comments | ask | show | jobs | submit login
Genny – Generate Nim library bindings for many languages (github.com/treeform)
54 points by elcritch 8 months ago | hide | past | favorite | 8 comments



Here's an example of the Python produced by Genny: https://github.com/treeform/pixie-python/blob/master/src/pix...

That Python is the output from this file: https://github.com/treeform/pixie/blob/master/bindings/bindi...

This is very new stuff for us but we really like the idea that we can generate comfortable bindings for many languages and never have to worry about method signatures or comments or default values getting out of sync with the C / shared lib interface.


This looks really cool. I kind of liked the idea of Nim from the get-go (I'm a Python guy first and foremost), and liked the idea of a "fast compiled Python", but never enough to make the leap.

Maybe I can write my loops in this and dip my toes this way? Good stuff, I'll be keen to see how it evolves.


For the specific case of Python executing Nim code you might be interested in [nimporter](https://github.com/Pebaz/nimporter) instead. Genny would definitely work for you as well, but it seems to be more geared towards creating libraries in Nim that can be imported by multiple languages.


Ah this looks beyond amazing, thanks!

Python always struggled with the “2 language problem” - you want something flexible and malleable to code in most of the time (-> usually slow) and something fast in small doses (-> usually rigid and compiled).

C/C++ is good but requires heavy scaffolding. Fortran has AFAIK good support, but it’s Fortran. There used to be an unholy and very useful utility called “scipy.weave” that lets you run inline C++, I think it was a coding abomination and now is gone.

The recommended way is using Cython, but I find it generally neither nice nor fast.

Nim could seriously be it for me.


Now if only a full binding generator for C and C++ headers to Nim was done (the reverse), the language would really be cooking!

"What do you mean, Nim has two of these already?"

Yeah, I know, and -- not to hurt anyone's feelings -- but: they kind of suck. And there's no way I see them able to be extended to do the job fully, based on the way they're currently built.

Those are some bold claims to make!

So before I get stoned to death (no offense to the authors, I am grateful that they exist and have used them both) let me attempt an explanation and back up these statements.

---

To start off, the two tools available are "c2nim" and "nimterop". c2nim is a Nim official library, while nimterop is a community library.

  https://github.com/nim-lang/c2nim
  https://github.com/nimterop/nimterop
To preface this: I've spent a fair amount of time on the "codegen of bindings for interop from C/C++ headers" problem. The first thing I did was reach out to people who spent months or years on this problem and ask them what worked and what didn't. (I had no clue where to start and figured lots of smart people have done this before)

Consistently, what I heard was (roughly):

  "If you want to programmatically interact with information about C/C++ units, you want to use the LLVM/clang ecosystem. You're going to think you can do what you want from 'libclang', the C API. But what you'll find out after sinking months of your time is that libclang DOESN'T expose enough info about the AST for certain scenarios, and there is ABSOLUTELY ZERO WAY to get it. So you'll need to use libTooling, which is C++, and has the full AST available."
I've found this to be true, with the exception that a few people have taken to "augmenting" libclang with extra bindings to libTooling functions or other C++-only methods.

Most notably is Tanner Gooding & Co's "libClangSharp" from the Microsoft "ClangSharp" project, and PathogenDavid's "Biohazrd" libraries (which again are a sort of enhanced fork based on ClangSharp):

  https://github.com/microsoft/ClangSharp  # See "/sources/libClangSharp"
  https://github.com/InfectedLibraries/Biohazrd
Okay so with that out of the way, my long-winded point is:

  - c2nim is a custom C lexer/parser written in Nim, with zero support for C++
  - nimterop uses tree-sitter, which isn't bad but it's certainly no "libTooling"
Neither tool supports C++, and edgecases (in either text or syntax) will still outright break them/they don't know how to handle. I had one where an unprintable unicode character caused c2nim to freeze indefinitely.

Nimterop is also dead as far as I understand -- last active commit period was about ~1 yr ago.

I'm hard to convince to the path to success here doesn't involve leaning on the collective decades of man-hours of engineering put into clang/LLVM. C++ is a nightmare of a specification, a veritable cluster-fuck of complexity.

---

As a side note, I've had the opportunity to speak to one of the Nim maintainers (Timothee Cour) about this and the future here, and he's expressed interest in having a good story for easy type-safe C++ interop via binding generation and also believes it's important.

So hopefully this is something we'll see pop up eventually. Maybe I'll take a crack at myself using ClangSharp (or it's XML output format, using TypeScript/Python to munge the XML and spit out code) but there aren't enough hours in a day to do all the interesting things =(


It’d be great if you wanted to try wrapping clang tools directly! Since Nim can wrap C++ code so it might take some manual effort to wrap enough clang to be useful, but for just getting an AST shouldn’t be too much work. Doesn’t clang also have an option to output JSON/XML of the C++ code?

Though I’ve had good luck with just c2nim. But as you mention it’ll break on some corner cases things, so I just tweak the C a bit (remove function attributes, etc) then retry. However, c2nim was recently updated and fixed a bunch of C99 things. I haven’t had to do that in a while.


Clang can dump the AST to JSON but there are some headaches when it comes to things like specialization or other circumstances.

If you wanted to go this route, using ClangSharp's XML output would probably be easier because it's been tailored to extract the "useful" bits and to expose some of the info that is hard to get at from a raw AST dump


Off topic but if OP is a maintainer please reduce the size and resolution of the GitHub banner image (2.4 MB).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: