
Ask HN: Why calling C in mainstream scripting languages is not so simple? - gorer
This is how Newlisp http:&#x2F;&#x2F;www.newlisp.org imports and calls C printf function from libc library.<p><pre><code>  (set &#x27;library &quot;&#x2F;usr&#x2F;lib&#x2F;libc-2.24.so&quot;)
  (import library &quot;printf&quot;)

  (printf &quot;%s\n&quot; &quot;hello world&quot;)
</code></pre>
The question is why is it no so simple in other scripting languages??
======
enkiv2
Consider what needs to be done in order to implement a general purpose FFI for
calling C from a scripting language:

1) You must provide mechanisms for converting types used in the scripting
language into C types (as a simple example, strings -- usually pascal-style
length-prefixed fat strings in scripting languages -- must be easy to convert
into zero-terminated character arrays; where most scripting languages have A
numeric type with a complex underlying structure to represent everything from
small integers to imaginary numbers with large irrational fractional
coefficients, the FFI must decide how to convert arbitrary numbers into some
numeric type), and furthermore the FFI must wrap C pointers as some new
script-native type.

2) C types, even in pretty clean standard library code, are often misleading:
functions that return single characters are often specced to return short ints
for historical reasons, char arrays are often byte arrays rather than strings,
void pointers are used all over the place to refer to basically any type that
isn't built-in, and rules about what to expect are part of folk understanding
rather than being properly notated. Any scripting language that expects to
determine types of even common standard library functions must have a huge
list of rules specific to particular functions.

3) Many C libraries won't have exported symbols. Your FFI must either know how
to parse C header files (in other words, include a complete C preprocessor) or
must require the user to write out the declarations in some special DSL that
duplicates some of the behavior of a C preprocessor. In other words, writing
an FFI for arbitrary C libraries really means writing an extra programming
language interpreter or compiler.

4) C++ adds name mangling rules, which can be complex, plus its own
complicated object system rules. If you want C++ support, it probably won't
"just work" by any stretch of the imagination. In some cases, if you have a
working C FFI, you can expect the user to determine what the appropriate name
mangling is and write the code to interface with it, but this won't work well
for methods -- only for public static functions. You're also going to need
some operation to dereference pointers or create a pointer out of an arbitrary
object, if you want to be able to use any of this stuff.

5) Usually, a scripting language will have a clean and well-defined mechanism
for adding functions written in C or some other language -- a mechanism that
sidesteps all of the above concerns by having an array of methods for casting
C types to specific language-native types, and for registering these functions
with the language engine. You'll often have a mechanism for loading binaries
of this type, which will include a convention for module initialization
functions to be called after load. Since writing wrappers in C for most
languages is straightforward and packages like scons exist for making this
easier, why bother with writing an FFI that only allows users to write in a
crappier version of C within the scripting language?

That said, I think the idea that mainstream scripting languages mostly lack
this facility is false. Several implementations of Python have an FFI like the
one described; so does LuaJit and Julia. I don't know for sure if Perl does,
but I'd be shocked if it didn't, given Perl's userbase and habits. Mainstream
languages that lack it often do so for very reasonable security reasons: if
Javascript had this facility, imagine how awful that would be for literally
every web browser!

