Hacker News new | past | comments | ask | show | jobs | submit login
Extending OCaml Programs with Lua (baturin.org)
74 points by signa11 38 days ago | hide | past | web | favorite | 3 comments

I'm not capable of speaking about OCaml, but I clicked on because I use Lua professionally and where it gets embedded always strikes my interest.

This is using Lua 2.5, current version of Lua is 5.3^, and Lua is not a fast moving language. 2.5 dates from 1997, 22 years ago. So wow, an interesting use of Lua, but as a fairly capable Lua programmer I'm not sure I could code in it.

Instead of using the standard C Lua interpreter (or Luajit), this uses Lua-ML which is an alternate interpreter for Lua 2.5. So moving it forward to Lua 5.1/5.3 would seem to be a lot of work of basically writing a new Lua 5.3 interpreter.

Seems super cool, and a fun project, but...wow. :)

^) https://www.lua.org/versions.html

It's based on research work on building an extensible interpreter [1]. It should be... well, extensible. Though I'm not sure what was changed in lua between 2.5 and 5.3, maybe it's not that simple.

[1] https://www.cs.tufts.edu/~nr/pubs/maniaws-abstract.html

I spent many years working on a project with similar goals, except in my case the statically typed language was C++11/14 and the embedded one was a scripting language called Io. Io is similar in implementation size and philosophy of simplicity to Lua, and C++14 template programming is probably similar to OCaml (as far as I know, not being well versed in OCaml).

The good part, what makes you go wow that's neat, is that once you have a projection of types and method signatures into the other language, everything to do with that Just Works. You get objects with appropriate types and can call methods on the object which return values that are also appropriately wrapped and typed which in turn have methods, etc. Provided your export is complete enough, you can navigate this whole tree or web of interconnected facilities from the static language from your dynamically typed scripts.

However despite spending years on this, I eventually gave up most of the idea, for several reasons:

First, I became discouraged that it seemed no one in either the Io language or C++ communities seemed to have any interest in the project.

Second, I hit a number of performance walls and impracticalities which led me to view the whole endeavor as a dead-end (for the purposes I wanted to use it for and the way that I built it).

If you don't export many types and methods from your host program, you find yourself often needing to tab out of your script editor to add another type or method to the exports list so you can call it. This requires recompiling the host program, removing much of the benefit of writing in a scripting language.

If you go the opposite way and try to export everything, your executable size balloons. Dead code elimination can't optimize away methods you exported script bindings for and didn't use, because it can't statically determine that you might not be using it or use it in the future. The executable size and unused exported bindings also has an adverse effect on compile times.

Managing values via proxy objects (the bridge between the script and the static language) also imposes a heap allocated memory model. This might be less of a penalty if the static host language is also garbage collected, but for C++ you lose a lot of opportunities to treat pure values more efficiently.

Conversely the wide varieties of ways C++ has to refer to instances of types: by value, by reference, by pointer, const or not, etc. posed a special problem for script binding. My quixotic goal was to fully support every C++ method signature; it would be silly if you couldn't pass a T value to something expecting T const reference. Therefore I included a type coercion graph attached to the bound types.

This required a lot of deep thought about references to temporaries and lifetimes of ephemeral values (I had to change the proxy objects to hold a cached copy of the C++ object, in most cases.) But it was string handling which really showed why the idea of automatically coercing the bound types to make method calls Just Work was a bad idea.

My core C++ code used 8-bit char strings. When porting to Windows I needed the scripts to pass these to wide char Window OS functions. I thought, OK, I'll just write a type converter for this. The problem with that is you need to specify a 3rd thing - the desired encoding - when converting strings. There simply isn't enough information just given the "from" type and the "to" type to make an intelligent decision what to do.

The relevance of all of the above to script language embedding in general is that it's often unclear how far to go with automatic conversion of types. Even if you don't foolishly try as much automatic conversion and even you don't program in a host language with as many type decorators as C++ has, you still have the fundamental question of whether to decay host primitive types into script language primitive types or keep them proxied.

The final problem is that now either all your tooling (debugger, profiler, etc.) doesn't work, because it doesn't understand your script language or at best you have to jump between higher and lower level tools. I actually wrote a gdb-wrapper which enabled debugging Io language in the C++ IDE and display Io language variables in the IDE watch window. Then gdb on my system got updated and my wrapper could no longer parse the gdb output correctly.

In light of all this, I decided that in the case when you are the one writing both the host code and the interpreted code, you're better off just writing the whole application in the host language.

And yet, I've also been burned subsequently by using host-language-implemented DSL's (as exposed to external DSL or script). So I don't exactly know what the right answer is. I dream of coming up with an intermediate approach which externalizes more of the "text" of the DSL while retaining the host-language static typing.

Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact