
Extending OCaml Programs with Lua - signa11
https://blog.baturin.org/extending-ocaml-programs-with-lua-soupault-got-plugin-support.html
======
as-j
I'm not capable of speaking about OCaml, but I clicked on because I use Lua
professionally and where it gets embedded always strikes my interest.

This is using Lua 2.5, current version of Lua is 5.3^, and Lua is not a fast
moving language. 2.5 dates from 1997, 22 years ago. So wow, an interesting use
of Lua, but as a fairly capable Lua programmer I'm not sure I could code in
it.

Instead of using the standard C Lua interpreter (or Luajit), this uses Lua-ML
which is an alternate interpreter for Lua 2.5. So moving it forward to Lua
5.1/5.3 would seem to be a lot of work of basically writing a new Lua 5.3
interpreter.

Seems super cool, and a fun project, but...wow. :)

^) [https://www.lua.org/versions.html](https://www.lua.org/versions.html)

~~~
ernst_klim
It's based on research work on building an extensible interpreter [1]. It
should be... well, extensible. Though I'm not sure what was changed in lua
between 2.5 and 5.3, maybe it's not that simple.

[1] [https://www.cs.tufts.edu/~nr/pubs/maniaws-
abstract.html](https://www.cs.tufts.edu/~nr/pubs/maniaws-abstract.html)

------
phaedrus
I spent many years working on a project with similar goals, except in my case
the statically typed language was C++11/14 and the embedded one was a
scripting language called Io. Io is similar in implementation size and
philosophy of simplicity to Lua, and C++14 template programming is probably
similar to OCaml (as far as I know, not being well versed in OCaml).

The good part, what makes you go wow that's neat, is that once you have a
projection of types and method signatures into the other language, everything
to do with that Just Works. You get objects with appropriate types and can
call methods on the object which return values that are also appropriately
wrapped and typed which in turn have methods, etc. Provided your export is
complete enough, you can navigate this whole tree or web of interconnected
facilities from the static language from your dynamically typed scripts.

However despite spending years on this, I eventually gave up most of the idea,
for several reasons:

First, I became discouraged that it seemed no one in either the Io language or
C++ communities seemed to have any interest in the project.

Second, I hit a number of performance walls and impracticalities which led me
to view the whole endeavor as a dead-end (for the purposes I wanted to use it
for and the way that I built it).

If you don't export many types and methods from your host program, you find
yourself often needing to tab out of your script editor to add another type or
method to the exports list so you can call it. This requires recompiling the
host program, removing much of the benefit of writing in a scripting language.

If you go the opposite way and try to export everything, your executable size
balloons. Dead code elimination can't optimize away methods you exported
script bindings for and didn't use, because it can't statically determine that
you might not be using it or use it in the future. The executable size and
unused exported bindings also has an adverse effect on compile times.

Managing values via proxy objects (the bridge between the script and the
static language) also imposes a heap allocated memory model. This might be
less of a penalty if the static host language is also garbage collected, but
for C++ you lose a lot of opportunities to treat pure values more efficiently.

Conversely the wide varieties of ways C++ has to refer to instances of types:
by value, by reference, by pointer, const or not, etc. posed a special problem
for script binding. My quixotic goal was to fully support every C++ method
signature; it would be silly if you couldn't pass a T value to something
expecting T const reference. Therefore I included a type coercion graph
attached to the bound types.

This required a lot of deep thought about references to temporaries and
lifetimes of ephemeral values (I had to change the proxy objects to hold a
cached copy of the C++ object, in most cases.) But it was string handling
which really showed why the idea of automatically coercing the bound types to
make method calls Just Work was a bad idea.

My core C++ code used 8-bit char strings. When porting to Windows I needed the
scripts to pass these to wide char Window OS functions. I thought, OK, I'll
just write a type converter for this. The problem with that is you need to
specify a 3rd thing - the desired encoding - when converting strings. There
simply isn't enough information just given the "from" type and the "to" type
to make an intelligent decision what to do.

The relevance of all of the above to script language embedding in general is
that it's often unclear how far to go with automatic conversion of types. Even
if you don't foolishly try as much automatic conversion and even you don't
program in a host language with as many type decorators as C++ has, you still
have the fundamental question of whether to decay host primitive types into
script language primitive types or keep them proxied.

The final problem is that now either all your tooling (debugger, profiler,
etc.) doesn't work, because it doesn't understand your script language or at
best you have to jump between higher and lower level tools. I actually wrote a
gdb-wrapper which enabled debugging Io language in the C++ IDE and display Io
language variables in the IDE watch window. Then gdb on my system got updated
and my wrapper could no longer parse the gdb output correctly.

In light of all this, I decided that in the case when you are the one writing
both the host code and the interpreted code, you're better off just writing
the whole application in the host language.

And yet, I've also been burned subsequently by using host-language-implemented
DSL's (as exposed to external DSL or script). So I don't exactly know what the
right answer is. I dream of coming up with an intermediate approach which
externalizes more of the "text" of the DSL while retaining the host-language
static typing.

