
Luje 0.1 – a pure Lua JVM - leafo
https://cowlark.com/luje/doc/stable/doc/index.wiki
======
jacques_chester
David Given is one of those prolific thinkers and doers who seem to be
overrepresented in the Lua community.

He's probably best known to HNers for his critical article "On Go" and
Objective Lua.

[http://cowlark.com/2009-11-15-go/](http://cowlark.com/2009-11-15-go/)
[https://cowlark.com/objective-lua/index.html](https://cowlark.com/objective-
lua/index.html)

~~~
seabrookmx
Thanks for the link! That's a really interesting article.

Though out of date, I still think his point holds water. It's shocking how
little difference there is between Algol-68 and modern Go or C. Go has changed
quite a bit since his article, but not all of these changes have been well
received (a great example being automatic semicolon insertion).

------
tinco
The article makes a rather hefty claim, that it is comparable or even in some
ways faster performancewise than Sun's hotspot JVM.

From what I've heard the JVM is one of the most heavily researched and
optimized bytecode VM out there. Could it really be that some guy can write a
competitor in a scripting language in his spare time?

Not saying it's impossible, just curious if there's anything to back that
claim up, besides the very specific microbenchmarks.

edit: I was a bit too quick to comment, reading again he doesn't really claim
to be faster at anything other than small loops. Still interested if it's fast
at any real world things though.

~~~
pron
HotSpot is indeed one of the most optimized runtimes around. It is, however,
optimized for large, complex applications. It's quite possible to beat in a
carefully designed microbenchmark. Beating it in a complex application is a
whole other matter (in fact, it's pretty hard to beat even in C++).

~~~
haberman
> in fact, it's pretty hard to beat even in C++

If there's one thing I've learned, it's that the wishful thinking will never
stop.

~~~
pron
Yes, but only if you call 15 years of experience in both C/C++ and Java
systems wishful thinking. And I'm talking big-ass, several MLOC, some soft and
some hard realtime systems.

But I guess some people must hold on to their beliefs, while some do extensive
research (well, to be fair, it was government funded, so we had the means to
write the same 2+ MLOC system twice, and we've had direct access to the Sun
team working on their realtime Java project at the time). I guess you just
don't have the data.

~~~
haberman
> I guess you just don't have the data.

Basically every piece of performance-critical open-source infrastructure (ie.
what we can actually all see and evaluate) is written in C or C++. That is
empirical data.

You say that long-running servers that don't have startup cost constraints
will perform better in Java, particularly when there is concurrency. Why then,
is the entire web served by web servers written in C/C++ (Apache, IIS, nginx,
LiteSpeed), mail servers written in C/C++ (Exim, Postfix, qmail, Sendmail,
Microsoft), DNS servers written in C/C++ (BIND, PowerDNS), and database
servers written in C/C++ (MySQL, PostgreSQL, Microsoft SQL Server, MongoDB,
Riak)?

You attempt to explain all this away, as if OS's, VMs, web browsers, etc
(which you admit are better off written in C or C++) are somehow different
from the applications you work on, and that somehow the Java you write in your
applications would be hard to beat in C++.

The only "data" you have offered is your own personal experience of working in
the defense industry. A couple problems though. First of all you never write
the "same" 2+ MLOC system twice, you'll always use the lessons gained from the
first go at the second design. Something tells me you wrote the C++ version
first. Your experience is also something that we can't see for ourselves, and
involves lots of variables that we can't evaluate. It seems likely that your
company didn't have a strong culture of best practices in C++, or maybe just
weren't that good at it. Your C++ design might have had lots of opportunities
for improvement that good C++ people would immediately recognize.

It's not hard to write good and high-performance C++ when you have a good set
of guidelines like the Google C++ style guide ([http://google-
styleguide.googlecode.com/svn/trunk/cppguide.x...](http://google-
styleguide.googlecode.com/svn/trunk/cppguide.xml)) and a solid set of core
libraries, but without these it's easier to go astray. There's no doubt that
Java's reasonable set of "batteries included" libraries makes this harder to
screw up.

I can say from my brief stint in defense that I was not super impressed.
Everything was based around extremely heavy infrastructure like CORBA. There
are lots of decisions like this that can skew the results you were getting.

~~~
pron
> Basically every piece of performance-critical open-source infrastructure
> (ie. what we can actually all see and evaluate) is written in C or C++. That
> is empirical data.

Beside the fact that this is an appeal to an irrelevant authority, it is also
false.

MongoDB is about as far from being performant as you can be; Riak is not
written in C++, but in Erlang; PostgreSQL and MySQL predate Java (and predate
Java 1.4 – when it started getting performant – by many years), as do Exim,
Postfix, qmail, Sendmail and BIND (don't know about PowerDNS); Microsoft has
been an enemy of Java for political reasons.

On the other hand, recent Apache projects are all Java: Tomcat, ZooKeeper and
the list is long. Cassandra and Voldemort are Java; Twitter Storm; most
benchmarks show Java web-servers (of non-static content) to outperform or
match C and C++ servers
([http://www.techempower.com/benchmarks/#section=data-r6&hw=i7...](http://www.techempower.com/benchmarks/#section=data-r6&hw=i7&test=plaintext)).

> It's not hard to write good and high-performance C++ when you have a good
> set of guidelines like the Google C++ style guide ([http://google-
> styleguide.googlecode.com/svn/trunk/cppguide.x...](http://google-
> styleguide.googlecode.com/svn/trunk/cppguide.x...)) and a solid set of core
> libraries, but without these it's easier to go astray.

This may have been true 5 or 6 years ago. But modern multicore hardware
requires a high level of concurrency, and a high level of concurrency requires
low contention, and low contention requires non-blocking data structures, and
nonblocking data structures require either a GC, or some other method of
automatic memory reclamation that's _extremely_ hard to achieve in C++ (or any
other language, for that matter). Even C++'s natural way of pointer sharing
(reference counting) is a huge contention point and a performance killer. Not
to mention the fact that until C++11, C++ didn't even have a memory model
(which, BTW, was designed by Java people).

So while it's not too hard to make single-threaded C++ code run a little
faster than Java, it's _extremely_ hard to make multithreaded C++ code
outperform Java, even for experts (and it was almost impossible prior to
C++11).

~~~
haberman
> most benchmarks show Java web-servers (of non-static content) to outperform
> or match C and C++ servers
> ([http://www.techempower.com/benchmarks/#section=data-r6&hw=i7...](http://www.techempower.com/benchmarks/#section=data-r6&hw=i7...)).

Umm, there are no C or C++ servers in that benchmark, except for one extremely
slow one I've never heard of.

> and a high level of concurrency requires low contention, and low contention
> requires non-blocking data structures

That really does not follow.

Lock-free data structures are not about reducing contention, they are about
eliminating blocking and improving performance when contention _does_ occur.
Contention occurs whenever multiple threads attempt to access the same
resource, regardless of whether access to that resource is blocking or non-
blocking. Lock-free data structures can still experience contention, and while
they do not block, their performance does degrade under contention. Indeed
later in your message you say that reference counting is a "huge contention
point" \-- refcounting a la C++'s shared_ptr is a lock-free algorithm.

So while lock-free algorithms don't affect the _amount_ of contention in the
system, they do exhibit better performance under high contention, for the
(small) subset of data structures that can be made lock-free.

But this is only an issue when contention exists to begin with. Many highly
parallel systems exhibit very low contention; for example, MapReduce. Even in
systems with inherent contention, there are many strategies for reducing it,
like partitioning and affinity schemes.

> nonblocking data structures require either a GC, or some other method of
> automatic memory reclamation that's extremely hard to achieve in C++ (or any
> other language, for that matter)

I'm not sure what is so hard about Hazard Pointers
([http://www.research.ibm.com/people/m/michael/ieeetpds-2004.p...](http://www.research.ibm.com/people/m/michael/ieeetpds-2004.pdf)).
Maged Michael is a smart guy who figured this stuff out 10 years ago.

But you don't even have to do it yourself. Non-blocking algorithms are
extremely subtle by their nature (I hope you don't implement them yourself in
Java; that would be a waste of effort). Just use an existing library like
Thread Building Blocks, Grand Central Dispatch, etc. that are specifically
designed to achieve maximum multi-threaded performance in C++.

~~~
pron
Contention is not binary, or, rather, its effects aren't binary. A lock that
is held for 10 microseconds potentially causes a lot more contention than a
CAS that takes a few nanos. OTOH, thousands of short contentions are as bad
as, or worse, than few long contentions (there's a whole great talk by Doug
Lea about how Java's fork/join tricks modern CPU out of crazy power-saving
optimizations to reduce thread wakeup time).

Intel's TBB were modeled after java.util.concurrent, BTW. Later, the Java
memory model inspired the C++11 memory model (and the Java people were called
to help with that, too). When it comes to concurrency, Java is usually at
least 5 years ahead of C++ (perhaps with the exception of fork/join, which was
inspired by Cilk, but has since come to surpass it).

> I'm not sure what is so hard about Hazard Pointers

Hazard pointers are expensive without RCU (the hazards need to be visible to
other threads, which means store-load barriers), and RCU requires kernel
cooperation, and besides this entire class of techniques is either very
limited or _incredibly_ complicated (what do you do if you have thousands of
threads?). A good GC vs hazard pointers with or without RCU is like driving
down a highway vs stumbling in a swamp when it comes to lock-free data
structures.

Here's Herb Sutter on the issue (source:
[http://herbsutter.com/2011/10/25/garbage-collection-
synopsis...](http://herbsutter.com/2011/10/25/garbage-collection-synopsis-
and-c/)):

 _All existing C++ garbage collectors, notably the Boehm collector, aren’t
enough for many interesting uses for several reasons, the main one being that
they’re conservative, not accurate. See Hans Boehm’s page Advantages and
Disadvantages of Conservative Garbage Collection for more discussion of
tradeoffs._

 _As usual, there’s something to standardize iff we want to enable
portability. So in this case the question is whether it’s interesting to
enable portable programs that rely on GC — and note that typically if they
rely on GC, they rely specifically on accurate GC, not conservative “maybe-
GC.” For example, C++11 atomics enable writing many kinds of portable lock-
free code, but not all; we need a standard for #2 if we want to enable people
to write portable lock-free singly-linked lists, where the only known
practical algorithms require automatic (non-refcount) GC and so they cannot be
implemented in portable C++ today without some nonportable platform-specific
code underneath._

 _Again, appeals to hazard pointers fail because the obscure convolutions and
limitations of hazard pointers only expose the limits of even the most heroic
efforts to work around the lack of true automatic GC. The main contribution of
hazard pointers, unfortunately, is that they thoroughly motivate why direct
support for automatic GC is needed._

~~~
haberman
> A lock that is held for 10 microseconds potentially causes a lot more
> contention than a CAS that takes a few nanos.

See, if you're trying to compare those two things, you're comparing a lot more
than locking vs lock-free; you're comparing two completely different designs.
All you can do with a single CAS is a very primitive operation like queue
enqueue/dequeue, stack push/pop, set add/remove, etc. If you're holding a lock
for 10us you're doing a whole lot more than that; you're doing something that
isn't possible to do lock-free, and your design is inherently less
concurrency-friendly.

If your experience comes from taking a coarse-grained mutex-heavy design in
C++ and reimplementing it in Java with a design that communicates between
threads only with lock-free data structures, then your results aren't
indicative of locking vs. lock-free _or_ C++ vs. Java.

If you really want to compare locks vs. lock-free apples-to-apples, then use
data structures that have the same API as your lock-free data structures, but
are implemented by taking a lock for a very short amount of time -- just
enough time for the operation you would have performed lock-free.

And if you want to compare C++ vs. Java, use an equally concurrency-friendly
design with both. If you're using high-performance lock-free data structures
in Java, use the same in C++, since they are available in both. The memory
management complications can be confined to the container library; at the
application level exclusive ownership of objects can be passed from thread to
thread.

~~~
pron
Obviously. But lock-free data structures in C++ are not as good as their Java
counterparts (because of GC), and besides, most lock-free DS are often
implemented in C++ several years after they're implemented in Java, if at all.

For example, Java's nonblocking hash map (by Cliff Click) is still not
available for C++, 6 years after its release. There are a few very
limited/half baked implementations, but nothing "canonical" (and nothing so
thoroughly researched, optimized and used).

BTW, part of the reason is that the world's leading low-level concurrency
researchers and implementers (Doug Lea, Nir Shavit and others) do most of
their work in Java (most of my own work is designing non-blocking data
structures, BTW; not that I'm comparing myself to Lea or Shavit in any way).

To get a sense for the quality of Java data structure implementations, I
strongly suggest you watch this talk by Doug Lea:
[https://www.youtube.com/watch?v=sq0MX3fHkro](https://www.youtube.com/watch?v=sq0MX3fHkro)

His work is incredible. Even if you don't use Java, there is so much to learn
in the talk. One of the coolest things there (which I already hinted at), is
how to spin properly on modern Intel processors.

~~~
haberman
I appreciate the link to the talk, I learned a lot. It just reinforces my
beliefs though; so much of the talk is about having to work around the
idiosyncrasies of the JVM. Quotes from the talk:

4:15 "The main moral of this talk is that things are never what they seem, and
they are increasingly not even _close_ to what they seem as we have more and
more layers and systems. So there's you on top of core libraries who live on
top of JVMs, and the reason I know all the JVM folks so much is because I yell
at them pretty much all the time."

10:30 "Oh my god, [because we derive from java.util.Collection] we have put
more lines of code into supporting a method that nobody in their right mind
would call for a concurrent queue, than almost everything else combined.
Between iterators and interior remove, that's probably 80% of our concurrent
queue code. The algorithms themselves, they're maybe 20 lines, they're cool
but they're not a lot of code."

0:57:10 "How do you get a good CAS loop? CAS until success? Don't do while, do
do/while! Why? Safepoints! Remember safepoints I mentioned long ago? Yeah,
they're not your friend. [...] You know the great thing about JITs is that
they work so well, and the bad thing about JITs is that, you know, sometimes
they're idiots! [...] Boxing, oh my god I hate boxing. Classloading appearing
at mysterious times."

Maybe to you the JVM gives enough benefits to be worth giving up all this
control, but to me it it's just nuts that you'd put yourself on top of a
platform that you have to work against so much.

And I still don't buy that GC makes for better lock-free data structures.
Lock-free data structures may need to have a GC-like scheme _internally_ to
memory-manage their nodes, but I can't see any reason why this requires the
entire program to be subject to GC.

~~~
pron
> I can't see any reason why this requires the entire program to be subject to
> GC.

It doesn't. Lots of Java low-level programs (DBs, caches etc.) manage most of
their heap manually. But an "internal" GC would have to be just as good as a
"whole program" GC anyhow in order to make the non-blocking data structure
robust (I generally don't use the term lock-free, as it's a very specific case
of a non-blocking data structure; many interesting nonblocking DSs are only
obstruction-free rather than lock-free).

Also, GCs have other advantages, such as a much higher memory allocation-
reclamation throughput than any general manual scheme, especially (though not
only) in the multi-threaded case. Many GCs may introduce pauses, but all the
good ones would give you a much higher throughput than anything you can
achieve manually.

> Maybe to you the JVM gives enough benefits to be worth giving up all this
> control, but to me it it's just nuts that you'd put yourself on top of a
> platform that you have to work against so much.

Yes, to anyone doing hardcore multicore programming, the JVM is probably the
best choice. You are not giving up control, though, because C++ has far worse
problems in those areas (forget about safepoints - C++ doesn't even have a
memory model, or didn't until C++11, and most C++ programmers don't even know
what ordering guarantees their particular compiler provides). Also, if you
think you're not "giving up control" in C++ (or C; or Assembly), then you're
sorely mistaken. You have little or no control over ILP, prefetching, branch-
prediction, outstanding cache misses and all the magic modern CPUs do at
runtime. You could think of the JVM as an extension of the CPU only in
software – modern CPUs do so much crazy stuff to your code anyway, that you
can't possibly know how exactly it will be executed even if you program
directly in machine code. You can no longer get truly close to the metal, so
being one step rather than two steps removed is not much of an advantage any
more.

> "Oh my god, [because we derive from java.util.Collection] we have put more
> lines of code into supporting a method that nobody in their right mind would
> call for a concurrent queue, than almost everything else combined. Between
> iterators and interior remove, that's probably 80% of our concurrent queue
> code.

Sure. The code is often complex to support moot use-cases. Still, it's leaps
and bounds beyond anything in pretty much any other language. Again, the
reason is mostly coincidental. From its beginnings, more or less, Java (or
Sun) has attracted the most prominent concurrency researchers. Java 1.5 was
the first general purpose programming environment with a memory model, iron-
level access to concurrency primitives (fences, CASs), and some world-class
GCs, all requirement for state-of-the-art concurrency research. To this day,
concurrency researchers prefer the JVM, despite its annoyances (and what
language doesn't have those?) to any other platform out there. That's why most
concurrent DSs get implemented in Java first, and their Java implementation is
usually the most polished and robust.

But to quickly go back to where we started, a higher-level PL often brings
about performance improvements. There are famous examples of sorting
algorithms in C++ easily outperforming their C implementations. Why? Because
of templates vs. function pointers. More precisely - due to inlining, as
inlining is "the mother of all optimizations" (it opens the door to other
optimizations). Now, of course you could achieve the same in C, by writing the
sort function many times, but it's cumbersome and error prone.

The JVM is "the inlining magic machine". It inlines code that a C++ compiler
never could. There are some areas where C++ is more performant, namely
computations on arrays of complex numbers, but hopefully that will be
addressed in Java 9. But this inlining capabilities of the JVM is why it often
outperform C++ even in single-threaded code. Again, you could achieve the same
in C++ by writing the same code several times (or by generating machine code
at runtime, which is what the JVM does). Also, this effect is usually
perceived in larger applications, because then you use virtual methods.

Anyway, that's pretty much what I have to say on the matter. I have a lot of
respect for C++. I was a C++ programmer for many years. The JVM brings many
interesting performance improvements over C++, as well as performance
regressions in comparison. All in all, Java usually comes up ahead in large
codebases, and falls behind in specific use cases (mostly numeric HPC). In
addition, there is no other programming language/platform out there that even
comes close to the JDK's power when it comes to hardcore concurrency; some
successful implementations find their way to TBB a few years later, and some
don't. If you're trying to push the envelope on concurrent software
development (my day job), Java is pretty much the only sensible way to go.

Still, there are valid reasons to prefer C++ over Java in many circumstances,
but sheer performance is seldom one of them.

~~~
haberman
> Also, GCs have other advantages, such as a much higher memory allocation-
> reclamation throughput than any general manual scheme

A benefit of using C/C++ is that you are not limited to general schemes. If
you have an allocation/deallocation hotspot, you can use more specialized
schemes like arena allocation, which will beat a GC handily.

Also, latency and predictability are often more important than throughput. GCs
fare poorly on these counts.

> forget about safepoints - C++ doesn't even have a memory model, or didn't
> until C++11

Aka it _does_ have a memory model. And one of the points Doug Lea made in the
talk you linked is that the Java memory model is actually broken and no one
knows how to fix it, whereas C++'s is not (though it's broken in ways that
apparently don't affect anyone).

> Also, if you think you're not "giving up control" in C++ (or C; or
> Assembly), then you're sorely mistaken. You have little or no control over
> ILP, prefetching, branch-prediction, outstanding cache misses and all the
> magic modern CPUs do at runtime.

This is a tired and weak argument.

A mispredicted branch costs 5ns. A main memory reference (cache miss) costs
100ns. And you actually have a lot of control over these things; you can make
your branches more predictable and the working set of your tight loops smaller
so it will fit in cache. And there are actually instructions that give control
over prefetching.

A Java GC pause on the other hand is often on the order of 200ms. That's six
orders of magnitude in difference, and can be even longer in degenerate GC
situations. There is really no bound to the worst case.

The loss of control also comes from the fact that the effects are highly
nonlocal. If your C/C++ code is in a tight loop, the microarchitectural
effects are mostly a function of your own code and its behavior. But whether a
safepoint triggers a GC pause in Java is a function of the entire program's
behavior.

So you really can't equate these two things. Java's loss of control is many
orders of magnitude larger and less local.

> But this inlining capabilities of the JVM is why it often outperform C++
> even in single-threaded code.

I don't think you've made this case at all.

> All in all, Java usually comes up ahead in large codebases

Coincidentally it is large codebases where it is most difficult to compare
apples to apples, because it is so unlikely that you can really get two large
programs that do the same thing with the same design. I'm deeply skeptical of
this claim.

> and falls behind in specific use cases (mostly numeric HPC).

Java falls behind it tons of cases. It loses in nearly all of the benchmarks
game, only some of which are numeric:
[http://benchmarksgame.alioth.debian.org/u64q/java.php](http://benchmarksgame.alioth.debian.org/u64q/java.php)
It loses very badly in this benchmark published a few years ago by Google,
which is not numeric:
[http://static.googleusercontent.com/external_content/untrust...](http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/pubs/archive/37122.pdf)

Given comparable programs, C++ is nearly always faster. You really haven't
offered any evidence that contradicts this basic fact. The only benchmark you
cited didn't even have any C++ programs in it.

~~~
pron
> A benefit of using C/C++ is that you are not limited to general schemes. If
> you have an allocation/deallocation hotspot, you can use more specialized
> schemes like arena allocation, which will beat a GC handily.

Absolutely. And you can do the same in Java with about the same amount of
effort (lots of Java projects do that).

> it does have a memory model.

It does, and it probably fixes some mistakes in the JMM, but C++ got it, what,
10 years after Java? That's about how far behind C++ is in terms of
concurrency. I know this because that's my business (concurrent data
structures and high concurrency on many-core hardware in general - you really
can't do C++ well on a many-core system; it doesn't scale unless it's
embarrassingly parallel).

> A Java GC pause on the other hand is often on the order of 200ms.

It could actually be much worse. Fortunately, you could get a commercial GC
that offers exactly 0 pauses (in fact, I suggest my customers do exactly that
if they need 0 pause but fall short of true hard real-time; for hard real-time
there's real-time Java with built in arena allocation with correctness
guarantees of no pointers from the heap into the arena).

> Java falls behind it tons of cases.

I know I won't be able to convince you, but those benchmarks are completely
irrelevant. I can easily make any single-threaded algorithm perform better in
C/C++. But, as I've said in the beginning, Java comes out ahead when you have
a large codebase with many team members, which requires software engineering
that precludes good optimizations. This realization is a result of many years
experience with both Java and C++. I don't have much hard evidence (some, but
it's possibly anecdotal and also proprietary), but in our field, as in
medicine, sometimes hard-earned first hand experience means a lot.

------
hbbio
It's an impressive project that shows the extraordinary work of a single
developer, Mike Pall. LuaJIT is not only an incredibly beautiful JIT, the
current project also demonstrates it can run generated code efficiently, which
is probably the hardest thing to do for a language.

When implementing Opa in OCaml, we run into tons of problem since the OCaml
runtime was unable to run generated code efficiently. One of its author,
Pierre Weis told us at the time: "it's normal, OCaml is not meant to run
generated code at all".

As a side note regarding Luje source code: Fossil must be great and all that
but having the code on a publicly available Git repository would probably
bolster contributions.

~~~
jsnell
Luajit is of course awesome, but let's not get too carried away. It doesn't
sound like this JVM has been used for running anything beyond extremely simple
benchmarks. Of course you're not going to run into difficulties with the
translation of a 15 line method.

I saw a talk Mike gave on Luajit last week, and one thing he credited Luajit's
success on was that it was working on a rather high abstraction layer. Stuff
like the loop optimizations and alias analysis work because the optimizer can
essentially understand the code at a fairly high level, and is somewhat
specific to Lua. When dealing with java translated to jvm bytecode naively
translated to lua source you lose much of that context. (And at the time he
didn't seem too convinced about using luajit as the backend for other
languages.)

But even if you ignore the loss of context, I'm sure that various luajit
limits would get hit with real life Java code. E.g. I'd guess there's a
fundamental 65k limit for the number of constants and bytecode instructions in
a single function.

~~~
justincormack
JVM bytecode is quite nasty from this point of view; one of Mike's comments in
the mailing list was to try Dalvik instead. It loses a lot of loop structure.

~~~
jonasb
[http://www.freelists.org/post/luajit/ANN-
luje-01](http://www.freelists.org/post/luajit/ANN-luje-01)

------
doublec
There's an interesting paper about a similar project that was done using Self
[1]. "Design and Implementation of Pep, a Java Just-In-Time Translator written
in Self"

It goes through details with benchmarks. The PEP code is available with the
Self distribution. They were able to get faster than the Java implementation
at the time. Back then Java was interpreted though.

[1]
[http://bluishcoder.co.nz/self/97-pep.pdf](http://bluishcoder.co.nz/self/97-pep.pdf)

~~~
pjmlp
Languages are nor interpreted or compiled. It is all about implementations.

------
JulianMorrison
Isn't this going to be intrinsically restricted to one numeric type? Lua
numbers are all the same type (double by default).

~~~
justincormack
No, LuaJIT has an ffi interface with support for any type, and you can just
use them as if they are native types.

~~~
derleth
Seems a bit like cheating if you do it that way, though.

~~~
justincormack
Its not really, they are just native types, but defined with a C syntax so
they are compatible with C. The code generation knows about them natively and
optimises them.

------
swah
Is LuaJIT bytecode a good target for toy languages?

------
taranka231
is there any java/lua interop?

