Hacker News new | past | comments | ask | show | jobs | submit login

Nice work, man. I was doing microbenchmarks where I among other things measure the 90ns (JIT) and 120ns (Regular) function call overhead of Lua. My benchmarks was compared to my own scripting backend where the call overhead is 4ns, and it always feels a little bit like cheating when you have a gigantic headstart with each test.

I'm not surprised the JIT isn't that good, but this is crazy almost! Did you compare against the latest version of Lua?


In my microbenchmarks LuaJIT was always faster than Lua, but not so much that I wouldn't stop using Lua. In my mind once you stop developing a language it stops being interesting, because you are always looking for ways to simplify and remove whole classes of bugs.

EDIT: I just built lua5.4 from sources and ran my benchmarks again, and it's markedly worse than LuaJIT still, but it's better than before for sure. My conclusion is that Lua5.4 is the fastest PUC-Lua yet.


Some of these benchmarks are maybe not how one would do things if they were trying to go fast. An unfortunate idiom I wrote a thousand times at a company that uses Lua and tries to go fast is

  local n = 0
  local t = {}
  -- to do an insert
  n = n + 1
  t[n] = blorp
  -- to delegate some inserts to another function
  n = insert_some_stuff(t,n)
A less messy thing you could do is

  local table_insert = table.insert
but this is not as great, because lua will still do a binary search to find the end of the array each time you insert something.

Lua's weak point is around doing things with arrays, IME. It's just not very fast when you actually need something simple.

I actually started poking around at a language design over the past month that tries to leverage Lua as the compiler for something that can run in its own separate bytecode interpreter, with more of an emphasis on low level primitives. It started off as a simple adaptation of Forth ideas, but yesterday I started playing with a revision that shifts the data structure from plain old stack towards growable arrays containing three cursors(thus, "tricursor array") which can be purposed to describe bounds, destinations, insertion points, read and write, top of stack, etc. In Forth the top three values of the stack tend to get dedicated words for their manipulation because they are used quite often; this model runs with that idea plus methods I've often used for array data(text edits, sound sample loops, blotting sprites) and tries to extrapolate that into a language that can ease certain forms of array programming, while falling back on using the array as a data stack. Still just sketching it now.

That might be, but the C++ array append is bounds-checked which makes it half as fast as it could be too. It would be 16ns if there is no checking. So, you are right but I am trying to do a balanced approach here. These benchmarks are made for me, and I want to write (more or less) normal code.

I don't want to write local table_insert = ... I'd rather drop Lua for something else then. That said, it's cool that there are things you can do to speed things up if you really have to.

Fair enough! I think t[#t+1] = x is not as offensive and will get you a few nanos less than the way with two table lookups, but if that's also not to your taste then that's fine.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact