Hacker News new | comments | show | ask | jobs | submit login
LuaJIT gets allocation sinking; Complex numbers on par with C; Better than java
21 points by daurnimator on July 3, 2012 | hide | past | web | favorite
From luajit mailing list: http://www.freelists.org/post/luajit/Allocation-sinking-in-git-HEAD http://www.freelists.org/post/luajit/Allocation-sinking-in-git-HEAD,3 http://www.freelists.org/post/luajit/Allocation-sinking-in-git-HEAD,5 -------------------------------

LuaJIT git HEAD now contains the new allocation sinking and store sinking optimization.

This optimization is enabled by default. In case you encounter any problems and want to check whether they are caused by this optimization, you can turn it off with: -O-sink

The optimization is geared towards the elimination of short-lived aggregates. It handles plain Lua tables as well as FFI cdata (e.g. structs, complex or short arrays). It also handles elimination of immutable FFI types that are implicitly boxed (e.g. 64 bit ints or pointers) in more contexts (e.g. loop-carried variables).

This optimization adds quite a bit of complexity, so I'd appreciate it if it would receive wider testing. Feedback welcome!

Here are a few examples that show the improved performance. The timings in seconds are for Lua 5.1.5 vs. LuaJIT git HEAD on x86 (32 bit). Lower numbers are better:

Typical point class with Lua tables:

  local point
  point = {
    new = function(self, x, y)
      return setmetatable({x=x, y=y}, self)
    __add = function(a, b)
     return point:new(a.x + b.x, a.y + b.y)
  point.__index = point
  local a, b = point:new(1.5, 2.5), point:new(3.25, 4.75)
  for i=1,1e8 do a = (a + b) + b end
  print(a.x, a.y)

  140.0  Lua
   26.9  LuaJIT -O-sink
    0.20 LuaJIT -O+sink ***700x faster than Lua***
Typical point class with cdata struct:

  local ffi = require("ffi")
  local point
  point = ffi.metatype("struct { double x, y; }", {
    __add = function(a, b)
     return point(a.x + b.x, a.y + b.y)
  local a, b = point(1.5, 2.5), point(3.25, 4.75)
  for i=1,1e8 do a = (a + b) + b end
  print(a.x, a.y)

   10.9  LuaJIT -O-sink
    0.20 LuaJIT -O+sink *700x faster than Lua*
64 bit arithmetic in a loop:

  local x = 0LL
  for i=1,1e9 do x = x + 100 end

   45.8  LuaJIT -O-sink (x86)
   40.9  LuaJIT -O-sink (x64)
    0.84 LuaJIT -O+sink (x86)
    0.48 LuaJIT -O+sink (x64)
I'm planning to add a standard "ffi.complex" module, which implements all of the common operators and functions. But I'll probably have to add a couple of things to the JIT compiler: JIT-compiling calls to C functions with complex args/returns and marking C functions as 'pure'.

One thing to remember is to always use a "complex[?]" cdata array or cdata structs if you want to store lots of complex numbers. Avoid storing them in plain Lua tables, because then all of them must be individually boxed. No problem storing an occasional complex value in a table with other data, of course.

Actually, that's one example where the LuaJIT FFI can do much better than Java: in Java, a 1000 element array of complex numbers has 1000 pointers to 1000 individually boxed complex objects. You can imagine this isn't exactly efficient ...

OTOH a "complex[1000]" array needs only 2 * 8 * 1000 bytes with the LuaJIT FFI, the same as in C. And it's not scanned by the GC.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact