Here are some performance measurements, based on a few benchmarks.
LuaJIT 2.0 is available with much improved performance!
Please check the new
» interactive performance comparison.
Interpreting the Results
As is always the case with benchmarks, care must be taken to interpret the results:
First, the standard Lua interpreter is already very fast. It's commonly the fastest of it's class (interpreters) in the » Great Computer Language Shootout. Only true machine code compilers get a better overall score.
Any performance improvements due to LuaJIT can only be incremental. You can't expect a speedup of 50x if the fastest compiled language is only 5x faster than interpreted Lua in a particular benchmark. LuaJIT can't do miracles.
Also please note that most of the benchmarks below are not trivial micro-benchmarks, which are often cited with marvelous numbers. Micro-benchmarks do not realistically model the performance gains you can expect in your own programs.
It's easy to make up a few one-liners like:
local function f(...) end; for i=1,1e7 do f() end
This is more than 30x faster with LuaJIT. But you won't find
this in a real-world program.
Measurement Methods
All measurements have been taken on a Pentium III 1.139 GHz running Linux 2.6. Both Lua and LuaJIT have been compiled with GCC 3.3.6 with -O3 -fomit-frame-pointer. You'll definitely get different results on different machines or with different C compiler options. *
The base for the comparison are the user CPU times as reported by /usr/bin/time. The runtime of each benchmark is parametrized and has been adjusted to minimize the variation between several runs. The ratio between the times for LuaJIT and Lua gives the speedup. Only this number is shown because it's less dependent on a specific system.
E.g. a speedup of 6.74 means the same benchmark runs almost 7 times faster with luajit -O than with standard Lua (or with -j off). Your mileage may vary.
* Yes, LuaJIT relies on quite a bit of the Lua core infrastructure like table and string handling. All of this is written in C and should be compiled with full optimization turned on, or performance will suffer.
Comparing Lua to LuaJIT
Here is a comparison using the current benchmark collection of the » Great Computer Language Shootout (as of 3/2006):
Benchmark | Speedup | |
mandelbrot | 6.74 | |
recursive | 6.64 | |
fannkuch | 5.37 | |
chameneos | 5.08 | |
nsievebits | 5.05 | |
pidigits | 4.94 | |
nbody | 4.63 | |
spectralnorm | 4.59 | |
cheapconcr | 4.46 | |
partialsums | 3.73 | |
fasta | 2.68 | |
cheapconcw | 2.52 | |
nsieve | 1.95 | |
revcomp | 1.92 | |
knucleotide | 1.59 | |
binarytrees | 1.52 | |
sumfile | 1.27 | |
regexdna | 1.01 |
Note that many of these benchmarks have changed over time (both spec and code). Benchmark results shown in previous versions of LuaJIT are not directly comparable. The next section compares different versions with the current set of benchmarks.
Comparing LuaJIT Versions
This shows the improvements between the following versions:
- LuaJIT 1.0.x
- LuaJIT 1.1.x
Benchmark | Speedup | |
fannkuch | 3.96 → 5.37 | |
chameneos | 2.25 → 5.08 | |
nsievebits | 2.90 → 5.05 | |
pidigits | 3.58 → 4.94 | |
nbody | 4.16 → 4.63 | |
cheapconcr | 1.46 → 4.46 | |
partialsums | 1.71 → 3.73 | |
fasta | 2.37 → 2.68 | |
cheapconcw | 1.27 → 2.52 | |
revcomp | 1.45 → 1.92 | |
knucleotide | 1.32 → 1.59 |
All other benchmarks show only minor performance differences.
Summary
These results should give you an idea about what speedup you can expect depending on the nature of your Lua code:
- LuaJIT is really good at (floating-point) math and loops (mandelbrot, pidigits, spectralnorm, partialsums).
- Function calls (recursive), vararg calls, table lookups (nbody), table iteration and coroutine switching (chameneos, cheapconc) are a lot faster than with plain Lua.
- It's still pretty good for indexed table access (fannkuch, nsieve) and string processing (fasta, revcomp, knucleotide). But there is room for improvement in a future version.
- If your application spends most of the time in C code you won't see much of a difference (regexdna, sumfile). Ok, so write more code in pure Lua. :-)
-
The real speedup may be shadowed by other dominant factors in a benchmark:
- Common parts of the Lua core: e.g. memory allocation and GC (binarytrees).
- Language characteristics: e.g. lack of bit operations (nsievebits).
- System characteristics: e.g. CPU cache size and memory speed (nsieve).
The best idea is of course to benchmark your own applications. Please report any interesting results you may find. Thank you!