LuaJIT Performance

+Here are some performance measurements, based on a few benchmarks. +

+LuaJIT 2.0 is available with much improved performance!
+Please check the new +» interactive performance comparison. +

+ +

Interpreting the Results

+As is always the case with benchmarks, care must be taken to +interpret the results: +

+First, the standard Lua interpreter is already very fast. +It's commonly the fastest of it's class (interpreters) in the +» Great Computer Language Shootout. +Only true machine code compilers get a better overall score. +

+Any performance improvements due to LuaJIT can only be incremental. +You can't expect a speedup of 50x if the fastest compiled language +is only 5x faster than interpreted Lua in a particular benchmark. +LuaJIT can't do miracles. +

+Also please note that most of the benchmarks below are not +trivial micro-benchmarks, which are often cited with marvelous numbers. +Micro-benchmarks do not realistically model the performance gains you +can expect in your own programs. +

+It's easy to make up a few one-liners like:
+ local function f(...) end; for i=1,1e7 do f() end
+This is more than 30x faster with LuaJIT. But you won't find +this in a real-world program. +

+ +

Measurement Methods

+All measurements have been taken on a Pentium III 1.139 GHz +running Linux 2.6. Both Lua and LuaJIT have been compiled with +GCC 3.3.6 with -O3 -fomit-frame-pointer. +You'll definitely get different results on different machines or +with different C compiler options. ^* +

+The base for the comparison are the user CPU times as reported by +/usr/bin/time. The runtime of each benchmark is parametrized +and has been adjusted to minimize the variation between several runs. +The ratio between the times for LuaJIT and Lua gives the speedup. +Only this number is shown because it's less dependent on a specific system. +

+E.g. a speedup of 6.74 means the same benchmark runs almost 7 times +faster with luajit -O than with standard Lua (or with +-j off). Your mileage may vary. +

+^* Yes, LuaJIT relies on quite a bit of the Lua core infrastructure +like table and string handling. All of this is written in C and +should be compiled with full optimization turned on, or performance +will suffer. +

+ +

Comparing Lua to LuaJIT

+Here is a comparison using the current benchmark collection of the +» Great Computer Language Shootout (as of 3/2006): +

+ +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Benchmark	Speedup	+ +
mandelbrot	6.74
recursive	6.64
fannkuch	5.37
chameneos	5.08
nsievebits	5.05
pidigits	4.94
nbody	4.63
spectralnorm	4.59
cheapconcr	4.46
partialsums	3.73
fasta	2.68
cheapconcw	2.52
nsieve	1.95
revcomp	1.92
knucleotide	1.59
binarytrees	1.52
sumfile	1.27
regexdna	1.01

+Note that many of these benchmarks have changed over time (both spec +and code). Benchmark results shown in previous versions of LuaJIT +are not directly comparable. The next section compares different +versions with the current set of benchmarks. +

+ +

Comparing LuaJIT Versions

+This shows the improvements between the following versions: +

LuaJIT 1.0.x
LuaJIT 1.1.x

+ +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Benchmark	Speedup	+ +
fannkuch	3.96 → 5.37
chameneos	2.25 → 5.08
nsievebits	2.90 → 5.05
pidigits	3.58 → 4.94
nbody	4.16 → 4.63
cheapconcr	1.46 → 4.46
partialsums	1.71 → 3.73
fasta	2.37 → 2.68
cheapconcw	1.27 → 2.52
revcomp	1.45 → 1.92
knucleotide	1.32 → 1.59

+All other benchmarks show only minor performance differences. +

+ +

Summary

+These results should give you an idea about what speedup +you can expect depending on the nature of your Lua code: +

+LuaJIT is really good at (floating-point) math and loops +(mandelbrot, pidigits, spectralnorm, partialsums). +
+Function calls (recursive), vararg calls, table lookups (nbody), +table iteration and coroutine switching (chameneos, cheapconc) +are a lot faster than with plain Lua. +
+It's still pretty good for indexed table access (fannkuch, nsieve) +and string processing (fasta, revcomp, knucleotide). +But there is room for improvement in a future version. +
+If your application spends most of the time in C code +you won't see much of a difference (regexdna, sumfile). +Ok, so write more code in pure Lua. :-) +
+The real speedup may be shadowed by other dominant factors in a benchmark: +
- Common parts of the Lua core: e.g. memory allocation +and GC (binarytrees).
- Language characteristics: e.g. lack of bit operations (nsievebits).
- System characteristics: e.g. CPU cache size and memory speed (nsieve).
+

+The best idea is of course to benchmark your own applications. +Please report any interesting results you may find. Thank you! +

+
+