+Here are some performance measurements, based on a few benchmarks. +
+
+LuaJIT 2.0 is available with much improved performance!
+Please check the new
+» interactive performance comparison.
+
Interpreting the Results
++As is always the case with benchmarks, care must be taken to +interpret the results: +
++First, the standard Lua interpreter is already very fast. +It's commonly the fastest of it's class (interpreters) in the +» Great Computer Language Shootout. +Only true machine code compilers get a better overall score. +
++Any performance improvements due to LuaJIT can only be incremental. +You can't expect a speedup of 50x if the fastest compiled language +is only 5x faster than interpreted Lua in a particular benchmark. +LuaJIT can't do miracles. +
++Also please note that most of the benchmarks below are not +trivial micro-benchmarks, which are often cited with marvelous numbers. +Micro-benchmarks do not realistically model the performance gains you +can expect in your own programs. +
+
+It's easy to make up a few one-liners like:
+ local function f(...) end; for i=1,1e7 do f() end
+This is more than 30x faster with LuaJIT. But you won't find
+this in a real-world program.
+
Measurement Methods
++All measurements have been taken on a Pentium III 1.139 GHz +running Linux 2.6. Both Lua and LuaJIT have been compiled with +GCC 3.3.6 with -O3 -fomit-frame-pointer. +You'll definitely get different results on different machines or +with different C compiler options. * +
++The base for the comparison are the user CPU times as reported by +/usr/bin/time. The runtime of each benchmark is parametrized +and has been adjusted to minimize the variation between several runs. +The ratio between the times for LuaJIT and Lua gives the speedup. +Only this number is shown because it's less dependent on a specific system. +
++E.g. a speedup of 6.74 means the same benchmark runs almost 7 times +faster with luajit -O than with standard Lua (or with +-j off). Your mileage may vary. +
++* Yes, LuaJIT relies on quite a bit of the Lua core infrastructure +like table and string handling. All of this is written in C and +should be compiled with full optimization turned on, or performance +will suffer. +
+ +Comparing Lua to LuaJIT
++Here is a comparison using the current benchmark collection of the +» Great Computer Language Shootout (as of 3/2006): +
+ +Benchmark | +Speedup | ++ + | +
mandelbrot | +6.74 | ++ |
recursive | +6.64 | ++ |
fannkuch | +5.37 | ++ |
chameneos | +5.08 | ++ |
nsievebits | +5.05 | ++ |
pidigits | +4.94 | ++ |
nbody | +4.63 | ++ |
spectralnorm | +4.59 | ++ |
cheapconcr | +4.46 | ++ |
partialsums | +3.73 | ++ |
fasta | +2.68 | ++ |
cheapconcw | +2.52 | ++ |
nsieve | +1.95 | ++ |
revcomp | +1.92 | ++ |
knucleotide | +1.59 | ++ |
binarytrees | +1.52 | ++ |
sumfile | +1.27 | ++ |
regexdna | +1.01 | ++ |
+Note that many of these benchmarks have changed over time (both spec +and code). Benchmark results shown in previous versions of LuaJIT +are not directly comparable. The next section compares different +versions with the current set of benchmarks. +
+ +Comparing LuaJIT Versions
++This shows the improvements between the following versions: +
+-
+
- LuaJIT 1.0.x +
- LuaJIT 1.1.x +
Benchmark | +Speedup | ++ + | +
fannkuch | +3.96 → 5.37 | ++ |
chameneos | +2.25 → 5.08 | ++ |
nsievebits | +2.90 → 5.05 | ++ |
pidigits | +3.58 → 4.94 | ++ |
nbody | +4.16 → 4.63 | ++ |
cheapconcr | +1.46 → 4.46 | ++ |
partialsums | +1.71 → 3.73 | ++ |
fasta | +2.37 → 2.68 | ++ |
cheapconcw | +1.27 → 2.52 | ++ |
revcomp | +1.45 → 1.92 | ++ |
knucleotide | +1.32 → 1.59 | ++ |
+All other benchmarks show only minor performance differences. +
+ +Summary
++These results should give you an idea about what speedup +you can expect depending on the nature of your Lua code: +
+-
+
- +LuaJIT is really good at (floating-point) math and loops +(mandelbrot, pidigits, spectralnorm, partialsums). + +
- +Function calls (recursive), vararg calls, table lookups (nbody), +table iteration and coroutine switching (chameneos, cheapconc) +are a lot faster than with plain Lua. + +
- +It's still pretty good for indexed table access (fannkuch, nsieve) +and string processing (fasta, revcomp, knucleotide). +But there is room for improvement in a future version. + +
- +If your application spends most of the time in C code +you won't see much of a difference (regexdna, sumfile). +Ok, so write more code in pure Lua. :-) + +
-
+The real speedup may be shadowed by other dominant factors in a benchmark:
+
-
+
- Common parts of the Lua core: e.g. memory allocation +and GC (binarytrees). +
- Language characteristics: e.g. lack of bit operations (nsievebits). +
- System characteristics: e.g. CPU cache size and memory speed (nsieve). +
+
+The best idea is of course to benchmark your own applications. +Please report any interesting results you may find. Thank you! +
++