aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/libraries/LuaJIT-1.1.7/jitdoc/luajit_performance.html
blob: 7f2307cca1f3fc0865eb798b6dae786bc83010e2 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<title>LuaJIT Performance</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta name="Author" content="Mike Pall">
<meta name="Copyright" content="Copyright (C) 2005-2011, Mike Pall">
<meta name="Language" content="en">
<link rel="stylesheet" type="text/css" href="bluequad.css" media="screen">
<link rel="stylesheet" type="text/css" href="bluequad-print.css" media="print">
<style type="text/css">
table.bench {
  line-height: 1.2;
}
tr.benchhead td {
  font-weight: bold;
}
td img, li img {
  vertical-align: middle;
}
td.barhead, td.bar {
  font-size: 8pt;
  font-family: Courier New, Courier, monospace;
  width: 360px;
  padding: 0;
}
td.bar {
  background: url('img/backbar.png');
}
td.speedup {
  text-align: center;
}
</style>
</head>
<body>
<div id="site">
<a href="http://luajit.org/"><span>Lua<span id="logo">JIT</span></span></a>
</div>
<div id="head">
<h1>LuaJIT Performance</h1>
</div>
<div id="nav">
<ul><li>
<a href="index.html">Index</a>
</li><li>
<a href="luajit.html">LuaJIT</a>
<ul><li>
<a href="luajit_features.html">Features</a>
</li><li>
<a href="luajit_install.html">Installation</a>
</li><li>
<a href="luajit_run.html">Running</a>
</li><li>
<a href="luajit_api.html">API Extensions</a>
</li><li>
<a href="luajit_intro.html">Introduction</a>
</li><li>
<a class="current" href="luajit_performance.html">Performance</a>
</li><li>
<a href="luajit_debug.html">Debugging</a>
</li><li>
<a href="luajit_changes.html">Changes</a>
</li></ul>
</li><li>
<a href="coco.html">Coco</a>
<ul><li>
<a href="coco_portability.html">Portability</a>
</li><li>
<a href="coco_api.html">API Extensions</a>
</li><li>
<a href="coco_changes.html">Changes</a>
</li></ul>
</li><li>
<a href="dynasm.html">DynASM</a>
<ul><li>
<a href="dynasm_features.html">Features</a>
</li><li>
<a href="dynasm_examples.html">Examples</a>
</li></ul>
</li><li>
<a href="http://luajit.org/download.html">Download <span class="ext">&raquo;</span></a>
</li></ul>
</div>
<div id="main">
<p>
Here are some performance measurements, based on a few benchmarks.
</p>
<p style="background: #ffd0d0; text-align: center;">
LuaJIT 2.0 is available with much improved performance!<br>
Please check the new
<a href="http://luajit.org/performance.html"><span class="ext">&raquo;</span> interactive performance comparison</a>.
</p>

<h2 id="interpretation">Interpreting the Results</h2>
<p>
As is always the case with benchmarks, care must be taken to
interpret the results:
</p>
<p>
First, the standard Lua interpreter is already <em>very</em> fast.
It's commonly the fastest of it's class (interpreters) in the
<a href="http://shootout.alioth.debian.org/"><span class="ext">&raquo;</span>&nbsp;Great Computer Language Shootout</a>.
Only true machine code compilers get a better overall score.
</p>
<p>
Any performance improvements due to LuaJIT can only be incremental.
You can't expect a speedup of 50x if the fastest compiled language
is only 5x faster than interpreted Lua in a particular benchmark.
LuaJIT can't do miracles.
</p>
<p>
Also please note that most of the benchmarks below are <em>not</em>
trivial micro-benchmarks, which are often cited with marvelous numbers.
Micro-benchmarks do not realistically model the performance gains you
can expect in your own programs.
</p>
<p>
It's easy to make up a few one-liners like:<br>
<tt>&nbsp;&nbsp;local function f(...) end; for i=1,1e7 do f() end</tt><br>
This is more than 30x faster with LuaJIT. But you won't find
this in a real-world program.
</p>

<h2 id="methods">Measurement Methods</h2>
<p>
All measurements have been taken on a Pentium&nbsp;III 1.139&nbsp;GHz
running Linux&nbsp;2.6. Both Lua and LuaJIT have been compiled with
GCC&nbsp;3.3.6 with <tt>-O3 -fomit-frame-pointer</tt>.
You'll definitely get different results on different machines or
with different C&nbsp;compiler options. <sup>*</sup>
</p>
<p>
The base for the comparison are the user CPU times as reported by
<tt>/usr/bin/time</tt>. The runtime of each benchmark is parametrized
and has been adjusted to minimize the variation between several runs.
The ratio between the times for LuaJIT and Lua gives the speedup.
Only this number is shown because it's less dependent on a specific system.
</p>
<p>
E.g. a speedup of 6.74 means the same benchmark runs almost 7 times
faster with <tt>luajit&nbsp;-O</tt> than with standard Lua (or with
<tt>-j off</tt>). Your mileage may vary.
</p>
<p style="font-size: 80%;">
<sup>*</sup> Yes, LuaJIT relies on quite a bit of the Lua core infrastructure
like table and string handling. All of this is written in C and
should be compiled with full optimization turned on, or performance
will suffer.
</p>

<h2 id="lua_luajit" class="pagebreak">Comparing Lua to LuaJIT</h2>
<p>
Here is a comparison using the current benchmark collection of the
<a href="http://shootout.alioth.debian.org/"><span class="ext">&raquo;</span>&nbsp;Great Computer Language Shootout</a> (as of 3/2006):
</p>

<div class="tablewrap">
<table class="bench">
<tr class="benchhead">
<td>Benchmark</td>
<td class="speedup">Speedup</td>
<td class="barhead">
<img src="img/spacer.png" width="360" height="12" alt="-----1x----2x----3x----4x----5x----6x----7x----8x">
</td>
</tr>
<tr class="odd">
<td>mandelbrot</td>
<td class="speedup">6.74</td>
<td class="bar"><img src="img/bluebar.png" width="303" height="12" alt="========================================"></td>
</tr>
<tr class="even">
<td>recursive</td>
<td class="speedup">6.64</td>
<td class="bar"><img src="img/bluebar.png" width="299" height="12" alt="========================================"></td>
</tr>
<tr class="odd">
<td>fannkuch</td>
<td class="speedup">5.37</td>
<td class="bar"><img src="img/bluebar.png" width="242" height="12" alt="================================"></td>
</tr>
<tr class="even">
<td>chameneos</td>
<td class="speedup">5.08</td>
<td class="bar"><img src="img/bluebar.png" width="229" height="12" alt="=============================="></td>
</tr>
<tr class="odd">
<td>nsievebits</td>
<td class="speedup">5.05</td>
<td class="bar"><img src="img/bluebar.png" width="227" height="12" alt="=============================="></td>
</tr>
<tr class="even">
<td>pidigits</td>
<td class="speedup">4.94</td>
<td class="bar"><img src="img/bluebar.png" width="222" height="12" alt="=============================="></td>
</tr>
<tr class="odd">
<td>nbody</td>
<td class="speedup">4.63</td>
<td class="bar"><img src="img/bluebar.png" width="208" height="12" alt="============================"></td>
</tr>
<tr class="even">
<td>spectralnorm</td>
<td class="speedup">4.59</td>
<td class="bar"><img src="img/bluebar.png" width="207" height="12" alt="============================"></td>
</tr>
<tr class="odd">
<td>cheapconcr</td>
<td class="speedup">4.46</td>
<td class="bar"><img src="img/bluebar.png" width="201" height="12" alt="==========================="></td>
</tr>
<tr class="even">
<td>partialsums</td>
<td class="speedup">3.73</td>
<td class="bar"><img src="img/bluebar.png" width="168" height="12" alt="======================"></td>
</tr>
<tr class="odd">
<td>fasta</td>
<td class="speedup">2.68</td>
<td class="bar"><img src="img/bluebar.png" width="121" height="12" alt="================"></td>
</tr>
<tr class="even">
<td>cheapconcw</td>
<td class="speedup">2.52</td>
<td class="bar"><img src="img/bluebar.png" width="113" height="12" alt="==============="></td>
</tr>
<tr class="odd">
<td>nsieve</td>
<td class="speedup">1.95</td>
<td class="bar"><img src="img/bluebar.png" width="88" height="12" alt="============"></td>
</tr>
<tr class="even">
<td>revcomp</td>
<td class="speedup">1.92</td>
<td class="bar"><img src="img/bluebar.png" width="86" height="12" alt="============"></td>
</tr>
<tr class="odd">
<td>knucleotide</td>
<td class="speedup">1.59</td>
<td class="bar"><img src="img/bluebar.png" width="72" height="12" alt="=========="></td>
</tr>
<tr class="even">
<td>binarytrees</td>
<td class="speedup">1.52</td>
<td class="bar"><img src="img/bluebar.png" width="68" height="12" alt="========="></td>
</tr>
<tr class="odd">
<td>sumfile</td>
<td class="speedup">1.27</td>
<td class="bar"><img src="img/bluebar.png" width="57" height="12" alt="========"></td>
</tr>
<tr class="even">
<td>regexdna</td>
<td class="speedup">1.01</td>
<td class="bar"><img src="img/bluebar.png" width="45" height="12" alt="======"></td>
</tr>
</table>
</div>
<p>
Note that many of these benchmarks have changed over time (both spec
and code). Benchmark results shown in previous versions of LuaJIT
are not directly comparable. The next section compares different
versions with the current set of benchmarks.
</p>

<h2 id="luajit_versions" class="pagebreak">Comparing LuaJIT Versions</h2>
<p>
This shows the improvements between the following versions:
</p>
<ul>
<li>LuaJIT&nbsp;1.0.x <img src="img/bluebar.png" width="30" height="12" alt="(===)"></li>
<li>LuaJIT&nbsp;1.1.x <img src="img/bluebar.png" width="30" height="12" alt="(===##)"><img src="img/magentabar.png" width="20" height="12" alt=""></li>
</ul>

<div class="tablewrap">
<table class="bench">
<tr class="benchhead">
<td>Benchmark</td>
<td class="speedup">Speedup</td>
<td class="barhead">
<img src="img/spacer.png" width="360" height="12" alt="-----1x----2x----3x----4x----5x----6x----7x----8x">
</td>
</tr>
<tr class="odd">
<td>fannkuch</td>
<td class="speedup">3.96&nbsp;&rarr;&nbsp;5.37</td>
<td class="bar"><img src="img/bluebar.png" width="178" height="12" alt="========================"><img src="img/magentabar.png" width="64" height="12" alt="########"></td>
</tr>
<tr class="even">
<td>chameneos</td>
<td class="speedup">2.25&nbsp;&rarr;&nbsp;5.08</td>
<td class="bar"><img src="img/bluebar.png" width="101" height="12" alt="=============="><img src="img/magentabar.png" width="128" height="12" alt="################"></td>
</tr>
<tr class="odd">
<td>nsievebits</td>
<td class="speedup">2.90&nbsp;&rarr;&nbsp;5.05</td>
<td class="bar"><img src="img/bluebar.png" width="131" height="12" alt="================="><img src="img/magentabar.png" width="96" height="12" alt="#############"></td>
</tr>
<tr class="even">
<td>pidigits</td>
<td class="speedup">3.58&nbsp;&rarr;&nbsp;4.94</td>
<td class="bar"><img src="img/bluebar.png" width="161" height="12" alt="====================="><img src="img/magentabar.png" width="61" height="12" alt="#########"></td>
</tr>
<tr class="odd">
<td>nbody</td>
<td class="speedup">4.16&nbsp;&rarr;&nbsp;4.63</td>
<td class="bar"><img src="img/bluebar.png" width="187" height="12" alt="========================="><img src="img/magentabar.png" width="21" height="12" alt="###"></td>
</tr>
<tr class="even">
<td>cheapconcr</td>
<td class="speedup">1.46&nbsp;&rarr;&nbsp;4.46</td>
<td class="bar"><img src="img/bluebar.png" width="66" height="12" alt="========="><img src="img/magentabar.png" width="135" height="12" alt="##################"></td>
</tr>
<tr class="odd">
<td>partialsums</td>
<td class="speedup">1.71&nbsp;&rarr;&nbsp;3.73</td>
<td class="bar"><img src="img/bluebar.png" width="77" height="12" alt="=========="><img src="img/magentabar.png" width="91" height="12" alt="############"></td>
</tr>
<tr class="even">
<td>fasta</td>
<td class="speedup">2.37&nbsp;&rarr;&nbsp;2.68</td>
<td class="bar"><img src="img/bluebar.png" width="107" height="12" alt="=============="><img src="img/magentabar.png" width="14" height="12" alt="##"></td>
</tr>
<tr class="odd">
<td>cheapconcw</td>
<td class="speedup">1.27&nbsp;&rarr;&nbsp;2.52</td>
<td class="bar"><img src="img/bluebar.png" width="57" height="12" alt="========"><img src="img/magentabar.png" width="56" height="12" alt="#######"></td>
</tr>
<tr class="even">
<td>revcomp</td>
<td class="speedup">1.45&nbsp;&rarr;&nbsp;1.92</td>
<td class="bar"><img src="img/bluebar.png" width="65" height="12" alt="========="><img src="img/magentabar.png" width="21" height="12" alt="###"></td>
</tr>
<tr class="odd">
<td>knucleotide</td>
<td class="speedup">1.32&nbsp;&rarr;&nbsp;1.59</td>
<td class="bar"><img src="img/bluebar.png" width="59" height="12" alt="========"><img src="img/magentabar.png" width="13" height="12" alt="##"></td>
</tr>
</table>
</div>
<p>
All other benchmarks show only minor performance differences.
</p>

<h2 id="summary">Summary</h2>
<p>
These results should give you an idea about what speedup
you can expect depending on the nature of your Lua code:
</p>
<ul>
<li>
LuaJIT is really good at (floating-point) math and loops
(mandelbrot, pidigits, spectralnorm, partialsums).
</li>
<li>
Function calls (recursive), vararg calls, table lookups (nbody),
table iteration and coroutine switching (chameneos, cheapconc)
are a lot faster than with plain Lua.
</li>
<li>
It's still pretty good for indexed table access (fannkuch, nsieve)
and string processing (fasta, revcomp, knucleotide).
But there is room for improvement in a future version.
</li>
<li>
If your application spends most of the time in C&nbsp;code
you won't see much of a difference (regexdna, sumfile).
Ok, so write more code in pure Lua. :-)
</li>
<li>
The real speedup may be shadowed by other dominant factors in a benchmark:
<ul>
<li>Common parts of the Lua core: e.g. memory allocation
and GC (binarytrees).</li>
<li>Language characteristics: e.g. lack of bit operations (nsievebits).</li>
<li>System characteristics: e.g. CPU cache size and memory speed (nsieve).</li>
</ul>
</li>
</ul>
<p>
The best idea is of course to benchmark your <em>own</em> applications.
Please report any interesting results you may find. Thank you!
</p>
<br class="flush">
</div>
<div id="foot">
<hr class="hide">
Copyright &copy; 2005-2011 Mike Pall
<span class="noprint">
&middot;
<a href="contact.html">Contact</a>
</span>
</div>
</body>
</html>