aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/libraries/LuaJIT-1.1.7/jitdoc/luajit_performance.html
diff options
context:
space:
mode:
Diffstat (limited to 'libraries/LuaJIT-1.1.7/jitdoc/luajit_performance.html')
-rw-r--r--libraries/LuaJIT-1.1.7/jitdoc/luajit_performance.html394
1 files changed, 394 insertions, 0 deletions
diff --git a/libraries/LuaJIT-1.1.7/jitdoc/luajit_performance.html b/libraries/LuaJIT-1.1.7/jitdoc/luajit_performance.html
new file mode 100644
index 0000000..7f2307c
--- /dev/null
+++ b/libraries/LuaJIT-1.1.7/jitdoc/luajit_performance.html
@@ -0,0 +1,394 @@
1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
2<html>
3<head>
4<title>LuaJIT Performance</title>
5<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
6<meta name="Author" content="Mike Pall">
7<meta name="Copyright" content="Copyright (C) 2005-2011, Mike Pall">
8<meta name="Language" content="en">
9<link rel="stylesheet" type="text/css" href="bluequad.css" media="screen">
10<link rel="stylesheet" type="text/css" href="bluequad-print.css" media="print">
11<style type="text/css">
12table.bench {
13 line-height: 1.2;
14}
15tr.benchhead td {
16 font-weight: bold;
17}
18td img, li img {
19 vertical-align: middle;
20}
21td.barhead, td.bar {
22 font-size: 8pt;
23 font-family: Courier New, Courier, monospace;
24 width: 360px;
25 padding: 0;
26}
27td.bar {
28 background: url('img/backbar.png');
29}
30td.speedup {
31 text-align: center;
32}
33</style>
34</head>
35<body>
36<div id="site">
37<a href="http://luajit.org/"><span>Lua<span id="logo">JIT</span></span></a>
38</div>
39<div id="head">
40<h1>LuaJIT Performance</h1>
41</div>
42<div id="nav">
43<ul><li>
44<a href="index.html">Index</a>
45</li><li>
46<a href="luajit.html">LuaJIT</a>
47<ul><li>
48<a href="luajit_features.html">Features</a>
49</li><li>
50<a href="luajit_install.html">Installation</a>
51</li><li>
52<a href="luajit_run.html">Running</a>
53</li><li>
54<a href="luajit_api.html">API Extensions</a>
55</li><li>
56<a href="luajit_intro.html">Introduction</a>
57</li><li>
58<a class="current" href="luajit_performance.html">Performance</a>
59</li><li>
60<a href="luajit_debug.html">Debugging</a>
61</li><li>
62<a href="luajit_changes.html">Changes</a>
63</li></ul>
64</li><li>
65<a href="coco.html">Coco</a>
66<ul><li>
67<a href="coco_portability.html">Portability</a>
68</li><li>
69<a href="coco_api.html">API Extensions</a>
70</li><li>
71<a href="coco_changes.html">Changes</a>
72</li></ul>
73</li><li>
74<a href="dynasm.html">DynASM</a>
75<ul><li>
76<a href="dynasm_features.html">Features</a>
77</li><li>
78<a href="dynasm_examples.html">Examples</a>
79</li></ul>
80</li><li>
81<a href="http://luajit.org/download.html">Download <span class="ext">&raquo;</span></a>
82</li></ul>
83</div>
84<div id="main">
85<p>
86Here are some performance measurements, based on a few benchmarks.
87</p>
88<p style="background: #ffd0d0; text-align: center;">
89LuaJIT 2.0 is available with much improved performance!<br>
90Please check the new
91<a href="http://luajit.org/performance.html"><span class="ext">&raquo;</span> interactive performance comparison</a>.
92</p>
93
94<h2 id="interpretation">Interpreting the Results</h2>
95<p>
96As is always the case with benchmarks, care must be taken to
97interpret the results:
98</p>
99<p>
100First, the standard Lua interpreter is already <em>very</em> fast.
101It's commonly the fastest of it's class (interpreters) in the
102<a href="http://shootout.alioth.debian.org/"><span class="ext">&raquo;</span>&nbsp;Great Computer Language Shootout</a>.
103Only true machine code compilers get a better overall score.
104</p>
105<p>
106Any performance improvements due to LuaJIT can only be incremental.
107You can't expect a speedup of 50x if the fastest compiled language
108is only 5x faster than interpreted Lua in a particular benchmark.
109LuaJIT can't do miracles.
110</p>
111<p>
112Also please note that most of the benchmarks below are <em>not</em>
113trivial micro-benchmarks, which are often cited with marvelous numbers.
114Micro-benchmarks do not realistically model the performance gains you
115can expect in your own programs.
116</p>
117<p>
118It's easy to make up a few one-liners like:<br>
119<tt>&nbsp;&nbsp;local function f(...) end; for i=1,1e7 do f() end</tt><br>
120This is more than 30x faster with LuaJIT. But you won't find
121this in a real-world program.
122</p>
123
124<h2 id="methods">Measurement Methods</h2>
125<p>
126All measurements have been taken on a Pentium&nbsp;III 1.139&nbsp;GHz
127running Linux&nbsp;2.6. Both Lua and LuaJIT have been compiled with
128GCC&nbsp;3.3.6 with <tt>-O3 -fomit-frame-pointer</tt>.
129You'll definitely get different results on different machines or
130with different C&nbsp;compiler options. <sup>*</sup>
131</p>
132<p>
133The base for the comparison are the user CPU times as reported by
134<tt>/usr/bin/time</tt>. The runtime of each benchmark is parametrized
135and has been adjusted to minimize the variation between several runs.
136The ratio between the times for LuaJIT and Lua gives the speedup.
137Only this number is shown because it's less dependent on a specific system.
138</p>
139<p>
140E.g. a speedup of 6.74 means the same benchmark runs almost 7 times
141faster with <tt>luajit&nbsp;-O</tt> than with standard Lua (or with
142<tt>-j off</tt>). Your mileage may vary.
143</p>
144<p style="font-size: 80%;">
145<sup>*</sup> Yes, LuaJIT relies on quite a bit of the Lua core infrastructure
146like table and string handling. All of this is written in C and
147should be compiled with full optimization turned on, or performance
148will suffer.
149</p>
150
151<h2 id="lua_luajit" class="pagebreak">Comparing Lua to LuaJIT</h2>
152<p>
153Here is a comparison using the current benchmark collection of the
154<a href="http://shootout.alioth.debian.org/"><span class="ext">&raquo;</span>&nbsp;Great Computer Language Shootout</a> (as of 3/2006):
155</p>
156
157<div class="tablewrap">
158<table class="bench">
159<tr class="benchhead">
160<td>Benchmark</td>
161<td class="speedup">Speedup</td>
162<td class="barhead">
163<img src="img/spacer.png" width="360" height="12" alt="-----1x----2x----3x----4x----5x----6x----7x----8x">
164</td>
165</tr>
166<tr class="odd">
167<td>mandelbrot</td>
168<td class="speedup">6.74</td>
169<td class="bar"><img src="img/bluebar.png" width="303" height="12" alt="========================================"></td>
170</tr>
171<tr class="even">
172<td>recursive</td>
173<td class="speedup">6.64</td>
174<td class="bar"><img src="img/bluebar.png" width="299" height="12" alt="========================================"></td>
175</tr>
176<tr class="odd">
177<td>fannkuch</td>
178<td class="speedup">5.37</td>
179<td class="bar"><img src="img/bluebar.png" width="242" height="12" alt="================================"></td>
180</tr>
181<tr class="even">
182<td>chameneos</td>
183<td class="speedup">5.08</td>
184<td class="bar"><img src="img/bluebar.png" width="229" height="12" alt="=============================="></td>
185</tr>
186<tr class="odd">
187<td>nsievebits</td>
188<td class="speedup">5.05</td>
189<td class="bar"><img src="img/bluebar.png" width="227" height="12" alt="=============================="></td>
190</tr>
191<tr class="even">
192<td>pidigits</td>
193<td class="speedup">4.94</td>
194<td class="bar"><img src="img/bluebar.png" width="222" height="12" alt="=============================="></td>
195</tr>
196<tr class="odd">
197<td>nbody</td>
198<td class="speedup">4.63</td>
199<td class="bar"><img src="img/bluebar.png" width="208" height="12" alt="============================"></td>
200</tr>
201<tr class="even">
202<td>spectralnorm</td>
203<td class="speedup">4.59</td>
204<td class="bar"><img src="img/bluebar.png" width="207" height="12" alt="============================"></td>
205</tr>
206<tr class="odd">
207<td>cheapconcr</td>
208<td class="speedup">4.46</td>
209<td class="bar"><img src="img/bluebar.png" width="201" height="12" alt="==========================="></td>
210</tr>
211<tr class="even">
212<td>partialsums</td>
213<td class="speedup">3.73</td>
214<td class="bar"><img src="img/bluebar.png" width="168" height="12" alt="======================"></td>
215</tr>
216<tr class="odd">
217<td>fasta</td>
218<td class="speedup">2.68</td>
219<td class="bar"><img src="img/bluebar.png" width="121" height="12" alt="================"></td>
220</tr>
221<tr class="even">
222<td>cheapconcw</td>
223<td class="speedup">2.52</td>
224<td class="bar"><img src="img/bluebar.png" width="113" height="12" alt="==============="></td>
225</tr>
226<tr class="odd">
227<td>nsieve</td>
228<td class="speedup">1.95</td>
229<td class="bar"><img src="img/bluebar.png" width="88" height="12" alt="============"></td>
230</tr>
231<tr class="even">
232<td>revcomp</td>
233<td class="speedup">1.92</td>
234<td class="bar"><img src="img/bluebar.png" width="86" height="12" alt="============"></td>
235</tr>
236<tr class="odd">
237<td>knucleotide</td>
238<td class="speedup">1.59</td>
239<td class="bar"><img src="img/bluebar.png" width="72" height="12" alt="=========="></td>
240</tr>
241<tr class="even">
242<td>binarytrees</td>
243<td class="speedup">1.52</td>
244<td class="bar"><img src="img/bluebar.png" width="68" height="12" alt="========="></td>
245</tr>
246<tr class="odd">
247<td>sumfile</td>
248<td class="speedup">1.27</td>
249<td class="bar"><img src="img/bluebar.png" width="57" height="12" alt="========"></td>
250</tr>
251<tr class="even">
252<td>regexdna</td>
253<td class="speedup">1.01</td>
254<td class="bar"><img src="img/bluebar.png" width="45" height="12" alt="======"></td>
255</tr>
256</table>
257</div>
258<p>
259Note that many of these benchmarks have changed over time (both spec
260and code). Benchmark results shown in previous versions of LuaJIT
261are not directly comparable. The next section compares different
262versions with the current set of benchmarks.
263</p>
264
265<h2 id="luajit_versions" class="pagebreak">Comparing LuaJIT Versions</h2>
266<p>
267This shows the improvements between the following versions:
268</p>
269<ul>
270<li>LuaJIT&nbsp;1.0.x <img src="img/bluebar.png" width="30" height="12" alt="(===)"></li>
271<li>LuaJIT&nbsp;1.1.x <img src="img/bluebar.png" width="30" height="12" alt="(===##)"><img src="img/magentabar.png" width="20" height="12" alt=""></li>
272</ul>
273
274<div class="tablewrap">
275<table class="bench">
276<tr class="benchhead">
277<td>Benchmark</td>
278<td class="speedup">Speedup</td>
279<td class="barhead">
280<img src="img/spacer.png" width="360" height="12" alt="-----1x----2x----3x----4x----5x----6x----7x----8x">
281</td>
282</tr>
283<tr class="odd">
284<td>fannkuch</td>
285<td class="speedup">3.96&nbsp;&rarr;&nbsp;5.37</td>
286<td class="bar"><img src="img/bluebar.png" width="178" height="12" alt="========================"><img src="img/magentabar.png" width="64" height="12" alt="########"></td>
287</tr>
288<tr class="even">
289<td>chameneos</td>
290<td class="speedup">2.25&nbsp;&rarr;&nbsp;5.08</td>
291<td class="bar"><img src="img/bluebar.png" width="101" height="12" alt="=============="><img src="img/magentabar.png" width="128" height="12" alt="################"></td>
292</tr>
293<tr class="odd">
294<td>nsievebits</td>
295<td class="speedup">2.90&nbsp;&rarr;&nbsp;5.05</td>
296<td class="bar"><img src="img/bluebar.png" width="131" height="12" alt="================="><img src="img/magentabar.png" width="96" height="12" alt="#############"></td>
297</tr>
298<tr class="even">
299<td>pidigits</td>
300<td class="speedup">3.58&nbsp;&rarr;&nbsp;4.94</td>
301<td class="bar"><img src="img/bluebar.png" width="161" height="12" alt="====================="><img src="img/magentabar.png" width="61" height="12" alt="#########"></td>
302</tr>
303<tr class="odd">
304<td>nbody</td>
305<td class="speedup">4.16&nbsp;&rarr;&nbsp;4.63</td>
306<td class="bar"><img src="img/bluebar.png" width="187" height="12" alt="========================="><img src="img/magentabar.png" width="21" height="12" alt="###"></td>
307</tr>
308<tr class="even">
309<td>cheapconcr</td>
310<td class="speedup">1.46&nbsp;&rarr;&nbsp;4.46</td>
311<td class="bar"><img src="img/bluebar.png" width="66" height="12" alt="========="><img src="img/magentabar.png" width="135" height="12" alt="##################"></td>
312</tr>
313<tr class="odd">
314<td>partialsums</td>
315<td class="speedup">1.71&nbsp;&rarr;&nbsp;3.73</td>
316<td class="bar"><img src="img/bluebar.png" width="77" height="12" alt="=========="><img src="img/magentabar.png" width="91" height="12" alt="############"></td>
317</tr>
318<tr class="even">
319<td>fasta</td>
320<td class="speedup">2.37&nbsp;&rarr;&nbsp;2.68</td>
321<td class="bar"><img src="img/bluebar.png" width="107" height="12" alt="=============="><img src="img/magentabar.png" width="14" height="12" alt="##"></td>
322</tr>
323<tr class="odd">
324<td>cheapconcw</td>
325<td class="speedup">1.27&nbsp;&rarr;&nbsp;2.52</td>
326<td class="bar"><img src="img/bluebar.png" width="57" height="12" alt="========"><img src="img/magentabar.png" width="56" height="12" alt="#######"></td>
327</tr>
328<tr class="even">
329<td>revcomp</td>
330<td class="speedup">1.45&nbsp;&rarr;&nbsp;1.92</td>
331<td class="bar"><img src="img/bluebar.png" width="65" height="12" alt="========="><img src="img/magentabar.png" width="21" height="12" alt="###"></td>
332</tr>
333<tr class="odd">
334<td>knucleotide</td>
335<td class="speedup">1.32&nbsp;&rarr;&nbsp;1.59</td>
336<td class="bar"><img src="img/bluebar.png" width="59" height="12" alt="========"><img src="img/magentabar.png" width="13" height="12" alt="##"></td>
337</tr>
338</table>
339</div>
340<p>
341All other benchmarks show only minor performance differences.
342</p>
343
344<h2 id="summary">Summary</h2>
345<p>
346These results should give you an idea about what speedup
347you can expect depending on the nature of your Lua code:
348</p>
349<ul>
350<li>
351LuaJIT is really good at (floating-point) math and loops
352(mandelbrot, pidigits, spectralnorm, partialsums).
353</li>
354<li>
355Function calls (recursive), vararg calls, table lookups (nbody),
356table iteration and coroutine switching (chameneos, cheapconc)
357are a lot faster than with plain Lua.
358</li>
359<li>
360It's still pretty good for indexed table access (fannkuch, nsieve)
361and string processing (fasta, revcomp, knucleotide).
362But there is room for improvement in a future version.
363</li>
364<li>
365If your application spends most of the time in C&nbsp;code
366you won't see much of a difference (regexdna, sumfile).
367Ok, so write more code in pure Lua. :-)
368</li>
369<li>
370The real speedup may be shadowed by other dominant factors in a benchmark:
371<ul>
372<li>Common parts of the Lua core: e.g. memory allocation
373and GC (binarytrees).</li>
374<li>Language characteristics: e.g. lack of bit operations (nsievebits).</li>
375<li>System characteristics: e.g. CPU cache size and memory speed (nsieve).</li>
376</ul>
377</li>
378</ul>
379<p>
380The best idea is of course to benchmark your <em>own</em> applications.
381Please report any interesting results you may find. Thank you!
382</p>
383<br class="flush">
384</div>
385<div id="foot">
386<hr class="hide">
387Copyright &copy; 2005-2011 Mike Pall
388<span class="noprint">
389&middot;
390<a href="contact.html">Contact</a>
391</span>
392</div>
393</body>
394</html>