diff options
Diffstat (limited to 'libraries/LuaJIT-1.1.7/jitdoc/luajit_performance.html')
-rw-r--r-- | libraries/LuaJIT-1.1.7/jitdoc/luajit_performance.html | 394 |
1 files changed, 394 insertions, 0 deletions
diff --git a/libraries/LuaJIT-1.1.7/jitdoc/luajit_performance.html b/libraries/LuaJIT-1.1.7/jitdoc/luajit_performance.html new file mode 100644 index 0000000..7f2307c --- /dev/null +++ b/libraries/LuaJIT-1.1.7/jitdoc/luajit_performance.html | |||
@@ -0,0 +1,394 @@ | |||
1 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> | ||
2 | <html> | ||
3 | <head> | ||
4 | <title>LuaJIT Performance</title> | ||
5 | <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> | ||
6 | <meta name="Author" content="Mike Pall"> | ||
7 | <meta name="Copyright" content="Copyright (C) 2005-2011, Mike Pall"> | ||
8 | <meta name="Language" content="en"> | ||
9 | <link rel="stylesheet" type="text/css" href="bluequad.css" media="screen"> | ||
10 | <link rel="stylesheet" type="text/css" href="bluequad-print.css" media="print"> | ||
11 | <style type="text/css"> | ||
12 | table.bench { | ||
13 | line-height: 1.2; | ||
14 | } | ||
15 | tr.benchhead td { | ||
16 | font-weight: bold; | ||
17 | } | ||
18 | td img, li img { | ||
19 | vertical-align: middle; | ||
20 | } | ||
21 | td.barhead, td.bar { | ||
22 | font-size: 8pt; | ||
23 | font-family: Courier New, Courier, monospace; | ||
24 | width: 360px; | ||
25 | padding: 0; | ||
26 | } | ||
27 | td.bar { | ||
28 | background: url('img/backbar.png'); | ||
29 | } | ||
30 | td.speedup { | ||
31 | text-align: center; | ||
32 | } | ||
33 | </style> | ||
34 | </head> | ||
35 | <body> | ||
36 | <div id="site"> | ||
37 | <a href="http://luajit.org/"><span>Lua<span id="logo">JIT</span></span></a> | ||
38 | </div> | ||
39 | <div id="head"> | ||
40 | <h1>LuaJIT Performance</h1> | ||
41 | </div> | ||
42 | <div id="nav"> | ||
43 | <ul><li> | ||
44 | <a href="index.html">Index</a> | ||
45 | </li><li> | ||
46 | <a href="luajit.html">LuaJIT</a> | ||
47 | <ul><li> | ||
48 | <a href="luajit_features.html">Features</a> | ||
49 | </li><li> | ||
50 | <a href="luajit_install.html">Installation</a> | ||
51 | </li><li> | ||
52 | <a href="luajit_run.html">Running</a> | ||
53 | </li><li> | ||
54 | <a href="luajit_api.html">API Extensions</a> | ||
55 | </li><li> | ||
56 | <a href="luajit_intro.html">Introduction</a> | ||
57 | </li><li> | ||
58 | <a class="current" href="luajit_performance.html">Performance</a> | ||
59 | </li><li> | ||
60 | <a href="luajit_debug.html">Debugging</a> | ||
61 | </li><li> | ||
62 | <a href="luajit_changes.html">Changes</a> | ||
63 | </li></ul> | ||
64 | </li><li> | ||
65 | <a href="coco.html">Coco</a> | ||
66 | <ul><li> | ||
67 | <a href="coco_portability.html">Portability</a> | ||
68 | </li><li> | ||
69 | <a href="coco_api.html">API Extensions</a> | ||
70 | </li><li> | ||
71 | <a href="coco_changes.html">Changes</a> | ||
72 | </li></ul> | ||
73 | </li><li> | ||
74 | <a href="dynasm.html">DynASM</a> | ||
75 | <ul><li> | ||
76 | <a href="dynasm_features.html">Features</a> | ||
77 | </li><li> | ||
78 | <a href="dynasm_examples.html">Examples</a> | ||
79 | </li></ul> | ||
80 | </li><li> | ||
81 | <a href="http://luajit.org/download.html">Download <span class="ext">»</span></a> | ||
82 | </li></ul> | ||
83 | </div> | ||
84 | <div id="main"> | ||
85 | <p> | ||
86 | Here are some performance measurements, based on a few benchmarks. | ||
87 | </p> | ||
88 | <p style="background: #ffd0d0; text-align: center;"> | ||
89 | LuaJIT 2.0 is available with much improved performance!<br> | ||
90 | Please check the new | ||
91 | <a href="http://luajit.org/performance.html"><span class="ext">»</span> interactive performance comparison</a>. | ||
92 | </p> | ||
93 | |||
94 | <h2 id="interpretation">Interpreting the Results</h2> | ||
95 | <p> | ||
96 | As is always the case with benchmarks, care must be taken to | ||
97 | interpret the results: | ||
98 | </p> | ||
99 | <p> | ||
100 | First, the standard Lua interpreter is already <em>very</em> fast. | ||
101 | It's commonly the fastest of it's class (interpreters) in the | ||
102 | <a href="http://shootout.alioth.debian.org/"><span class="ext">»</span> Great Computer Language Shootout</a>. | ||
103 | Only true machine code compilers get a better overall score. | ||
104 | </p> | ||
105 | <p> | ||
106 | Any performance improvements due to LuaJIT can only be incremental. | ||
107 | You can't expect a speedup of 50x if the fastest compiled language | ||
108 | is only 5x faster than interpreted Lua in a particular benchmark. | ||
109 | LuaJIT can't do miracles. | ||
110 | </p> | ||
111 | <p> | ||
112 | Also please note that most of the benchmarks below are <em>not</em> | ||
113 | trivial micro-benchmarks, which are often cited with marvelous numbers. | ||
114 | Micro-benchmarks do not realistically model the performance gains you | ||
115 | can expect in your own programs. | ||
116 | </p> | ||
117 | <p> | ||
118 | It's easy to make up a few one-liners like:<br> | ||
119 | <tt> local function f(...) end; for i=1,1e7 do f() end</tt><br> | ||
120 | This is more than 30x faster with LuaJIT. But you won't find | ||
121 | this in a real-world program. | ||
122 | </p> | ||
123 | |||
124 | <h2 id="methods">Measurement Methods</h2> | ||
125 | <p> | ||
126 | All measurements have been taken on a Pentium III 1.139 GHz | ||
127 | running Linux 2.6. Both Lua and LuaJIT have been compiled with | ||
128 | GCC 3.3.6 with <tt>-O3 -fomit-frame-pointer</tt>. | ||
129 | You'll definitely get different results on different machines or | ||
130 | with different C compiler options. <sup>*</sup> | ||
131 | </p> | ||
132 | <p> | ||
133 | The base for the comparison are the user CPU times as reported by | ||
134 | <tt>/usr/bin/time</tt>. The runtime of each benchmark is parametrized | ||
135 | and has been adjusted to minimize the variation between several runs. | ||
136 | The ratio between the times for LuaJIT and Lua gives the speedup. | ||
137 | Only this number is shown because it's less dependent on a specific system. | ||
138 | </p> | ||
139 | <p> | ||
140 | E.g. a speedup of 6.74 means the same benchmark runs almost 7 times | ||
141 | faster with <tt>luajit -O</tt> than with standard Lua (or with | ||
142 | <tt>-j off</tt>). Your mileage may vary. | ||
143 | </p> | ||
144 | <p style="font-size: 80%;"> | ||
145 | <sup>*</sup> Yes, LuaJIT relies on quite a bit of the Lua core infrastructure | ||
146 | like table and string handling. All of this is written in C and | ||
147 | should be compiled with full optimization turned on, or performance | ||
148 | will suffer. | ||
149 | </p> | ||
150 | |||
151 | <h2 id="lua_luajit" class="pagebreak">Comparing Lua to LuaJIT</h2> | ||
152 | <p> | ||
153 | Here is a comparison using the current benchmark collection of the | ||
154 | <a href="http://shootout.alioth.debian.org/"><span class="ext">»</span> Great Computer Language Shootout</a> (as of 3/2006): | ||
155 | </p> | ||
156 | |||
157 | <div class="tablewrap"> | ||
158 | <table class="bench"> | ||
159 | <tr class="benchhead"> | ||
160 | <td>Benchmark</td> | ||
161 | <td class="speedup">Speedup</td> | ||
162 | <td class="barhead"> | ||
163 | <img src="img/spacer.png" width="360" height="12" alt="-----1x----2x----3x----4x----5x----6x----7x----8x"> | ||
164 | </td> | ||
165 | </tr> | ||
166 | <tr class="odd"> | ||
167 | <td>mandelbrot</td> | ||
168 | <td class="speedup">6.74</td> | ||
169 | <td class="bar"><img src="img/bluebar.png" width="303" height="12" alt="========================================"></td> | ||
170 | </tr> | ||
171 | <tr class="even"> | ||
172 | <td>recursive</td> | ||
173 | <td class="speedup">6.64</td> | ||
174 | <td class="bar"><img src="img/bluebar.png" width="299" height="12" alt="========================================"></td> | ||
175 | </tr> | ||
176 | <tr class="odd"> | ||
177 | <td>fannkuch</td> | ||
178 | <td class="speedup">5.37</td> | ||
179 | <td class="bar"><img src="img/bluebar.png" width="242" height="12" alt="================================"></td> | ||
180 | </tr> | ||
181 | <tr class="even"> | ||
182 | <td>chameneos</td> | ||
183 | <td class="speedup">5.08</td> | ||
184 | <td class="bar"><img src="img/bluebar.png" width="229" height="12" alt="=============================="></td> | ||
185 | </tr> | ||
186 | <tr class="odd"> | ||
187 | <td>nsievebits</td> | ||
188 | <td class="speedup">5.05</td> | ||
189 | <td class="bar"><img src="img/bluebar.png" width="227" height="12" alt="=============================="></td> | ||
190 | </tr> | ||
191 | <tr class="even"> | ||
192 | <td>pidigits</td> | ||
193 | <td class="speedup">4.94</td> | ||
194 | <td class="bar"><img src="img/bluebar.png" width="222" height="12" alt="=============================="></td> | ||
195 | </tr> | ||
196 | <tr class="odd"> | ||
197 | <td>nbody</td> | ||
198 | <td class="speedup">4.63</td> | ||
199 | <td class="bar"><img src="img/bluebar.png" width="208" height="12" alt="============================"></td> | ||
200 | </tr> | ||
201 | <tr class="even"> | ||
202 | <td>spectralnorm</td> | ||
203 | <td class="speedup">4.59</td> | ||
204 | <td class="bar"><img src="img/bluebar.png" width="207" height="12" alt="============================"></td> | ||
205 | </tr> | ||
206 | <tr class="odd"> | ||
207 | <td>cheapconcr</td> | ||
208 | <td class="speedup">4.46</td> | ||
209 | <td class="bar"><img src="img/bluebar.png" width="201" height="12" alt="==========================="></td> | ||
210 | </tr> | ||
211 | <tr class="even"> | ||
212 | <td>partialsums</td> | ||
213 | <td class="speedup">3.73</td> | ||
214 | <td class="bar"><img src="img/bluebar.png" width="168" height="12" alt="======================"></td> | ||
215 | </tr> | ||
216 | <tr class="odd"> | ||
217 | <td>fasta</td> | ||
218 | <td class="speedup">2.68</td> | ||
219 | <td class="bar"><img src="img/bluebar.png" width="121" height="12" alt="================"></td> | ||
220 | </tr> | ||
221 | <tr class="even"> | ||
222 | <td>cheapconcw</td> | ||
223 | <td class="speedup">2.52</td> | ||
224 | <td class="bar"><img src="img/bluebar.png" width="113" height="12" alt="==============="></td> | ||
225 | </tr> | ||
226 | <tr class="odd"> | ||
227 | <td>nsieve</td> | ||
228 | <td class="speedup">1.95</td> | ||
229 | <td class="bar"><img src="img/bluebar.png" width="88" height="12" alt="============"></td> | ||
230 | </tr> | ||
231 | <tr class="even"> | ||
232 | <td>revcomp</td> | ||
233 | <td class="speedup">1.92</td> | ||
234 | <td class="bar"><img src="img/bluebar.png" width="86" height="12" alt="============"></td> | ||
235 | </tr> | ||
236 | <tr class="odd"> | ||
237 | <td>knucleotide</td> | ||
238 | <td class="speedup">1.59</td> | ||
239 | <td class="bar"><img src="img/bluebar.png" width="72" height="12" alt="=========="></td> | ||
240 | </tr> | ||
241 | <tr class="even"> | ||
242 | <td>binarytrees</td> | ||
243 | <td class="speedup">1.52</td> | ||
244 | <td class="bar"><img src="img/bluebar.png" width="68" height="12" alt="========="></td> | ||
245 | </tr> | ||
246 | <tr class="odd"> | ||
247 | <td>sumfile</td> | ||
248 | <td class="speedup">1.27</td> | ||
249 | <td class="bar"><img src="img/bluebar.png" width="57" height="12" alt="========"></td> | ||
250 | </tr> | ||
251 | <tr class="even"> | ||
252 | <td>regexdna</td> | ||
253 | <td class="speedup">1.01</td> | ||
254 | <td class="bar"><img src="img/bluebar.png" width="45" height="12" alt="======"></td> | ||
255 | </tr> | ||
256 | </table> | ||
257 | </div> | ||
258 | <p> | ||
259 | Note that many of these benchmarks have changed over time (both spec | ||
260 | and code). Benchmark results shown in previous versions of LuaJIT | ||
261 | are not directly comparable. The next section compares different | ||
262 | versions with the current set of benchmarks. | ||
263 | </p> | ||
264 | |||
265 | <h2 id="luajit_versions" class="pagebreak">Comparing LuaJIT Versions</h2> | ||
266 | <p> | ||
267 | This shows the improvements between the following versions: | ||
268 | </p> | ||
269 | <ul> | ||
270 | <li>LuaJIT 1.0.x <img src="img/bluebar.png" width="30" height="12" alt="(===)"></li> | ||
271 | <li>LuaJIT 1.1.x <img src="img/bluebar.png" width="30" height="12" alt="(===##)"><img src="img/magentabar.png" width="20" height="12" alt=""></li> | ||
272 | </ul> | ||
273 | |||
274 | <div class="tablewrap"> | ||
275 | <table class="bench"> | ||
276 | <tr class="benchhead"> | ||
277 | <td>Benchmark</td> | ||
278 | <td class="speedup">Speedup</td> | ||
279 | <td class="barhead"> | ||
280 | <img src="img/spacer.png" width="360" height="12" alt="-----1x----2x----3x----4x----5x----6x----7x----8x"> | ||
281 | </td> | ||
282 | </tr> | ||
283 | <tr class="odd"> | ||
284 | <td>fannkuch</td> | ||
285 | <td class="speedup">3.96 → 5.37</td> | ||
286 | <td class="bar"><img src="img/bluebar.png" width="178" height="12" alt="========================"><img src="img/magentabar.png" width="64" height="12" alt="########"></td> | ||
287 | </tr> | ||
288 | <tr class="even"> | ||
289 | <td>chameneos</td> | ||
290 | <td class="speedup">2.25 → 5.08</td> | ||
291 | <td class="bar"><img src="img/bluebar.png" width="101" height="12" alt="=============="><img src="img/magentabar.png" width="128" height="12" alt="################"></td> | ||
292 | </tr> | ||
293 | <tr class="odd"> | ||
294 | <td>nsievebits</td> | ||
295 | <td class="speedup">2.90 → 5.05</td> | ||
296 | <td class="bar"><img src="img/bluebar.png" width="131" height="12" alt="================="><img src="img/magentabar.png" width="96" height="12" alt="#############"></td> | ||
297 | </tr> | ||
298 | <tr class="even"> | ||
299 | <td>pidigits</td> | ||
300 | <td class="speedup">3.58 → 4.94</td> | ||
301 | <td class="bar"><img src="img/bluebar.png" width="161" height="12" alt="====================="><img src="img/magentabar.png" width="61" height="12" alt="#########"></td> | ||
302 | </tr> | ||
303 | <tr class="odd"> | ||
304 | <td>nbody</td> | ||
305 | <td class="speedup">4.16 → 4.63</td> | ||
306 | <td class="bar"><img src="img/bluebar.png" width="187" height="12" alt="========================="><img src="img/magentabar.png" width="21" height="12" alt="###"></td> | ||
307 | </tr> | ||
308 | <tr class="even"> | ||
309 | <td>cheapconcr</td> | ||
310 | <td class="speedup">1.46 → 4.46</td> | ||
311 | <td class="bar"><img src="img/bluebar.png" width="66" height="12" alt="========="><img src="img/magentabar.png" width="135" height="12" alt="##################"></td> | ||
312 | </tr> | ||
313 | <tr class="odd"> | ||
314 | <td>partialsums</td> | ||
315 | <td class="speedup">1.71 → 3.73</td> | ||
316 | <td class="bar"><img src="img/bluebar.png" width="77" height="12" alt="=========="><img src="img/magentabar.png" width="91" height="12" alt="############"></td> | ||
317 | </tr> | ||
318 | <tr class="even"> | ||
319 | <td>fasta</td> | ||
320 | <td class="speedup">2.37 → 2.68</td> | ||
321 | <td class="bar"><img src="img/bluebar.png" width="107" height="12" alt="=============="><img src="img/magentabar.png" width="14" height="12" alt="##"></td> | ||
322 | </tr> | ||
323 | <tr class="odd"> | ||
324 | <td>cheapconcw</td> | ||
325 | <td class="speedup">1.27 → 2.52</td> | ||
326 | <td class="bar"><img src="img/bluebar.png" width="57" height="12" alt="========"><img src="img/magentabar.png" width="56" height="12" alt="#######"></td> | ||
327 | </tr> | ||
328 | <tr class="even"> | ||
329 | <td>revcomp</td> | ||
330 | <td class="speedup">1.45 → 1.92</td> | ||
331 | <td class="bar"><img src="img/bluebar.png" width="65" height="12" alt="========="><img src="img/magentabar.png" width="21" height="12" alt="###"></td> | ||
332 | </tr> | ||
333 | <tr class="odd"> | ||
334 | <td>knucleotide</td> | ||
335 | <td class="speedup">1.32 → 1.59</td> | ||
336 | <td class="bar"><img src="img/bluebar.png" width="59" height="12" alt="========"><img src="img/magentabar.png" width="13" height="12" alt="##"></td> | ||
337 | </tr> | ||
338 | </table> | ||
339 | </div> | ||
340 | <p> | ||
341 | All other benchmarks show only minor performance differences. | ||
342 | </p> | ||
343 | |||
344 | <h2 id="summary">Summary</h2> | ||
345 | <p> | ||
346 | These results should give you an idea about what speedup | ||
347 | you can expect depending on the nature of your Lua code: | ||
348 | </p> | ||
349 | <ul> | ||
350 | <li> | ||
351 | LuaJIT is really good at (floating-point) math and loops | ||
352 | (mandelbrot, pidigits, spectralnorm, partialsums). | ||
353 | </li> | ||
354 | <li> | ||
355 | Function calls (recursive), vararg calls, table lookups (nbody), | ||
356 | table iteration and coroutine switching (chameneos, cheapconc) | ||
357 | are a lot faster than with plain Lua. | ||
358 | </li> | ||
359 | <li> | ||
360 | It's still pretty good for indexed table access (fannkuch, nsieve) | ||
361 | and string processing (fasta, revcomp, knucleotide). | ||
362 | But there is room for improvement in a future version. | ||
363 | </li> | ||
364 | <li> | ||
365 | If your application spends most of the time in C code | ||
366 | you won't see much of a difference (regexdna, sumfile). | ||
367 | Ok, so write more code in pure Lua. :-) | ||
368 | </li> | ||
369 | <li> | ||
370 | The real speedup may be shadowed by other dominant factors in a benchmark: | ||
371 | <ul> | ||
372 | <li>Common parts of the Lua core: e.g. memory allocation | ||
373 | and GC (binarytrees).</li> | ||
374 | <li>Language characteristics: e.g. lack of bit operations (nsievebits).</li> | ||
375 | <li>System characteristics: e.g. CPU cache size and memory speed (nsieve).</li> | ||
376 | </ul> | ||
377 | </li> | ||
378 | </ul> | ||
379 | <p> | ||
380 | The best idea is of course to benchmark your <em>own</em> applications. | ||
381 | Please report any interesting results you may find. Thank you! | ||
382 | </p> | ||
383 | <br class="flush"> | ||
384 | </div> | ||
385 | <div id="foot"> | ||
386 | <hr class="hide"> | ||
387 | Copyright © 2005-2011 Mike Pall | ||
388 | <span class="noprint"> | ||
389 | · | ||
390 | <a href="contact.html">Contact</a> | ||
391 | </span> | ||
392 | </div> | ||
393 | </body> | ||
394 | </html> | ||