aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/libraries/LuaJIT-1.1.7/jitdoc/luajit_intro.html
diff options
context:
space:
mode:
Diffstat (limited to 'libraries/LuaJIT-1.1.7/jitdoc/luajit_intro.html')
-rw-r--r--libraries/LuaJIT-1.1.7/jitdoc/luajit_intro.html389
1 files changed, 389 insertions, 0 deletions
diff --git a/libraries/LuaJIT-1.1.7/jitdoc/luajit_intro.html b/libraries/LuaJIT-1.1.7/jitdoc/luajit_intro.html
new file mode 100644
index 0000000..9e1fb23
--- /dev/null
+++ b/libraries/LuaJIT-1.1.7/jitdoc/luajit_intro.html
@@ -0,0 +1,389 @@
1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
2<html>
3<head>
4<title>Introducing LuaJIT</title>
5<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
6<meta name="Author" content="Mike Pall">
7<meta name="Copyright" content="Copyright (C) 2005-2011, Mike Pall">
8<meta name="Language" content="en">
9<link rel="stylesheet" type="text/css" href="bluequad.css" media="screen">
10<link rel="stylesheet" type="text/css" href="bluequad-print.css" media="print">
11</head>
12<body>
13<div id="site">
14<a href="http://luajit.org/"><span>Lua<span id="logo">JIT</span></span></a>
15</div>
16<div id="head">
17<h1>Introducing LuaJIT</h1>
18</div>
19<div id="nav">
20<ul><li>
21<a href="index.html">Index</a>
22</li><li>
23<a href="luajit.html">LuaJIT</a>
24<ul><li>
25<a href="luajit_features.html">Features</a>
26</li><li>
27<a href="luajit_install.html">Installation</a>
28</li><li>
29<a href="luajit_run.html">Running</a>
30</li><li>
31<a href="luajit_api.html">API Extensions</a>
32</li><li>
33<a class="current" href="luajit_intro.html">Introduction</a>
34</li><li>
35<a href="luajit_performance.html">Performance</a>
36</li><li>
37<a href="luajit_debug.html">Debugging</a>
38</li><li>
39<a href="luajit_changes.html">Changes</a>
40</li></ul>
41</li><li>
42<a href="coco.html">Coco</a>
43<ul><li>
44<a href="coco_portability.html">Portability</a>
45</li><li>
46<a href="coco_api.html">API Extensions</a>
47</li><li>
48<a href="coco_changes.html">Changes</a>
49</li></ul>
50</li><li>
51<a href="dynasm.html">DynASM</a>
52<ul><li>
53<a href="dynasm_features.html">Features</a>
54</li><li>
55<a href="dynasm_examples.html">Examples</a>
56</li></ul>
57</li><li>
58<a href="http://luajit.org/download.html">Download <span class="ext">&raquo;</span></a>
59</li></ul>
60</div>
61<div id="main">
62<p>
63This is a little essay that tries to answer the question:
64<em>'So, how does LuaJIT really work?'</em>.
65</p>
66<p>
67I tried to avoid going into all the gory details, but at the
68same time provide a deep enough explanation, to let you find
69your way around LuaJIT's inner workings.
70</p>
71<p>
72The learning curve is maybe a little bit steep for newbies and
73compiler gurus will certainly fall asleep after two paragraphs.
74It's difficult to strike a balance here.
75</p>
76
77<h2>Acronym Soup</h2>
78<p>
79As the name says LuaJIT is a <em>Just-In-Time</em> (JIT) compiler.
80This means that functions are compiled on demand, i.e. when they
81are run first. This ensures both a quick application startup
82and helps to avoid useless work, too. E.g. unused functions
83are not compiled at all.
84</p>
85<p>
86The other alternative is known as <em>Ahead-Of-Time</em> (AOT)
87compilation. Here everything is compiled before running any function.
88This is the classic way for many languages, such as C or C++.
89</p>
90<p>
91In fact plain Lua allows you to pre-compile Lua source code into
92Lua bytecode and store it in a binary file that can be run
93later on. This is used only in specific settings (e.g. memory limited
94embedded systems), because the Lua bytecode compiler is really fast.
95The ability to run source files right away is part of what makes
96a dynamic language (aka scripting language) so powerful.
97</p>
98<p>
99JIT compilation has a few other advantages for dynamic languages
100that AOT compilation can only provide with a massive amount
101of code analysis. More can be found in the literature.
102One particular advantage is explained later.
103</p>
104
105<h2>Quick, JIT &mdash; Run!</h2>
106<p>
107JIT compilation happens mostly invisible. You'll probably never
108notice that a compilation is going on. Part of the secret is
109that everything happens in little pieces intermixed with running
110the application itself inbetween. The other part of the secret
111is that JIT compilation can be made pretty fast.
112</p>
113<p>
114Most applications quickly converge to a stable state where
115everything that really needs to be compiled is compiled
116right away. Only occasional isolated compiles happen later on.
117</p>
118<p>
119Even though the name doesn't suggest it, LuaJIT <em>can</em> operate
120in AOT mode, too. But this is completely under user control
121(see <a href="luajit_api.html#jit_compile"><tt>jit.compile()</tt></a>)
122and doesn't happen automatically.
123</p>
124<p>
125Unless you have good reason to suspect that AOT compilation
126might help for a specific application, I wouldn't bother though.
127Compilation speed is usually a non-argument, because LuaJIT
128is extremely fast. Compilation times are typically in the
129<em>microsecond range</em> for individual Lua functions.
130</p>
131
132<h2>Starting Up</h2>
133<p>
134The next few paragraphs may not be exactly breaking news to you,
135if you are familiar with JIT compilers. Still, please read on,
136because some terms are introduced that are used later on.
137</p>
138<p>
139When you start LuaJIT everything proceeds like in standard Lua:
140the Lua core is initialized, the standard libraries are loaded and
141the command line is analyzed. Then usually the first Lua source
142code file is loaded and is translated to Lua bytecode. And finally
143the function for the initial main chunk is run ...
144</p>
145
146<h2>Kicking the Compiler</h2>
147<p>
148This is where LuaJIT kicks in:
149</p>
150<p>
151All Lua functions carry an additional <em>status code</em> for LuaJIT.
152Initially this is set to 'NONE', i.e. the function has not been
153looked at (yet). If a function is run with this setting,
154the LuaJIT <em>compiler pipeline</em> is started up.
155</p>
156<p>
157If you haven't loaded any special LuaJIT modules and optimization
158is not turned on, the compiler pipeline only consists of the
159<em>compiler backend</em>.
160</p>
161<p>
162The compiler backend is the low-level encoding engine that translates
163bytecode instructions to machine code instructions. Without any
164further hints from other modules, the backend more or less does a
1651:1 translation. I.e. a single variant of a bytecode instruction
166corresponds to a single piece of machine code.
167</p>
168<p>
169If all goes well, these little code pieces are put together,
170a function prologue is slapped on and voila: your Lua function
171has been translated to machine code. Of course things are not
172that simple when you look closer, but hey &mdash; this is
173the theory.
174</p>
175<p>
176Anyway, the status code for the function is set to 'OK' and the
177machine code is run. If this function runs another Lua function
178which has not been compiled, that one is compiled, too. And so on.
179</p>
180
181<h2>Call Gates</h2>
182<p>
183Ok, so what happens when a function is called repeatedly? After all
184this is the most common case.
185</p>
186<p>
187Simple: The status code is checked again. This time it's set to 'OK',
188so the machine code can be run directly. Well &mdash; that's not the
189whole truth: for calls that originate in a JIT compiled function
190a better mechanism, tentatively named <em>call gates</em> is used.
191</p>
192<p>
193Every function has a call gate field (a function pointer). By default
194it's set to a function that does the above checks and runs the
195compiler. But as soon as a function is compiled, the call gate
196is modified to point to the just compiled machine code.
197</p>
198<p>
199Calling a function is then as easy as calling the code that the
200call gate points to. But due to special (faster) calling conventions
201this function pointer cannot be used directly from C. So calls from
202a non-compiled function or from a C&nbsp;function use an extra entry
203call gate which in turn calls the real call gate. But this is
204really a non-issue since most calls in typical applications
205are intra-JIT calls.
206</p>
207
208<h2>The Compiler Pipeline</h2>
209<p>
210The compiler pipeline has already been mentioned. This sounds
211more complicated than it is. Basically this is a coroutine that
212runs a <em>frontend</em> function which in turn calls all functions
213from the <em>pipeline table</em>.
214</p>
215<p>
216The pipeline table is sorted by <em>priorities</em>. The standard
217backend has priority 0. Positive priorities are run before the
218backend and negative priorities are run after the backend. Modules
219can dynamically attach or detach themselves to the pipeline with
220the library function <tt>jit.attach()</tt>.
221</p>
222<p>
223So a typical optimizer pass better have a positive priority,
224because it needs to be run before the backend is run. E.g. the
225LuaJIT optimizer module registers itself with priority 50.
226</p>
227<p>
228On the other hand a typical helper module for debugging &mdash;
229a machine code disassembler &mdash; needs to be run after the
230backend and is attached with a negative priority.
231</p>
232<p>
233One special case occurs when compilation fails. This can be due to
234an internal error (ouch) or on purpose. E.g. the optimizer module
235checks some characteristics of the function to be compiled and
236may decide that it's just not worth it. In this case a status
237other than OK is passed back to the pipeline frontend.
238</p>
239<p>
240The easiest thing would be to abort pipeline processing and just
241give up. But this would remove the ability to trace the progress
242of the compiler (which better include failed compilations, too).
243So there is a special rule that odd priorities are still run,
244but even priorities are not. That's why e.g. <tt>-j trace</tt>
245registers itself with priority -99.
246</p>
247
248<h2>The Optimizer</h2>
249<p>
250Maybe it hasn't become clear from the above description,
251but a module can attach any Lua or C&nbsp;function to the compiler
252pipeline. In fact all of the loadable modules are Lua modules.
253Only the backend itself is written in C.
254</p>
255<p>
256So, yes &mdash; the LuaJIT optimizer is written in pure Lua!
257</p>
258<p>
259And no, don't worry, it's quite fast. One reason for this is
260that a very simple <em>abstract interpretation</em> algorithm
261is used. It mostly ignores control flow and/or basic block
262boundaries.
263</p>
264<p>
265Thus the results of the analysis are really only <em>hints</em>.
266The backend <em>must</em> check the preconditions (the contracts)
267for these hints (e.g. the object type). Still, the generated
268hints are pretty accurate and quite useful to speed up the
269compiled code (see below).
270</p>
271<p>
272Explaining how abstract interpretation works is not within the
273scope for this short essay. You may want to have a look at the
274optimizer source code and/or read some articles or books on
275this topic. The canonical reference is
276<a href="http://www2.imm.dtu.dk/~riis/PPA/ppa.html"><span class="ext">&raquo;</span>&nbsp;Principles of Program Analysis</a>.
277Ok, so this one is a bit more on the theoretical side (a gross
278understatement). Try a search engine with the keywords "abstract
279interpretation", too.
280</p>
281<p>
282Suffice to say the optimizer generates hints and passes these
283on to the backend. The backend then decides to encode different
284forms for the same bytecode instruction, to combine several
285instructions or to inline code for C&nbsp;functions. If the hints
286from the optimizer are good, the resulting code will perform
287better because shorter code paths are used for the typical cases.
288</p>
289
290<h2>The JIT Advantage</h2>
291<p>
292One important feature of the optimizer is that it takes 'live'
293function arguments into account. Since the JIT compiler is
294called just before the function is run, the arguments for this
295first invocation are already present. This can be used to great
296advantage in a <em>dynamically typed language</em>, such as Lua.
297</p>
298<p>
299Here's a trivial example:
300</p>
301<pre>
302function foo(t, k)
303 return t[k]
304end
305</pre>
306<p>
307Without knowing the most likely arguments for the function
308there's not much to optimize.
309</p>
310<p>
311Ok, so 't' is most likely a table. But it could be userdata, too.
312In fact it could be any type since the introduction of generic
313metatables for types.
314</p>
315<p>
316And more importantly 'k' can be a number, a string
317or any other type. Oh and let's not forget about metamethods ...
318</p>
319<p>
320If you know a bit about Lua internals, it should be clear by now
321that the code for this function could potentially branch to half
322of the Lua core. And it's of course impossible to inline all
323these cases.
324</p>
325<p>
326On the other hand if it's <em>known</em> (or there's a good hint)
327that 't' is a table and that 'k' is a positive integer, then there
328is a high likeliness that the key 'k' is in the array part
329of the table. This lookup can be done with just a few machine code
330instructions.
331</p>
332<p>
333Of course the preconditions for this fast path have to be checked
334(unless there are definitive hints). But if the hints are right,
335the code runs a lot faster (about a factor of 3 in this case
336for the pure table lookup).
337</p>
338
339<h2>Optimizing the Optimizer</h2>
340<p>
341A question that surely popped up in your mind while reading
342the above section: does the optimizer optimize itself? I.e.
343is the optimizer module compiled?
344</p>
345<p>
346The current answer is no. Mainly because the compiler pipeline
347is single-threaded only. It's locked during compilation and
348any parallel attempt to JIT compile a function results in
349a 'DELAYED' status code. In fact all modules that attach to
350the compiler pipeline disable compilation for the entire
351module (because LuaJIT would do that anyway). The main chunk
352of modules loaded with <tt>require()</tt> is never compiled,
353so there is no chicken-and-egg problem here.
354</p>
355<p>
356Of course you could do an AOT compilation in the main chunk of
357the optimizer module. But then only with the plain backend.
358Recompiling it later on with the optimizer attached doesn't work,
359because a function cannot be compiled twice (I plan to lift
360this restriction).
361</p>
362<p>
363The other question is whether it pays off to compile the optimizer
364at all? Honestly, I haven't tried, because the current optimizer
365is really simple. It runs very quickly, even under the bytecode
366interpreter.
367</p>
368
369<h2>That's All Folks</h2>
370<p>
371Ok, that's all for now. I'll extend this text later on with
372new topics that come up in questions. Keep on asking these
373on the mailing list if you are interested.
374</p>
375<p>
376Thank you for your attention!
377</p>
378<br class="flush">
379</div>
380<div id="foot">
381<hr class="hide">
382Copyright &copy; 2005-2011 Mike Pall
383<span class="noprint">
384&middot;
385<a href="contact.html">Contact</a>
386</span>
387</div>
388</body>
389</html>