diff options
Diffstat (limited to '')
-rw-r--r-- | libraries/LuaJIT-1.1.7/jitdoc/luajit_intro.html | 389 |
1 files changed, 0 insertions, 389 deletions
diff --git a/libraries/LuaJIT-1.1.7/jitdoc/luajit_intro.html b/libraries/LuaJIT-1.1.7/jitdoc/luajit_intro.html deleted file mode 100644 index 9e1fb23..0000000 --- a/libraries/LuaJIT-1.1.7/jitdoc/luajit_intro.html +++ /dev/null | |||
@@ -1,389 +0,0 @@ | |||
1 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> | ||
2 | <html> | ||
3 | <head> | ||
4 | <title>Introducing LuaJIT</title> | ||
5 | <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> | ||
6 | <meta name="Author" content="Mike Pall"> | ||
7 | <meta name="Copyright" content="Copyright (C) 2005-2011, Mike Pall"> | ||
8 | <meta name="Language" content="en"> | ||
9 | <link rel="stylesheet" type="text/css" href="bluequad.css" media="screen"> | ||
10 | <link rel="stylesheet" type="text/css" href="bluequad-print.css" media="print"> | ||
11 | </head> | ||
12 | <body> | ||
13 | <div id="site"> | ||
14 | <a href="http://luajit.org/"><span>Lua<span id="logo">JIT</span></span></a> | ||
15 | </div> | ||
16 | <div id="head"> | ||
17 | <h1>Introducing LuaJIT</h1> | ||
18 | </div> | ||
19 | <div id="nav"> | ||
20 | <ul><li> | ||
21 | <a href="index.html">Index</a> | ||
22 | </li><li> | ||
23 | <a href="luajit.html">LuaJIT</a> | ||
24 | <ul><li> | ||
25 | <a href="luajit_features.html">Features</a> | ||
26 | </li><li> | ||
27 | <a href="luajit_install.html">Installation</a> | ||
28 | </li><li> | ||
29 | <a href="luajit_run.html">Running</a> | ||
30 | </li><li> | ||
31 | <a href="luajit_api.html">API Extensions</a> | ||
32 | </li><li> | ||
33 | <a class="current" href="luajit_intro.html">Introduction</a> | ||
34 | </li><li> | ||
35 | <a href="luajit_performance.html">Performance</a> | ||
36 | </li><li> | ||
37 | <a href="luajit_debug.html">Debugging</a> | ||
38 | </li><li> | ||
39 | <a href="luajit_changes.html">Changes</a> | ||
40 | </li></ul> | ||
41 | </li><li> | ||
42 | <a href="coco.html">Coco</a> | ||
43 | <ul><li> | ||
44 | <a href="coco_portability.html">Portability</a> | ||
45 | </li><li> | ||
46 | <a href="coco_api.html">API Extensions</a> | ||
47 | </li><li> | ||
48 | <a href="coco_changes.html">Changes</a> | ||
49 | </li></ul> | ||
50 | </li><li> | ||
51 | <a href="dynasm.html">DynASM</a> | ||
52 | <ul><li> | ||
53 | <a href="dynasm_features.html">Features</a> | ||
54 | </li><li> | ||
55 | <a href="dynasm_examples.html">Examples</a> | ||
56 | </li></ul> | ||
57 | </li><li> | ||
58 | <a href="http://luajit.org/download.html">Download <span class="ext">»</span></a> | ||
59 | </li></ul> | ||
60 | </div> | ||
61 | <div id="main"> | ||
62 | <p> | ||
63 | This is a little essay that tries to answer the question: | ||
64 | <em>'So, how does LuaJIT really work?'</em>. | ||
65 | </p> | ||
66 | <p> | ||
67 | I tried to avoid going into all the gory details, but at the | ||
68 | same time provide a deep enough explanation, to let you find | ||
69 | your way around LuaJIT's inner workings. | ||
70 | </p> | ||
71 | <p> | ||
72 | The learning curve is maybe a little bit steep for newbies and | ||
73 | compiler gurus will certainly fall asleep after two paragraphs. | ||
74 | It's difficult to strike a balance here. | ||
75 | </p> | ||
76 | |||
77 | <h2>Acronym Soup</h2> | ||
78 | <p> | ||
79 | As the name says LuaJIT is a <em>Just-In-Time</em> (JIT) compiler. | ||
80 | This means that functions are compiled on demand, i.e. when they | ||
81 | are run first. This ensures both a quick application startup | ||
82 | and helps to avoid useless work, too. E.g. unused functions | ||
83 | are not compiled at all. | ||
84 | </p> | ||
85 | <p> | ||
86 | The other alternative is known as <em>Ahead-Of-Time</em> (AOT) | ||
87 | compilation. Here everything is compiled before running any function. | ||
88 | This is the classic way for many languages, such as C or C++. | ||
89 | </p> | ||
90 | <p> | ||
91 | In fact plain Lua allows you to pre-compile Lua source code into | ||
92 | Lua bytecode and store it in a binary file that can be run | ||
93 | later on. This is used only in specific settings (e.g. memory limited | ||
94 | embedded systems), because the Lua bytecode compiler is really fast. | ||
95 | The ability to run source files right away is part of what makes | ||
96 | a dynamic language (aka scripting language) so powerful. | ||
97 | </p> | ||
98 | <p> | ||
99 | JIT compilation has a few other advantages for dynamic languages | ||
100 | that AOT compilation can only provide with a massive amount | ||
101 | of code analysis. More can be found in the literature. | ||
102 | One particular advantage is explained later. | ||
103 | </p> | ||
104 | |||
105 | <h2>Quick, JIT — Run!</h2> | ||
106 | <p> | ||
107 | JIT compilation happens mostly invisible. You'll probably never | ||
108 | notice that a compilation is going on. Part of the secret is | ||
109 | that everything happens in little pieces intermixed with running | ||
110 | the application itself inbetween. The other part of the secret | ||
111 | is that JIT compilation can be made pretty fast. | ||
112 | </p> | ||
113 | <p> | ||
114 | Most applications quickly converge to a stable state where | ||
115 | everything that really needs to be compiled is compiled | ||
116 | right away. Only occasional isolated compiles happen later on. | ||
117 | </p> | ||
118 | <p> | ||
119 | Even though the name doesn't suggest it, LuaJIT <em>can</em> operate | ||
120 | in AOT mode, too. But this is completely under user control | ||
121 | (see <a href="luajit_api.html#jit_compile"><tt>jit.compile()</tt></a>) | ||
122 | and doesn't happen automatically. | ||
123 | </p> | ||
124 | <p> | ||
125 | Unless you have good reason to suspect that AOT compilation | ||
126 | might help for a specific application, I wouldn't bother though. | ||
127 | Compilation speed is usually a non-argument, because LuaJIT | ||
128 | is extremely fast. Compilation times are typically in the | ||
129 | <em>microsecond range</em> for individual Lua functions. | ||
130 | </p> | ||
131 | |||
132 | <h2>Starting Up</h2> | ||
133 | <p> | ||
134 | The next few paragraphs may not be exactly breaking news to you, | ||
135 | if you are familiar with JIT compilers. Still, please read on, | ||
136 | because some terms are introduced that are used later on. | ||
137 | </p> | ||
138 | <p> | ||
139 | When you start LuaJIT everything proceeds like in standard Lua: | ||
140 | the Lua core is initialized, the standard libraries are loaded and | ||
141 | the command line is analyzed. Then usually the first Lua source | ||
142 | code file is loaded and is translated to Lua bytecode. And finally | ||
143 | the function for the initial main chunk is run ... | ||
144 | </p> | ||
145 | |||
146 | <h2>Kicking the Compiler</h2> | ||
147 | <p> | ||
148 | This is where LuaJIT kicks in: | ||
149 | </p> | ||
150 | <p> | ||
151 | All Lua functions carry an additional <em>status code</em> for LuaJIT. | ||
152 | Initially this is set to 'NONE', i.e. the function has not been | ||
153 | looked at (yet). If a function is run with this setting, | ||
154 | the LuaJIT <em>compiler pipeline</em> is started up. | ||
155 | </p> | ||
156 | <p> | ||
157 | If you haven't loaded any special LuaJIT modules and optimization | ||
158 | is not turned on, the compiler pipeline only consists of the | ||
159 | <em>compiler backend</em>. | ||
160 | </p> | ||
161 | <p> | ||
162 | The compiler backend is the low-level encoding engine that translates | ||
163 | bytecode instructions to machine code instructions. Without any | ||
164 | further hints from other modules, the backend more or less does a | ||
165 | 1:1 translation. I.e. a single variant of a bytecode instruction | ||
166 | corresponds to a single piece of machine code. | ||
167 | </p> | ||
168 | <p> | ||
169 | If all goes well, these little code pieces are put together, | ||
170 | a function prologue is slapped on and voila: your Lua function | ||
171 | has been translated to machine code. Of course things are not | ||
172 | that simple when you look closer, but hey — this is | ||
173 | the theory. | ||
174 | </p> | ||
175 | <p> | ||
176 | Anyway, the status code for the function is set to 'OK' and the | ||
177 | machine code is run. If this function runs another Lua function | ||
178 | which has not been compiled, that one is compiled, too. And so on. | ||
179 | </p> | ||
180 | |||
181 | <h2>Call Gates</h2> | ||
182 | <p> | ||
183 | Ok, so what happens when a function is called repeatedly? After all | ||
184 | this is the most common case. | ||
185 | </p> | ||
186 | <p> | ||
187 | Simple: The status code is checked again. This time it's set to 'OK', | ||
188 | so the machine code can be run directly. Well — that's not the | ||
189 | whole truth: for calls that originate in a JIT compiled function | ||
190 | a better mechanism, tentatively named <em>call gates</em> is used. | ||
191 | </p> | ||
192 | <p> | ||
193 | Every function has a call gate field (a function pointer). By default | ||
194 | it's set to a function that does the above checks and runs the | ||
195 | compiler. But as soon as a function is compiled, the call gate | ||
196 | is modified to point to the just compiled machine code. | ||
197 | </p> | ||
198 | <p> | ||
199 | Calling a function is then as easy as calling the code that the | ||
200 | call gate points to. But due to special (faster) calling conventions | ||
201 | this function pointer cannot be used directly from C. So calls from | ||
202 | a non-compiled function or from a C function use an extra entry | ||
203 | call gate which in turn calls the real call gate. But this is | ||
204 | really a non-issue since most calls in typical applications | ||
205 | are intra-JIT calls. | ||
206 | </p> | ||
207 | |||
208 | <h2>The Compiler Pipeline</h2> | ||
209 | <p> | ||
210 | The compiler pipeline has already been mentioned. This sounds | ||
211 | more complicated than it is. Basically this is a coroutine that | ||
212 | runs a <em>frontend</em> function which in turn calls all functions | ||
213 | from the <em>pipeline table</em>. | ||
214 | </p> | ||
215 | <p> | ||
216 | The pipeline table is sorted by <em>priorities</em>. The standard | ||
217 | backend has priority 0. Positive priorities are run before the | ||
218 | backend and negative priorities are run after the backend. Modules | ||
219 | can dynamically attach or detach themselves to the pipeline with | ||
220 | the library function <tt>jit.attach()</tt>. | ||
221 | </p> | ||
222 | <p> | ||
223 | So a typical optimizer pass better have a positive priority, | ||
224 | because it needs to be run before the backend is run. E.g. the | ||
225 | LuaJIT optimizer module registers itself with priority 50. | ||
226 | </p> | ||
227 | <p> | ||
228 | On the other hand a typical helper module for debugging — | ||
229 | a machine code disassembler — needs to be run after the | ||
230 | backend and is attached with a negative priority. | ||
231 | </p> | ||
232 | <p> | ||
233 | One special case occurs when compilation fails. This can be due to | ||
234 | an internal error (ouch) or on purpose. E.g. the optimizer module | ||
235 | checks some characteristics of the function to be compiled and | ||
236 | may decide that it's just not worth it. In this case a status | ||
237 | other than OK is passed back to the pipeline frontend. | ||
238 | </p> | ||
239 | <p> | ||
240 | The easiest thing would be to abort pipeline processing and just | ||
241 | give up. But this would remove the ability to trace the progress | ||
242 | of the compiler (which better include failed compilations, too). | ||
243 | So there is a special rule that odd priorities are still run, | ||
244 | but even priorities are not. That's why e.g. <tt>-j trace</tt> | ||
245 | registers itself with priority -99. | ||
246 | </p> | ||
247 | |||
248 | <h2>The Optimizer</h2> | ||
249 | <p> | ||
250 | Maybe it hasn't become clear from the above description, | ||
251 | but a module can attach any Lua or C function to the compiler | ||
252 | pipeline. In fact all of the loadable modules are Lua modules. | ||
253 | Only the backend itself is written in C. | ||
254 | </p> | ||
255 | <p> | ||
256 | So, yes — the LuaJIT optimizer is written in pure Lua! | ||
257 | </p> | ||
258 | <p> | ||
259 | And no, don't worry, it's quite fast. One reason for this is | ||
260 | that a very simple <em>abstract interpretation</em> algorithm | ||
261 | is used. It mostly ignores control flow and/or basic block | ||
262 | boundaries. | ||
263 | </p> | ||
264 | <p> | ||
265 | Thus the results of the analysis are really only <em>hints</em>. | ||
266 | The backend <em>must</em> check the preconditions (the contracts) | ||
267 | for these hints (e.g. the object type). Still, the generated | ||
268 | hints are pretty accurate and quite useful to speed up the | ||
269 | compiled code (see below). | ||
270 | </p> | ||
271 | <p> | ||
272 | Explaining how abstract interpretation works is not within the | ||
273 | scope for this short essay. You may want to have a look at the | ||
274 | optimizer source code and/or read some articles or books on | ||
275 | this topic. The canonical reference is | ||
276 | <a href="http://www2.imm.dtu.dk/~riis/PPA/ppa.html"><span class="ext">»</span> Principles of Program Analysis</a>. | ||
277 | Ok, so this one is a bit more on the theoretical side (a gross | ||
278 | understatement). Try a search engine with the keywords "abstract | ||
279 | interpretation", too. | ||
280 | </p> | ||
281 | <p> | ||
282 | Suffice to say the optimizer generates hints and passes these | ||
283 | on to the backend. The backend then decides to encode different | ||
284 | forms for the same bytecode instruction, to combine several | ||
285 | instructions or to inline code for C functions. If the hints | ||
286 | from the optimizer are good, the resulting code will perform | ||
287 | better because shorter code paths are used for the typical cases. | ||
288 | </p> | ||
289 | |||
290 | <h2>The JIT Advantage</h2> | ||
291 | <p> | ||
292 | One important feature of the optimizer is that it takes 'live' | ||
293 | function arguments into account. Since the JIT compiler is | ||
294 | called just before the function is run, the arguments for this | ||
295 | first invocation are already present. This can be used to great | ||
296 | advantage in a <em>dynamically typed language</em>, such as Lua. | ||
297 | </p> | ||
298 | <p> | ||
299 | Here's a trivial example: | ||
300 | </p> | ||
301 | <pre> | ||
302 | function foo(t, k) | ||
303 | return t[k] | ||
304 | end | ||
305 | </pre> | ||
306 | <p> | ||
307 | Without knowing the most likely arguments for the function | ||
308 | there's not much to optimize. | ||
309 | </p> | ||
310 | <p> | ||
311 | Ok, so 't' is most likely a table. But it could be userdata, too. | ||
312 | In fact it could be any type since the introduction of generic | ||
313 | metatables for types. | ||
314 | </p> | ||
315 | <p> | ||
316 | And more importantly 'k' can be a number, a string | ||
317 | or any other type. Oh and let's not forget about metamethods ... | ||
318 | </p> | ||
319 | <p> | ||
320 | If you know a bit about Lua internals, it should be clear by now | ||
321 | that the code for this function could potentially branch to half | ||
322 | of the Lua core. And it's of course impossible to inline all | ||
323 | these cases. | ||
324 | </p> | ||
325 | <p> | ||
326 | On the other hand if it's <em>known</em> (or there's a good hint) | ||
327 | that 't' is a table and that 'k' is a positive integer, then there | ||
328 | is a high likeliness that the key 'k' is in the array part | ||
329 | of the table. This lookup can be done with just a few machine code | ||
330 | instructions. | ||
331 | </p> | ||
332 | <p> | ||
333 | Of course the preconditions for this fast path have to be checked | ||
334 | (unless there are definitive hints). But if the hints are right, | ||
335 | the code runs a lot faster (about a factor of 3 in this case | ||
336 | for the pure table lookup). | ||
337 | </p> | ||
338 | |||
339 | <h2>Optimizing the Optimizer</h2> | ||
340 | <p> | ||
341 | A question that surely popped up in your mind while reading | ||
342 | the above section: does the optimizer optimize itself? I.e. | ||
343 | is the optimizer module compiled? | ||
344 | </p> | ||
345 | <p> | ||
346 | The current answer is no. Mainly because the compiler pipeline | ||
347 | is single-threaded only. It's locked during compilation and | ||
348 | any parallel attempt to JIT compile a function results in | ||
349 | a 'DELAYED' status code. In fact all modules that attach to | ||
350 | the compiler pipeline disable compilation for the entire | ||
351 | module (because LuaJIT would do that anyway). The main chunk | ||
352 | of modules loaded with <tt>require()</tt> is never compiled, | ||
353 | so there is no chicken-and-egg problem here. | ||
354 | </p> | ||
355 | <p> | ||
356 | Of course you could do an AOT compilation in the main chunk of | ||
357 | the optimizer module. But then only with the plain backend. | ||
358 | Recompiling it later on with the optimizer attached doesn't work, | ||
359 | because a function cannot be compiled twice (I plan to lift | ||
360 | this restriction). | ||
361 | </p> | ||
362 | <p> | ||
363 | The other question is whether it pays off to compile the optimizer | ||
364 | at all? Honestly, I haven't tried, because the current optimizer | ||
365 | is really simple. It runs very quickly, even under the bytecode | ||
366 | interpreter. | ||
367 | </p> | ||
368 | |||
369 | <h2>That's All Folks</h2> | ||
370 | <p> | ||
371 | Ok, that's all for now. I'll extend this text later on with | ||
372 | new topics that come up in questions. Keep on asking these | ||
373 | on the mailing list if you are interested. | ||
374 | </p> | ||
375 | <p> | ||
376 | Thank you for your attention! | ||
377 | </p> | ||
378 | <br class="flush"> | ||
379 | </div> | ||
380 | <div id="foot"> | ||
381 | <hr class="hide"> | ||
382 | Copyright © 2005-2011 Mike Pall | ||
383 | <span class="noprint"> | ||
384 | · | ||
385 | <a href="contact.html">Contact</a> | ||
386 | </span> | ||
387 | </div> | ||
388 | </body> | ||
389 | </html> | ||