+This is a list of changes between the released versions of LuaJIT.
+The current development version is LuaJIT 2.0.0-beta9.
+The current stable version is LuaJIT 1.1.7.
+
FFI library: Resolve ld script redirection in ffi.load().
+
From Lua 5.2: package.searchpath(), fp:read("*L"),
+load(string).
+
From Lua 5.2, disabled by default: empty statement,
+table.unpack(), modified coroutine.running().
+
+
Correctness and completeness:
+
+
FFI library: numerous fixes.
+
Fix type mismatches in store-to-load forwarding.
+
Fix error handling within metamethods.
+
Fix table.maxn().
+
Improve accuracy of x^-k on x64.
+
Fix code generation for Intel Atom in x64 mode.
+
Fix narrowing of POW.
+
Fix recording of retried fast functions.
+
Fix code generation for bit.bnot() and multiplies.
+
Fix error location within cpcall frames.
+
Add workaround for old libgcc unwind bug.
+
Fix lua_yield() and getmetatable(lightuserdata) on x64.
+
Misc. fixes for PPC/e500 interpreter.
+
Fix stack slot updates for down-recursion.
+
+
Structural and performance enhancements:
+
+
Add dual-number mode (int/double) for the VM. Enabled for ARM.
+
Improve narrowing of arithmetic operators and for loops.
+
Tune loop unrolling heuristics and increase trace recorder limits.
+
Eliminate dead slots in snapshots using bytecode data-flow analysis.
+
Avoid phantom stores to proxy tables.
+
Optimize lookups in empty proxy tables.
+
Improve bytecode optimization of and/or operators.
+
+
+
+
LuaJIT 2.0.0-beta6 — 2011-02-11
+
+
New features:
+
+
PowerPC/e500v2 port of the LuaJIT interpreter is complete.
+
Various minor features from Lua 5.2: Hex escapes in literals,
+'\*' escape, reversible string.format("%q",s),
+"%g" pattern, table.sort checks callbacks,
+os.exit(status|true|false[,close]).
+
Lua 5.2 __pairs and __ipairs metamethods
+(disabled by default).
+
Initial release of the FFI library.
+
+
Correctness and completeness:
+
+
Fix string.format() for non-finite numbers.
+
Fix memory leak when compiled to use the built-in allocator.
+
x86/x64: Fix unnecessary resize in TSETM bytecode.
+
Fix various GC issues with traces and jit.flush().
+
x64: Fix fusion of indexes for array references.
+
x86/x64: Fix stack overflow handling for coroutine results.
+
Enable low-2GB memory allocation on FreeBSD/x64.
+
Fix collectgarbage("count") result if more than 2GB is in use.
+
Fix parsing of hex floats.
+
x86/x64: Fix loop branch inversion with trailing
+HREF+NE/EQ.
+
Add jit.os string.
+
coroutine.create() permits running C functions, too.
+
Fix OSX build to work with newer ld64 versions.
+
Fix bytecode optimization of and/or operators.
+
+
Structural and performance enhancements:
+
+
Emit specialized bytecode for pairs()/next().
+
Improve bytecode coalescing of nil constants.
+
Compile calls to vararg functions.
+
Compile select().
+
Improve alias analysis, esp. for loads from allocations.
+
Tuning of various compiler heuristics.
+
Refactor and extend IR conversion instructions.
+
x86/x64: Various backend enhancements related to the FFI.
+
Add SPLIT pass to split 64 bit IR instructions for 32 bit CPUs.
+
+
+
+
LuaJIT 2.0.0-beta5 — 2010-08-24
+
+
Correctness and completeness:
+
+
Fix trace exit dispatch to function headers.
+
Fix Windows and OSX builds with LUAJIT_DISABLE_JIT.
+
Reorganize and fix placement of generated machine code on x64.
+
Fix TNEW in x64 interpreter.
+
Do not eliminate PHIs for values only referenced from side exits.
+
OS-independent canonicalization of strings for non-finite numbers.
+
Fix string.char() range check on x64.
+
Fix tostring() resolving within print().
+
Fix error handling for next().
+
Fix passing of constant arguments to external calls on x64.
+
Fix interpreter argument check for two-argument SSE math functions.
+
Fix C frame chain corruption caused by lua_cpcall().
+
Fix return from pcall() within active hook.
+
+
Structural and performance enhancements:
+
+
Replace on-trace GC frame syncing with interpreter exit.
+
Improve hash lookup specialization by not removing dead keys during GC.
+
Turn traces into true GC objects.
+
Avoid starting a GC cycle immediately after library init.
+
Add weak guards to improve dead-code elimination.
+
Speed up string interning.
+
+
+
+
LuaJIT 2.0.0-beta4 — 2010-03-28
+
+
Correctness and completeness:
+
+
Fix precondition for on-trace creation of table keys.
+
Fix {f()} on x64 when table is resized.
+
Fix folding of ordered comparisons with same references.
+
Fix snapshot restores for multi-result bytecodes.
+
Fix potential hang when recording bytecode with nested closures.
+
Fix recording of getmetatable(), tonumber() and bad argument types.
+
Fix SLOAD fusion across returns to lower frames.
+
+
Structural and performance enhancements:
+
+
Add array bounds check elimination. -Oabc is enabled by default.
+
More tuning for x64, e.g. smaller table objects.
+
+
+
+
LuaJIT 2.0.0-beta3 — 2010-03-07
+
+
LuaJIT x64 port:
+
+
Port integrated memory allocator to Linux/x64, Windows/x64 and OSX/x64.
+
Port interpreter and JIT compiler to x64.
+
Port DynASM to x64.
+
Many 32/64 bit cleanups in the VM.
+
Allow building the interpreter with either x87 or SSE2 arithmetics.
+
Add external unwinding and C++ exception interop (default on x64).
+
+
Correctness and completeness:
+
+
Fix constructor bytecode generation for certain conditional values.
+
Fix some cases of ordered string comparisons.
+
Fix lua_tocfunction().
+
Fix cutoff register in JMP bytecode for some conditional expressions.
+
Fix PHI marking algorithm for references from variant slots.
+
Fix package.cpath for non-default PREFIX.
+
Fix DWARF2 frame unwind information for interpreter on OSX.
+
Drive the GC forward on string allocations in the parser.
+
Implement call/return hooks (zero-cost if disabled).
+
Implement yield from C hooks.
+
Disable JIT compiler on older non-SSE2 CPUs instead of aborting.
+
+
Structural and performance enhancements:
+
+
Compile recursive code (tail-, up- and down-recursion).
+
Improve heuristics for bytecode penalties and blacklisting.
+
Split CALL/FUNC recording and clean up fast function call semantics.
+
Major redesign of internal function call handling.
+
Improve FOR loop const specialization and integerness checks.
+
Switch to pre-initialized stacks. Avoid frame-clearing.
+
Colocation of prototypes and related data: bytecode, constants, debug info.
+
Cleanup parser and streamline bytecode generation.
+
Add support for weak IR references to register allocator.
+
Switch to compressed, extensible snapshots.
+
Compile returns to frames below the start frame.
+
Improve alias analysis of upvalues using a disambiguation hash value.
+
Compile floor/ceil/trunc to SSE2 helper calls or SSE4.1 instructions.
Remove a (sometimes) wrong assertion in luaJIT_findpc().
+
DynASM now allows labels for displacements and .aword.
+
Fix some compiler warnings for DynASM glue (internal API change).
+
Correct naming for SSSE3 (temporarily known as SSE4) in DynASM and x86 disassembler.
+
The loadable debug modules now handle redirection to stdout
+(e.g. -j trace=-).
+
+
+
LuaJIT 1.1.2 — 2006-06-24
+
+
Fix MSVC inline assembly: use only local variables with
+lua_number2int().
+
Fix "attempt to call a thread value" bug on Mac OS X:
+make values of consts used as lightuserdata keys unique
+to avoid joining by the compiler/linker.
The C stack is kept 16 byte aligned (faster).
+Mandatory for Mac OS X on Intel, too.
+
Faster calling conventions for internal C helper functions.
+
Better instruction scheduling for function prologue, OP_CALL and
+OP_RETURN.
+
+
+
Miscellaneous optimizations:
+
+
Faster loads of FP constants. Remove narrow-to-wide store-to-load
+forwarding stalls.
+
Use (scalar) SSE2 ops (if the CPU supports it) to speed up slot moves
+and FP to integer conversions.
+
Optimized the two-argument form of OP_CONCAT (a..b).
+
Inlined OP_MOD (a%b).
+With better accuracy than the C variant, too.
+
Inlined OP_POW (a^b). Unroll x^k or
+use k^x = 2^(log2(k)*x) or call pow().
+
+
+
Changes in the optimizer:
+
+
Improved hinting for table keys derived from table values
+(t1[t2[x]]).
+
Lookup hinting now works with arbitrary object types and
+supports index chains, too.
+
Generate type hints for arithmetic and comparison operators,
+OP_LEN, OP_CONCAT and OP_FORPREP.
+
Remove several hint definitions in favour of a generic COMBINE hint.
+
Complete rewrite of jit.opt_inline module
+(ex jit.opt_lib).
+
+
+
Use adaptive deoptimization:
+
+
If runtime verification of a contract fails, the affected
+instruction is recompiled and patched on-the-fly.
+Regular programs will trigger deoptimization only occasionally.
+
This avoids generating code for uncommon fallback cases
+most of the time. Generated code is up to 30% smaller compared to
+LuaJIT 1.0.3.
+
Deoptimization is used for many opcodes and contracts:
+
+
OP_CALL, OP_TAILCALL: type mismatch for callable.
+
Inlined calls: closure mismatch, parameter number and type mismatches.
+
OP_GETTABLE, OP_SETTABLE: table or key type and range mismatches.
+
All arithmetic and comparison operators, OP_LEN, OP_CONCAT,
+OP_FORPREP: operand type and range mismatches.
+
+
Complete redesign of the debug and traceback info
+(bytecode ↔ mcode) to support deoptimization.
+Much more flexible and needs only 50% of the space.
+
The modules jit.trace, jit.dumphints and
+jit.dump handle deoptimization.
+
+
+
Inlined many popular library functions
+(for commonly used arguments only):
+
+
Most math.* functions (the 18 most used ones)
+[2x-10x faster].
+
string.len, string.sub and string.char
+[2x-10x faster].
+
table.insert, table.remove and table.getn
+[3x-5x faster].
+
coroutine.yield and coroutine.resume
+[3x-5x faster].
+
pairs, ipairs and the corresponding iterators
+[8x-15x faster].
+
+
+
Changes in the core and loadable modules and the stand-alone executable:
+
+
Added jit.version, jit.version_num
+and jit.arch.
+
Reorganized some internal API functions (jit.util.*mcode*).
+
The -j dump output now shows JSUB names, too.
+
New x86 disassembler module written in pure Lua. No dependency
+on ndisasm anymore. Flexible API, very compact (500 lines)
+and complete (x87, MMX, SSE, SSE2, SSE3, SSSE3, privileged instructions).
+
luajit -v prints the LuaJIT version and copyright
+on a separate line.
+
+
+
Added SSE, SSE2, SSE3 and SSSE3 support to DynASM.
+
Miscellaneous doc changes. Added a section about
+embedding LuaJIT.
+LuaJIT adds some extensions to the standard Lua/C API. The LuaJIT include
+directory must be in the compiler search path (-Ipath)
+to be able to include the required header for C code:
+
+
+#include "luajit.h"
+
+
+Or for C++ code:
+
+
+#include "lua.hpp"
+
+
+
luaJIT_setmode(L, idx, mode)
+— Control VM
+
+This is a C API extension to allow control of the VM from C code. The
+full prototype of LuaJIT_setmode is:
+
+
+LUA_API int luaJIT_setmode(lua_State *L, int idx, int mode);
+
+
+The returned status is either success (1) or failure (0).
+The second argument is either 0 or a stack index (similar to the
+other Lua/C API functions).
+
+
+The third argument specifies the mode, which is 'or'ed with a flag.
+The flag can be LUAJIT_MODE_OFF to turn a feature on,
+LUAJIT_MODE_ON to turn a feature off, or
+LUAJIT_MODE_FLUSH to flush cached code.
+
+
+The following modes are defined:
+
+
+
luaJIT_setmode(L, 0, LUAJIT_MODE_ENGINE|flag)
+
+Turn the whole JIT compiler on or off or flush the whole cache of compiled code.
+
+This sets the mode for the function at the stack index idx or
+the parent of the calling function (idx = 0). It either
+enables JIT compilation for a function, disables it and flushes any
+already compiled code or only flushes already compiled code. This
+applies recursively to all sub-functions of the function with
+LUAJIT_MODE_ALLFUNC or only to the sub-functions with
+LUAJIT_MODE_ALLSUBFUNC.
+
+Flushes the specified root trace and all of its side traces from the cache.
+The code for the trace will be retained as long as there are any other
+traces which link to it.
+
+This mode defines a wrapper function for calls to C functions. If
+called with LUAJIT_MODE_ON, the stack index at idx
+must be a lightuserdata object holding a pointer to the wrapper
+function. From now on all C functions are called through the wrapper
+function. If called with LUAJIT_MODE_OFF this mode is turned
+off and all C functions are directly called.
+
+
+The wrapper function can be used for debugging purposes or to catch
+and convert foreign exceptions. But please read the section on
+C++ exception interoperability
+first. Recommended usage can be seen in this C++ code excerpt:
+
+
+#include <exception>
+#include "lua.hpp"
+
+// Catch C++ exceptions and convert them to Lua error messages.
+// Customize as needed for your own exception classes.
+static int wrap_exceptions(lua_State *L, lua_CFunction f)
+{
+ try {
+ return f(L); // Call wrapped function and return result.
+ } catch (const char *s) { // Catch and convert exceptions.
+ lua_pushstring(L, s);
+ } catch (std::exception& e) {
+ lua_pushstring(L, e.what());
+ } catch (...) {
+ lua_pushliteral(L, "caught (...)");
+ }
+ return lua_error(L); // Rethrow as a Lua error.
+}
+
+static int myinit(lua_State *L)
+{
+ ...
+ // Define wrapper function and enable it.
+ lua_pushlightuserdata(L, (void *)wrap_exceptions);
+ luaJIT_setmode(L, -1, LUAJIT_MODE_WRAPCFUNC|LUAJIT_MODE_ON);
+ lua_pop(L, 1);
+ ...
+}
+
+
+Note that you can only define a single global wrapper function,
+so be careful when using this mechanism from multiple C++ modules.
+Also note that this mechanism is not without overhead.
+
+
+The FFI library allows calling external C functions and
+using C data structures from pure Lua code.
+
+
+
+
+The FFI library largely obviates the need to write tedious manual
+Lua/C bindings in C. No need to learn a separate binding language
+— it parses plain C declarations! These can be
+cut-n-pasted from C header files or reference manuals. It's up to
+the task of binding large libraries without the need for dealing with
+fragile binding generators.
+
+
+
+The FFI library is tightly integrated into LuaJIT (it's not available
+as a separate module). The code generated by the JIT-compiler for
+accesses to C data structures from Lua code is on par with the
+code a C compiler would generate. Calls to C functions can
+be inlined in JIT-compiled code, unlike calls to functions bound via
+the classic Lua/C API.
+
+
+This page gives a short introduction to the usage of the FFI library.
+Please use the FFI sub-topics in the navigation bar to learn more.
+
+
+
Motivating Example: Calling External C Functions
+
+It's really easy to call an external C library function:
+
+② Add a C declaration
+for the function. The part inside the double-brackets (in green) is
+just standard C syntax.
+
+
+③ Call the named
+C function — Yes, it's that simple!
+
+
+Actually, what goes on behind the scenes is far from simple: ③ makes use of the standard
+C library namespace ffi.C. Indexing this namespace with
+a symbol name ("printf") automatically binds it to the the
+standard C library. The result is a special kind of object which,
+when called, runs the printf function. The arguments passed
+to this function are automatically converted from Lua objects to the
+corresponding C types.
+
+
+Ok, so maybe the use of printf() wasn't such a spectacular
+example. You could have done that with io.write() and
+string.format(), too. But you get the idea ...
+
+
+So here's something to pop up a message box on Windows:
+
+Compare this with the effort required to bind that function using the
+classic Lua/C API: create an extra C file, add a C function
+that retrieves and checks the argument types passed from Lua and calls
+the actual C function, add a list of module functions and their
+names, add a luaopen_* function and register all module
+functions, compile and link it into a shared library (DLL), move it to
+the proper path, add Lua code that loads the module aaaand ... finally
+call the binding function. Phew!
+
+
+
Motivating Example: Using C Data Structures
+
+The FFI library allows you to create and access C data
+structures. Of course the main use for this is for interfacing with
+C functions. But they can be used stand-alone, too.
+
+
+Lua is built upon high-level data types. They are flexible, extensible
+and dynamic. That's why we all love Lua so much. Alas, this can be
+inefficient for certain tasks, where you'd really want a low-level
+data type. E.g. a large array of a fixed structure needs to be
+implemented with a big table holding lots of tiny tables. This imposes
+both a substantial memory overhead as well as a performance overhead.
+
+
+Here's a sketch of a library that operates on color images plus a
+simple benchmark. First, the plain Lua version:
+
+
+local floor = math.floor
+
+local function image_ramp_green(n)
+ local img = {}
+ local f = 255/(n-1)
+ for i=1,n do
+ img[i] = { red = 0, green = floor((i-1)*f), blue = 0, alpha = 255 }
+ end
+ return img
+end
+
+local function image_to_grey(img, n)
+ for i=1,n do
+ local y = floor(0.3*img[i].red + 0.59*img[i].green + 0.11*img[i].blue)
+ img[i].red = y; img[i].green = y; img[i].blue = y
+ end
+end
+
+local N = 400*400
+local img = image_ramp_green(N)
+for i=1,1000 do
+ image_to_grey(img, N)
+end
+
+
+This creates a table with 160.000 pixels, each of which is a table
+holding four number values in the range of 0-255. First an image with
+a green ramp is created (1D for simplicity), then the image is
+converted to greyscale 1000 times. Yes, that's silly, but I was in
+need of a simple example ...
+
+
+And here's the FFI version. The modified parts have been marked in
+bold:
+
+
+①
+
+
+
+
+
+②
+
+③
+④
+
+
+
+
+
+
+③
+⑤local ffi = require("ffi")
+ffi.cdef[[
+typedef struct { uint8_t red, green, blue, alpha; } rgba_pixel;
+]]
+
+local function image_ramp_green(n)
+ local img = ffi.new("rgba_pixel[?]", n)
+ local f = 255/(n-1)
+ for i=0,n-1 do
+ img[i].green = i*f
+ img[i].alpha = 255
+ end
+ return img
+end
+
+local function image_to_grey(img, n)
+ for i=0,n-1 do
+ local y = 0.3*img[i].red + 0.59*img[i].green + 0.11*img[i].blue
+ img[i].red = y; img[i].green = y; img[i].blue = y
+ end
+end
+
+local N = 400*400
+local img = image_ramp_green(N)
+for i=1,1000 do
+ image_to_grey(img, N)
+end
+
+
+Ok, so that wasn't too difficult:
+
+
+① First, load the FFI
+library and declare the low-level data type. Here we choose a
+struct which holds four byte fields, one for each component
+of a 4x8 bit RGBA pixel.
+
+
+② Creating the data
+structure with ffi.new() is straightforward — the
+'?' is a placeholder for the number of elements of a
+variable-length array.
+
+
+③ C arrays are
+zero-based, so the indexes have to run from 0 to
+n-1. One might want to allocate one more element instead to
+simplify converting legacy code.
+
+
+④ Since ffi.new()
+zero-fills the array by default, we only need to set the green and the
+alpha fields.
+
+
+⑤ The calls to
+math.floor() can be omitted here, because floating-point
+numbers are already truncated towards zero when converting them to an
+integer. This happens implicitly when the number is stored in the
+fields of each pixel.
+
+
+Now let's have a look at the impact of the changes: first, memory
+consumption for the image is down from 22 Megabytes to
+640 Kilobytes (400*400*4 bytes). That's a factor of 35x less! So,
+yes, tables do have a noticeable overhead. BTW: The original program
+would consume 40 Megabytes in plain Lua (on x64).
+
+
+Next, performance: the pure Lua version runs in 9.57 seconds (52.9
+seconds with the Lua interpreter) and the FFI version runs in 0.48
+seconds on my machine (YMMV). That's a factor of 20x faster (110x
+faster than the Lua interpreter).
+
+
+The avid reader may notice that converting the pure Lua version over
+to use array indexes for the colors ([1] instead of
+.red, [2] instead of .green etc.) ought to
+be more compact and faster. This is certainly true (by a factor of
+~1.7x). Switching to a struct-of-arrays would help, too.
+
+
+However the resulting code would be less idiomatic and rather
+error-prone. And it still doesn't get even close to the performance of
+the FFI version of the code. Also, high-level data structures cannot
+be easily passed to other C functions, especially I/O functions,
+without undue conversion penalties.
+
+This page describes the API functions provided by the FFI library in
+detail. It's recommended to read through the
+introduction and the
+FFI tutorial first.
+
+
+
Glossary
+
+
cdecl — An abstract C type declaration (a Lua
+string).
+
ctype — A C type object. This is a special kind of
+cdata returned by ffi.typeof(). It serves as a
+cdataconstructor when called.
+
cdata — A C data object. It holds a value of the
+corresponding ctype.
+
ct — A C type specification which can be used for
+most of the API functions. Either a cdecl, a ctype or a
+cdata serving as a template type.
+
cb — A callback object. This is a C data object
+holding a special function pointer. Calling this function from
+C code runs an associated Lua function.
+
VLA — A variable-length array is declared with a
+? instead of the number of elements, e.g. "int[?]".
+The number of elements (nelem) must be given when it's
+created.
+
VLS — A variable-length struct is a struct C
+type where the last element is a VLA. The same rules for
+declaration and creation apply.
+
+
+
Declaring and Accessing External Symbols
+
+External symbols must be declared first and can then be accessed by
+indexing a C library
+namespace, which automatically binds the symbol to a specific
+library.
+
+
+
ffi.cdef(def)
+
+Adds multiple C declarations for types or external symbols (named
+variables or functions). def must be a Lua string. It's
+recommended to use the syntactic sugar for string arguments as
+follows:
+
+
+ffi.cdef[[
+typedef struct foo { int a, b; } foo_t; // Declare a struct and typedef.
+int dofoo(foo_t *f, int n); /* Declare an external C function. */
+]]
+
+
+The contents of the string (the part in green above) must be a
+sequence of
+C declarations,
+separated by semicolons. The trailing semicolon for a single
+declaration may be omitted.
+
+
+Please note that external symbols are only declared, but they
+are not bound to any specific address, yet. Binding is
+achieved with C library namespaces (see below).
+
+
+C declarations are not passed through a C pre-processor,
+yet. No pre-processor tokens are allowed, except for
+#pragma pack. Replace #define in existing
+C header files with enum, static const
+or typedef and/or pass the files through an external
+C pre-processor (once). Be careful not to include unneeded or
+redundant declarations from unrelated header files.
+
+
+
ffi.C
+
+This is the default C library namespace — note the
+uppercase 'C'. It binds to the default set of symbols or
+libraries on the target system. These are more or less the same as a
+C compiler would offer by default, without specifying extra link
+libraries.
+
+
+On POSIX systems, this binds to symbols in the default or global
+namespace. This includes all exported symbols from the executable and
+any libraries loaded into the global namespace. This includes at least
+libc, libm, libdl (on Linux),
+libgcc (if compiled with GCC), as well as any exported
+symbols from the Lua/C API provided by LuaJIT itself.
+
+
+On Windows systems, this binds to symbols exported from the
+*.exe, the lua51.dll (i.e. the Lua/C API
+provided by LuaJIT itself), the C runtime library LuaJIT was linked
+with (msvcrt*.dll), kernel32.dll,
+user32.dll and gdi32.dll.
+
+
+
clib = ffi.load(name [,global])
+
+This loads the dynamic library given by name and returns
+a new C library namespace which binds to its symbols. On POSIX
+systems, if global is true, the library symbols are
+loaded into the global namespace, too.
+
+
+If name is a path, the library is loaded from this path.
+Otherwise name is canonicalized in a system-dependent way and
+searched in the default search path for dynamic libraries:
+
+
+On POSIX systems, if the name contains no dot, the extension
+.so is appended. Also, the lib prefix is prepended
+if necessary. So ffi.load("z") looks for "libz.so"
+in the default shared library search path.
+
+
+On Windows systems, if the name contains no dot, the extension
+.dll is appended. So ffi.load("ws2_32") looks for
+"ws2_32.dll" in the default DLL search path.
+
+
+
Creating cdata Objects
+
+The following API functions create cdata objects (type()
+returns "cdata"). All created cdata objects are
+garbage collected.
+
+Creates a cdata object for the given ct. VLA/VLS types
+require the nelem argument. The second syntax uses a ctype as
+a constructor and is otherwise fully equivalent.
+
+
+The cdata object is initialized according to the
+rules for initializers,
+using the optional init arguments. Excess initializers cause
+an error.
+
+
+Performance notice: if you want to create many objects of one kind,
+parse the cdecl only once and get its ctype with
+ffi.typeof(). Then use the ctype as a constructor repeatedly.
+
+
+Please note that an anonymous struct declaration implicitly
+creates a new and distinguished ctype every time you use it for
+ffi.new(). This is probably not what you want,
+especially if you create more than one cdata object. Different anonymous
+structs are not considered assignment-compatible by the
+C standard, even though they may have the same fields! Also, they
+are considered different types by the JIT-compiler, which may cause an
+excessive number of traces. It's strongly suggested to either declare
+a named struct or typedef with ffi.cdef()
+or to create a single ctype object for an anonymous struct
+with ffi.typeof().
+
+
+
ctype = ffi.typeof(ct)
+
+Creates a ctype object for the given ct.
+
+
+This function is especially useful to parse a cdecl only once and then
+use the resulting ctype object as a constructor.
+
+
+
cdata = ffi.cast(ct, init)
+
+Creates a scalar cdata object for the given ct. The cdata
+object is initialized with init using the "cast" variant of
+the C type conversion
+rules.
+
+
+This functions is mainly useful to override the pointer compatibility
+checks or to convert pointers to addresses or vice versa.
+
+
+
ctype = ffi.metatype(ct, metatable)
+
+Creates a ctype object for the given ct and associates it with
+a metatable. Only struct/union types, complex numbers
+and vectors are allowed. Other types may be wrapped in a
+struct, if needed.
+
+
+The association with a metatable is permanent and cannot be changed
+afterwards. Neither the contents of the metatable nor the
+contents of an __index table (if any) may be modified
+afterwards. The associated metatable automatically applies to all uses
+of this type, no matter how the objects are created or where they
+originate from. Note that pre-defined operations on types have
+precedence (e.g. declared field names cannot be overriden).
+
+
+All standard Lua metamethods are implemented. These are called directly,
+without shortcuts and on any mix of types. For binary operations, the
+left operand is checked first for a valid ctype metamethod. The
+__gc metamethod only applies to struct/union
+types and performs an implicit ffi.gc()
+call during creation of an instance.
+
+
+
cdata = ffi.gc(cdata, finalizer)
+
+Associates a finalizer with a pointer or aggregate cdata object. The
+cdata object is returned unchanged.
+
+
+This function allows safe integration of unmanaged resources into the
+automatic memory management of the LuaJIT garbage collector. Typical
+usage:
+
+
+local p = ffi.gc(ffi.C.malloc(n), ffi.C.free)
+...
+p = nil -- Last reference to p is gone.
+-- GC will eventually run finalizer: ffi.C.free(p)
+
+
+A cdata finalizer works like the __gc metamethod for userdata
+objects: when the last reference to a cdata object is gone, the
+associated finalizer is called with the cdata object as an argument. The
+finalizer can be a Lua function or a cdata function or cdata function
+pointer. An existing finalizer can be removed by setting a nil
+finalizer, e.g. right before explicitly deleting a resource:
+
+
+ffi.C.free(ffi.gc(p, nil)) -- Manually free the memory.
+
+
+
C Type Information
+
+The following API functions return information about C types.
+They are most useful for inspecting cdata objects.
+
+
+
size = ffi.sizeof(ct [,nelem])
+
+Returns the size of ct in bytes. Returns nil if
+the size is not known (e.g. for "void" or function types).
+Requires nelem for VLA/VLS types, except for cdata objects.
+
+
+
align = ffi.alignof(ct)
+
+Returns the minimum required alignment for ct in bytes.
+
+
+
ofs [,bpos,bsize] = ffi.offsetof(ct, field)
+
+Returns the offset (in bytes) of field relative to the start
+of ct, which must be a struct. Additionally returns
+the position and the field size (in bits) for bit fields.
+
+
+
status = ffi.istype(ct, obj)
+
+Returns true if obj has the C type given by
+ct. Returns false otherwise.
+
+
+C type qualifiers (const etc.) are ignored. Pointers are
+checked with the standard pointer compatibility rules, but without any
+special treatment for void *. If ct specifies a
+struct/union, then a pointer to this type is accepted,
+too. Otherwise the types must match exactly.
+
+
+Note: this function accepts all kinds of Lua objects for the
+obj argument, but always returns false for non-cdata
+objects.
+
+
+
Utility Functions
+
+
err = ffi.errno([newerr])
+
+Returns the error number set by the last C function call which
+indicated an error condition. If the optional newerr argument
+is present, the error number is set to the new value and the previous
+value is returned.
+
+
+This function offers a portable and OS-independent way to get and set the
+error number. Note that only some C functions set the error
+number. And it's only significant if the function actually indicated an
+error condition (e.g. with a return value of -1 or
+NULL). Otherwise, it may or may not contain any previously set
+value.
+
+
+You're advised to call this function only when needed and as close as
+possible after the return of the related C function. The
+errno value is preserved across hooks, memory allocations,
+invocations of the JIT compiler and other internal VM activity. The same
+applies to the value returned by GetLastError() on Windows, but
+you need to declare and call it yourself.
+
+
+
str = ffi.string(ptr [,len])
+
+Creates an interned Lua string from the data pointed to by
+ptr.
+
+
+If the optional argument len is missing, ptr is
+converted to a "char *" and the data is assumed to be
+zero-terminated. The length of the string is computed with
+strlen().
+
+
+Otherwise ptr is converted to a "void *" and
+len gives the length of the data. The data may contain
+embedded zeros and need not be byte-oriented (though this may cause
+endianess issues).
+
+
+This function is mainly useful to convert (temporary)
+"const char *" pointers returned by
+C functions to Lua strings and store them or pass them to other
+functions expecting a Lua string. The Lua string is an (interned) copy
+of the data and bears no relation to the original data area anymore.
+Lua strings are 8 bit clean and may be used to hold arbitrary,
+non-character data.
+
+
+Performance notice: it's faster to pass the length of the string, if
+it's known. E.g. when the length is returned by a C call like
+sprintf().
+
+
+
ffi.copy(dst, src, len)
+ffi.copy(dst, str)
+
+Copies the data pointed to by src to dst.
+dst is converted to a "void *" and src
+is converted to a "const void *".
+
+
+In the first syntax, len gives the number of bytes to copy.
+Caveat: if src is a Lua string, then len must not
+exceed #src+1.
+
+
+In the second syntax, the source of the copy must be a Lua string. All
+bytes of the string plus a zero-terminator are copied to
+dst (i.e. #src+1 bytes).
+
+
+Performance notice: ffi.copy() may be used as a faster
+(inlinable) replacement for the C library functions
+memcpy(), strcpy() and strncpy().
+
+
+
ffi.fill(dst, len [,c])
+
+Fills the data pointed to by dst with len constant
+bytes, given by c. If c is omitted, the data is
+zero-filled.
+
+
+Performance notice: ffi.fill() may be used as a faster
+(inlinable) replacement for the C library function
+memset(dst, c, len). Please note the different
+order of arguments!
+
+
+
Target-specific Information
+
+
status = ffi.abi(param)
+
+Returns true if param (a Lua string) applies for the
+target ABI (Application Binary Interface). Returns false
+otherwise. The following parameters are currently defined:
+
+
+
+
Parameter
+
Description
+
+
+
32bit
32 bit architecture
+
+
64bit
64 bit architecture
+
+
le
Little-endian architecture
+
+
be
Big-endian architecture
+
+
fpu
Target has a hardware FPU
+
+
softfp
softfp calling conventions
+
+
hardfp
hardfp calling conventions
+
+
eabi
EABI variant of the standard ABI
+
+
win
Windows variant of the standard ABI
+
+
+
ffi.os
+
+Contains the target OS name. Same contents as
+jit.os.
+
+
+
ffi.arch
+
+Contains the target architecture name. Same contents as
+jit.arch.
+
+
+
Methods for Callbacks
+
+The C types for callbacks
+have some extra methods:
+
+
+
cb:free()
+
+Free the resources associated with a callback. The associated Lua
+function is unanchored and may be garbage collected. The callback
+function pointer is no longer valid and must not be called anymore
+(it may be reused by a subsequently created callback).
+
+
+
cb:set(func)
+
+Associate a new Lua function with a callback. The C type of the
+callback and the callback function pointer are unchanged.
+
+
+This method is useful to dynamically switch the receiver of callbacks
+without creating a new callback each time and registering it again (e.g.
+with a GUI library).
+
+
+
Extended Standard Library Functions
+
+The following standard library functions have been extended to work
+with cdata objects:
+
+
+
n = tonumber(cdata)
+
+Converts a number cdata object to a double and returns it as
+a Lua number. This is particularly useful for boxed 64 bit
+integer values. Caveat: this conversion may incur a precision loss.
+
+
+
s = tostring(cdata)
+
+Returns a string representation of the value of 64 bit integers
+("nnnLL" or "nnnULL") or
+complex numbers ("re±imi"). Otherwise
+returns a string representation of the C type of a ctype object
+("ctype<type>") or a cdata object
+("cdata<type>: address").
+
+
+
Extensions to the Lua Parser
+
+The parser for Lua source code treats numeric literals with the
+suffixes LL or ULL as signed or unsigned 64 bit
+integers. Case doesn't matter, but uppercase is recommended for
+readability. It handles both decimal (42LL) and hexadecimal
+(0x2aLL) literals.
+
+
+The imaginary part of complex numbers can be specified by suffixing
+number literals with i or I, e.g. 12.5i.
+Caveat: you'll need to use 1i to get an imaginary part with
+the value one, since i itself still refers to a variable
+named i.
+
+This page describes the detailed semantics underlying the FFI library
+and its interaction with both Lua and C code.
+
+
+Given that the FFI library is designed to interface with C code
+and that declarations can be written in plain C syntax, it
+closely follows the C language semantics, wherever possible.
+Some minor concessions are needed for smoother interoperation with Lua
+language semantics.
+
+
+Please don't be overwhelmed by the contents of this page — this
+is a reference and you may need to consult it, if in doubt. It doesn't
+hurt to skim this page, but most of the semantics "just work" as you'd
+expect them to work. It should be straightforward to write
+applications using the LuaJIT FFI for developers with a C or C++
+background.
+
+
+Please note: this doesn't comprise the final specification for the FFI
+semantics, yet. Some semantics may need to be changed, based on your
+feedback. Please report any problems you may
+encounter or any improvements you'd like to see — thank you!
+
+
+
C Language Support
+
+The FFI library has a built-in C parser with a minimal memory
+footprint. It's used by the ffi.* library
+functions to declare C types or external symbols.
+
+
+It's only purpose is to parse C declarations, as found e.g. in
+C header files. Although it does evaluate constant expressions,
+it's not a C compiler. The body of inline
+C function definitions is simply ignored.
+
+
+Also, this is not a validating C parser. It expects and
+accepts correctly formed C declarations, but it may choose to
+ignore bad declarations or show rather generic error messages. If in
+doubt, please check the input against your favorite C compiler.
+
+
+The C parser complies to the C99 language standard plus
+the following extensions:
+
+
+
+
The '\e' escape in character and string literals.
+
+
The C99/C++ boolean type, declared with the keywords bool
+or _Bool.
+
+
Complex numbers, declared with the keywords complex or
+_Complex.
+
+
Two complex number types: complex (aka
+complex double) and complex float.
+
+
Vector types, declared with the GCC mode or
+vector_size attribute.
+
+
Unnamed ('transparent') struct/union fields
+inside a struct/union.
+
+
Incomplete enum declarations, handled like incomplete
+struct declarations.
+
+
Unnamed enum fields inside a
+struct/union. This is similar to a scoped C++
+enum, except that declared constants are visible in the
+global namespace, too.
+
+
Scoped static const declarations inside a
+struct/union (from C++).
+
+
Zero-length arrays ([0]), empty
+struct/union, variable-length arrays (VLA,
+[?]) and variable-length structs (VLS, with a trailing
+VLA).
+
+
C++ reference types (int &x).
+
+
Alternate GCC keywords with '__', e.g.
+__const__.
+
+
GCC __attribute__ with the following attributes:
+aligned, packed, mode,
+vector_size, cdecl, fastcall,
+stdcall.
+
+
The GCC __extension__ keyword and the GCC
+__alignof__ operator.
+
+
GCC __asm__("symname") symbol name redirection for
+function declarations.
+
+
MSVC keywords for fixed-length types: __int8,
+__int16, __int32 and __int64.
+You're encouraged to use these types in preference to the
+compiler-specific extensions or the target-dependent standard types.
+E.g. char differs in signedness and long differs in
+size, depending on the target architecture and platform ABI.
+
+
+The following C features are not supported:
+
+
+
+
A declaration must always have a type specifier; it doesn't
+default to an int type.
+
+
Old-style empty function declarations (K&R) are not allowed.
+All C functions must have a proper prototype declaration. A
+function declared without parameters (int foo();) is
+treated as a function taking zero arguments, like in C++.
+
+
The long double C type is parsed correctly, but
+there's no support for the related conversions, accesses or arithmetic
+operations.
+
+
Wide character strings and character literals are not
+supported.
+
+
See below for features that are currently
+not implemented.
+
+
+
+
C Type Conversion Rules
+
+
Conversions from C types to Lua objects
+
+These conversion rules apply for read accesses to
+C types: indexing pointers, arrays or
+struct/union types; reading external variables or
+constant values; retrieving return values from C calls:
+
+
+
+
Input
+
Conversion
+
Output
+
+
+
int8_t, int16_t
→sign-extint32_t → double
number
+
+
uint8_t, uint16_t
→zero-extint32_t → double
number
+
+
int32_t, uint32_t
→ double
number
+
+
int64_t, uint64_t
boxed value
64 bit int cdata
+
+
double, float
→ double
number
+
+
bool
0 → false, otherwise true
boolean
+
+
Complex number
boxed value
complex cdata
+
+
Vector
boxed value
vector cdata
+
+
Pointer
boxed value
pointer cdata
+
+
Array
boxed reference
reference cdata
+
+
struct/union
boxed reference
reference cdata
+
+
+Bitfields or enum types are treated like their underlying
+type.
+
+
+Reference types are dereferenced before a conversion can take
+place — the conversion is applied to the C type pointed to
+by the reference.
+
+
+
Conversions from Lua objects to C types
+
+These conversion rules apply for write accesses to
+C types: indexing pointers, arrays or
+struct/union types; initializing cdata objects;
+casts to C types; writing to external variables; passing
+arguments to C calls:
+
+If the result type of this conversion doesn't match the
+C type of the destination, the
+conversion rules between C types
+are applied.
+
+
+Reference types are immutable after initialization ("no re-seating of
+references"). For initialization purposes or when passing values to
+reference parameters, they are treated like pointers. Note that unlike
+in C++, there's no way to implement automatic reference generation of
+variables under the Lua language semantics. If you want to call a
+function with a reference parameter, you need to explicitly pass a
+one-element array.
+
+
+
Conversions between C types
+
+These conversion rules are more or less the same as the standard
+C conversion rules. Some rules only apply to casts, or require
+pointer or type compatibility:
+
+
+
+
Input
+
Conversion
+
Output
+
+
+
Signed integer
→narrow or sign-extend
Integer
+
+
Unsigned integer
→narrow or zero-extend
Integer
+
+
Integer
→round
double, float
+
+
double, float
→truncint32_t →narrow
(u)int8_t, (u)int16_t
+
+
double, float
→trunc
(u)int32_t, (u)int64_t
+
+
double, float
→round
float, double
+
+
Number
n == 0 → 0, otherwise 1
bool
+
+
bool
false → 0, true → 1
Number
+
+
Complex number
convert real part
Number
+
+
Number
convert real part, imag = 0
Complex number
+
+
Complex number
convert real and imag part
Complex number
+
+
Number
convert scalar and replicate
Vector
+
+
Vector
copy (same size)
Vector
+
+
struct/union
take base address (compat)
Pointer
+
+
Array
take base address (compat)
Pointer
+
+
Function
take function address
Function pointer
+
+
Number
convert via uintptr_t (cast)
Pointer
+
+
Pointer
convert address (compat/cast)
Pointer
+
+
Pointer
convert address (cast)
Integer
+
+
Array
convert base address (cast)
Integer
+
+
Array
copy (compat)
Array
+
+
struct/union
copy (identical type)
struct/union
+
+
+Bitfields or enum types are treated like their underlying
+type.
+
+
+Conversions not listed above will raise an error. E.g. it's not
+possible to convert a pointer to a complex number or vice versa.
+
+
+
Conversions for vararg C function arguments
+
+The following default conversion rules apply when passing Lua objects
+to the variable argument part of vararg C functions:
+
+
+
+
Input
+
Conversion
+
Output
+
+
+
number
→
double
+
+
boolean
false → 0, true → 1
bool
+
+
nil
NULL →
(void *)
+
+
userdata
userdata payload →
(void *)
+
+
lightuserdata
lightuserdata address →
(void *)
+
+
string
string data →
const char *
+
+
float cdata
→
double
+
+
Array cdata
take base address
Element pointer
+
+
struct/union cdata
take base address
struct/union pointer
+
+
Function cdata
take function address
Function pointer
+
+
Any other cdata
no conversion
C type
+
+
+To pass a Lua object, other than a cdata object, as a specific type,
+you need to override the conversion rules: create a temporary cdata
+object with a constructor or a cast and initialize it with the value
+to pass:
+
+
+Assuming x is a Lua number, here's how to pass it as an
+integer to a vararg function:
+
+If you don't do this, the default Lua number → double
+conversion rule applies. A vararg C function expecting an integer
+will see a garbled or uninitialized value.
+
+
+
Initializers
+
+Creating a cdata object with
+ffi.new() or the
+equivalent constructor syntax always initializes its contents, too.
+Different rules apply, depending on the number of optional
+initializers and the C types involved:
+
+
+
If no initializers are given, the object is filled with zero bytes.
Valarrays (complex numbers and vectors) are treated like scalars
+when a single initializer is given. Otherwise they are treated like
+regular arrays.
+
+
Aggregate types (arrays and structs) accept either a single
+table initializer or a flat list of
+initializers.
+
+
The elements of an array are initialized, starting at index zero.
+If a single initializer is given for an array, it's repeated for all
+remaining elements. This doesn't happen if two or more initializers
+are given: all remaining uninitialized elements are filled with zero
+bytes.
+
+
Byte arrays may also be initialized with a Lua string. This copies
+the whole string plus a terminating zero-byte. The copy stops early only
+if the array has a known, fixed size.
+
+
The fields of a struct are initialized in the order of
+their declaration. Uninitialized fields are filled with zero
+bytes.
+
+
Only the first field of a union can be initialized with a
+flat initializer.
+
+
Elements or fields which are aggregates themselves are initialized
+with a single initializer, but this may be a table
+initializer or a compatible aggregate.
+
+
Excess initializers cause an error.
+
+
+
+
Table Initializers
+
+The following rules apply if a Lua table is used to initialize an
+Array or a struct/union:
+
+
+
+
If the table index [0] is non-nil, then the
+table is assumed to be zero-based. Otherwise it's assumed to be
+one-based.
+
+
Array elements, starting at index zero, are initialized one-by-one
+with the consecutive table elements, starting at either index
+[0] or [1]. This process stops at the first
+nil table element.
+
+
If exactly one array element was initialized, it's repeated for
+all the remaining elements. Otherwise all remaining uninitialized
+elements are filled with zero bytes.
+
+
The above logic only applies to arrays with a known fixed size.
+A VLA is only initialized with the element(s) given in the table.
+Depending on the use case, you may need to explicitly add a
+NULL or 0 terminator to a VLA.
+
+
If the table has a non-empty hash part, a
+struct/union is initialized by looking up each field
+name (as a string key) in the table. Each non-nil value is
+used to initialize the corresponding field.
+
+
Otherwise a struct/union is initialized in the
+order of the declaration of its fields. Each field is initialized with
+the consecutive table elements, starting at either index [0]
+or [1]. This process stops at the first nil table
+element.
+
+
Uninitialized fields of a struct are filled with zero
+bytes, except for the trailing VLA of a VLS.
+
+
Initialization of a union stops after one field has been
+initialized. If no field has been initialized, the union is
+filled with zero bytes.
+
+
Elements or fields which are aggregates themselves are initialized
+with a single initializer, but this may be a nested table
+initializer (or a compatible aggregate).
+
+
Excess initializers for an array cause an error. Excess
+initializers for a struct/union are ignored.
+Unrelated table entries are ignored, too.
+
+
+
+Example:
+
+
+local ffi = require("ffi")
+
+ffi.cdef[[
+struct foo { int a, b; };
+union bar { int i; double d; };
+struct nested { int x; struct foo y; };
+]]
+
+ffi.new("int[3]", {}) --> 0, 0, 0
+ffi.new("int[3]", {1}) --> 1, 1, 1
+ffi.new("int[3]", {1,2}) --> 1, 2, 0
+ffi.new("int[3]", {1,2,3}) --> 1, 2, 3
+ffi.new("int[3]", {[0]=1}) --> 1, 1, 1
+ffi.new("int[3]", {[0]=1,2}) --> 1, 2, 0
+ffi.new("int[3]", {[0]=1,2,3}) --> 1, 2, 3
+ffi.new("int[3]", {[0]=1,2,3,4}) --> error: too many initializers
+
+ffi.new("struct foo", {}) --> a = 0, b = 0
+ffi.new("struct foo", {1}) --> a = 1, b = 0
+ffi.new("struct foo", {1,2}) --> a = 1, b = 2
+ffi.new("struct foo", {[0]=1,2}) --> a = 1, b = 2
+ffi.new("struct foo", {b=2}) --> a = 0, b = 2
+ffi.new("struct foo", {a=1,b=2,c=3}) --> a = 1, b = 2 'c' is ignored
+
+ffi.new("union bar", {}) --> i = 0, d = 0.0
+ffi.new("union bar", {1}) --> i = 1, d = ?
+ffi.new("union bar", {[0]=1,2}) --> i = 1, d = ? '2' is ignored
+ffi.new("union bar", {d=2}) --> i = ?, d = 2.0
+
+ffi.new("struct nested", {1,{2,3}}) --> x = 1, y.a = 2, y.b = 3
+ffi.new("struct nested", {x=1,y={2,3}}) --> x = 1, y.a = 2, y.b = 3
+
+
+
Operations on cdata Objects
+
+All of the standard Lua operators can be applied to cdata objects or a
+mix of a cdata object and another Lua object. The following list shows
+the valid combinations. All other combinations currently raise an
+error.
+
+
+Reference types are dereferenced before performing each of
+the operations below — the operation is applied to the
+C type pointed to by the reference.
+
+
+The pre-defined operations are always tried first before deferring to a
+metamethod for a ctype (if defined).
+
+
+
Indexing a cdata object
+
+
+
Indexing a pointer/array: a cdata pointer/array can be
+indexed by a cdata number or a Lua number. The element address is
+computed as the base address plus the number value multiplied by the
+element size in bytes. A read access loads the element value and
+converts it to a Lua object. A write
+access converts a Lua object to the element
+type and stores the converted value to the element. An error is
+raised if the element size is undefined or a write access to a
+constant element is attempted.
+
+
Dereferencing a struct/union field: a
+cdata struct/union or a pointer to a
+struct/union can be dereferenced by a string key,
+giving the field name. The field address is computed as the base
+address plus the relative offset of the field. A read access loads the
+field value and converts it to a Lua
+object. A write access converts a Lua
+object to the field type and stores the converted value to the
+field. An error is raised if a write access to a constant
+struct/union or a constant field is attempted.
+
+
Indexing a complex number: a complex number can be indexed
+either by a cdata number or a Lua number with the values 0 or 1, or by
+the strings "re" or "im". A read access loads the
+real part ([0], .re) or the imaginary part
+([1], .im) part of a complex number and
+converts it to a Lua number. The
+sub-parts of a complex number are immutable — assigning to an
+index of a complex number raises an error. Accessing out-of-bound
+indexes returns unspecified results, but is guaranteed not to trigger
+memory access violations.
+
+
Indexing a vector: a vector is treated like an array for
+indexing purposes, except the vector elements are immutable —
+assigning to an index of a vector raises an error.
+
+
+
+Note: since there's (deliberately) no address-of operator, a cdata
+object holding a value type is effectively immutable after
+initialization. The JIT compiler benefits from this fact when applying
+certain optimizations.
+
+
+As a consequence of this, the elements of complex numbers and
+vectors are immutable. But the elements of an aggregate holding these
+types may be modified of course. I.e. you cannot assign to
+foo.c.im, but you can assign a (newly created) complex number
+to foo.c.
+
+
+
Calling a cdata object
+
+
+
Constructor: a ctype object can be called and used as a
+constructor.
+
+
C function call: a cdata function or cdata function
+pointer can be called. The passed arguments are
+converted to the C types of the
+parameters given by the function declaration. Arguments passed to the
+variable argument part of vararg C function use
+special conversion rules. This
+C function is called and the return value (if any) is
+converted to a Lua object.
+On Windows/x86 systems, __stdcall functions are automatically
+detected and a function declared as __cdecl (the default) is
+silently fixed up after the first call.
+
+
+
+
Arithmetic on cdata objects
+
+
+
Pointer arithmetic: a cdata pointer/array and a cdata
+number or a Lua number can be added or subtracted. The number must be
+on the right hand side for a subtraction. The result is a pointer of
+the same type with an address plus or minus the number value
+multiplied by the element size in bytes. An error is raised if the
+element size is undefined.
+
+
Pointer difference: two compatible cdata pointers/arrays
+can be subtracted. The result is the difference between their
+addresses, divided by the element size in bytes. An error is raised if
+the element size is undefined or zero.
+
+
64 bit integer arithmetic: the standard arithmetic
+operators (+ - * / % ^ and unary
+minus) can be applied to two cdata numbers, or a cdata number and a
+Lua number. If one of them is an uint64_t, the other side is
+converted to an uint64_t and an unsigned arithmetic operation
+is performed. Otherwise both sides are converted to an
+int64_t and a signed arithmetic operation is performed. The
+result is a boxed 64 bit cdata object.
+
+These rules ensure that 64 bit integers are "sticky". Any
+expression involving at least one 64 bit integer operand results
+in another one. The undefined cases for the division, modulo and power
+operators return 2LL ^ 63 or
+2ULL ^ 63.
+
+You'll have to explicitly convert a 64 bit integer to a Lua
+number (e.g. for regular floating-point calculations) with
+tonumber(). But note this may incur a precision loss.
+
+
+
+
Comparisons of cdata objects
+
+
+
Pointer comparison: two compatible cdata pointers/arrays
+can be compared. The result is the same as an unsigned comparison of
+their addresses. nil is treated like a NULL pointer,
+which is compatible with any other pointer type.
+
+
64 bit integer comparison: two cdata numbers, or a
+cdata number and a Lua number can be compared with each other. If one
+of them is an uint64_t, the other side is converted to an
+uint64_t and an unsigned comparison is performed. Otherwise
+both sides are converted to an int64_t and a signed
+comparison is performed.
+
+
+
+
cdata objects as table keys
+
+Lua tables may be indexed by cdata objects, but this doesn't provide
+any useful semantics — cdata objects are unsuitable as table
+keys!
+
+
+A cdata object is treated like any other garbage-collected object and
+is hashed and compared by its address for table indexing. Since
+there's no interning for cdata value types, the same value may be
+boxed in different cdata objects with different addresses. Thus
+t[1LL+1LL] and t[2LL] usually do not point to
+the same hash slot and they certainly do not point to the same
+hash slot as t[2].
+
+
+It would seriously drive up implementation complexity and slow down
+the common case, if one were to add extra handling for by-value
+hashing and comparisons to Lua tables. Given the ubiquity of their use
+inside the VM, this is not acceptable.
+
+
+There are three viable alternatives, if you really need to use cdata
+objects as keys:
+
+
+
+
If you can get by with the precision of Lua numbers
+(52 bits), then use tonumber() on a cdata number or
+combine multiple fields of a cdata aggregate to a Lua number. Then use
+the resulting Lua number as a key when indexing tables.
+One obvious benefit: t[tonumber(2LL)]does point to
+the same slot as t[2].
+
+
Otherwise use either tostring() on 64 bit integers
+or complex numbers or combine multiple fields of a cdata aggregate to
+a Lua string (e.g. with
+ffi.string()). Then
+use the resulting Lua string as a key when indexing tables.
+
+
Create your own specialized hash table implementation using the
+C types provided by the FFI library, just like you would in
+C code. Ultimately this may give much better performance than the
+other alternatives or what a generic by-value hash table could
+possibly provide.
+
+
+
+
Garbage Collection of cdata Objects
+
+All explicitly (ffi.new(), ffi.cast() etc.) or
+implicitly (accessors) created cdata objects are garbage collected.
+You need to ensure to retain valid references to cdata objects
+somewhere on a Lua stack, an upvalue or in a Lua table while they are
+still in use. Once the last reference to a cdata object is gone, the
+garbage collector will automatically free the memory used by it (at
+the end of the next GC cycle).
+
+
+Please note that pointers themselves are cdata objects, however they
+are not followed by the garbage collector. So e.g. if you
+assign a cdata array to a pointer, you must keep the cdata object
+holding the array alive as long as the pointer is still in use:
+
+
+ffi.cdef[[
+typedef struct { int *a; } foo_t;
+]]
+
+local s = ffi.new("foo_t", ffi.new("int[10]")) -- WRONG!
+
+local a = ffi.new("int[10]") -- OK
+local s = ffi.new("foo_t", a)
+-- Now do something with 's', but keep 'a' alive until you're done.
+
+
+Similar rules apply for Lua strings which are implicitly converted to
+"const char *": the string object itself must be
+referenced somewhere or it'll be garbage collected eventually. The
+pointer will then point to stale data, which may have already been
+overwritten. Note that string literals are automatically kept
+alive as long as the function containing it (actually its prototype)
+is not garbage collected.
+
+
+Objects which are passed as an argument to an external C function
+are kept alive until the call returns. So it's generally safe to
+create temporary cdata objects in argument lists. This is a common
+idiom for passing specific C types to
+vararg functions.
+
+
+Memory areas returned by C functions (e.g. from malloc())
+must be manually managed, of course (or use
+ffi.gc()). Pointers to
+cdata objects are indistinguishable from pointers returned by C
+functions (which is one of the reasons why the GC cannot follow them).
+
+
+
Callbacks
+
+The LuaJIT FFI automatically generates special callback functions
+whenever a Lua function is converted to a C function pointer. This
+associates the generated callback function pointer with the C type
+of the function pointer and the Lua function object (closure).
+
+
+This can happen implicitly due to the usual conversions, e.g. when
+passing a Lua function to a function pointer argument. Or you can use
+ffi.cast() to explicitly cast a Lua function to a
+C function pointer.
+
+
+Currently only certain C function types can be used as callback
+functions. Neither C vararg functions nor functions with
+pass-by-value aggregate argument or result types are supported. There
+are no restrictions for the kind of Lua functions that can be called
+from the callback — no checks for the proper number of arguments
+are made. The return value of the Lua function will be converted to the
+result type and an error will be thrown for invalid conversions.
+
+
+It's allowed to throw errors across a callback invocation, but it's not
+advisable in general. Do this only if you know the C function, that
+called the callback, copes with the forced stack unwinding and doesn't
+leak resources.
+
+
+
Callback resource handling
+
+Callbacks take up resources — you can only have a limited number
+of them at the same time (500 - 1000, depending on the
+architecture). The associated Lua functions are anchored to prevent
+garbage collection, too.
+
+
+Callbacks due to implicit conversions are permanent! There is no
+way to guess their lifetime, since the C side might store the
+function pointer for later use (typical for GUI toolkits). The associated
+resources cannot be reclaimed until termination:
+
+
+ffi.cdef[[
+typedef int (__stdcall *WNDENUMPROC)(void *hwnd, intptr_t l);
+int EnumWindows(WNDENUMPROC func, intptr_t l);
+]]
+
+-- Implicit conversion to a callback via function pointer argument.
+local count = 0
+ffi.C.EnumWindows(function(hwnd, l)
+ count = count + 1
+ return true
+end, 0)
+-- The callback is permanent and its resources cannot be reclaimed!
+-- Ok, so this may not be a problem, if you do this only once.
+
+
+Note: this example shows that you must properly declare
+__stdcall callbacks on Windows/x86 systems. The calling
+convention cannot be automatically detected, unlike for
+__stdcall calls to Windows functions.
+
+
+For some use cases it's necessary to free up the resources or to
+dynamically redirect callbacks. Use an explicit cast to a
+C function pointer and keep the resulting cdata object. Then use
+the cb:free()
+or cb:set() methods
+on the cdata object:
+
+
+-- Explicitly convert to a callback via cast.
+local count = 0
+local cb = ffi.cast("WNDENUMPROC", function(hwnd, l)
+ count = count + 1
+ return true
+end)
+
+-- Pass it to a C function.
+ffi.C.EnumWindows(cb, 0)
+-- EnumWindows doesn't need the callback after it returns, so free it.
+
+cb:free()
+-- The callback function pointer is no longer valid and its resources
+-- will be reclaimed. The created Lua closure will be garbage collected.
+
+
+
Callback performance
+
+Callbacks are slow! First, the C to Lua transition itself
+has an unavoidable cost, similar to a lua_call() or
+lua_pcall(). Argument and result marshalling add to that cost.
+And finally, neither the C compiler nor LuaJIT can inline or
+optimize across the language barrier and hoist repeated computations out
+of a callback function.
+
+
+Do not use callbacks for performance-sensitive work: e.g. consider a
+numerical integration routine which takes a user-defined function to
+integrate over. It's a bad idea to call a user-defined Lua function from
+C code millions of times. The callback overhead will be absolutely
+detrimental for performance.
+
+
+It's considerably faster to write the numerical integration routine
+itself in Lua — the JIT compiler will be able to inline the
+user-defined function and optimize it together with its calling context,
+with very competitive performance.
+
+
+As a general guideline: use callbacks only when you must, because
+of existing C APIs. E.g. callback performance is irrelevant for a
+GUI application, which waits for user input most of the time, anyway.
+
+
+For new designs avoid push-style APIs (C function repeatedly
+calling a callback for each result). Instead use pull-style APIs
+(call a C function repeatedly to get a new result). Calls from Lua
+to C via the FFI are much faster than the other way round. Most well-designed
+libraries already use pull-style APIs (read/write, get/put).
+
+
+
C Library Namespaces
+
+A C library namespace is a special kind of object which allows
+access to the symbols contained in shared libraries or the default
+symbol namespace. The default
+ffi.C namespace is
+automatically created when the FFI library is loaded. C library
+namespaces for specific shared libraries may be created with the
+ffi.load() API
+function.
+
+
+Indexing a C library namespace object with a symbol name (a Lua
+string) automatically binds it to the library. First the symbol type
+is resolved — it must have been declared with
+ffi.cdef. Then the
+symbol address is resolved by searching for the symbol name in the
+associated shared libraries or the default symbol namespace. Finally,
+the resulting binding between the symbol name, the symbol type and its
+address is cached. Missing symbol declarations or nonexistent symbol
+names cause an error.
+
+
+This is what happens on a read access for the different kinds of
+symbols:
+
+
+
+
External functions: a cdata object with the type of the function
+and its address is returned.
+
+
External variables: the symbol address is dereferenced and the
+loaded value is converted to a Lua object
+and returned.
+
+
Constant values (static const or enum
+constants): the constant is converted to a
+Lua object and returned.
+
+
+
+This is what happens on a write access:
+
+
+
+
External variables: the value to be written is
+converted to the C type of the
+variable and then stored at the symbol address.
+
+
Writing to constant variables or to any other symbol type causes
+an error, like any other attempted write to a constant location.
+
+
+
+C library namespaces themselves are garbage collected objects. If
+the last reference to the namespace object is gone, the garbage
+collector will eventually release the shared library reference and
+remove all memory associated with the namespace. Since this may
+trigger the removal of the shared library from the memory of the
+running process, it's generally not safe to use function
+cdata objects obtained from a library if the namespace object may be
+unreferenced.
+
+
+Performance notice: the JIT compiler specializes to the identity of
+namespace objects and to the strings used to index it. This
+effectively turns function cdata objects into constants. It's not
+useful and actually counter-productive to explicitly cache these
+function objects, e.g. local strlen = ffi.C.strlen. OTOH it
+is useful to cache the namespace itself, e.g. local C =
+ffi.C.
+
+
+
No Hand-holding!
+
+The FFI library has been designed as a low-level library. The
+goal is to interface with C code and C data types with a
+minimum of overhead. This means you can do anything you can do
+from C: access all memory, overwrite anything in memory, call
+machine code at any memory address and so on.
+
+
+The FFI library provides no memory safety, unlike regular Lua
+code. It will happily allow you to dereference a NULL
+pointer, to access arrays out of bounds or to misdeclare
+C functions. If you make a mistake, your application might crash,
+just like equivalent C code would.
+
+
+This behavior is inevitable, since the goal is to provide full
+interoperability with C code. Adding extra safety measures, like
+bounds checks, would be futile. There's no way to detect
+misdeclarations of C functions, since shared libraries only
+provide symbol names, but no type information. Likewise there's no way
+to infer the valid range of indexes for a returned pointer.
+
+
+Again: the FFI library is a low-level library. This implies it needs
+to be used with care, but it's flexibility and performance often
+outweigh this concern. If you're a C or C++ developer, it'll be easy
+to apply your existing knowledge. OTOH writing code for the FFI
+library is not for the faint of heart and probably shouldn't be the
+first exercise for someone with little experience in Lua, C or C++.
+
+
+As a corollary of the above, the FFI library is not safe for use by
+untrusted Lua code. If you're sandboxing untrusted Lua code, you
+definitely don't want to give this code access to the FFI library or
+to any cdata object (except 64 bit integers or complex
+numbers). Any properly engineered Lua sandbox needs to provide safety
+wrappers for many of the standard Lua library functions —
+similar wrappers need to be written for high-level operations on FFI
+data types, too.
+
+
+
Current Status
+
+The initial release of the FFI library has some limitations and is
+missing some features. Most of these will be fixed in future releases.
+
C declarations are not passed through a C pre-processor,
+yet.
+
The C parser is able to evaluate most constant expressions
+commonly found in C header files. However it doesn't handle the
+full range of C expression semantics and may fail for some
+obscure constructs.
+
static const declarations only work for integer types
+up to 32 bits. Neither declaring string constants nor
+floating-point constants is supported.
+
Packed struct bitfields that cross container boundaries
+are not implemented.
+
Native vector types may be defined with the GCC mode or
+vector_size attribute. But no operations other than loading,
+storing and initializing them are supported, yet.
+
The volatile type qualifier is currently ignored by
+compiled code.
+The JIT compiler already handles a large subset of all FFI operations.
+It automatically falls back to the interpreter for unimplemented
+operations (you can check for this with the
+-jv command line option).
+The following operations are currently not compiled and may exhibit
+suboptimal performance, especially when used in inner loops:
+
+
+
Array/struct copies and bulk initializations.
+
Bitfield accesses and initializations.
+
Vector operations.
+
Table initializers.
+
Initialization of nested struct/union types.
+
Allocations of variable-length arrays or structs.
+
Allocations of C types with a size > 64 bytes or an
+alignment > 8 bytes.
+
Conversions from lightuserdata to void *.
+
Pointer differences for element sizes that are not a power of
+two.
+
Calls to C functions with aggregates passed or returned by
+value.
+
Calls to ctype metamethods which are not plain functions.
+
ctype __newindex tables and non-string lookups in ctype
+__index tables.
+
tostring() for cdata types.
+
Calls to the following ffi.* API
+functions: cdef, load, typeof,
+metatype, gc, sizeof, alignof,
+offsetof.
+This page is intended to give you an overview of the features of the FFI
+library by presenting a few use cases and guidelines.
+
+
+This page makes no attempt to explain all of the FFI library, though.
+You'll want to have a look at the ffi.* API
+function reference and the FFI
+semantics to learn more.
+
+
+
Loading the FFI Library
+
+The FFI library is built into LuaJIT by default, but it's not loaded
+and initialized by default. The suggested way to use the FFI library
+is to add the following to the start of every Lua file that needs one
+of its functions:
+
+
+local ffi = require("ffi")
+
+
+Please note this doesn't define an ffi variable in the table
+of globals — you really need to use the local variable. The
+require function ensures the library is only loaded once.
+
+
+
Accessing Standard System Functions
+
+The following code explains how to access standard system functions.
+We slowly print two lines of dots by sleeping for 10 milliseconds
+after each dot:
+
+
+
+①
+
+
+
+
+
+②
+③
+④
+
+
+
+⑤
+
+
+
+
+
+⑥local ffi = require("ffi")
+ffi.cdef[[
+void Sleep(int ms);
+int poll(struct pollfd *fds, unsigned long nfds, int timeout);
+]]
+
+local sleep
+if ffi.os == "Windows" then
+ function sleep(s)
+ ffi.C.Sleep(s*1000)
+ end
+else
+ function sleep(s)
+ ffi.C.poll(nil, 0, s*1000)
+ end
+end
+
+for i=1,160 do
+ io.write("."); io.flush()
+ sleep(0.01)
+end
+io.write("\n")
+
+
+Here's the step-by-step explanation:
+
+
+① This defines the
+C library functions we're going to use. The part inside the
+double-brackets (in green) is just standard C syntax. You can
+usually get this info from the C header files or the
+documentation provided by each C library or C compiler.
+
+
+② The difficulty we're
+facing here, is that there are different standards to choose from.
+Windows has a simple Sleep() function. On other systems there
+are a variety of functions available to achieve sub-second sleeps, but
+with no clear consensus. Thankfully poll() can be used for
+this task, too, and it's present on most non-Windows systems. The
+check for ffi.os makes sure we use the Windows-specific
+function only on Windows systems.
+
+
+③ Here we're wrapping the
+call to the C function in a Lua function. This isn't strictly
+necessary, but it's helpful to deal with system-specific issues only
+in one part of the code. The way we're wrapping it ensures the check
+for the OS is only done during initialization and not for every call.
+
+
+④ A more subtle point is
+that we defined our sleep() function (for the sake of this
+example) as taking the number of seconds, but accepting fractional
+seconds. Multiplying this by 1000 gets us milliseconds, but that still
+leaves it a Lua number, which is a floating-point value. Alas, the
+Sleep() function only accepts an integer value. Luckily for
+us, the FFI library automatically performs the conversion when calling
+the function (truncating the FP value towards zero, like in C).
+
+
+Some readers will notice that Sleep() is part of
+KERNEL32.DLL and is also a stdcall function. So how
+can this possibly work? The FFI library provides the ffi.C
+default C library namespace, which allows calling functions from
+the default set of libraries, like a C compiler would. Also, the
+FFI library automatically detects stdcall functions, so you
+don't need to declare them as such.
+
+
+⑤ The poll()
+function takes a couple more arguments we're not going to use. You can
+simply use nil to pass a NULL pointer and 0
+for the nfds parameter. Please note that the
+number 0does not convert to a pointer value,
+unlike in C++. You really have to pass pointers to pointer arguments
+and numbers to number arguments.
+
+
+The page on FFI semantics has all
+of the gory details about
+conversions between Lua
+objects and C types. For the most part you don't have to deal
+with this, as it's performed automatically and it's carefully designed
+to bridge the semantic differences between Lua and C.
+
+
+⑥ Now that we have defined
+our own sleep() function, we can just call it from plain Lua
+code. That wasn't so bad, huh? Turning these boring animated dots into
+a fascinating best-selling game is left as an exercise for the reader.
+:-)
+
+
+
Accessing the zlib Compression Library
+
+The following code shows how to access the zlib compression library from Lua code.
+We'll define two convenience wrapper functions that take a string and
+compress or uncompress it to another string:
+
+
+
+①
+
+
+
+
+
+
+②
+
+
+③
+
+④
+
+
+⑤
+
+
+⑥
+
+
+
+
+
+
+
+⑦local ffi = require("ffi")
+ffi.cdef[[
+unsigned long compressBound(unsigned long sourceLen);
+int compress2(uint8_t *dest, unsigned long *destLen,
+ const uint8_t *source, unsigned long sourceLen, int level);
+int uncompress(uint8_t *dest, unsigned long *destLen,
+ const uint8_t *source, unsigned long sourceLen);
+]]
+local zlib = ffi.load(ffi.os == "Windows" and "zlib1" or "z")
+
+local function compress(txt)
+ local n = zlib.compressBound(#txt)
+ local buf = ffi.new("uint8_t[?]", n)
+ local buflen = ffi.new("unsigned long[1]", n)
+ local res = zlib.compress2(buf, buflen, txt, #txt, 9)
+ assert(res == 0)
+ return ffi.string(buf, buflen[0])
+end
+
+local function uncompress(comp, n)
+ local buf = ffi.new("uint8_t[?]", n)
+ local buflen = ffi.new("unsigned long[1]", n)
+ local res = zlib.uncompress(buf, buflen, comp, #comp)
+ assert(res == 0)
+ return ffi.string(buf, buflen[0])
+end
+
+-- Simple test code.
+local txt = string.rep("abcd", 1000)
+print("Uncompressed size: ", #txt)
+local c = compress(txt)
+print("Compressed size: ", #c)
+local txt2 = uncompress(c, #txt)
+assert(txt2 == txt)
+
+
+Here's the step-by-step explanation:
+
+
+① This defines some of the
+C functions provided by zlib. For the sake of this example, some
+type indirections have been reduced and it uses the pre-defined
+fixed-size integer types, while still adhering to the zlib API/ABI.
+
+
+② This loads the zlib shared
+library. On POSIX systems it's named libz.so and usually
+comes pre-installed. Since ffi.load() automatically adds any
+missing standard prefixes/suffixes, we can simply load the
+"z" library. On Windows it's named zlib1.dll and
+you'll have to download it first from the
+» zlib site. The check for
+ffi.os makes sure we pass the right name to
+ffi.load().
+
+
+③ First, the maximum size of
+the compression buffer is obtained by calling the
+zlib.compressBound function with the length of the
+uncompressed string. The next line allocates a byte buffer of this
+size. The [?] in the type specification indicates a
+variable-length array (VLA). The actual number of elements of this
+array is given as the 2nd argument to ffi.new().
+
+
+④ This may look strange at
+first, but have a look at the declaration of the compress2
+function from zlib: the destination length is defined as a pointer!
+This is because you pass in the maximum buffer size and get back the
+actual length that was used.
+
+
+In C you'd pass in the address of a local variable
+(&buflen). But since there's no address-of operator in
+Lua, we'll just pass in a one-element array. Conveniently it can be
+initialized with the maximum buffer size in one step. Calling the
+actual zlib.compress2 function is then straightforward.
+
+
+⑤ We want to return the
+compressed data as a Lua string, so we'll use ffi.string().
+It needs a pointer to the start of the data and the actual length. The
+length has been returned in the buflen array, so we'll just
+get it from there.
+
+
+Note that since the function returns now, the buf and
+buflen variables will eventually be garbage collected. This
+is fine, because ffi.string() has copied the contents to a
+newly created (interned) Lua string. If you plan to call this function
+lots of times, consider reusing the buffers and/or handing back the
+results in buffers instead of strings. This will reduce the overhead
+for garbage collection and string interning.
+
+
+⑥ The uncompress
+functions does the exact opposite of the compress function.
+The compressed data doesn't include the size of the original string,
+so this needs to be passed in. Otherwise no surprises here.
+
+
+⑦ The code, that makes use
+of the functions we just defined, is just plain Lua code. It doesn't
+need to know anything about the LuaJIT FFI — the convenience
+wrapper functions completely hide it.
+
+
+One major advantage of the LuaJIT FFI is that you are now able to
+write those wrappers in Lua. And at a fraction of the time it
+would cost you to create an extra C module using the Lua/C API.
+Many of the simpler C functions can probably be used directly
+from your Lua code, without any wrappers.
+
+
+Side note: the zlib API uses the long type for passing
+lengths and sizes around. But all those zlib functions actually only
+deal with 32 bit values. This is an unfortunate choice for a
+public API, but may be explained by zlib's history — we'll just
+have to deal with it.
+
+
+First, you should know that a long is a 64 bit type e.g.
+on POSIX/x64 systems, but a 32 bit type on Windows/x64 and on
+32 bit systems. Thus a long result can be either a plain
+Lua number or a boxed 64 bit integer cdata object, depending on
+the target system.
+
+
+Ok, so the ffi.* functions generally accept cdata objects
+wherever you'd want to use a number. That's why we get a away with
+passing n to ffi.string() above. But other Lua
+library functions or modules don't know how to deal with this. So for
+maximum portability one needs to use tonumber() on returned
+long results before passing them on. Otherwise the
+application might work on some systems, but would fail in a POSIX/x64
+environment.
+
+
+
Defining Metamethods for a C Type
+
+The following code explains how to define metamethods for a C type.
+We define a simple point type and add some operations to it:
+
+① This defines the C type for a
+two-dimensional point object.
+
+
+② We have to declare the variable
+holding the point constructor first, because it's used inside of a
+metamethod.
+
+
+③ Let's define an __add
+metamethod which adds the coordinates of two points and creates a new
+point object. For simplicity, this function assumes that both arguments
+are points. But it could be any mix of objects, if at least one operand
+is of the required type (e.g. adding a point plus a number or vice
+versa). Our __len metamethod returns the distance of a point to
+the origin.
+
+
+④ If we run out of operators, we can
+define named methods, too. Here the __index table defines an
+area function. For custom indexing needs, one might want to
+define __index and __newindexfunctions instead.
+
+
+⑤ This associates the metamethods with
+our C type. This only needs to be done once. For convenience, a
+constructor is returned by
+ffi.metatype().
+We're not required to use it, though. The original C type can still
+be used e.g. to create an array of points. The metamethods automatically
+apply to any and all uses of this type.
+
+
+Please note that the association with a metatable is permanent and
+the metatable must not be modified afterwards! Ditto for the
+__index table.
+
+
+⑥ Here are some simple usage examples
+for the point type and their expected results. The pre-defined
+operations (such as a.x) can be freely mixed with the newly
+defined metamethods. Note that area is a method and must be
+called with the Lua syntax for methods: a:area(), not
+a.area().
+
+
+The C type metamethod mechanism is most useful when used in
+conjunction with C libraries that are written in an object-oriented
+style. Creators return a pointer to a new instance and methods take an
+instance pointer as the first argument. Sometimes you can just point
+__index to the library namespace and __gc to the
+destructor and you're done. But often enough you'll want to add
+convenience wrappers, e.g. to return actual Lua strings or when
+returning multiple values.
+
+
+Some C libraries only declare instance pointers as an opaque
+void * type. In this case you can use a fake type for all
+declarations, e.g. a pointer to a named (incomplete) struct will do:
+typedef struct foo_type *foo_handle. The C side doesn't
+know what you declare with the LuaJIT FFI, but as long as the underlying
+types are compatible, everything still works.
+
+
+
Translating C Idioms
+
+Here's a list of common C idioms and their translation to the
+LuaJIT FFI:
+
+
+
+
Idiom
+
C code
+
Lua code
+
+
+
Pointer dereference int *p;
x = *p; *p = y;
x = p[0] p[0] = y
+
+
Pointer indexing int i, *p;
x = p[i]; p[i+1] = y;
x = p[i] p[i+1] = y
+
+
Array indexing int i, a[];
x = a[i]; a[i+1] = y;
x = a[i] a[i+1] = y
+
+
struct/union dereference struct foo s;
x = s.field; s.field = y;
x = s.field s.field = y
+
+
struct/union pointer deref. struct foo *sp;
x = sp->field; sp->field = y;
x = s.field s.field = y
+
+
Pointer arithmetic int i, *p;
x = p + i; y = p - i;
x = p + i y = p - i
+
+
Pointer difference int *p1, *p2;
x = p1 - p2;
x = p1 - p2
+
+
Array element pointer int i, a[];
x = &a[i];
x = a+i
+
+
Cast pointer to address int *p;
x = (intptr_t)p;
x = tonumber( ffi.cast("intptr_t", p))
+
+
Functions with outargs void foo(int *inoutlen);
int len = x; foo(&len); y = len;
local len = ffi.new("int[1]", x) foo(len) y = len[0]
+This replaces several hash-table lookups with a (faster) direct use of
+a local or an upvalue. This is less important with LuaJIT, since the
+JIT compiler optimizes hash-table lookups a lot and is even able to
+hoist most of them out of the inner loops. It can't eliminate
+all of them, though, and it saves some typing for often-used
+functions. So there's still a place for this, even with LuaJIT.
+
+
+The situation is a bit different with C function calls via the
+FFI library. The JIT compiler has special logic to eliminate all
+of the lookup overhead for functions resolved from a
+C library namespace!
+Thus it's not helpful and actually counter-productive to cache
+individual C functions like this:
+
+
+local funca, funcb = ffi.C.funcb, ffi.C.funcb -- Not helpful!
+local function foo(x, n)
+ for i=1,n do funcb(funca(x, i), 1) end
+end
+
+
+This turns them into indirect calls and generates bigger and slower
+machine code. Instead you'll want to cache the namespace itself and
+rely on the JIT compiler to eliminate the lookups:
+
+
+local C = ffi.C -- Instead use this!
+local function foo(x, n)
+ for i=1,n do C.funcb(C.funca(x, i), 1) end
+end
+
+
+This generates both shorter and faster code. So don't cache
+C functions, but do cache namespaces! Most often the
+namespace is already in a local variable at an outer scope, e.g. from
+local lib = ffi.load(...). Note that copying
+it to a local variable in the function scope is unnecessary.
+
+The functions in this built-in module control the behavior of the JIT
+compiler engine. Note that JIT-compilation is fully automatic —
+you probably won't need to use any of the following functions unless
+you have special needs.
+
+
+
jit.on()
+jit.off()
+
+Turns the whole JIT compiler on (default) or off.
+
+
+These functions are typically used with the command line options
+-j on or -j off.
+
+jit.on enables JIT compilation for a Lua function (this is
+the default).
+
+
+jit.off disables JIT compilation for a Lua function and
+flushes any already compiled code from the code cache.
+
+
+jit.flush flushes the code, but doesn't affect the
+enable/disable status.
+
+
+The current function, i.e. the Lua function calling this library
+function, can also be specified by passing true as the first
+argument.
+
+
+If the second argument is true, JIT compilation is also
+enabled, disabled or flushed recursively for all sub-functions of a
+function. With false only the sub-functions are affected.
+
+
+The jit.on and jit.off functions only set a flag
+which is checked when the function is about to be compiled. They do
+not trigger immediate compilation.
+
+
+Typical usage is jit.off(true, true) in the main chunk
+of a module to turn off JIT compilation for the whole module for
+debugging purposes.
+
+
+
jit.flush(tr)
+
+Flushes the root trace, specified by its number, and all of its side
+traces from the cache. The code for the trace will be retained as long
+as there are any other traces which link to it.
+
+
+
status, ... = jit.status()
+
+Returns the current status of the JIT compiler. The first result is
+either true or false if the JIT compiler is turned
+on or off. The remaining results are strings for CPU-specific features
+and enabled optimizations.
+
+
+
jit.version
+
+Contains the LuaJIT version string.
+
+
+
jit.version_num
+
+Contains the version number of the LuaJIT core. Version xx.yy.zz
+is represented by the decimal number xxyyzz.
+
+
+
jit.os
+
+Contains the target OS name:
+"Windows", "Linux", "OSX", "BSD", "POSIX" or "Other".
+
+
+
jit.arch
+
+Contains the target architecture name:
+"x86", "x64" or "ppcspe".
+
+
+
jit.opt.* — JIT compiler optimization control
+
+This sub-module provides the backend for the -O command line
+option.
+
+
+You can also use it programmatically, e.g.:
+
+
+jit.opt.start(2) -- same as -O2
+jit.opt.start("-dce")
+jit.opt.start("hotloop=10", "hotexit=2")
+
+
+Unlike in LuaJIT 1.x, the module is built-in and
+optimization is turned on by default!
+It's no longer necessary to run require("jit.opt").start(),
+which was one of the ways to enable optimization.
+
+
+
jit.util.* — JIT compiler introspection
+
+This sub-module holds functions to introspect the bytecode, generated
+traces, the IR and the generated machine code. The functionality
+provided by this module is still in flux and therefore undocumented.
+
+
+The debug modules -jbc, -jv and -jdump make
+extensive use of these functions. Please check out their source code,
+if you want to know more.
+
+LuaJIT is also fully ABI-compatible to Lua 5.1 at the linker/dynamic
+loader level. This means you can compile a C module against the
+standard Lua headers and load the same shared library from either Lua
+or LuaJIT.
+
+
+LuaJIT extends the standard Lua VM with new functionality and adds
+several extension modules. Please note that this page is only about
+functional enhancements and not about performance enhancements,
+such as the optimized VM, the faster interpreter or the JIT compiler.
+
+
+
Extensions Modules
+
+LuaJIT comes with several built-in extension modules:
+
+
+
bit.* — Bitwise operations
+
+LuaJIT supports all bitwise operations as defined by
+» Lua BitOp:
+
+This module is a LuaJIT built-in — you don't need to download or
+install Lua BitOp. The Lua BitOp site has full documentation for all
+» Lua BitOp API functions.
+
+
+Please make sure to require the module before using any of
+its functions:
+
+
+local bit = require("bit")
+
+
+An already installed Lua BitOp module is ignored by LuaJIT.
+This way you can use bit operations from both Lua and LuaJIT on a
+shared installation.
+
+
+
ffi.* — FFI library
+
+The FFI library allows calling external
+C functions and the use of C data structures from pure Lua
+code.
+
+Unlike the standard implementation in Lua 5.1, xpcall()
+passes any arguments after the error function to the function
+which is called in a protected context.
+
+
+
loadfile() etc. handle UTF-8 source code
+
+Non-ASCII characters are handled transparently by the Lua source code parser.
+This allows the use of UTF-8 characters in identifiers and strings.
+A UTF-8 BOM is skipped at the start of the source code.
+
+
+
tostring() etc. canonicalize NaN and ±Inf
+
+All number-to-string conversions consistently convert non-finite numbers
+to the same strings on all platforms. NaN results in "nan",
+positive infinity results in "inf" and negative infinity results
+in "-inf".
+
+An extra argument has been added to string.dump(). If set to
+true, 'stripped' bytecode without debug information is
+generated. This speeds up later bytecode loading and reduces memory
+usage. See also the
+-b command line option.
+
+
+The generated bytecode is portable and can be loaded on any architecture
+that LuaJIT supports, independent of word size or endianess. However the
+bytecode compatibility versions must match. Bytecode stays compatible
+for dot releases (x.y.0 → x.y.1), but may change with major or
+minor releases (2.0 → 2.1) or between any beta release. Foreign
+bytecode (e.g. from Lua 5.1) is incompatible and cannot be loaded.
+
+
+
Enhanced PRNG for math.random()
+
+LuaJIT uses a Tausworthe PRNG with period 2^223 to implement
+math.random() and math.randomseed(). The quality of
+the PRNG results is much superior compared to the standard Lua
+implementation which uses the platform-specific ANSI rand().
+
+
+The PRNG generates the same sequences from the same seeds on all
+platforms and makes use of all bits in the seed argument.
+math.random() without arguments generates 52 pseudo-random bits
+for every call. The result is uniformly distributed between 0 and 1.
+It's correctly scaled up and rounded for math.random(n [,m]) to
+preserve uniformity.
+
+
+
io.* functions handle 64 bit file offsets
+
+The file I/O functions in the standard io.* library handle
+64 bit file offsets. In particular this means it's possible
+to open files larger than 2 Gigabytes and to reposition or obtain
+the current file position for offsets beyond 2 GB
+(fp:seek() method).
+
+
+
debug.* functions identify metamethods
+
+debug.getinfo() and lua_getinfo() also return information
+about invoked metamethods. The namewhat field is set to
+"metamethod" and the name field has the name of
+the corresponding metamethod (e.g. "__index").
+
+
+
Fully Resumable VM
+
+The LuaJIT 2.x VM is fully resumable. This means you can yield from a
+coroutine even across contexts, where this would not possible with
+the standard Lua 5.1 VM: e.g. you can yield across pcall()
+and xpcall(), across iterators and across metamethods.
+
+
+Note however that LuaJIT 2.x doesn't use
+» Coco anymore. This means the
+overhead for creating coroutines is much smaller and no extra
+C stacks need to be allocated. OTOH you can no longer yield
+across arbitrary C functions. Keep this in mind when
+upgrading from LuaJIT 1.x.
+
+
+
C++ Exception Interoperability
+
+LuaJIT has built-in support for interoperating with C++ exceptions.
+The available range of features depends on the target platform and
+the toolchain used to compile LuaJIT:
+
+
+
+
Platform
+
Compiler
+
Interoperability
+
+
+
POSIX/x64, DWARF2 unwinding
+
GCC 4.3+
+
Full
+
+
+
Other platforms, DWARF2 unwinding
+
GCC
+
Limited
+
+
+
Windows/x64
+
MSVC or WinSDK
+
Full
+
+
+
Windows/x86
+
Any
+
No
+
+
+
Other platforms
+
Other compilers
+
No
+
+
+
+Full interoperability means:
+
+
+
C++ exceptions can be caught on the Lua side with pcall(),
+lua_pcall() etc.
+
C++ exceptions will be converted to the generic Lua error
+"C++ exception", unless you use the
+C call wrapper feature.
+
It's safe to throw C++ exceptions across non-protected Lua frames
+on the C stack. The contents of the C++ exception object
+pass through unmodified.
+
Lua errors can be caught on the C++ side with catch(...).
+The corresponding Lua error message can be retrieved from the Lua stack.
+
Throwing Lua errors across C++ frames is safe. C++ destructors
+will be called.
+
+
+Limited interoperability means:
+
+
+
C++ exceptions can be caught on the Lua side with pcall(),
+lua_pcall() etc.
+
C++ exceptions will be converted to the generic Lua error
+"C++ exception", unless you use the
+C call wrapper feature.
+
C++ exceptions will be caught by non-protected Lua frames and
+are rethrown as a generic Lua error. The C++ exception object will
+be destroyed.
+
Lua errors cannot be caught on the C++ side.
+
Throwing Lua errors across C++ frames will not call
+C++ destructors.
+
+
+
+No interoperability means:
+
+
+
It's not safe to throw C++ exceptions across Lua frames.
+
C++ exceptions cannot be caught on the Lua side.
+
Lua errors cannot be caught on the C++ side.
+
Throwing Lua errors across C++ frames will not call
+C++ destructors.
+
Additionally, on Windows/x86 with SEH-based C++ exceptions:
+it's not safe to throw a Lua error across any frames containing
+a C++ function with any try/catch construct or using variables with
+(implicit) destructors. This also applies to any functions which may be
+inlined in such a function. It doesn't matter whether lua_error()
+is called inside or outside of a try/catch or whether any object actually
+needs to be destroyed: the SEH chain is corrupted and this will eventually
+lead to the termination of the process.
The community-managed » Lua Wiki
+has information about diverse topics.
+
The primary source of information for the latest developments surrounding
+Lua is the » Lua mailing list.
+You can check out the » mailing
+list archive or
+» subscribe
+to the list (you need to be subscribed before posting).
+This is also the place where announcements and discussions about LuaJIT
+take place.
+
+
+
+
+
Q: Where can I learn more about the compiler technology used by LuaJIT?
Q: Why do I get this error: "attempt to index global 'arg' (a nil value)"?
+Q: My vararg functions fail after switching to LuaJIT!
+
LuaJIT is compatible to the Lua 5.1 language standard. It doesn't
+support the implicit arg parameter for old-style vararg
+functions from Lua 5.0. Please convert your code to the
+» Lua 5.1
+vararg syntax.
+
+
+
+
Q: Why do I get this error: "bad FPU precision"?
+
Q: I get weird behavior after initializing Direct3D.
+
Q: Some FPU operations crash after I load a Delphi DLL.
+
+
+
+DirectX/Direct3D (up to version 9) sets the x87 FPU to single-precision
+mode by default. This violates the Windows ABI and interferes with the
+operation of many programs — LuaJIT is affected, too. Please make
+sure you always use the D3DCREATE_FPU_PRESERVE flag when
+initializing Direct3D.
+
+Direct3D version 10 or higher do not show this behavior anymore.
+Consider testing your application with older versions, too.
+
+Similarly, the Borland/Delphi runtime modifies the FPU control word and
+enables FP exceptions. Of course this violates the Windows ABI, too.
+Please check the Delphi docs for the Set8087CW method.
+
+
+
+
+
Q: Sometimes Ctrl-C fails to stop my Lua program. Why?
+
The interrupt signal handler sets a Lua debug hook. But this is
+currently ignored by compiled code (this will eventually be fixed). If
+your program is running in a tight loop and never falls back to the
+interpreter, the debug hook never runs and can't throw the
+"interrupted!" error. In the meantime you have to press Ctrl-C
+twice to get stop your program. That's similar to when it's stuck
+running inside a C function under the Lua interpreter.
+
+
+
+
Q: Why doesn't my favorite power-patch for Lua apply against LuaJIT?
+
Because it's a completely redesigned VM and has very little code
+in common with Lua anymore. Also, if the patch introduces changes to
+the Lua semantics, these would need to be reflected everywhere in the
+VM, from the interpreter up to all stages of the compiler. Please
+use only standard Lua language constructs. For many common needs you
+can use source transformations or use wrapper or proxy functions.
+The compiler will happily optimize away such indirections.
+
+
+
+
Q: Lua runs everywhere. Why doesn't LuaJIT support my CPU?
+
Because it's a compiler — it needs to generate native
+machine code. This means the code generator must be ported to each
+architecture. And the fast interpreter is written in assembler and
+must be ported, too. This is quite an undertaking.
+The install documentation shows the supported
+architectures. Other architectures will follow based on sufficient user
+demand and/or sponsoring.
+
+
+
+
Q: When will feature X be added? When will the next version be released?
+
When it's ready.
+C'mon, it's open source — I'm doing it on my own time and you're
+getting it for free. You can either contribute a patch or sponsor
+the development of certain features, if they are important to you.
+
+LuaJIT is only distributed as a source package. This page explains
+how to build and install LuaJIT with different operating systems
+and C compilers.
+
+
+For the impatient (on POSIX systems):
+
+
+make && sudo make install
+
+
+LuaJIT currently builds out-of-the box on most systems.
+Here's the compatibility matrix for the supported combinations of
+operating systems, CPUs and compilers:
+
+The standard configuration should work fine for most installations.
+Usually there is no need to tweak the settings. The following files
+hold all user-configurable settings:
+
+
+
src/luaconf.h sets some configuration variables.
+
Makefile has settings for installing LuaJIT (POSIX
+only).
+
src/Makefile has settings for compiling LuaJIT
+under POSIX, MinGW or Cygwin.
+
src/msvcbuild.bat has settings for compiling LuaJIT with
+MSVC or WinSDK.
+
+
+Please read the instructions given in these files, before changing
+any settings.
+
+
+
POSIX Systems (Linux, OSX, *BSD etc.)
+
Prerequisites
+
+Depending on your distribution, you may need to install a package for
+GCC, the development headers and/or a complete SDK. E.g. on a current
+Debian/Ubuntu, install libc6-dev with the package manager.
+
+
+Download the current source package of LuaJIT (pick the .tar.gz),
+if you haven't already done so. Move it to a directory of your choice,
+open a terminal window and change to this directory. Now unpack the archive
+and change to the newly created directory:
+
+The supplied Makefiles try to auto-detect the settings needed for your
+operating system and your compiler. They need to be run with GNU Make,
+which is probably the default on your system, anyway. Simply run:
+
+
+make
+
+
+This always builds a native x86, x64 or PPC binary, depending on the host OS
+you're running this command on. Check the section on
+cross-compilation for more options.
+
+
+By default, modules are only searched under the prefix /usr/local.
+You can add an extra prefix to the search paths by appending the
+PREFIX option, e.g.:
+
+
+make PREFIX=/home/myself/lj2
+
+
+Note for OSX: MACOSX_DEPLOYMENT_TARGET is set to 10.4
+in src/Makefile. Change it, if you want to build on an older version.
+
+
Installing LuaJIT
+
+The top-level Makefile installs LuaJIT by default under
+/usr/local, i.e. the executable ends up in
+/usr/local/bin and so on. You need root privileges
+to write to this path. So, assuming sudo is installed on your system,
+run the following command and enter your sudo password:
+
+
+sudo make install
+
+
+Otherwise specify the directory prefix as an absolute path, e.g.:
+
+
+make install PREFIX=/home/myself/lj2
+
+
+Obviously the prefixes given during build and installation need to be the same.
+
+
+Note: to avoid overwriting a previous version, the beta test releases
+only install the LuaJIT executable under the versioned name (i.e.
+luajit-2.0.0-beta9). You probably want to create a symlink
+for convenience, with a command like this:
+
+Either install one of the open source SDKs
+(» MinGW or
+» Cygwin), which come with a modified
+GCC plus the required development headers.
+
+
+Or install Microsoft's Visual C++ (MSVC). The freely downloadable
+» Express Edition
+works just fine, but only contains an x86 compiler.
+
+
+The freely downloadable
+» Windows SDK
+only comes with command line tools, but this is all you need to build LuaJIT.
+It contains x86 and x64 compilers.
+
+
+Next, download the source package and unpack it using an archive manager
+(e.g. the Windows Explorer) to a directory of your choice.
+
+
Building with MSVC
+
+Open a "Visual Studio .NET Command Prompt", cd to the
+directory where you've unpacked the sources and run these commands:
+
+
+cd src
+msvcbuild
+
+
+Then follow the installation instructions below.
+
+
Building with the Windows SDK
+
+Open a "Windows SDK Command Shell" and select the x86 compiler:
+
+
+setenv /release /x86
+
+
+Or select the x64 compiler:
+
+
+setenv /release /x64
+
+
+Then cd to the directory where you've unpacked the sources
+and run these commands:
+
+
+cd src
+msvcbuild
+
+
+Then follow the installation instructions below.
+
+
Building with MinGW or Cygwin
+
+Open a command prompt window and make sure the MinGW or Cygwin programs
+are in your path. Then cd to the directory where
+you've unpacked the sources and run this command for MinGW:
+
+
+mingw32-make
+
+
+Or this command for Cygwin:
+
+
+make
+
+
+Then follow the installation instructions below.
+
+
Installing LuaJIT
+
+Copy luajit.exe and lua51.dll (built in the src
+directory) to a newly created directory (any location is ok).
+Add lua and lua\jit directories below it and copy
+all Lua files from the lib directory of the distribution
+to the latter directory.
+
+
+There are no hardcoded
+absolute path names — all modules are loaded relative to the
+directory where luajit.exe is installed
+(see src/luaconf.h).
+
+
+
Cross-compiling LuaJIT
+
+The build system has limited support for cross-compilation. For details
+check the comments in src/Makefile. Here are some popular examples:
+
+
+You can cross-compile to a 32 bit binary on a multilib x64 OS by
+installing the multilib development packages (e.g. libc6-dev-i386
+on Debian/Ubuntu) and running:
+
+
+make CC="gcc -m32"
+
+
+You can cross-compile for a Windows target on Debian/Ubuntu by
+installing the mingw32 package and running:
+
+You can cross-compile for an ARM target on an x86 or x64 host
+system using a standard GNU cross-compile toolchain (Binutils, GCC,
+EGLIBC). The CROSS prefix may vary depending on the
+--target of the toolchain:
+
+You can cross-compile for Android (ARM) using the » Android NDK.
+The environment variables need to match the install locations and the
+desired target platform. E.g. Android 2.2 corresponds to ABI level 8:
+
+You can cross-compile for iOS 3.0+ (iPhone/iPad) using the » iOS SDK.
+The environment variables need to match the iOS SDK version:
+
+
+Note: the JIT compiler is disabled for iOS, because regular iOS Apps
+are not allowed to generate code at runtime. You'll only get the performance
+of the LuaJIT interpreter on iOS. This is still faster than plain Lua, but
+much slower than the JIT compiler. Please complain to Apple, not me.
+Or use Android. :-p
+
+You can cross-compile for a PPC target or a
+PPC/e500v2 target on x86 or x64 host systems using a standard
+GNU cross-compile toolchain (Binutils, GCC, EGLIBC).
+The CROSS prefix may vary depending on the --target
+of the toolchain:
+
+Whenever the host OS and the target OS differ, you need to specify
+TARGET_SYS or you'll get assembler or linker errors. E.g. if
+you're compiling on a Windows or OSX host for embedded Linux or Android,
+you need to add TARGET_SYS=Linux to the examples above. For a
+minimal target OS, you may need to disable the built-in allocator in
+src/Makefile and use TARGET_SYS=Other.
+
+
+
Embedding LuaJIT
+
+LuaJIT is API-compatible with Lua 5.1. If you've already embedded Lua
+into your application, you probably don't need to do anything to switch
+to LuaJIT, except link with a different library:
+
+
+
It's strongly suggested to build LuaJIT separately using the supplied
+build system. Please do not attempt to integrate the individual
+source files into your build tree. You'll most likely get the internal build
+dependencies wrong or mess up the compiler flags. Treat LuaJIT like any
+other external library and link your application with either the dynamic
+or static library, depending on your needs.
+
If you want to load C modules compiled for plain Lua
+with require(), you need to make sure the public symbols
+(e.g. lua_pushnumber) are exported, too:
+
On POSIX systems you can either link to the shared library
+or link the static library into your application. In the latter case
+you'll need to export all public symbols from your main executable
+(e.g. -Wl,-E on Linux) and add the external dependencies
+(e.g. -lm -ldl on Linux).
+
Since Windows symbols are bound to a specific DLL name, you need to
+link to the lua51.dll created by the LuaJIT build (do not rename
+the DLL). You may link LuaJIT statically on Windows only if you don't
+intend to load Lua/C modules at runtime.
+
+
+
+If you're building a 64 bit application on OSX which links directly or
+indirectly against LuaJIT, you need to link your main executable
+with these flags:
+
+-pagezero_size 10000 -image_base 100000000
+
+Also, it's recommended to rebase all (self-compiled) shared libraries
+which are loaded at runtime on OSX/x64 (e.g. C extension modules for Lua).
+See: man rebase
+
+
+
Additional hints for initializing LuaJIT using the C API functions:
+
+
Here's a
+» simple example
+for embedding Lua or LuaJIT into your application.
+
Make sure you use luaL_newstate. Avoid using
+lua_newstate, since this uses the (slower) default memory
+allocator from your system (no support for this on x64).
+
Make sure you use luaL_openlibs and not the old Lua 5.0 style
+of calling luaopen_base etc. directly.
+
To change or extend the list of standard libraries to load, copy
+src/lib_init.c to your project and modify it accordingly.
+Make sure the jit library is loaded or the JIT compiler
+will not be activated.
+
The bit.* module for bitwise operations
+is already built-in. There's no need to statically link
+» Lua BitOp to your application.
+
+
+
Hints for Distribution Maintainers
+
+The LuaJIT build system has extra provisions for the needs of most
+POSIX-based distributions. If you're a package maintainer for
+a distribution, please make use of these features and
+avoid patching, subverting, autotoolizing or messing up the build system
+in unspeakable ways.
+
+
+There should be absolutely no need to patch luaconf.h or any
+of the Makefiles. And please do not hand-pick files for your packages —
+simply use whatever make install creates. There's a reason
+for all of the files and directories it creates.
+
+
+The build system uses GNU make and auto-detects most settings based on
+the host you're building it on. This should work fine for native builds,
+even when sandboxed. You may need to pass some of the following flags to
+both the make and the make install command lines
+for a regular distribution build:
+
+
+
PREFIX overrides the installation path and should usually
+be set to /usr. Setting this also changes the module paths and
+the -rpath of the shared library.
+
DESTDIR is an absolute path which allows you to install
+to a shadow tree instead of the root tree of the build system.
+
Have a look at the top-level Makefile and src/Makefile
+for additional variables to tweak. The following variables may be
+overridden, but it's not recommended, except for special needs
+like cross-builds:
+BUILDMODE, CC, HOST_CC, STATIC_CC, DYNAMIC_CC, CFLAGS, HOST_CFLAGS,
+TARGET_CFLAGS, LDFLAGS, HOST_LDFLAGS, TARGET_LDFLAGS, TARGET_SHLDFLAGS,
+TARGET_FLAGS, LIBS, HOST_LIBS, TARGET_LIBS, CROSS, HOST_SYS, TARGET_SYS
+
+
+
+The build system has a special target for an amalgamated build, i.e.
+make amalg. This compiles the LuaJIT core as one huge C file
+and allows GCC to generate faster and shorter code. Alas, this requires
+lots of memory during the build. This may be a problem for some users,
+that's why it's not enabled by default. But it shouldn't be a problem for
+most build farms. It's recommended that binary distributions use this
+target for their LuaJIT builds.
+
+Finally, if you encounter any difficulties, please
+contact me first, instead of releasing a broken
+package onto unsuspecting users. Because they'll usually gonna complain
+to me (the upstream) and not you (the package maintainer), anyway.
+
+* Lua is a powerful, dynamic and light-weight programming language
+designed for extending applications. Lua is also frequently used as a
+general-purpose, stand-alone language. More information about
+Lua can be found at: » http://www.lua.org/
+
+
Compatibility
+
+LuaJIT implements the full set of language features defined by Lua 5.1.
+The virtual machine (VM) is API- and ABI-compatible to the
+standard Lua interpreter and can be deployed as a drop-in replacement.
+
+
+LuaJIT offers more performance, at the expense of portability. It
+currently runs on all popular operating systems based on
+x86 or x64 CPUs (Linux, Windows, OSX etc.) or embedded
+systems based on ARM (Android, iOS) or PPC CPUs.
+Other platforms will be supported in the future, based on user demand
+and sponsoring.
+
+
+
Overview
+
+LuaJIT has been successfully used as a scripting middleware in
+games, 3D modellers, numerical simulations, trading platforms and many
+other specialty applications. It combines high flexibility with high
+performance and an unmatched low memory footprint: less than
+125K for the VM plus less than 85K for the JIT compiler (on x86).
+
+
+LuaJIT has been in continuous development since 2005. It's widely
+considered to be one of the fastest dynamic language
+implementations. It has outperformed other dynamic languages on many
+cross-language benchmarks since its first release — often by a
+substantial margin. In 2009 other dynamic language VMs started to catch up
+with the performance of LuaJIT 1.x. Well, I couldn't let that slide. ;-)
+
+
+2009 also marks the first release of the long-awaited LuaJIT 2.0.
+The whole VM has been rewritten from the ground up and relentlessly
+optimized for performance. It combines a high-speed interpreter,
+written in assembler, with a state-of-the-art JIT compiler.
+
+
+An innovative trace compiler is integrated with advanced,
+SSA-based optimizations and a highly tuned code generation backend. This
+allows a substantial reduction of the overhead associated with dynamic
+language features.
+
+
+It's destined to break into the » performance
+range traditionally reserved for offline, static language compilers.
+
+
+
More ...
+
+Click on the LuaJIT sub-topics in the navigation bar to learn more
+about LuaJIT.
+
+
+Click on the Logo in the upper left corner to visit
+the LuaJIT project page on the web. All other links to online
+resources are marked with a '»'.
+
+LuaJIT has only a single stand-alone executable, called luajit on
+POSIX systems or luajit.exe on Windows. It can be used to run simple
+Lua statements or whole Lua applications from the command line. It has an
+interactive mode, too.
+
+
+Note: the beta test releases only install under the versioned name on
+POSIX systems (to avoid overwriting a previous version). You either need
+to type luajit-2.0.0-beta9 to start it or create a symlink
+with a command like this:
+
+Unlike previous versions optimization is turned on by default in
+LuaJIT 2.0! It's no longer necessary to use luajit -O.
+
+
+
Command Line Options
+
+The luajit stand-alone executable is just a slightly modified
+version of the regular lua stand-alone executable.
+It supports the same basic options, too. luajit -h
+prints a short list of the available options. Please have a look at the
+» Lua manual
+for details.
+
+
+LuaJIT has some additional options:
+
+
+
-b[options] input output
+
+This option saves or lists bytecode. The following additional options
+are accepted:
+
+
+
-l — Only list bytecode.
+
-s — Strip debug info (this is the default).
+
-g — Keep debug info.
+
-n name — Set module name (default: auto-detect from input name)
+
-t type — Set output file type (default: auto-detect from output name).
+
-a arch — Override architecture for object files (default: native).
+
-o os — Override OS for object files (default: native).
+
-e chunk — Use chunk string as input.
+
- (a single minus sign) — Use stdin as input and/or stdout as output.
+
+
+The output file type is auto-detected from the extension of the output
+file name:
+
+
+
c — C source file, exported bytecode data.
+
h — C header file, static bytecode data.
+
obj or o — Object file, exported bytecode data
+(OS- and architecture-specific).
+
raw or any other extension — Raw bytecode file (portable).
+
+
+Notes:
+
+
+
See also string.dump()
+for information on bytecode portability and compatibility.
+
A file in raw bytecode format is auto-detected and can be loaded like
+any Lua source file. E.g. directly from the command line or with
+loadfile(), dofile() etc.
+
To statically embed the bytecode of a module in your application,
+generate an object file and just link it with your application.
+
On most ELF-based systems (e.g. Linux) you need to explicitly export the
+global symbols when linking your application, e.g. with: -Wl,-E
+
require() tries to load embedded bytecode data from exported
+symbols (in *.exe or lua51.dll on Windows) and from
+shared libraries in package.cpath.
+
+
+Typical usage examples:
+
+
+luajit -b test.lua test.out # Save bytecode to test.out
+luajit -bg test.lua test.out # Keep debug info
+luajit -be "print('hello world')" test.out # Save cmdline script
+
+luajit -bl test.lua # List to stdout
+luajit -bl test.lua test.txt # List to test.txt
+luajit -ble "print('hello world')" # List cmdline script
+
+luajit -b test.lua test.obj # Generate object file
+# Link test.obj with your application and load it with require("test")
+
+
+
-j cmd[=arg[,arg...]]
+
+This option performs a LuaJIT control command or activates one of the
+loadable extension modules. The command is first looked up in the
+jit.* library. If no matching function is found, a module
+named jit.<cmd> is loaded and the start()
+function of the module is called with the specified arguments (if
+any). The space between -j and cmd is optional.
+
+
+Here are the available LuaJIT control commands:
+
+
+
-jon — Turns the JIT compiler on (default).
+
-joff — Turns the JIT compiler off (only use the interpreter).
+
-jflush — Flushes the whole cache of compiled code.
+
-jv — Shows verbose information about the progress of the JIT compiler.
+
-jdump — Dumps the code and structures used in various compiler stages.
+
+
+The -jv and -jdump commands are extension modules
+written in Lua. They are mainly used for debugging the JIT compiler
+itself. For a description of their options and output format, please
+read the comment block at the start of their source.
+They can be found in the lib directory of the source
+distribution or installed under the jit directory. By default
+this is /usr/local/share/luajit-2.0.0-beta9/jit on POSIX
+systems.
+
+
+
-O[level]
+-O[+]flag-O-flag
+-Oparam=value
+
+This options allows fine-tuned control of the optimizations used by
+the JIT compiler. This is mainly intended for debugging LuaJIT itself.
+Please note that the JIT compiler is extremely fast (we are talking
+about the microsecond to millisecond range). Disabling optimizations
+doesn't have any visible impact on its overhead, but usually generates
+code that runs slower.
+
+
+The first form sets an optimization level — this enables a
+specific mix of optimization flags. -O0 turns off all
+optimizations and higher numbers enable more optimizations. Omitting
+the level (i.e. just -O) sets the default optimization level,
+which is -O3 in the current version.
+
+
+The second form adds or removes individual optimization flags.
+The third form sets a parameter for the VM or the JIT compiler
+to a specific value.
+
+
+You can either use this option multiple times (like -Ocse
+-O-dce -Ohotloop=10) or separate several settings with a comma
+(like -O+cse,-dce,hotloop=10). The settings are applied from
+left to right and later settings override earlier ones. You can freely
+mix the three forms, but note that setting an optimization level
+overrides all earlier flags.
+
+
+Here are the available flags and at what optimization levels they
+are enabled:
+
+
+
+
Flag
+
-O1
+
-O2
+
-O3
+
+
+
+
fold
•
•
•
Constant Folding, Simplifications and Reassociation
+
+
cse
•
•
•
Common-Subexpression Elimination
+
+
dce
•
•
•
Dead-Code Elimination
+
+
narrow
•
•
Narrowing of numbers to integers
+
+
loop
•
•
Loop Optimizations (code hoisting)
+
+
fwd
•
Load Forwarding (L2L) and Store Forwarding (S2L)
+
+
dse
•
Dead-Store Elimination
+
+
abc
•
Array Bounds Check Elimination
+
+
fuse
•
Fusion of operands into instructions
+
+
+Here are the parameters and their default settings:
+
+
+
+
Parameter
+
Default
+
+
+
+
maxtrace
1000
Max. number of traces in the cache
+
+
maxrecord
4000
Max. number of recorded IR instructions
+
+
maxirconst
500
Max. number of IR constants of a trace
+
+
maxside
100
Max. number of side traces of a root trace
+
+
maxsnap
500
Max. number of snapshots for a trace
+
+
hotloop
56
Number of iterations to detect a hot loop or hot call
+
+
hotexit
10
Number of taken exits to start a side trace
+
+
tryside
4
Number of attempts to compile a side trace
+
+
instunroll
4
Max. unroll factor for instable loops
+
+
loopunroll
15
Max. unroll factor for loop ops in side traces
+
+
callunroll
3
Max. unroll factor for pseudo-recursive calls
+
+
recunroll
2
Min. unroll factor for true recursion
+
+
sizemcode
32
Size of each machine code area in KBytes (Windows: 64K)
+
+
maxmcode
512
Max. total size of all machine code areas in KBytes
+The LuaJIT 1.x series represents
+the current stable branch.
+Only a single bug has been discovered in the last two years. So, if
+you need a rock-solid VM, you are encouraged to fetch the latest
+release of LuaJIT 1.x from the » Download
+page.
+
+
+LuaJIT 2.0 is the currently active
+development branch.
+It has Beta Test status and is still undergoing
+substantial changes.
+It has » much better performance than LuaJIT 1.x.
+It's maturing quickly, so you should definitely
+start to evaluate it for new projects right now.
+
+
+
Current Status
+
+This is a list of the things you should know about the LuaJIT 2.0 beta test:
+
+
+
+Obviously there will be some bugs in a VM which has been
+rewritten from the ground up. Please report your findings together with
+the circumstances needed to reproduce the bug. If possible, reduce the
+problem down to a simple test case.
+There is no formal bug tracker at the moment. The best place for
+discussion is the
+» Lua mailing list. Of course
+you may also send your bug reports directly to me,
+especially when they contain lengthy debug output or if you require
+confidentiality.
+
+
+The x86 JIT compiler only generates code for CPUs with support for
+SSE2 instructions. I.e. you need at least a P4, Core 2/i3/i5/i7,
+Atom or K8/K10 to get the full benefit.
+If you run LuaJIT on older CPUs without SSE2 support, the JIT compiler
+is disabled and the VM falls back to the LuaJIT interpreter. This is faster
+than the Lua interpreter, but not nearly as fast as the JIT compiler of course.
+Run the command line executable without arguments to show the current status
+(JIT: ON or JIT: OFF).
+
+
+The VM is complete in the sense that it should run all Lua code
+just fine. It's considered a serious bug if the VM crashes or produces
+unexpected results — please report this. There are only very few
+known incompatibilities with standard Lua:
+
+
+The Lua debug API is missing a couple of features (return
+hooks for non-Lua functions) and shows slightly different behavior
+(no per-coroutine hooks, no tail call counting).
+
+
+Some of the configuration options of Lua 5.1 are not supported:
+
+
The number type cannot be changed (it's always a double).
+
The stand-alone executable cannot be linked with readline
+to enable line editing. It's planned to add support for loading it
+on-demand.
+
+
+
+Most other issues you're likely to find (e.g. with the existing test
+suites) are differences in the implementation-defined behavior.
+These either have a good reason (like early tail call resolving which
+may cause differences in error reporting), are arbitrary design choices
+or are due to quirks in the VM. The latter cases may get fixed if a
+demonstrable need is shown.
+
+
+
+
+The JIT compiler falls back to the
+interpreter in some cases. All of this works transparently, so unless
+you use -jv, you'll probably never notice (the interpreter is
+» quite fast, too). Here are the known issues:
+
+
+Most known issues cause a NYI (not yet implemented) trace abort
+message. E.g. for calls to some internal library
+functions. Reporting these is only mildly useful, except if you have good
+example code that shows the problem. Obviously, reports accompanied with
+a patch to fix the issue are more than welcome. But please check back
+with me, before writing major improvements, to avoid duplication of
+effort.
+
+
+Some checks are missing in the JIT-compiled code for obscure situations
+with open upvalues aliasing one of the SSA slots later on (or
+vice versa). Bonus points, if you can find a real world test case for
+this.
+
+
+Currently some out-of-memory errors from on-trace code are not
+handled correctly. The error may fall through an on-trace
+pcall (x86) or it may be passed on to the function set with
+lua_atpanic (x64).
+
+
+
+
+
+
Roadmap
+
+Please refer to the
+» LuaJIT
+Roadmap 2011 for the latest release plan. Here's the general
+project plan for LuaJIT 2.0:
+
+
+
+The main goal right now is to stabilize LuaJIT 2.0 and get it out of
+beta test. Correctness has priority over completeness. This
+implies the first stable release will certainly NOT compile every
+library function call and will fall back to the interpreter from time
+to time. This is perfectly ok, since it still executes all Lua code,
+just not at the highest possible speed.
+
+
+The next step is to get it to compile more library functions and handle
+more cases where the compiler currently bails out. This doesn't mean it
+will compile every corner case. It's much more important that it
+performs well in a majority of use cases. Every compiler has to make
+these trade-offs — completeness just cannot be the
+overriding goal for a low-footprint, low-overhead JIT compiler.
+
+
+More optimizations will be added in parallel to the last step on
+an as-needed basis. Sinking of stores
+to aggregates and sinking of allocations are high on the list.
+More complex optimizations with less pay-off, such as value-range-propagation
+(VRP) will have to wait.
+
+
+LuaJIT 2.0 has been designed with portability in mind.
+Nonetheless, it compiles to native code and needs to be adapted to each
+architecture. The two major work items are porting the the fast interpreter,
+which is written in assembler, and porting the compiler backend.
+Most other portability issues like endianess or 32 vs. 64 bit CPUs
+have already been taken care of.
+Several ports are already available, thanks to the
+» LuaJIT sponsorship program.
+More ports will follow in the future — companies which are
+interested in sponsoring a port to a particular architecture, please
+use the given contact address.
+
+
+Documentation about the internals of LuaJIT is still sorely
+missing. Although the source code is included and is IMHO well
+commented, many basic design decisions are in need of an explanation.
+The rather un-traditional compiler architecture and the many highly
+optimized data structures are a barrier for outside participation in
+the development. Alas, as I've repeatedly stated, I'm better at
+writing code than papers and I'm not in need of any academic merits.
+Someday I will find the time for it. :-)
+
+
+Producing good code for unbiased branches is a key problem for trace
+compilers. This is the main cause for "trace explosion".
+Hyperblock scheduling promises to solve this nicely at the
+price of a major redesign of the compiler. This would also pave the
+way for emitting predicated instructions, which is a prerequisite
+for efficient vectorization.
+