FFI Semantics

+This page describes the detailed semantics underlying the FFI library +and its interaction with both Lua and C code. +

+Given that the FFI library is designed to interface with C code +and that declarations can be written in plain C syntax, it +closely follows the C language semantics, wherever possible. +Some minor concessions are needed for smoother interoperation with Lua +language semantics. +

+Please don't be overwhelmed by the contents of this page — this +is a reference and you may need to consult it, if in doubt. It doesn't +hurt to skim this page, but most of the semantics "just work" as you'd +expect them to work. It should be straightforward to write +applications using the LuaJIT FFI for developers with a C or C++ +background. +

+Please note: this doesn't comprise the final specification for the FFI +semantics, yet. Some semantics may need to be changed, based on your +feedback. Please report any problems you may +encounter or any improvements you'd like to see — thank you! +

+ +

C Language Support

+The FFI library has a built-in C parser with a minimal memory +footprint. It's used by the ffi.* library +functions to declare C types or external symbols. +

+It's only purpose is to parse C declarations, as found e.g. in +C header files. Although it does evaluate constant expressions, +it's not a C compiler. The body of inline +C function definitions is simply ignored. +

+Also, this is not a validating C parser. It expects and +accepts correctly formed C declarations, but it may choose to +ignore bad declarations or show rather generic error messages. If in +doubt, please check the input against your favorite C compiler. +

+The C parser complies to the C99 language standard plus +the following extensions: +

The '\e' escape in character and string literals.
The C99/C++ boolean type, declared with the keywords bool +or _Bool.
Complex numbers, declared with the keywords complex or +_Complex.
Two complex number types: complex (aka +complex double) and complex float.
Vector types, declared with the GCC mode or +vector_size attribute.
Unnamed ('transparent') struct/union fields +inside a struct/union.
Incomplete enum declarations, handled like incomplete +struct declarations.
Unnamed enum fields inside a +struct/union. This is similar to a scoped C++ +enum, except that declared constants are visible in the +global namespace, too.
Scoped static const declarations inside a +struct/union (from C++).
Zero-length arrays ([0]), empty +struct/union, variable-length arrays (VLA, +[?]) and variable-length structs (VLS, with a trailing +VLA).
C++ reference types (int &x).
Alternate GCC keywords with '__', e.g. +__const__.
GCC __attribute__ with the following attributes: +aligned, packed, mode, +vector_size, cdecl, fastcall, +stdcall.
The GCC __extension__ keyword and the GCC +__alignof__ operator.
GCC __asm__("symname") symbol name redirection for +function declarations.
MSVC keywords for fixed-length types: __int8, +__int16, __int32 and __int64.
MSVC __cdecl, __fastcall, __stdcall, +__ptr32, __ptr64, __declspec(align(n)) +and #pragma pack.
All other GCC/MSVC-specific attributes are ignored.

+The following C types are pre-defined by the C parser (like +a typedef, except re-declarations will be ignored): +

Vararg handling: va_list, __builtin_va_list, +__gnuc_va_list.
From <stddef.h>: ptrdiff_t, +size_t, wchar_t.
From <stdint.h>: int8_t, int16_t, +int32_t, int64_t, uint8_t, +uint16_t, uint32_t, uint64_t, +intptr_t, uintptr_t.

+You're encouraged to use these types in preference to the +compiler-specific extensions or the target-dependent standard types. +E.g. char differs in signedness and long differs in +size, depending on the target architecture and platform ABI. +

+The following C features are not supported: +

A declaration must always have a type specifier; it doesn't +default to an int type.
Old-style empty function declarations (K&R) are not allowed. +All C functions must have a proper prototype declaration. A +function declared without parameters (int foo();) is +treated as a function taking zero arguments, like in C++.
The long double C type is parsed correctly, but +there's no support for the related conversions, accesses or arithmetic +operations.
Wide character strings and character literals are not +supported.
See below for features that are currently +not implemented.

+ +

C Type Conversion Rules

+ +

Conversions from C types to Lua objects

+These conversion rules apply for read accesses to +C types: indexing pointers, arrays or +struct/union types; reading external variables or +constant values; retrieving return values from C calls: +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Input	Conversion	Output
`int8_t`, `int16_t`	→^sign-ext `int32_t` → `double`	number
`uint8_t`, `uint16_t`	→^zero-ext `int32_t` → `double`	number
`int32_t`, `uint32_t`	→ `double`	number
`int64_t`, `uint64_t`	boxed value	64 bit int cdata
`double`, `float`	→ `double`	number
`bool`	0 → `false`, otherwise `true`	boolean
Complex number	boxed value	complex cdata
Vector	boxed value	vector cdata
Pointer	boxed value	pointer cdata
Array	boxed reference	reference cdata
`struct`/`union`	boxed reference	reference cdata

+Bitfields or enum types are treated like their underlying +type. +

+Reference types are dereferenced before a conversion can take +place — the conversion is applied to the C type pointed to +by the reference. +

+ +

Conversions from Lua objects to C types

+These conversion rules apply for write accesses to +C types: indexing pointers, arrays or +struct/union types; initializing cdata objects; +casts to C types; writing to external variables; passing +arguments to C calls: +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Input	Conversion	Output
number	→	`double`
boolean	`false` → 0, `true` → 1	`bool`
nil	`NULL` →	`(void *)`
userdata	userdata payload →	`(void *)`
lightuserdata	lightuserdata address →	`(void *)`
string	match against `enum` constant	`enum`
string	copy string data + zero-byte	`int8_t[]`, `uint8_t[]`
string	string data →	`const char[]`
function	create callback →	C function type
table	table initializer	Array
table	table initializer	`struct`/`union`
cdata	cdata payload →	C type

+If the result type of this conversion doesn't match the +C type of the destination, the +conversion rules between C types +are applied. +

+Reference types are immutable after initialization ("no re-seating of +references"). For initialization purposes or when passing values to +reference parameters, they are treated like pointers. Note that unlike +in C++, there's no way to implement automatic reference generation of +variables under the Lua language semantics. If you want to call a +function with a reference parameter, you need to explicitly pass a +one-element array. +

+ +

Conversions between C types

+These conversion rules are more or less the same as the standard +C conversion rules. Some rules only apply to casts, or require +pointer or type compatibility: +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Input	Conversion	Output
Signed integer	→^{narrow or sign-extend}	Integer
Unsigned integer	→^{narrow or zero-extend}	Integer
Integer	→^round	`double`, `float`
`double`, `float`	→^trunc `int32_t` →^narrow	`(u)int8_t`, `(u)int16_t`
`double`, `float`	→^trunc	`(u)int32_t`, `(u)int64_t`
`double`, `float`	→^round	`float`, `double`
Number	n == 0 → 0, otherwise 1	`bool`
`bool`	`false` → 0, `true` → 1	Number
Complex number	convert real part	Number
Number	convert real part, imag = 0	Complex number
Complex number	convert real and imag part	Complex number
Number	convert scalar and replicate	Vector
Vector	copy (same size)	Vector
`struct`/`union`	take base address (compat)	Pointer
Array	take base address (compat)	Pointer
Function	take function address	Function pointer
Number	convert via `uintptr_t` (cast)	Pointer
Pointer	convert address (compat/cast)	Pointer
Pointer	convert address (cast)	Integer
Array	convert base address (cast)	Integer
Array	copy (compat)	Array
`struct`/`union`	copy (identical type)	`struct`/`union`

+Bitfields or enum types are treated like their underlying +type. +

+Conversions not listed above will raise an error. E.g. it's not +possible to convert a pointer to a complex number or vice versa. +

+ +

Conversions for vararg C function arguments

+The following default conversion rules apply when passing Lua objects +to the variable argument part of vararg C functions: +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Input	Conversion	Output
number	→	`double`
boolean	`false` → 0, `true` → 1	`bool`
nil	`NULL` →	`(void *)`
userdata	userdata payload →	`(void *)`
lightuserdata	lightuserdata address →	`(void *)`
string	string data →	`const char *`
`float` cdata	→	`double`
Array cdata	take base address	Element pointer
`struct`/`union` cdata	take base address	`struct`/`union` pointer
Function cdata	take function address	Function pointer
Any other cdata	no conversion	C type

+To pass a Lua object, other than a cdata object, as a specific type, +you need to override the conversion rules: create a temporary cdata +object with a constructor or a cast and initialize it with the value +to pass: +

+Assuming x is a Lua number, here's how to pass it as an +integer to a vararg function: +

+ffi.cdef[[
+int printf(const char *fmt, ...);
+]]
+ffi.C.printf("integer value: %d\n", ffi.new("int", x))
+

+If you don't do this, the default Lua number → double +conversion rule applies. A vararg C function expecting an integer +will see a garbled or uninitialized value. +

+ +

Initializers

+Creating a cdata object with +ffi.new() or the +equivalent constructor syntax always initializes its contents, too. +Different rules apply, depending on the number of optional +initializers and the C types involved: +

If no initializers are given, the object is filled with zero bytes.
Scalar types (numbers and pointers) accept a single initializer. +The Lua object is converted to the scalar +C type.
Valarrays (complex numbers and vectors) are treated like scalars +when a single initializer is given. Otherwise they are treated like +regular arrays.
Aggregate types (arrays and structs) accept either a single +table initializer or a flat list of +initializers.
The elements of an array are initialized, starting at index zero. +If a single initializer is given for an array, it's repeated for all +remaining elements. This doesn't happen if two or more initializers +are given: all remaining uninitialized elements are filled with zero +bytes.
Byte arrays may also be initialized with a Lua string. This copies +the whole string plus a terminating zero-byte. The copy stops early only +if the array has a known, fixed size.
The fields of a struct are initialized in the order of +their declaration. Uninitialized fields are filled with zero +bytes.
Only the first field of a union can be initialized with a +flat initializer.
Elements or fields which are aggregates themselves are initialized +with a single initializer, but this may be a table +initializer or a compatible aggregate.
Excess initializers cause an error.

+ +

Table Initializers

+The following rules apply if a Lua table is used to initialize an +Array or a struct/union: +

If the table index [0] is non-nil, then the +table is assumed to be zero-based. Otherwise it's assumed to be +one-based.
Array elements, starting at index zero, are initialized one-by-one +with the consecutive table elements, starting at either index +[0] or [1]. This process stops at the first +nil table element.
If exactly one array element was initialized, it's repeated for +all the remaining elements. Otherwise all remaining uninitialized +elements are filled with zero bytes.
The above logic only applies to arrays with a known fixed size. +A VLA is only initialized with the element(s) given in the table. +Depending on the use case, you may need to explicitly add a +NULL or 0 terminator to a VLA.
If the table has a non-empty hash part, a +struct/union is initialized by looking up each field +name (as a string key) in the table. Each non-nil value is +used to initialize the corresponding field.
Otherwise a struct/union is initialized in the +order of the declaration of its fields. Each field is initialized with +the consecutive table elements, starting at either index [0] +or [1]. This process stops at the first nil table +element.
Uninitialized fields of a struct are filled with zero +bytes, except for the trailing VLA of a VLS.
Initialization of a union stops after one field has been +initialized. If no field has been initialized, the union is +filled with zero bytes.
Elements or fields which are aggregates themselves are initialized +with a single initializer, but this may be a nested table +initializer (or a compatible aggregate).
Excess initializers for an array cause an error. Excess +initializers for a struct/union are ignored. +Unrelated table entries are ignored, too.

+Example: +

+local ffi = require("ffi")
+
+ffi.cdef[[
+struct foo { int a, b; };
+union bar { int i; double d; };
+struct nested { int x; struct foo y; };
+]]
+
+ffi.new("int[3]", {})            --> 0, 0, 0
+ffi.new("int[3]", {1})           --> 1, 1, 1
+ffi.new("int[3]", {1,2})         --> 1, 2, 0
+ffi.new("int[3]", {1,2,3})       --> 1, 2, 3
+ffi.new("int[3]", {[0]=1})       --> 1, 1, 1
+ffi.new("int[3]", {[0]=1,2})     --> 1, 2, 0
+ffi.new("int[3]", {[0]=1,2,3})   --> 1, 2, 3
+ffi.new("int[3]", {[0]=1,2,3,4}) --> error: too many initializers
+
+ffi.new("struct foo", {})            --> a = 0, b = 0
+ffi.new("struct foo", {1})           --> a = 1, b = 0
+ffi.new("struct foo", {1,2})         --> a = 1, b = 2
+ffi.new("struct foo", {[0]=1,2})     --> a = 1, b = 2
+ffi.new("struct foo", {b=2})         --> a = 0, b = 2
+ffi.new("struct foo", {a=1,b=2,c=3}) --> a = 1, b = 2  'c' is ignored
+
+ffi.new("union bar", {})        --> i = 0, d = 0.0
+ffi.new("union bar", {1})       --> i = 1, d = ?
+ffi.new("union bar", {[0]=1,2}) --> i = 1, d = ?    '2' is ignored
+ffi.new("union bar", {d=2})     --> i = ?, d = 2.0
+
+ffi.new("struct nested", {1,{2,3}})     --> x = 1, y.a = 2, y.b = 3
+ffi.new("struct nested", {x=1,y={2,3}}) --> x = 1, y.a = 2, y.b = 3
+

+ +

Operations on cdata Objects

+All of the standard Lua operators can be applied to cdata objects or a +mix of a cdata object and another Lua object. The following list shows +the valid combinations. All other combinations currently raise an +error. +

+Reference types are dereferenced before performing each of +the operations below — the operation is applied to the +C type pointed to by the reference. +

+The pre-defined operations are always tried first before deferring to a +metamethod for a ctype (if defined). +

+ +

Indexing a cdata object

Indexing a pointer/array: a cdata pointer/array can be +indexed by a cdata number or a Lua number. The element address is +computed as the base address plus the number value multiplied by the +element size in bytes. A read access loads the element value and +converts it to a Lua object. A write +access converts a Lua object to the element +type and stores the converted value to the element. An error is +raised if the element size is undefined or a write access to a +constant element is attempted.
Dereferencing a struct/union field: a +cdata struct/union or a pointer to a +struct/union can be dereferenced by a string key, +giving the field name. The field address is computed as the base +address plus the relative offset of the field. A read access loads the +field value and converts it to a Lua +object. A write access converts a Lua +object to the field type and stores the converted value to the +field. An error is raised if a write access to a constant +struct/union or a constant field is attempted.
Indexing a complex number: a complex number can be indexed +either by a cdata number or a Lua number with the values 0 or 1, or by +the strings "re" or "im". A read access loads the +real part ([0], .re) or the imaginary part +([1], .im) part of a complex number and +converts it to a Lua number. The +sub-parts of a complex number are immutable — assigning to an +index of a complex number raises an error. Accessing out-of-bound +indexes returns unspecified results, but is guaranteed not to trigger +memory access violations.
Indexing a vector: a vector is treated like an array for +indexing purposes, except the vector elements are immutable — +assigning to an index of a vector raises an error.

+Note: since there's (deliberately) no address-of operator, a cdata +object holding a value type is effectively immutable after +initialization. The JIT compiler benefits from this fact when applying +certain optimizations. +

+As a consequence of this, the elements of complex numbers and +vectors are immutable. But the elements of an aggregate holding these +types may be modified of course. I.e. you cannot assign to +foo.c.im, but you can assign a (newly created) complex number +to foo.c. +

+ +

Calling a cdata object

Constructor: a ctype object can be called and used as a +constructor.
C function call: a cdata function or cdata function +pointer can be called. The passed arguments are +converted to the C types of the +parameters given by the function declaration. Arguments passed to the +variable argument part of vararg C function use +special conversion rules. This +C function is called and the return value (if any) is +converted to a Lua object.
+On Windows/x86 systems, __stdcall functions are automatically +detected and a function declared as __cdecl (the default) is +silently fixed up after the first call.

+ +

Arithmetic on cdata objects

Pointer arithmetic: a cdata pointer/array and a cdata +number or a Lua number can be added or subtracted. The number must be +on the right hand side for a subtraction. The result is a pointer of +the same type with an address plus or minus the number value +multiplied by the element size in bytes. An error is raised if the +element size is undefined.
Pointer difference: two compatible cdata pointers/arrays +can be subtracted. The result is the difference between their +addresses, divided by the element size in bytes. An error is raised if +the element size is undefined or zero.
64 bit integer arithmetic: the standard arithmetic +operators (+ - * / % ^ and unary +minus) can be applied to two cdata numbers, or a cdata number and a +Lua number. If one of them is an uint64_t, the other side is +converted to an uint64_t and an unsigned arithmetic operation +is performed. Otherwise both sides are converted to an +int64_t and a signed arithmetic operation is performed. The +result is a boxed 64 bit cdata object.
+ +These rules ensure that 64 bit integers are "sticky". Any +expression involving at least one 64 bit integer operand results +in another one. The undefined cases for the division, modulo and power +operators return 2LL ^ 63 or +2ULL ^ 63.
+ +You'll have to explicitly convert a 64 bit integer to a Lua +number (e.g. for regular floating-point calculations) with +tonumber(). But note this may incur a precision loss.

+ +

Comparisons of cdata objects

Pointer comparison: two compatible cdata pointers/arrays +can be compared. The result is the same as an unsigned comparison of +their addresses. nil is treated like a NULL pointer, +which is compatible with any other pointer type.
64 bit integer comparison: two cdata numbers, or a +cdata number and a Lua number can be compared with each other. If one +of them is an uint64_t, the other side is converted to an +uint64_t and an unsigned comparison is performed. Otherwise +both sides are converted to an int64_t and a signed +comparison is performed.

+ +

cdata objects as table keys

+Lua tables may be indexed by cdata objects, but this doesn't provide +any useful semantics — cdata objects are unsuitable as table +keys! +

+A cdata object is treated like any other garbage-collected object and +is hashed and compared by its address for table indexing. Since +there's no interning for cdata value types, the same value may be +boxed in different cdata objects with different addresses. Thus +t[1LL+1LL] and t[2LL] usually do not point to +the same hash slot and they certainly do not point to the same +hash slot as t[2]. +

+It would seriously drive up implementation complexity and slow down +the common case, if one were to add extra handling for by-value +hashing and comparisons to Lua tables. Given the ubiquity of their use +inside the VM, this is not acceptable. +

+There are three viable alternatives, if you really need to use cdata +objects as keys: +

If you can get by with the precision of Lua numbers +(52 bits), then use tonumber() on a cdata number or +combine multiple fields of a cdata aggregate to a Lua number. Then use +the resulting Lua number as a key when indexing tables.
+One obvious benefit: t[tonumber(2LL)] does point to +the same slot as t[2].
Otherwise use either tostring() on 64 bit integers +or complex numbers or combine multiple fields of a cdata aggregate to +a Lua string (e.g. with +ffi.string()). Then +use the resulting Lua string as a key when indexing tables.
Create your own specialized hash table implementation using the +C types provided by the FFI library, just like you would in +C code. Ultimately this may give much better performance than the +other alternatives or what a generic by-value hash table could +possibly provide.

+ +

Garbage Collection of cdata Objects

+All explicitly (ffi.new(), ffi.cast() etc.) or +implicitly (accessors) created cdata objects are garbage collected. +You need to ensure to retain valid references to cdata objects +somewhere on a Lua stack, an upvalue or in a Lua table while they are +still in use. Once the last reference to a cdata object is gone, the +garbage collector will automatically free the memory used by it (at +the end of the next GC cycle). +

+Please note that pointers themselves are cdata objects, however they +are not followed by the garbage collector. So e.g. if you +assign a cdata array to a pointer, you must keep the cdata object +holding the array alive as long as the pointer is still in use: +

+ffi.cdef[[
+typedef struct { int *a; } foo_t;
+]]
+
+local s = ffi.new("foo_t", ffi.new("int[10]")) -- WRONG!
+
+local a = ffi.new("int[10]") -- OK
+local s = ffi.new("foo_t", a)
+-- Now do something with 's', but keep 'a' alive until you're done.
+

+Similar rules apply for Lua strings which are implicitly converted to +"const char *": the string object itself must be +referenced somewhere or it'll be garbage collected eventually. The +pointer will then point to stale data, which may have already been +overwritten. Note that string literals are automatically kept +alive as long as the function containing it (actually its prototype) +is not garbage collected. +

+Objects which are passed as an argument to an external C function +are kept alive until the call returns. So it's generally safe to +create temporary cdata objects in argument lists. This is a common +idiom for passing specific C types to +vararg functions. +

+Memory areas returned by C functions (e.g. from malloc()) +must be manually managed, of course (or use +ffi.gc()). Pointers to +cdata objects are indistinguishable from pointers returned by C +functions (which is one of the reasons why the GC cannot follow them). +

+ +

Callbacks

+The LuaJIT FFI automatically generates special callback functions +whenever a Lua function is converted to a C function pointer. This +associates the generated callback function pointer with the C type +of the function pointer and the Lua function object (closure). +

+This can happen implicitly due to the usual conversions, e.g. when +passing a Lua function to a function pointer argument. Or you can use +ffi.cast() to explicitly cast a Lua function to a +C function pointer. +

+Currently only certain C function types can be used as callback +functions. Neither C vararg functions nor functions with +pass-by-value aggregate argument or result types are supported. There +are no restrictions for the kind of Lua functions that can be called +from the callback — no checks for the proper number of arguments +are made. The return value of the Lua function will be converted to the +result type and an error will be thrown for invalid conversions. +

+It's allowed to throw errors across a callback invocation, but it's not +advisable in general. Do this only if you know the C function, that +called the callback, copes with the forced stack unwinding and doesn't +leak resources. +

+ +

Callback resource handling

+Callbacks take up resources — you can only have a limited number +of them at the same time (500 - 1000, depending on the +architecture). The associated Lua functions are anchored to prevent +garbage collection, too. +

+Callbacks due to implicit conversions are permanent! There is no +way to guess their lifetime, since the C side might store the +function pointer for later use (typical for GUI toolkits). The associated +resources cannot be reclaimed until termination: +

+ffi.cdef[[
+typedef int (__stdcall *WNDENUMPROC)(void *hwnd, intptr_t l);
+int EnumWindows(WNDENUMPROC func, intptr_t l);
+]]
+
+-- Implicit conversion to a callback via function pointer argument.
+local count = 0
+ffi.C.EnumWindows(function(hwnd, l)
+  count = count + 1
+  return true
+end, 0)
+-- The callback is permanent and its resources cannot be reclaimed!
+-- Ok, so this may not be a problem, if you do this only once.
+

+Note: this example shows that you must properly declare +__stdcall callbacks on Windows/x86 systems. The calling +convention cannot be automatically detected, unlike for +__stdcall calls to Windows functions. +

+For some use cases it's necessary to free up the resources or to +dynamically redirect callbacks. Use an explicit cast to a +C function pointer and keep the resulting cdata object. Then use +the cb:free() +or cb:set() methods +on the cdata object: +

+-- Explicitly convert to a callback via cast.
+local count = 0
+local cb = ffi.cast("WNDENUMPROC", function(hwnd, l)
+  count = count + 1
+  return true
+end)
+
+-- Pass it to a C function.
+ffi.C.EnumWindows(cb, 0)
+-- EnumWindows doesn't need the callback after it returns, so free it.
+
+cb:free()
+-- The callback function pointer is no longer valid and its resources
+-- will be reclaimed. The created Lua closure will be garbage collected.
+

+ +

Callback performance

+Callbacks are slow! First, the C to Lua transition itself +has an unavoidable cost, similar to a lua_call() or +lua_pcall(). Argument and result marshalling add to that cost. +And finally, neither the C compiler nor LuaJIT can inline or +optimize across the language barrier and hoist repeated computations out +of a callback function. +

+Do not use callbacks for performance-sensitive work: e.g. consider a +numerical integration routine which takes a user-defined function to +integrate over. It's a bad idea to call a user-defined Lua function from +C code millions of times. The callback overhead will be absolutely +detrimental for performance. +

+It's considerably faster to write the numerical integration routine +itself in Lua — the JIT compiler will be able to inline the +user-defined function and optimize it together with its calling context, +with very competitive performance. +

+As a general guideline: use callbacks only when you must, because +of existing C APIs. E.g. callback performance is irrelevant for a +GUI application, which waits for user input most of the time, anyway. +

+For new designs avoid push-style APIs (C function repeatedly +calling a callback for each result). Instead use pull-style APIs +(call a C function repeatedly to get a new result). Calls from Lua +to C via the FFI are much faster than the other way round. Most well-designed +libraries already use pull-style APIs (read/write, get/put). +

+ +

C Library Namespaces

+A C library namespace is a special kind of object which allows +access to the symbols contained in shared libraries or the default +symbol namespace. The default +ffi.C namespace is +automatically created when the FFI library is loaded. C library +namespaces for specific shared libraries may be created with the +ffi.load() API +function. +

+Indexing a C library namespace object with a symbol name (a Lua +string) automatically binds it to the library. First the symbol type +is resolved — it must have been declared with +ffi.cdef. Then the +symbol address is resolved by searching for the symbol name in the +associated shared libraries or the default symbol namespace. Finally, +the resulting binding between the symbol name, the symbol type and its +address is cached. Missing symbol declarations or nonexistent symbol +names cause an error. +

+This is what happens on a read access for the different kinds of +symbols: +

External functions: a cdata object with the type of the function +and its address is returned.
External variables: the symbol address is dereferenced and the +loaded value is converted to a Lua object +and returned.
Constant values (static const or enum +constants): the constant is converted to a +Lua object and returned.

+This is what happens on a write access: +

External variables: the value to be written is +converted to the C type of the +variable and then stored at the symbol address.
Writing to constant variables or to any other symbol type causes +an error, like any other attempted write to a constant location.

+C library namespaces themselves are garbage collected objects. If +the last reference to the namespace object is gone, the garbage +collector will eventually release the shared library reference and +remove all memory associated with the namespace. Since this may +trigger the removal of the shared library from the memory of the +running process, it's generally not safe to use function +cdata objects obtained from a library if the namespace object may be +unreferenced. +

+Performance notice: the JIT compiler specializes to the identity of +namespace objects and to the strings used to index it. This +effectively turns function cdata objects into constants. It's not +useful and actually counter-productive to explicitly cache these +function objects, e.g. local strlen = ffi.C.strlen. OTOH it +is useful to cache the namespace itself, e.g. local C = +ffi.C. +

+ +

No Hand-holding!

+The FFI library has been designed as a low-level library. The +goal is to interface with C code and C data types with a +minimum of overhead. This means you can do anything you can do +from C: access all memory, overwrite anything in memory, call +machine code at any memory address and so on. +

+The FFI library provides no memory safety, unlike regular Lua +code. It will happily allow you to dereference a NULL +pointer, to access arrays out of bounds or to misdeclare +C functions. If you make a mistake, your application might crash, +just like equivalent C code would. +

+This behavior is inevitable, since the goal is to provide full +interoperability with C code. Adding extra safety measures, like +bounds checks, would be futile. There's no way to detect +misdeclarations of C functions, since shared libraries only +provide symbol names, but no type information. Likewise there's no way +to infer the valid range of indexes for a returned pointer. +

+Again: the FFI library is a low-level library. This implies it needs +to be used with care, but it's flexibility and performance often +outweigh this concern. If you're a C or C++ developer, it'll be easy +to apply your existing knowledge. OTOH writing code for the FFI +library is not for the faint of heart and probably shouldn't be the +first exercise for someone with little experience in Lua, C or C++. +

+As a corollary of the above, the FFI library is not safe for use by +untrusted Lua code. If you're sandboxing untrusted Lua code, you +definitely don't want to give this code access to the FFI library or +to any cdata object (except 64 bit integers or complex +numbers). Any properly engineered Lua sandbox needs to provide safety +wrappers for many of the standard Lua library functions — +similar wrappers need to be written for high-level operations on FFI +data types, too. +

+ +

Current Status

+The initial release of the FFI library has some limitations and is +missing some features. Most of these will be fixed in future releases. +

+C language support is +currently incomplete: +

C declarations are not passed through a C pre-processor, +yet.
The C parser is able to evaluate most constant expressions +commonly found in C header files. However it doesn't handle the +full range of C expression semantics and may fail for some +obscure constructs.
static const declarations only work for integer types +up to 32 bits. Neither declaring string constants nor +floating-point constants is supported.
Packed struct bitfields that cross container boundaries +are not implemented.
Native vector types may be defined with the GCC mode or +vector_size attribute. But no operations other than loading, +storing and initializing them are supported, yet.
The volatile type qualifier is currently ignored by +compiled code.
ffi.cdef silently +ignores all re-declarations.

+The JIT compiler already handles a large subset of all FFI operations. +It automatically falls back to the interpreter for unimplemented +operations (you can check for this with the +-jv command line option). +The following operations are currently not compiled and may exhibit +suboptimal performance, especially when used in inner loops: +

Array/struct copies and bulk initializations.
Bitfield accesses and initializations.
Vector operations.
Table initializers.
Initialization of nested struct/union types.
Allocations of variable-length arrays or structs.
Allocations of C types with a size > 64 bytes or an +alignment > 8 bytes.
Conversions from lightuserdata to void *.
Pointer differences for element sizes that are not a power of +two.
Calls to C functions with aggregates passed or returned by +value.
Calls to ctype metamethods which are not plain functions.
ctype __newindex tables and non-string lookups in ctype +__index tables.
tostring() for cdata types.
Calls to the following ffi.* API +functions: cdef, load, typeof, +metatype, gc, sizeof, alignof, +offsetof.

+Other missing features: +

Bit operations for 64 bit types.
Arithmetic for complex numbers.
Passing structs by value to vararg C functions.
C++ exception interoperability +does not extend to C functions called via the FFI, if the call is +compiled.

+
+