This is a list of changes between the released versions of LuaJIT.
The current development version is LuaJIT 2.0.0-beta9.
The current stable version is LuaJIT 1.1.7.
Please check the » Online Change History to see whether newer versions are available.
LuaJIT 2.0.0-beta9 — 2011-12-14
- New features:
- PPC port of LuaJIT is complete. Default is the dual-number port (usually faster). Single-number port selectable via src/Makefile at build time.
- Add FFI callback support.
- Extend -b to generate .c, .h or .obj/.o files with embedded bytecode.
- Allow loading embedded bytecode with require().
- From Lua 5.2: Change to '\z' escape. Reject undefined escape sequences.
- Correctness and completeness:
- Fix OSX 10.7 build. Fix install_name and versioning on OSX.
- Fix iOS build.
- Install dis_arm.lua, too.
- Mark installed shared library as executable.
- Add debug option to msvcbuild.bat and improve error handling.
- Fix data-flow analysis for iterators.
- Fix forced unwinding triggered by external unwinder.
- Record missing for loop slot loads (return to lower frame).
- Always use ANSI variants of Windows system functions.
- Fix GC barrier for multi-result table constructor (TSETM).
- Fix/add various FOLD rules.
- Add potential PHI for number conversions due to type instability.
- Do not eliminate PHIs only referenced from other PHIs.
- Correctly anchor implicit number to string conversions in Lua/C API.
- Fix various stack limit checks.
- x64: Use thread-safe exceptions for external unwinding (GCC platforms).
- x64: Fix result type of cdata index conversions.
- x64: Fix math.random() and bit.bswap() code generation.
- x64: Fix lightuserdata comparisons.
- x64: Always extend stack-passed arguments to pointer size.
- ARM: Many fixes to code generation backend.
- PPC/e500: Fix dispatch for binop metamethods.
- PPC/e500: Save/restore condition registers when entering/leaving the VM.
- PPC/e500: Fix write barrier in stores of strings to upvalues.
- FFI library:
- Fix C comment parsing.
- Fix snapshot optimization for cdata comparisons.
- Fix recording of const/enum lookups in namespaces.
- Fix call argument and return handling for I8/U8/I16/U16 types.
- Fix unfused loads of float fields.
- Fix ffi.string() recording.
- Save GetLastError() around ffi.load() and symbol resolving, too.
- Improve ld script detection in ffi.load().
- Record loads/stores to external variables in namespaces.
- Compile calls to stdcall, fastcall and vararg functions.
- Treat function ctypes like pointers in comparisons.
- Resolve __call metamethod for pointers, too.
- Record C function calls with bool return values.
- Record ffi.errno().
- x86: Fix number to uint32_t conversion rounding.
- x86: Fix 64 bit arithmetic in assembler backend.
- x64: Fix struct-by-value calling conventions.
- ARM: Ensure invocation of SPLIT pass for float conversions.
- Structural and performance enhancements:
- Display trace types with -jv and -jdump.
- Record isolated calls. But prefer recording loops over calls.
- Specialize to prototype for non-monomorphic functions. Solves the trace-explosion problem for closure-heavy programming styles.
- Always generate a portable vmdef.lua. Easier for distros.
LuaJIT 2.0.0-beta8 — 2011-06-23
- New features:
- Soft-float ARM port of LuaJIT is complete.
- Add support for bytecode loading/saving and -b command line option.
- From Lua 5.2: __len metamethod for tables (disabled by default).
- Correctness and completeness:
- ARM: Misc. fixes for interpreter.
- x86/x64: Fix bit.* argument checking in interpreter.
- Catch early out-of-memory in memory allocator initialization.
- Fix data-flow analysis for paths leading to an upvalue close.
- Fix check for missing arguments in string.format().
- Fix Solaris/x86 build (note: not a supported target).
- Fix recording of loops with instable directions in side traces.
- x86/x64: Fix fusion of comparisons with u8/u16 XLOAD.
- x86/x64: Fix register allocation for variable shifts.
- FFI library:
- Add ffi.errno(). Save errno/GetLastError() around allocations etc.
- Fix __gc for VLA/VLS cdata objects.
- Fix recording of casts from 32 bit cdata pointers to integers.
- tonumber(cdata) returns nil for non-numbers.
- Show address pointed to for tostring(pointer).
- Print NULL pointers as "cdata<... *>: NULL".
- Support __tostring metamethod for pointers to structs, too.
- Structural and performance enhancements:
- More tuning for loop unrolling heuristics.
- Flatten and compress in-memory debug info (saves ~70%).
LuaJIT 2.0.0-beta7 — 2011-05-05
- New features:
- ARM port of the LuaJIT interpreter is complete.
- FFI library: Add ffi.gc(), ffi.metatype(), ffi.istype().
- FFI library: Resolve ld script redirection in ffi.load().
- From Lua 5.2: package.searchpath(), fp:read("*L"), load(string).
- From Lua 5.2, disabled by default: empty statement, table.unpack(), modified coroutine.running().
- Correctness and completeness:
- FFI library: numerous fixes.
- Fix type mismatches in store-to-load forwarding.
- Fix error handling within metamethods.
- Fix table.maxn().
- Improve accuracy of x^-k on x64.
- Fix code generation for Intel Atom in x64 mode.
- Fix narrowing of POW.
- Fix recording of retried fast functions.
- Fix code generation for bit.bnot() and multiplies.
- Fix error location within cpcall frames.
- Add workaround for old libgcc unwind bug.
- Fix lua_yield() and getmetatable(lightuserdata) on x64.
- Misc. fixes for PPC/e500 interpreter.
- Fix stack slot updates for down-recursion.
- Structural and performance enhancements:
- Add dual-number mode (int/double) for the VM. Enabled for ARM.
- Improve narrowing of arithmetic operators and for loops.
- Tune loop unrolling heuristics and increase trace recorder limits.
- Eliminate dead slots in snapshots using bytecode data-flow analysis.
- Avoid phantom stores to proxy tables.
- Optimize lookups in empty proxy tables.
- Improve bytecode optimization of and/or operators.
LuaJIT 2.0.0-beta6 — 2011-02-11
- New features:
- PowerPC/e500v2 port of the LuaJIT interpreter is complete.
- Various minor features from Lua 5.2: Hex escapes in literals, '\*' escape, reversible string.format("%q",s), "%g" pattern, table.sort checks callbacks, os.exit(status|true|false[,close]).
- Lua 5.2 __pairs and __ipairs metamethods (disabled by default).
- Initial release of the FFI library.
- Correctness and completeness:
- Fix string.format() for non-finite numbers.
- Fix memory leak when compiled to use the built-in allocator.
- x86/x64: Fix unnecessary resize in TSETM bytecode.
- Fix various GC issues with traces and jit.flush().
- x64: Fix fusion of indexes for array references.
- x86/x64: Fix stack overflow handling for coroutine results.
- Enable low-2GB memory allocation on FreeBSD/x64.
- Fix collectgarbage("count") result if more than 2GB is in use.
- Fix parsing of hex floats.
- x86/x64: Fix loop branch inversion with trailing HREF+NE/EQ.
- Add jit.os string.
- coroutine.create() permits running C functions, too.
- Fix OSX build to work with newer ld64 versions.
- Fix bytecode optimization of and/or operators.
- Structural and performance enhancements:
- Emit specialized bytecode for pairs()/next().
- Improve bytecode coalescing of nil constants.
- Compile calls to vararg functions.
- Compile select().
- Improve alias analysis, esp. for loads from allocations.
- Tuning of various compiler heuristics.
- Refactor and extend IR conversion instructions.
- x86/x64: Various backend enhancements related to the FFI.
- Add SPLIT pass to split 64 bit IR instructions for 32 bit CPUs.
LuaJIT 2.0.0-beta5 — 2010-08-24
- Correctness and completeness:
- Fix trace exit dispatch to function headers.
- Fix Windows and OSX builds with LUAJIT_DISABLE_JIT.
- Reorganize and fix placement of generated machine code on x64.
- Fix TNEW in x64 interpreter.
- Do not eliminate PHIs for values only referenced from side exits.
- OS-independent canonicalization of strings for non-finite numbers.
- Fix string.char() range check on x64.
- Fix tostring() resolving within print().
- Fix error handling for next().
- Fix passing of constant arguments to external calls on x64.
- Fix interpreter argument check for two-argument SSE math functions.
- Fix C frame chain corruption caused by lua_cpcall().
- Fix return from pcall() within active hook.
- Structural and performance enhancements:
- Replace on-trace GC frame syncing with interpreter exit.
- Improve hash lookup specialization by not removing dead keys during GC.
- Turn traces into true GC objects.
- Avoid starting a GC cycle immediately after library init.
- Add weak guards to improve dead-code elimination.
- Speed up string interning.
LuaJIT 2.0.0-beta4 — 2010-03-28
- Correctness and completeness:
- Fix precondition for on-trace creation of table keys.
- Fix {f()} on x64 when table is resized.
- Fix folding of ordered comparisons with same references.
- Fix snapshot restores for multi-result bytecodes.
- Fix potential hang when recording bytecode with nested closures.
- Fix recording of getmetatable(), tonumber() and bad argument types.
- Fix SLOAD fusion across returns to lower frames.
- Structural and performance enhancements:
- Add array bounds check elimination. -Oabc is enabled by default.
- More tuning for x64, e.g. smaller table objects.
LuaJIT 2.0.0-beta3 — 2010-03-07
- LuaJIT x64 port:
- Port integrated memory allocator to Linux/x64, Windows/x64 and OSX/x64.
- Port interpreter and JIT compiler to x64.
- Port DynASM to x64.
- Many 32/64 bit cleanups in the VM.
- Allow building the interpreter with either x87 or SSE2 arithmetics.
- Add external unwinding and C++ exception interop (default on x64).
- Correctness and completeness:
- Fix constructor bytecode generation for certain conditional values.
- Fix some cases of ordered string comparisons.
- Fix lua_tocfunction().
- Fix cutoff register in JMP bytecode for some conditional expressions.
- Fix PHI marking algorithm for references from variant slots.
- Fix package.cpath for non-default PREFIX.
- Fix DWARF2 frame unwind information for interpreter on OSX.
- Drive the GC forward on string allocations in the parser.
- Implement call/return hooks (zero-cost if disabled).
- Implement yield from C hooks.
- Disable JIT compiler on older non-SSE2 CPUs instead of aborting.
- Structural and performance enhancements:
- Compile recursive code (tail-, up- and down-recursion).
- Improve heuristics for bytecode penalties and blacklisting.
- Split CALL/FUNC recording and clean up fast function call semantics.
- Major redesign of internal function call handling.
- Improve FOR loop const specialization and integerness checks.
- Switch to pre-initialized stacks. Avoid frame-clearing.
- Colocation of prototypes and related data: bytecode, constants, debug info.
- Cleanup parser and streamline bytecode generation.
- Add support for weak IR references to register allocator.
- Switch to compressed, extensible snapshots.
- Compile returns to frames below the start frame.
- Improve alias analysis of upvalues using a disambiguation hash value.
- Compile floor/ceil/trunc to SSE2 helper calls or SSE4.1 instructions.
- Add generic C call handling to IR and backend.
- Improve KNUM fuse vs. load heuristics.
- Compile various io.*() functions.
- Compile math.sinh(), math.cosh(), math.tanh() and math.random().
LuaJIT 2.0.0-beta2 — 2009-11-09
- Reorganize build system. Build static+shared library on POSIX.
- Allow C++ exception conversion on all platforms using a wrapper function.
- Automatically catch C++ exceptions and rethrow Lua error (DWARF2 only).
- Check for the correct x87 FPU precision at strategic points.
- Always use wrappers for libm functions.
- Resurrect metamethod name strings before copying them.
- Mark current trace, even if compiler is idle.
- Ensure FILE metatable is created only once.
- Fix type comparisons when different integer types are involved.
- Fix getmetatable() recording.
- Fix TDUP with dead keys in template table.
- jit.flush(tr) returns status. Prevent manual flush of a trace that's still linked.
- Improve register allocation heuristics for invariant references.
- Compile the push/pop variants of table.insert() and table.remove().
- Compatibility with MSVC link /debug.
- Fix lua_iscfunction().
- Fix math.random() when compiled with -fpic (OSX).
- Fix table.maxn().
- Bump MACOSX_DEPLOYMENT_TARGET to 10.4
- luaL_check*() and luaL_opt*() now support
negative arguments, too.
This matches the behavior of Lua 5.1, but not the specification.
LuaJIT 2.0.0-beta1 — 2009-10-31
- This is the first public release of LuaJIT 2.0.
- The whole VM has been rewritten from the ground up, so there's no point in listing differences over earlier versions.
LuaJIT 1.1.7 — 2011-05-05
- Added fixes for the » currently known bugs in Lua 5.1.4.
LuaJIT 1.1.6 — 2010-03-28
- Added fixes for the » currently known bugs in Lua 5.1.4.
- Removed wrong GC check in jit_createstate(). Thanks to Tim Mensch.
- Fixed bad assertions while compiling table.insert() and table.remove().
LuaJIT 1.1.5 — 2008-10-25
- Merged with Lua 5.1.4. Fixes all » known bugs in Lua 5.1.3.
LuaJIT 1.1.4 — 2008-02-05
- Merged with Lua 5.1.3. Fixes all » known bugs in Lua 5.1.2.
- Fixed possible (but unlikely) stack corruption while compiling k^x expressions.
- Fixed DynASM template for cmpss instruction.
LuaJIT 1.1.3 — 2007-05-24
- Merged with Lua 5.1.2. Fixes all » known bugs in Lua 5.1.1.
- Merged pending Lua 5.1.x fixes: "return -nil" bug, spurious count hook call.
- Remove a (sometimes) wrong assertion in luaJIT_findpc().
- DynASM now allows labels for displacements and .aword.
- Fix some compiler warnings for DynASM glue (internal API change).
- Correct naming for SSSE3 (temporarily known as SSE4) in DynASM and x86 disassembler.
- The loadable debug modules now handle redirection to stdout (e.g. -j trace=-).
LuaJIT 1.1.2 — 2006-06-24
- Fix MSVC inline assembly: use only local variables with lua_number2int().
- Fix "attempt to call a thread value" bug on Mac OS X: make values of consts used as lightuserdata keys unique to avoid joining by the compiler/linker.
LuaJIT 1.1.1 — 2006-06-20
- Merged with Lua 5.1.1. Fixes all » known bugs in Lua 5.1.
- Enforce (dynamic) linker error for EXE/DLL version mismatches.
- Minor changes to DynASM: faster pre-processing, smaller encoding for some immediates.
This release is in sync with Coco 1.1.1 (see the » Coco Change History).
LuaJIT 1.1.0 — 2006-03-13
- Merged with Lua 5.1 (final).
- New JIT call frame setup:
- The C stack is kept 16 byte aligned (faster). Mandatory for Mac OS X on Intel, too.
- Faster calling conventions for internal C helper functions.
- Better instruction scheduling for function prologue, OP_CALL and OP_RETURN.
- Miscellaneous optimizations:
- Faster loads of FP constants. Remove narrow-to-wide store-to-load forwarding stalls.
- Use (scalar) SSE2 ops (if the CPU supports it) to speed up slot moves and FP to integer conversions.
- Optimized the two-argument form of OP_CONCAT (a..b).
- Inlined OP_MOD (a%b). With better accuracy than the C variant, too.
- Inlined OP_POW (a^b). Unroll x^k or use k^x = 2^(log2(k)*x) or call pow().
- Changes in the optimizer:
- Improved hinting for table keys derived from table values (t1[t2[x]]).
- Lookup hinting now works with arbitrary object types and supports index chains, too.
- Generate type hints for arithmetic and comparison operators, OP_LEN, OP_CONCAT and OP_FORPREP.
- Remove several hint definitions in favour of a generic COMBINE hint.
- Complete rewrite of jit.opt_inline module (ex jit.opt_lib).
- Use adaptive deoptimization:
- If runtime verification of a contract fails, the affected instruction is recompiled and patched on-the-fly. Regular programs will trigger deoptimization only occasionally.
- This avoids generating code for uncommon fallback cases most of the time. Generated code is up to 30% smaller compared to LuaJIT 1.0.3.
- Deoptimization is used for many opcodes and contracts:
- OP_CALL, OP_TAILCALL: type mismatch for callable.
- Inlined calls: closure mismatch, parameter number and type mismatches.
- OP_GETTABLE, OP_SETTABLE: table or key type and range mismatches.
- All arithmetic and comparison operators, OP_LEN, OP_CONCAT, OP_FORPREP: operand type and range mismatches.
- Complete redesign of the debug and traceback info (bytecode ↔ mcode) to support deoptimization. Much more flexible and needs only 50% of the space.
- The modules jit.trace, jit.dumphints and jit.dump handle deoptimization.
- Inlined many popular library functions
(for commonly used arguments only):
- Most math.* functions (the 18 most used ones) [2x-10x faster].
- string.len, string.sub and string.char [2x-10x faster].
- table.insert, table.remove and table.getn [3x-5x faster].
- coroutine.yield and coroutine.resume [3x-5x faster].
- pairs, ipairs and the corresponding iterators [8x-15x faster].
- Changes in the core and loadable modules and the stand-alone executable:
- Added jit.version, jit.version_num and jit.arch.
- Reorganized some internal API functions (jit.util.*mcode*).
- The -j dump output now shows JSUB names, too.
- New x86 disassembler module written in pure Lua. No dependency on ndisasm anymore. Flexible API, very compact (500 lines) and complete (x87, MMX, SSE, SSE2, SSE3, SSSE3, privileged instructions).
- luajit -v prints the LuaJIT version and copyright on a separate line.
- Added SSE, SSE2, SSE3 and SSSE3 support to DynASM.
- Miscellaneous doc changes. Added a section about embedding LuaJIT.
This release is in sync with Coco 1.1.0 (see the » Coco Change History).
LuaJIT 1.0.3 — 2005-09-08
- Even more docs.
- Unified closure checks in jit.*.
- Fixed some range checks in jit.util.*.
- Fixed __newindex call originating from jit_settable_str().
- Merged with Lua 5.1 alpha (including early bug fixes).
This is the first public release of LuaJIT.
LuaJIT 1.0.2 — 2005-09-02
- Add support for flushing the Valgrind translation cache
(MYCFLAGS= -DUSE_VALGRIND). - Add support for freeing executable mcode memory to the mmap()-based variant for POSIX systems.
- Reorganized the C function signature handling in jit.opt_lib.
- Changed to index-based hints for inlining C functions. Still no support in the backend for inlining.
- Hardcode HEAP_CREATE_ENABLE_EXECUTE value if undefined.
- Misc. changes to the jit.* modules.
- Misc. changes to the Makefiles.
- Lots of new docs.
- Complete doc reorg.
Not released because Lua 5.1 alpha came out today.
LuaJIT 1.0.1 — 2005-08-31
- Missing GC step in OP_CONCAT.
- Fix result handling for C –> JIT calls.
- Detect CPU feature bits.
- Encode conditional moves (fucomip) only when supported.
- Add fallback instructions for FP compares.
- Add support for LUA_COMPAT_VARARG. Still disabled by default.
- MSVC needs a specific place for the CALLBACK attribute (David Burgess).
- Misc. doc updates.
Interim non-public release. Special thanks to Adam D. Moss for reporting most of the bugs.
LuaJIT 1.0.0 — 2005-08-29
This is the initial non-public release of LuaJIT.