1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
|
nat-5.0.3
This directory contains versions of front end files that are rewritten
to be more "native" to Lua. These files should be considered as
exercises in exploring ways to write the front end, for example, to
write a front end that is optimized for size, etc. See also file size
data further below.
The following are the different versions available (mk2 == "mark 2",
this is commonly used in the UK, e.g. for aeroplanes during WWII):
Lexers
------
WARNING: Theses lexer may or may not exhibit exact behaviour when lexing
strings or long strings with embedded CRLF newlines. The CRLF sequence
may be translated into LF (the reference manual is unclear on this.) The
user is advised to stick to LF line endings exclusively.
llex_mk2 Rewritten from original ported code to become more
Lua-like. Still uses a stream-based input interface.
MK2 still scans using a per-character function that
is pretty inefficient.
Status: TESTED
llex_mk3 A rewritten version of MK2 that needs input to be
entered as a single string. Unless an application's
need is very unusual, this should not be a problem.
It will not work for per-line interaction, though.
MK3 no longer needs stream input functions. This
version is also heavily optimized for size. MK3 scans
using find functions and doesn't create many strings.
Status: TESTED
llex_mk4 A rewritten version of MK3 that is line-oriented.
This allows a command-line version that works in the
usual way to be written.
Status: TESTED
The following is a comparison of file sizes (as of 20061111):
lzio llex TOTAL Speed (2)
(bytes) (bytes) (bytes) (KB/s)
----------------------------------------------
Binary (Mingw) 416 5312 5728 N/A
----------------------------------------------
(in orig-5.0.3:)
----------------------------------------------
normal 2219 12639 14585 404.9
stripped 1292 7618 8910
----------------------------------------------
(in nat-5.0.3:)
----------------------------------------------
mk2 1995 7628 9623 469.5
mk2-stripped 1195 4003 5298
----------------------------------------------
mk3 (1) - 6552 6552 1870.8
mk3-stripped - 3286 3286
----------------------------------------------
mk4 1337 6956 8293 802.9
mk4-stripped 798 3457 4225
----------------------------------------------
(1) mk3 does not have a file input streaming function
(2) Speed was benchmarked using a Sempron 3000+. Benchmark scripts are
in the test directories. Best of first three figures quoted. This is a
measurement of raw lexer speed, i.e. tokens are read but no processing
is done. All files are read in entirely before running the lexer.
The performance of the orig-5.0.3 parser is probably a whole magnitude
less than the orig-5.0.3 lexer performance.
Parsers
-------
lparser_mk3 Written for the simplified lexer interface of llex_mk3+.
(Should be compatible with llex_mk4 too, but untested.)
This is a lexer skeleton, stripped of codegen code. See
the comments in the source code for more information.
Without logging messages and comments, it's under 600 LOC.
Sample output of the parser message logger can be found
in the test/parser_log subdirectory.
Tested with test_parser-5.0.lua, the Lua 5.0.x parser test
cases in the test_lua/ directory, appears to be fine.
Status: SNIPPETS APPEAR TO WORK
lparser_mk3b As above, with variable management code added. In order
to use the parser usefully, variable management code is
a big step forward, allowing the parser to differentiate
locals, upvalues and globals. The number of added lines
is around 100 LOC. A binary chunk of lparser_mk3b
(stripped) is 18076 bytes.
Sample output of the parser message logger can be found
in the test/parser_log subdirectory.
Tested with test_parser-5.0.lua, the Lua 5.0.x parser test
cases in the test_lua/ directory, appears to be fine.
Status: SNIPPETS APPEAR TO WORK
There will be no further development beyond lparser_mk3b. Further work
will focus on a 5.1.x equivalent, for which both a parser skeleton and a
parser with full code generation using nicely commented code is planned.
Other notes:
------------
For Lua 5.0.2, see Yueliang 0.1.3, which was the last release of Lua
5.0.2 material.
Test scripts for the lexer should probably be consolidated, but it's a
little difficult because not all lexers give the same error messages or
use the same token names or format.
|