sqlite source (unix build) added to libraries

author: dan miller 2007-10-20 02:49:29 +0000
committer: dan miller 2007-10-20 02:49:29 +0000
commit: e36d23a85ebff914d74bb541558c2b6082b78edb (patch)
tree: 54b58fdf162e78af64055282a6035c8d2443389d /libraries/sqlite/unix/sqlite-3.5.1/www/optimizer.tcl
parent: * Fixed an issue whereby avatar chat distances were being calculated against ... (diff)
download: opensim-SC-e36d23a85ebff914d74bb541558c2b6082b78edb.zip
opensim-SC-e36d23a85ebff914d74bb541558c2b6082b78edb.tar.gz
opensim-SC-e36d23a85ebff914d74bb541558c2b6082b78edb.tar.bz2
opensim-SC-e36d23a85ebff914d74bb541558c2b6082b78edb.tar.xz
1 files changed, 265 insertions, 0 deletions
diff --git a/libraries/sqlite/unix/sqlite-3.5.1/www/optimizer.tcl b/libraries/sqlite/unix/sqlite-3.5.1/www/optimizer.tcl
new file mode 100644
index 0000000..5b2897e
--- /dev/null
+++ b/libraries/sqlite/unix/sqlite-3.5.1/www/optimizer.tcl
@@ -0,0 +1,265 @@
+#
+# Run this TCL script to generate HTML for the goals.html file.
+#
+set rcsid {$Id: optimizer.tcl,v 1.1 2005/08/30 22:44:06 drh Exp $}
+source common.tcl
+header {The SQLite Query Optimizer}
+proc CODE {text} {
+  puts "<blockquote><pre>"
+  puts $text
+  puts "</pre></blockquote>"
+}
+proc IMAGE {name {caption {}}} {
+  puts "<center><img src=\"$name\">"
+  if {$caption!=""} {
+    puts "<br>$caption"
+  }
+  puts "</center>"
+}
+proc PARAGRAPH {text} {
+  puts "<p>$text</p>\n"
+}
+proc HEADING {level name} {
+  puts "<h$level>$name</h$level>"
+}
+HEADING 1 {The SQLite Query Optimizer}
+PARAGRAPH {
+  This article describes how the SQLite query optimizer works.
+  This is not something you have to know in order to use SQLite - many
+  programmers use SQLite successfully without the slightest hint of what
+  goes on in the inside.
+  But a basic understanding of what SQLite is doing
+  behind the scenes will help you to write more efficient SQL.  And the
+  knowledge gained by studying the SQLite query optimizer has broad
+  application since most other relational database engines operate 
+  similarly.
+  A solid understanding of how the query optimizer works is also
+  required before making meaningful changes or additions to the SQLite, so 
+  this article should be read closely by anyone aspiring
+  to hack the source code.
+}
+HEADING 2 Background
+PARAGRAPH {
+  It is important to understand that SQL is a programming language.
+  SQL is a perculiar programming language in that it
+  describes <u>what</u> the programmer wants to compute not <u>how</u>
+  to compute it as most other programming languages do.
+  But perculiar or not, SQL is still just a programming language.
+}
+PARAGRAPH {
+  It is very helpful to think of each SQL statement as a separate
+  program.
+  An important job of the SQL database engine is to translate each
+  SQL statement from its descriptive form that specifies what the
+  information is desired (the <u>what</u>) 
+  into a procedural form that specifies how to go
+  about acquiring the desired information (the <u>how</u>).
+  The task of translating the <u>what</u> into a 
+  <u>how</u> is assigned to the query optimizer.
+}
+PARAGRAPH {
+  The beauty of SQL comes from the fact that the optimizer frees the programmer
+  from having to worry over the details of <u>how</u>.  The programmer
+  only has to specify the <u>what</u> and then leave the optimizer
+  to deal with all of the minutae of implementing the
+  <u>how</u>.  Thus the programmer is able to think and work at a
+  much higher level and leave the optimizer to stress over the low-level
+  work.
+}
+HEADING 2 {Database Layout}
+PARAGRAPH {
+  An SQLite database consists of one or more "b-trees".
+  Each b-tree contains zero or more "rows". 
+  A single row contains a "key" and some "data".
+  In general, both the key and the data are arbitrary binary
+  data of any length.
+  The keys must all be unique within a single b-tree.
+  Rows are stored in order of increasing key values - each
+  b-tree has a comparision functions for keys that determines
+  this order.
+}
+PARAGRAPH {
+  In SQLite, each SQL table is stored as a b-tree where the
+  key is a 64-bit integer and the data is the content of the
+  table row.  The 64-bit integer key is the ROWID.  And, of course,
+  if the table has an INTEGER PRIMARY KEY, then that integer is just
+  an alias for the ROWID.
+}
+PARAGRAPH {
+  Consider the following block of SQL code:
+}
+CODE {
+  CREATE TABLE ex1(
+     id INTEGER PRIMARY KEY,
+     x  VARCHAR(30),
+     y  INTEGER
+  );
+  INSERT INTO ex1 VALUES(NULL,'abc',12345);
+  INSERT INTO ex1 VALUES(NULL,456,'def');
+  INSERT INTO ex1 VALUES(100,'hello','world');
+  INSERT INTO ex1 VALUES(-5,'abc','xyz');
+  INSERT INTO ex1 VALUES(54321,NULL,987);
+}
+PARAGRAPH {
+  This code generates a new b-tree (named "ex1") containing 5 rows.
+  This table can be visualized as follows:
+}
+IMAGE table-ex1b2.gif
+PARAGRAPH {
+  Note that the key for each row if the b-tree is the INTEGER PRIMARY KEY
+  for that row.  (Remember that the INTEGER PRIMARY KEY is just an alias
+  for the ROWID.)  The other fields of the table form the data for each
+  entry in the b-tree.  Note also that the b-tree entries are in ROWID order
+  which is different from the order that they were originally inserted.
+}
+PARAGRAPH {
+  Now consider the following SQL query:
+}
+CODE {
+  SELECT y FROM ex1 WHERE x=456;
+}
+PARAGRAPH {
+  When the SQLite parser and query optimizer are handed this query, they
+  have to translate it into a procedure that will find the desired result.
+  In this case, they do what is call a "full table scan".  They start
+  at the beginning of the b-tree that contains the table and visit each
+  row.  Within each row, the value of the "x" column is tested and when it
+  is found to match 456, the value of the "y" column is output.
+  We can represent this procedure graphically as follows:
+}
+IMAGE fullscanb.gif
+PARAGRAPH {
+  A full table scan is the access method of last resort.  It will always
+  work.  But if the table contains millions of rows and you are only looking
+  a single one, it might take a very long time to find the particular row
+  you are interested in.
+  In particular, the time needed to access a single row of the table is
+  proportional to the total number of rows in the table.
+  So a big part of the job of the optimizer is to try to find ways to 
+  satisfy the query without doing a full table scan.
+}
+PARAGRAPH {
+  The usual way to avoid doing a full table scan is use a binary search
+  to find the particular row or rows of interest in the table.
+  Consider the next query which searches on rowid instead of x:
+}
+CODE {
+  SELECT y FROM ex1 WHERE rowid=2;
+}
+PARAGRAPH {
+  In the previous query, we could not use a binary search for x because
+  the values of x were not ordered.  But the rowid values are ordered.
+  So instead of having to visit every row of the b-tree looking for one
+  that has a rowid value of 2, we can do a binary search for that particular
+  row and output its corresponding y value.  We show this graphically
+  as follows:
+}
+IMAGE direct1b.gif
+PARAGRAPH {
+  When doing a binary search, we only have to look at a number of
+  rows with is proportional to the logorithm of the number of entries
+  in the table.  For a table with just 5 entires as in the example above,
+  the difference between a full table scan and a binary search is
+  negligible.  In fact, the full table scan might be faster.  But in
+  a database that has 5 million rows, a binary search will be able to
+  find the desired row in only about 23 tries, whereas the full table
+  scan will need to look at all 5 million rows.  So the binary search
+  is about 200,000 times faster in that case.
+}
+PARAGRAPH {
+  A 200,000-fold speed improvement is huge.  So we always want to do
+  a binary search rather than a full table scan when we can.
+}
+PARAGRAPH {
+  The problem with a binary search is that the it only works if the
+  fields you are search for are in sorted order.  So we can do a binary
+  search when looking up the rowid because the rows of the table are
+  sorted by rowid.  But we cannot use a binary search when looking up
+  x because the values in the x column are in no particular order.
+}
+PARAGRAPH {
+  The way to work around this problem and to permit binary searching on
+  fields like x is to provide an index.
+  An index is another b-tree.
+  But in the index b-tree the key is not the rowid but rather the field
+  or fields being indexed followed by the rowid.
+  The data in an index b-tree is empty - it is not needed or used.
+  The following diagram shows an index on the x field of our example table:
+}
+IMAGE index-ex1-x-b.gif
+PARAGRAPH {
+  An important point to note in the index are that they keys of the
+  b-tree are in sorted order.  (Recall that NULL values in SQLite sort
+  first, followed by numeric values in numerical order, then strings, and
+  finally BLOBs.)  This is the property that will allow use to do a
+  binary search for the field x.  The rowid is also included in every
+  key for two reasons.  First, by including the rowid we guarantee that
+  every key will be unique.  And second, the rowid will be used to look
+  up the actual table entry after doing the binary search.  Finally, note
+  that the data portion of the index b-tree serves no purpose and is thus
+  kept empty to save space in the disk file.
+}
+PARAGRAPH {
+  Remember what the original query example looked like:
+}
+CODE {
+  SELECT y FROM ex1 WHERE x=456;
+}
+PARAGRAPH {
+  The first time this query was encountered we had to do a full table
+  scan.  But now that we have an index on x, we can do a binary search
+  on that index for the entry where x==456.  Then from that entry we
+  can find the rowid value and use the rowid to look up the corresponding
+  entry in the original table.  From the entry in the original table,
+  we can find the value y and return it as our result.  The following
+  diagram shows this process graphically:
+}
+IMAGE indirect1b1.gif
+PARAGRAPH {
+  With the index, we are able to look up an entry based on the value of
+  x after visiting only a logorithmic number of b-tree entries.  Unlike
+  the case where we were searching using rowid, we have to do two binary
+  searches for each output row.  But for a 5-million row table, that is
+  still only 46 searches instead of 5 million for a 100,000-fold speedup.
+}
+HEADING 3 {Parsing The WHERE Clause}
+# parsing the where clause
+# rowid lookup
+# index lookup
+# index lookup without the table
+# how an index is chosen
+# joins
+# join reordering
+# order by using an index
+# group by using an index
+# OR -> IN optimization
+# Bitmap indices
+# LIKE and GLOB optimization
+# subquery flattening
+# MIN and MAX optimizations
author	dan miller	2007-10-20 02:49:29 +0000
committer	dan miller	2007-10-20 02:49:29 +0000
commit	e36d23a85ebff914d74bb541558c2b6082b78edb (patch)
tree	54b58fdf162e78af64055282a6035c8d2443389d /libraries/sqlite/unix/sqlite-3.5.1/www/optimizer.tcl
parent	* Fixed an issue whereby avatar chat distances were being calculated against ... (diff)
download	opensim-SC-e36d23a85ebff914d74bb541558c2b6082b78edb.zip opensim-SC-e36d23a85ebff914d74bb541558c2b6082b78edb.tar.gz opensim-SC-e36d23a85ebff914d74bb541558c2b6082b78edb.tar.bz2 opensim-SC-e36d23a85ebff914d74bb541558c2b6082b78edb.tar.xz

diff --git a/libraries/sqlite/unix/sqlite-3.5.1/www/optimizer.tcl b/libraries/sqlite/unix/sqlite-3.5.1/www/optimizer.tcl new file mode 100644 index 0000000..5b2897e --- /dev/null +++ b/libraries/sqlite/unix/sqlite-3.5.1/www/optimizer.tcl
@@ -0,0 +1,265 @@
	1	#
	2	# Run this TCL script to generate HTML for the goals.html file.
	3	#
	4	set rcsid {$Id: optimizer.tcl,v 1.1 2005/08/30 22:44:06 drh Exp $}
	5	source common.tcl
	6	header {The SQLite Query Optimizer}
	7
	8	proc CODE {text} {
	9	puts "<blockquote><pre>"
	10	puts $text
	11	puts "</pre></blockquote>"
	12	}
	13	proc IMAGE {name {caption {}}} {
	14	puts "<center><img src=\"$name\">"
	15	if {$caption!=""} {
	16	puts "<br>$caption"
	17	}
	18	puts "</center>"
	19	}
	20	proc PARAGRAPH {text} {
	21	puts "<p>$text</p>\n"
	22	}
	23	proc HEADING {level name} {
	24	puts "<h$level>$name</h$level>"
	25	}
	26
	27	HEADING 1 {The SQLite Query Optimizer}
	28
	29	PARAGRAPH {
	30	This article describes how the SQLite query optimizer works.
	31	This is not something you have to know in order to use SQLite - many
	32	programmers use SQLite successfully without the slightest hint of what
	33	goes on in the inside.
	34	But a basic understanding of what SQLite is doing
	35	behind the scenes will help you to write more efficient SQL. And the
	36	knowledge gained by studying the SQLite query optimizer has broad
	37	application since most other relational database engines operate
	38	similarly.
	39	A solid understanding of how the query optimizer works is also
	40	required before making meaningful changes or additions to the SQLite, so
	41	this article should be read closely by anyone aspiring
	42	to hack the source code.
	43	}
	44
	45	HEADING 2 Background
	46
	47	PARAGRAPH {
	48	It is important to understand that SQL is a programming language.
	49	SQL is a perculiar programming language in that it
	50	describes <u>what</u> the programmer wants to compute not <u>how</u>
	51	to compute it as most other programming languages do.
	52	But perculiar or not, SQL is still just a programming language.
	53	}
	54
	55	PARAGRAPH {
	56	It is very helpful to think of each SQL statement as a separate
	57	program.
	58	An important job of the SQL database engine is to translate each
	59	SQL statement from its descriptive form that specifies what the
	60	information is desired (the <u>what</u>)
	61	into a procedural form that specifies how to go
	62	about acquiring the desired information (the <u>how</u>).
	63	The task of translating the <u>what</u> into a
	64	<u>how</u> is assigned to the query optimizer.
	65	}
	66
	67	PARAGRAPH {
	68	The beauty of SQL comes from the fact that the optimizer frees the programmer
	69	from having to worry over the details of <u>how</u>. The programmer
	70	only has to specify the <u>what</u> and then leave the optimizer
	71	to deal with all of the minutae of implementing the
	72	<u>how</u>. Thus the programmer is able to think and work at a
	73	much higher level and leave the optimizer to stress over the low-level
	74	work.
	75	}
	76
	77	HEADING 2 {Database Layout}
	78
	79	PARAGRAPH {
	80	An SQLite database consists of one or more "b-trees".
	81	Each b-tree contains zero or more "rows".
	82	A single row contains a "key" and some "data".
	83	In general, both the key and the data are arbitrary binary
	84	data of any length.
	85	The keys must all be unique within a single b-tree.
	86	Rows are stored in order of increasing key values - each
	87	b-tree has a comparision functions for keys that determines
	88	this order.
	89	}
	90
	91	PARAGRAPH {
	92	In SQLite, each SQL table is stored as a b-tree where the
	93	key is a 64-bit integer and the data is the content of the
	94	table row. The 64-bit integer key is the ROWID. And, of course,
	95	if the table has an INTEGER PRIMARY KEY, then that integer is just
	96	an alias for the ROWID.
	97	}
	98
	99	PARAGRAPH {
	100	Consider the following block of SQL code:
	101	}
	102
	103	CODE {
	104	CREATE TABLE ex1(
	105	id INTEGER PRIMARY KEY,
	106	x VARCHAR(30),
	107	y INTEGER
	108	);
	109	INSERT INTO ex1 VALUES(NULL,'abc',12345);
	110	INSERT INTO ex1 VALUES(NULL,456,'def');
	111	INSERT INTO ex1 VALUES(100,'hello','world');
	112	INSERT INTO ex1 VALUES(-5,'abc','xyz');
	113	INSERT INTO ex1 VALUES(54321,NULL,987);
	114	}
	115
	116	PARAGRAPH {
	117	This code generates a new b-tree (named "ex1") containing 5 rows.
	118	This table can be visualized as follows:
	119	}
	120	IMAGE table-ex1b2.gif
	121
	122	PARAGRAPH {
	123	Note that the key for each row if the b-tree is the INTEGER PRIMARY KEY
	124	for that row. (Remember that the INTEGER PRIMARY KEY is just an alias
	125	for the ROWID.) The other fields of the table form the data for each
	126	entry in the b-tree. Note also that the b-tree entries are in ROWID order
	127	which is different from the order that they were originally inserted.
	128	}
	129
	130	PARAGRAPH {
	131	Now consider the following SQL query:
	132	}
	133	CODE {
	134	SELECT y FROM ex1 WHERE x=456;
	135	}
	136
	137	PARAGRAPH {
	138	When the SQLite parser and query optimizer are handed this query, they
	139	have to translate it into a procedure that will find the desired result.
	140	In this case, they do what is call a "full table scan". They start
	141	at the beginning of the b-tree that contains the table and visit each
	142	row. Within each row, the value of the "x" column is tested and when it
	143	is found to match 456, the value of the "y" column is output.
	144	We can represent this procedure graphically as follows:
	145	}
	146	IMAGE fullscanb.gif
	147
	148	PARAGRAPH {
	149	A full table scan is the access method of last resort. It will always
	150	work. But if the table contains millions of rows and you are only looking
	151	a single one, it might take a very long time to find the particular row
	152	you are interested in.
	153	In particular, the time needed to access a single row of the table is
	154	proportional to the total number of rows in the table.
	155	So a big part of the job of the optimizer is to try to find ways to
	156	satisfy the query without doing a full table scan.
	157	}
	158	PARAGRAPH {
	159	The usual way to avoid doing a full table scan is use a binary search
	160	to find the particular row or rows of interest in the table.
	161	Consider the next query which searches on rowid instead of x:
	162	}
	163	CODE {
	164	SELECT y FROM ex1 WHERE rowid=2;
	165	}
	166
	167	PARAGRAPH {
	168	In the previous query, we could not use a binary search for x because
	169	the values of x were not ordered. But the rowid values are ordered.
	170	So instead of having to visit every row of the b-tree looking for one
	171	that has a rowid value of 2, we can do a binary search for that particular
	172	row and output its corresponding y value. We show this graphically
	173	as follows:
	174	}
	175	IMAGE direct1b.gif
	176
	177	PARAGRAPH {
	178	When doing a binary search, we only have to look at a number of
	179	rows with is proportional to the logorithm of the number of entries
	180	in the table. For a table with just 5 entires as in the example above,
	181	the difference between a full table scan and a binary search is
	182	negligible. In fact, the full table scan might be faster. But in
	183	a database that has 5 million rows, a binary search will be able to
	184	find the desired row in only about 23 tries, whereas the full table
	185	scan will need to look at all 5 million rows. So the binary search
	186	is about 200,000 times faster in that case.
	187	}
	188	PARAGRAPH {
	189	A 200,000-fold speed improvement is huge. So we always want to do
	190	a binary search rather than a full table scan when we can.
	191	}
	192	PARAGRAPH {
	193	The problem with a binary search is that the it only works if the
	194	fields you are search for are in sorted order. So we can do a binary
	195	search when looking up the rowid because the rows of the table are
	196	sorted by rowid. But we cannot use a binary search when looking up
	197	x because the values in the x column are in no particular order.
	198	}
	199	PARAGRAPH {
	200	The way to work around this problem and to permit binary searching on
	201	fields like x is to provide an index.
	202	An index is another b-tree.
	203	But in the index b-tree the key is not the rowid but rather the field
	204	or fields being indexed followed by the rowid.
	205	The data in an index b-tree is empty - it is not needed or used.
	206	The following diagram shows an index on the x field of our example table:
	207	}
	208	IMAGE index-ex1-x-b.gif
	209
	210	PARAGRAPH {
	211	An important point to note in the index are that they keys of the
	212	b-tree are in sorted order. (Recall that NULL values in SQLite sort
	213	first, followed by numeric values in numerical order, then strings, and
	214	finally BLOBs.) This is the property that will allow use to do a
	215	binary search for the field x. The rowid is also included in every
	216	key for two reasons. First, by including the rowid we guarantee that
	217	every key will be unique. And second, the rowid will be used to look
	218	up the actual table entry after doing the binary search. Finally, note
	219	that the data portion of the index b-tree serves no purpose and is thus
	220	kept empty to save space in the disk file.
	221	}
	222	PARAGRAPH {
	223	Remember what the original query example looked like:
	224	}
	225	CODE {
	226	SELECT y FROM ex1 WHERE x=456;
	227	}
	228
	229	PARAGRAPH {
	230	The first time this query was encountered we had to do a full table
	231	scan. But now that we have an index on x, we can do a binary search
	232	on that index for the entry where x==456. Then from that entry we
	233	can find the rowid value and use the rowid to look up the corresponding
	234	entry in the original table. From the entry in the original table,
	235	we can find the value y and return it as our result. The following
	236	diagram shows this process graphically:
	237	}
	238	IMAGE indirect1b1.gif
	239
	240	PARAGRAPH {
	241	With the index, we are able to look up an entry based on the value of
	242	x after visiting only a logorithmic number of b-tree entries. Unlike
	243	the case where we were searching using rowid, we have to do two binary
	244	searches for each output row. But for a 5-million row table, that is
	245	still only 46 searches instead of 5 million for a 100,000-fold speedup.
	246	}
	247
	248	HEADING 3 {Parsing The WHERE Clause}
	249
	250
	251
	252	# parsing the where clause
	253	# rowid lookup
	254	# index lookup
	255	# index lookup without the table
	256	# how an index is chosen
	257	# joins
	258	# join reordering
	259	# order by using an index
	260	# group by using an index
	261	# OR -> IN optimization
	262	# Bitmap indices
	263	# LIKE and GLOB optimization
	264	# subquery flattening
	265	# MIN and MAX optimizations