
Sqlite Performance - turrini
http://stackoverflow.com/questions/1711631/improve-insert-per-second-performance-of-sqlite
======
nodesocket
> If you have indices, consider calling CREATE INDEX after doing all your
> inserts. This is significantly faster than creating the index and then doing
> your inserts.

That's interesting. Can anybody explain why?

~~~
SQLite
Short answer: Write Amplification.

Long answer: When the index is created first, new entries have to be inserted
into the index (in sorted order) as they arrive. This involves writing various
individual records at arbitrary places in the b-tree. Each such write is
perhaps 20 bytes in size (depending on your index, of course). But, behind the
scenes, your OS is really reading an entire 4K page from your HDD/SSD,
swapping out the 20 bytes that you sent to write() and the pushing the whole
4K page back to the HDD/SSD. So your 20-byte random write turned into a 4K
byte write. That is "write applification". But if you insert all the records
into the table first (inserting into an unindexed table is just an append)
then create the index afterwards, SQLite is free to sort the columns to be
indexed (using an efficient external merge sort algorithm) then write them to
disk in sorted order. That way, each new 20-byte records is appended, rather
than written randomly into the file. That means multiple appends can be
grouped together to form a full 4K page which is then written exactly once.
Much less I/O is required.

~~~
nodesocket
Amazing explanation. I am envious of your knowledge :-)

------
alschwalm
Anyone have any thoughts on why removing the superfluous assignments had such
a dramatic effect? That was a surprise to me.

~~~
StavrosK
Looks to me like each one was taking 10 μsec, so 100k of them shaved a second
off. I am really surprised at how SQLite can insert a single record in 100
μsec, though.

