

LevelDB Benchmarks - dchest
http://leveldb.googlecode.com/svn/trunk/doc/benchmark.html

======
snewman
I wish they had done a benchmark of mixed reads and writes. LSMs have some
nice properties, but read performance in the presence of writes may not be as
good as when there are no writes. With no write traffic, you wind up with only
a single .sst file, so reads only have to consult that one table.

Still, good to have some numbers.

------
thezilch
It'd be interesting to see LevelDB pitted against InnoDB/HandlerSocket and
Redis (with eventual diskstore) in the k/v department. Both have a lot more
going for them, where the need arises for a higher level or "more structured"
access to the same underlying data, offered by MySQL or Redis structures
(lists, sets, hashes, etc).

~~~
dchest
Here's comparison with InnoDB <http://blog.basho.com/2011/07/01/Leveling-the-
Field/>

------
aaronblohowiak
This benchmark seems trustworthy because they have lots of benchmarks where
LevelDB performs much worse than the competition (writing large objects;
random reads with large caches)

------
rb2k_
Kyoto TreeDB and SQLite both use a B+Tree/B-Tree implementation. As far as I
remember LevelDB uses a Log-Structured Merge Tree.

So this is basically a comparison between those, right?

Does a LSM offer ordered access or do you lose that feature and gain a bit of
speed?

~~~
snewman
LSMs support ordered access. The implementation performs an ordered scan on
each .sst and merges the results.

------
sirrice
It would be nice to see longer running experimentsto see how a) long term
usage (compaction in LSM trees) and b) disk interactions (benefits of
compression) impact performance.

For example, in the large values experiment, the experiment executes 1000
writes. However if the results report LevelDB as 1060 ops/sec. Assuming op ==
write, then the experiment ran for all of 1 second. This shows great
instantaneous performance, but what if it kept running?

Additionally, it appears that no-compression is the way to go, which makes
sense for small values and an in-memory experiment, but is that the case when
disk comes into play?

~~~
andytwigg
I've read most of the messages on this list with interest. I've done some
benchmarking on LevelDB with billions of entries, and on commodity flash SSDs,
and thought that summarising it might be of interest to folks on this list:

<http://www.acunu.com/blogs/andy-twigg/benchmarking-leveldb/>

Comments most welcome.

-Andy

\-- CTO, Acunu Inc. www.acunu.com |
<http://www.cs.ox.ac.uk/people/andy.twigg/>

------
ryanpetrich
The synchronous writes benchmark is suspect--on a 7200rpm disk it seems
impossible to get 2,400 dependent transactions (or even sqlite's 430) with a
full disk sync. Is this due to ext3's relaxed notion of fsync?

~~~
MichaelGG
Well look how SQLite does write ahead logging (assuming I understand it).
Transactions can go into the log and commit, without modifying the original
data. Now and then, it can "checkpoint" to move those logged transactions into
the database. This way, transactions are mainly sequential writes, with a bit
of maintenance now and then.

I've used a similar approach for a constant database. The transaction log goes
to disk, but also stays in memory. Scanning it linearly isn't that slow. When
it grows too large, the log is converted into a disk-optimized format, which
is a rather quick process. This way, it can take tons of writes, service reads
acceptably, while still being able to store much more data than RAM and
offering fast access to on-disk data.

~~~
ryanpetrich
I understand that to be the case when not using synchronous. The point of
synchronous full is that sqlite should only consider a transaction committed
when it is fully flushed to disk. Assuming a 7200rpm disk that isn't lying to
you, at most 120 dependent transactions a second should be even possible.

(this should apply equally to other databases that have a similar option)

~~~
MichaelGG
Why does the disk need to re-seek to write to the next sector? With things
like SQLite WAL, you're just performing sequential writes to disk.

~~~
ryanpetrich
I hadn't thought of that. If the disk can perform some localized remapping,
it's possible to get sequential fsynced writes to exceed the rotation speed.

