
RethinkDB (YC S09): MySQL Storage Engine Built From The Ground Up For SSD - vaksel
http://www.techcrunch.com/2009/07/28/yc-funded-rethinkdb-a-mysql-storage-engine-built-from-the-ground-up-for-ssds/
======
edw519
Very interesting. I wish you great success.

Since your primary market will probably be developers, describing RethinkDB
will be necessary, but not sufficient. Also, demonstrated performance will be
necessary, but again, probably not sufficient. I, for one, want to understand
what goes on under the hood (and be able to describe it to my customers).

The "Time to Insert 2 Million Records" graph was impressive. How does the
"Time to Retrieve, Sort, & Present 2 Million Records" graph look? How about
the "Time to Modify 14% of the 2 Million Records" graph?

Your append-only approach sounds great for adds. How is it for changes and
deletes? How will garbage collection affect performance?

"No more locks" is a great claim, but how will it work in a real world
enterprise-quality app? User A takes 30 seconds to change Zip Code, Phone
Number, and increase Credit Limit while User B takes 10 seconds in the middle
of that to change City and decrease Credit Limit for the same customer. Who
wins? This is a difficult scenario in both optimistic and pessimistic
environments. I can only imagine how it's handled in a "no lock" environment.

(I'm not looking for answers here, just spouting off what's on the top of my
head, but I will be looking to better understand on your website. Your white
papers oughta be interesting.)

You make ambitious claims. I look forward to seeing you fulfill them. Best
wishes!

~~~
coffeemug
Thank you, this is a great post! We are looking forward to answering all of
these (and many other) concerns by publishing benchmarks, papers, blog posts,
tech talk videos, etc. We hope to do all of this in the coming weeks, we just
took it one step at a time.

~~~
mahmud
Good luck slava! And don't forget to send the compatibility patches for clsql-
mysql ;-)

------
cperciva
Congratulations -- but keep in mind that cleaning (aka. garbage collection) is
the hardest part of any log-structured storage system. I learned this lesson
first-hand when writing tarsnap. :-)

~~~
leif
Thanks; we're doing GC over the next few days (starting...now actually).

~~~
cperciva
Full GC (removing everything which is not needed to read the current DB
state), or partial GC (removing old metadata such as indexes, but leaving
behind data which has been deleted/modified)?

The former is much harder -- it's a significantly complex task just to figure
out which data records are still live.

~~~
coffeemug
Full GC. In the framework of what we're doing, it's not very complicated (we
think). This is because of our indexing scheme (we can simply walk the index
tree while the database answers queries). We hope to release more info on this
shortly.

~~~
cperciva
Cool -- good to see that you're thinking about this stuff ahead of time rather
than sitting down to write code and suddenly realizing that you painted
yourself into a corner. :-)

~~~
dbz
I did that once and regretted it.

~~~
mahmud
you must have programmed _once_ in your life then.

------
mcav
Cool snippet from one of the [founders?] on TechCrunch:

> _All of our interesting work is MySQL API-independent, so a Postgres port is
> not out of the question. We’ve also been entertaining the idea of porting to
> SQLite, as many embedded devices use that, and have SSDs already._

~~~
leif
yep, that's me. hi!

------
zhyder
Looks like it's not open source, though they're apparently considering it:
<http://news.ycombinator.com/item?id=729338> . For such an important part of
infrastructure, I don't think closed source will do, especially not from a
startup that's worked on it for only a few months. Open sourcing will:

\- Allow others to vet your codebase for stability and security

\- Give customers some recourse if your startup folds

\- Make you comparable to MyISAM/InnoDB/PostgreSQL, unless you want to be
compared to Oracle or Microsoft SQL

~~~
leif
To your points:

\- Nothing much I can say about stability and security, but then again, we're
not saying it's stable and secure yet. Besides, people trust Oracle without
seeing their sources, don't they?

\- If our startup folds, I doubt we'll drag our source to the grave. That
said, maybe it gives our users incentive to make sure we don't fold? :P

\- There are closed-source MySQL storage engines out there with whom we'd
rather be compared (TokuDB, Falcon (is Falcon open-source?)).

We just don't want to close any doors yet. If it makes good business sense,
we'd be glad to open the source.

~~~
zhyder
Oracle's been around a while and their stability and security have been
battle-tested, or at least that's the perception that makes a prospective
customer trust them.

Interesting that there are other closed-source MySQL storage engines. I wonder
how big a piece of the pie (among paying or willing-to-pay users) they have
compared to MyISAM/InnoDB.

------
antonovka
Using Apple's Xcode and Dictionary icons in the banner of the RethinkDB
website looks pretty skeevy. It's hard to do by accident, and beyond violating
Apple's copyright, it doesn't place your company in a good light.

I wouldn't bother commenting on it, but surprisingly this is not the first
time I've seen someone lifting Apple icons for their startup website, and it's
something that needs to be addressed: artists _are_ expensive, but you _can't_
take other people's artwork, and worse yet, it's _blazingly obvious_ when you
use Apple's.

~~~
mglukhovsky
All of the icons we use are under an open license[1,2]. If you do find any
specific copyright violations, we’ll certainly remove them.

[1] <http://www.iconfinder.net/icondetails/6166/128/>

[2] <http://www.iconfinder.net/icondetails/8722/128/>

~~~
antonovka
The icons are almost exact clones of Apple's -- the differences are so minor
that I didn't notice them when actually looking at the applications in my
dock. The Xcode icon, for instance, appears identical barring the reversal of
the hammer.

[edit]

The Xcode icon you linked to comes from a user-uploaded KDE Icon Theme:

[http://www.kde-look.org/content/show.php/Dark-
Glass+reviewed...](http://www.kde-look.org/content/show.php/Dark-
Glass+reviewed?content=67902)

If you download the actual theme set, you'll find that the copyright ownership
is unknown: "99% of this set is GPL now and what's not is most likely creative
commons (a tiny number of the mime types may be proprietary). PLEASE abide by
the licence rules, if you use icons from this set please research and credit
the appropriate people. I have been given permission to release other peoples
art work under the GPL so respect the licence." (from the README)

The original icon theme may be found here:
<http://www.mentalrey.it/project.html>

As noted by mikejs below, Apple actually uses the same icon you're using.
Digging a bit, it appears it's the icon Apple used for Xcode in Mac OS X 10.4
Tiger (Xcode 2.5): [http://www.command-
tab.com/images/photoshop/tiger_icons/prev...](http://www.command-
tab.com/images/photoshop/tiger_icons/preview.jpg)

Xcode 3.0 actually introduced the right-facing hammer:

[http://developer.apple.com/DOCUMENTATION/Cocoa/Conceptual/Ob...](http://developer.apple.com/DOCUMENTATION/Cocoa/Conceptual/ObjCTutorial/Art/xcode_icon.jpg)

~~~
mikejs
Apple seems to use the exact variant of the icon that these guys are using on
some parts of their site as well, see
<http://www.apple.com/ca/science/whymac/righttool.html>

------
neilc
_Database consistency problems require traditional databases to use
complicated locks. Because RethinkDB data is always consistent, locking is
unnecessary._

You may not use locking for concurrency control (plenty of "traditional DBs"
don't, either), but you still need some sort of concurrency control scheme --
just using append-only/log-structured storage doesn't make CC free. I'd be
curious to hear how you guys are doing this.

~~~
gaius
Statements like that set my alarm bells ringing.

~~~
neilc
Sorry, statements like what?

~~~
gaius
Statements like "traditional X does Y but..."

The "traditional" (and I don't know how you could call the latest version of
any of the major databases "traditional", this is a brutally competitive
market) databases don't lock just for the fun of it, but to enable features
that users want. Anyone can come up with a product that doesn't do Y if it
can't do X either. So what're we missing here?

~~~
neilc
_I don't know how you could call the latest version of any of the major
databases "traditional"_

Oracle has done MVCC for many years, as has Postgres. The canonical paper on
optimistic concurrency control for DBs is from 1981
(<http://www.seas.upenn.edu/~zives/cis650/papers/opt-cc.pdf>).

I didn't follow the rest of your comment, I'm afraid -- I was just saying that
I didn't see how using append-only storage immediately makes concurrency
control a non-issue. The comments from the RethinkDB guys upthread support
that: not supporting concurrent writers makes your concurrency control much
more straightforward.

------
jrockway
I am confused by this graph. The linear default behavior makes sense -- it
always takes the same amount of time to insert a row. The logarithmic behavior
confuses me -- as you add more rows to the database, the time to insert a row
_decreases_? If you add an infinite number of rows, each row can be inserted
in zero time? That doesn't make sense to me.

I would like to read the benchmark script.

Also, I'm afraid to read their source code after reading the license
agreement. I can't sell support for any product that can communicate with
RethinkDB? That sounds unenforceable, but it is scary enough to prevent me
from even looking at the code.

~~~
coffeemug
The benchmark is incremental. It measures how fast you can insert N rows given
M rows already in the database. So, the second data point means you can insert
roughly 750,000 rows in 140 seconds, given 750,000 rows already in the
database. The limit certainly doesn't approach zero as the number of elements
approaches infinity :)

The engine isn't open source. We're considering open sourcing it in the
future, but we want to understand all business implications of this decision
before we proceed - it's a decision you can't easily retract. The license is a
bit draconian, but this is because we've only released a developer pre-alpha.
We don't want people to use the engine in production yet - it's not ready.
AFAIK, the license says you can't sell RethinkDB support, not that you can't
sell support for software that uses RethinkDB.

~~~
jrockway
_AFAIK, the license says you can't sell RethinkDB support, not that you can't
sell support for software that uses RethinkDB._

From the license:

 _Prohibited activities include but are not limited to:

Selling support for products which incorporate RethinkDB._

This is the problem with rolling your own software licenses.

~~~
leif
For the moment, we cannot endorse using RethinkDB in a production environment,
and even then, this license is meant to be free for non-commercial. Once we
get to that point, we'll be re-visiting the license anyway, before we start
licensing to commercial users.

------
mrduncan
Congrats guys, sounds very exciting!

I just had one of those "why didn't I think of that" moments as I read through
your wiki thinking back to this paper I read a few months ago:
<http://publications.csail.mit.edu/lcs/specpub.php?id=773>

~~~
neilc
In fairness, log-structured storage is not a new idea. This paper is a great,
classic read on the subject:
<http://www.eecs.berkeley.edu/~brewer/cs262/LFS.pdf>

~~~
leif
Very much not new. Nobody's done it for MySQL yet to my knowledge though, and
I don't think many have looked at the implications of log-structured storage
on SSDs. That kind of storage is notoriously difficult to do cleanly on a
rotational drive, but our indexing scheme is quite simple. It's essentially
some combination of shadow paging and side-effect-free style from the
functional programming world, if that makes any sense to you.

------
vicaya
I still think SSD is a mere diversion in storage. According to Jason Hoffman
(CTO/Founder of Joyent, speaking on Structure'09), SSD under their typical
workload can only last for a month before wearing out:
[http://gigaom.com/2009/06/25/structure-09-how-to-scale-up-
wi...](http://gigaom.com/2009/06/25/structure-09-how-to-scale-up-with-
distributed-data-storage/)

Cheap commodity disks however have an annualized failure rate of 4% in
Google's datacenter (according to their disk analysis paper.)

------
mattyb
Congratulations! Glad to see some fellow Stony Brook folks on the map.

~~~
bkudria
I know, right?

------
rjurney
MySQL's pluggable storage engine model is so much win. You can have MyISAM,
InnoDB, InfoBright and this in the same database engine for the same
application. And there are many others.

------
kvs
My question is, how much of the speed-up is due to RDB being append-only and
how much of it is due to specialization to SSD.

------
henryl
Congrats. These guys seriously know what they're doing from the talks I've had
with them.

------
leif
vaksel: who are you? you stole our thunder :( actually :)

~~~
vaksel
nope, just your karma points.

Stealing thunder = releasing the same exact product, a month before you
finish.

------
prakash
Good luck, guys! This is an interesting market to be in.

It's also nice to see a lot of interesting database related companies coming
out of Stony Brook -- I think one of the founders of tokutek is from Stony
Brook as well.

~~~
leif
Michael Bender (one of our professors) is a co-founder of Tokutek, and several
SBU grad students work there.

They are doing some _really_ cool stuff!

------
chime
I did not see any mention of full-text capabilities. MyISAM is the only option
in MySQL for doing full-text searching and it's pretty limited due to the
table-locking issue. Are there any plans to have a good full-text search
feature in Rethink? Having a lock-free full-text table in MySQL would be
awesome for many many sites out there.

------
pmorici
Does RethinkDB support full text search?

~~~
defen
Since they're building a MySQL storage engine, you should be able to just hook
it into Sphinx and have everything work.

<http://www.sphinxsearch.com/>

~~~
ovi256
Sphinx is awesome. On a database where MySQL answered full-text search queries
in 20s, Sphinx builds indici in _2s_ , and queries are instantaneous for all
practical purposes.

------
aberman
You guys are gonna kill it. You deserve it. Great concept.

------
siong1987
Congratulations. Another YC 09 startup.

------
Confusion
This sounds pretty similar to Drizzle?

------
billclerico
congrats guys

~~~
mglukhovsky
Thanks, Bill! We're all very excited here.

------
lzhou
Grats guys!

------
mrandle
Nice work. A great idea well executed.

------
shiftace15
You guys rock!

