

InnoDB faster than NoSQL solutions? - petervandijck
http://mysqldba.blogspot.com/2010/12/handlersocket-mysqls-nosql-php-and.html

======
coffeemug
You cannot get 750K transactions/second with durability for writes, or for
reads when they go out of RAM. An HDD drive operates at roughly 250
ops/second, an SDD at roughly 10k ops/second (huge oversimplification, but
roughly correct). Even when you batch things and sacrifice a little bit of
latency, you can't get even close to 750K QPS.

InnoDB is a great engine, but we need to know exactly what the benchmarking
number means before we can start relying on it.

~~~
morgo
Right: It's very workload sensitive how this is going to help. If your data
set fits in memory, it's probably very good. If you're waiting on IO, the cost
of SQL parsing becomes trivial in comparison.

What is actually interesting, is that InnoDB actually has some optimizations
around writing that NoSQL databases do not (configure multiple read/write
threads - required for raid controllers and faster SSDs). So while the
benchmark does show the absolute best case, it's not like everything else is a
write off.

A B+Tree index (assuming hot spots) can usually scale quite well for out of
memory fit. This is different for example, from a hash index which aims to
have random distribution to each bucket.

(Disclaimer: I work for Percona, the company that releases Percona Server).

------
teoruiz
There was a more detailed discussion about HandlerSocket in another thread:

<http://news.ycombinator.com/item?id=1886137>

Original article: [http://yoshinorimatsunobu.blogspot.com/2010/10/using-
mysql-a...](http://yoshinorimatsunobu.blogspot.com/2010/10/using-mysql-as-
nosql-story-for.html)

------
antirez
I think that this exact post, a few months ago, could get very different
comments, possibly along the lines "wow, so what the hype about NoSQL is at
all?".

Instead here I see tons of comments that are to the point, that show how we
are evolving as the hackers community about the ability to understand
tradeoffs with database systems. This is a _huge_ win.

Also I bet the reverse is true, that the "magical features" about NoSQL
systems to be fast, scalable, handling huge data sets, fault tolerant while
distributed, everything at the same time, is hardly believed at this point.

------
meursault
Doesn't bypassing the SQL parser kind of make this a NOSQL solution?

~~~
seunosewa
It's also a NoACID solution.

~~~
jerf
It says you still get the ACID from InnoDB in the article; I have no way to
independently verify that, but since people are talking about the speed gains
coming from bypassing the SQL parser it would make sense that InnoDB still
sees the same basic queries and therefore has the same properties it usually
does. InnoDB, despite its association with MySQL, does have ACID and
transactions and stuff.

~~~
aidenn0
I'd want to verify that claim, since it is also avoiding mutex contention.

------
m0th87
Yes, I'm sure it is (for some definitions of faster) and isn't (for others).

No benchmarks, no comparative NoSQL code, and the example is trivial. There's
nothing interesting here.

~~~
cies
indeed.

------
elliottcarlson
Is the same InnoDB data available via MySQL itself still using standard SQL
syntax, or is utilizing InnoDB in this manner locking it out from that?

Having a dual angle of accessing the data could be very beneficial when it
comes to reporting etc, while maintaining a high throughput access for the
actual production code.

~~~
danudey
You do get both forms of access, so you can make queries from both interfaces.

I believe there is a locking mechanism whereby the HandlerSocket interface
holds its own lock for all its concurrent client, which it releases
occasionally to allow MySQL to have access as well ('occasionally' in this
case might be 'a hundred times a second' for all I know).

Note that this doesn't give you SQL or any of the benefits. For example, you
can't do joins (but you could write them manually, since you get better
throughput manually so the speed might outweigh the inefficiencies of multiple
round-trips.

~~~
ams6110
Unlikely that you could manually join tables any faster than MySQL can, given
that the optimization of joins in MySQL has surely been given considerable
attention.

~~~
danudey
Maybe not, but if you're bypassing MySQL's internal mutexes and SQL parser to
get significantly faster results, it's possible that those speed improvements
would outweigh doing two fetches via HandlerSocket.

------
runningdogx
This sounds like Drizzle, where they've modularized almost everything, and you
can get similarly ultrahigh tps by keeping connections open and bypassing the
standard sql parser and query planner. (Obviously, the assumption for such
high numbers is that your disk array can keep up or that the transactions are
only hitting memory.)

------
js4all
People would not be discussing speedier solutions, if there weren't these fast
NoSQL DBs. Plus 1 for NoSQL.

Another aspect where this solution will always lose under real load, is the
speed loss due to locking and blocking. CouchDB for instance uses MVCC, which
never blocks.

------
shykes
Riak uses InnoDB as its default disk storage engine. Membase is built around
the well-established memcache protocol. Both are very promising projects.

I see an encouraging trend in enhancing proven technology, instead of re-
writing everything indiscriminately.

~~~
peschkaj
The current version of Riak uses Bitcask as the default disk storage engine,
but InnoDB is an option that you can use. They each work well for specific
workloads - Bitcask scales up until your keys (not total data) can't fit in
RAM.

~~~
shykes
Thanks for the correction. I still by everything in my comment except for the
word "default" :)

