
MySQL is bazillion times faster than MemSQL - spudlyo
http://dom.as/2012/06/26/memsql-rage/
======
nikita
MemSQL CTO here. Great article- Domas has done a good job of digging into the
internals of MemSQL! A few questions/comments:

1.) The range query issue you pointed out can be explained by a well known
limitation of skip lists. Unlike B-Trees, skip lists are unidirectional. By
default, our indexes are ascending, so indeed you have to skip to the end to
run a MAX() or "ORDER BY id DESC" query. To fix this, just change the primary
key to be descending in your schema:

    
    
        CREATE TABLE x (id int, ..., PRIMARY KEY id (id) DESC, ...)
        

This is explained here
[http://developers.memsql.com/docs/1b/indexes.html#skip-
list-...](http://developers.memsql.com/docs/1b/indexes.html#skip-list-
indexes). If you want both behaviors in your app, you'll have to create two
skip list indexes (for now).

2.) The transaction-buffer > 0 setting is really where we shine. Synchronous
durability is something we have for flexibility, but I'll be the first to
admit that it's not really optimized to perform well. The customers that we're
working with are okay with this. And it's inspired by what modern companies
do. Maybe it's changed in the time since I've left (Domas?) but Facebook
certainly does not write comments synchronously to disk.

3.) Our synchronous durability does indeed flush in 50 MS cycles, so you'll
see poor performance on single threaded inserts. However, as you pointed out,
we're optimized for the case with multiple concurrent writers. Since MemSQL
implements group commit on high parallel load throuput picks up. Sure, writing
your own very specific benchmark you can show us writing poorly, but we've
never worked with a customer that's needed single threaded, synchronous writes
to shine. If this is your use case, unless you need substantial read
performance, MemSQL is not the database for you.

~~~
timaelliott
You seem to be missing the main reason why people have an issue with MemSQL.
The issue are quotes like this on your website and video:

"MEMSQL IS 30 TIMES FASTER THAN MYSQL."

It's an extremely invalid and biased comparison. You're quite literally
comparing the speed of writing to RAM versus the speed of writing to disk. If
you didn't make such ridiculous assertions, people would accept your product
for the actual awesome things it does and not focus on refuting your bullshit
claims.

~~~
tigerBL00D
Tim, if you want to talk substance, you have to peer deeper under the covers.
And if you do that, then you just can't overlook the fact that the query
execution model used by MemSQL is significantly different than that used by
old school relational databases, including MySQL. And what's different is that
MemSQL translates you SQL query into extremely efficient C++ code. Code that
is compiled and executed natively. Whereas MySQL, SQL Server, Postgress,
Oracle - all of these products evaluate queries by interpreting their
respective tree representations of your SQL queries. So which database do you
expect to be faster? The one that runs native code or ones that interpret?

This is a huge differentiator we are talking about here. For someone who has
been around databases for quite some time (of which I had spent about 5 years
working on SQL Server engine) this is very exciting. Is it not?

~~~
phillmv
>And what's different is that MemSQL translates you SQL query into extremely
efficient C++ code. Code that is compiled and executed natively. Whereas
MySQL, SQL Server, Postgress, Oracle - all of these products evaluate queries
by interpreting their respective tree representations of your SQL queries.

This sounds like absurd cargo culting. I've never designed a database but
parsing the SQL can not have ever been the bottleneck; the fact that you
rarely write to disk is where any speed improvements can be found.

~~~
tigerBL00D
You are asserting that databases are always I/O bound and never CPU bound.
Your assertion is wrong. A properly tuned database will become CPU bound
(whether it's MemSQL or MySQL). At that point hyper-efficient execution
results in higher throughput and lower latency.

~~~
phillmv
> At that point hyper-efficient execution results in higher throughput and
> lower latency.

I don't get excited for 1% decreases in latency. I'm willing to bet money that
the performance penalty behind parsing the sql queries is asymptotically a
constant - or at least, a tiny fraction of the time spent computing the data
set.

I feel especially confident in this because memsql never brags about the
specific increase in speed. This can't be hard to measure.

~~~
tigerBL00D
Phill, parsing has little to do with what goes on at query execution time.
Parsing is typically done once for a given parameterized query (and MemSQL
automatically parameterizes queries with constants). After parsing comes
compilation which produces a query plan. The query plan is cached and that is
the thing that is actually used to evaluate the query when it's re-submitted
again. We are discussing the differences in how MemSQL handles executing
compiled queries. Executing compiled queries is what a (well tuned) database
spends most of its time doing. It is the hot code path. And if you cut the
instruction count in half you can expect latency to decrease proportionally
and throughput to increase.

You are making all these skeptical claims that are technically wrong. I
suggest you download MemSQL, compile a query and read the generated C++ code.
Then download MySQL source and look at that too. At that point you will be
coming to your own conclusions and I'll be happy to resume this debate.

~~~
Nitramp
I've actually written a compiler from database queries to native code (JIT'ed
bytecode actually, but doesn't make a difference here) once.

Some queries in databases are indeed CPU bound, but the general statement that
well-optimized databases turn to be CPU bound is not correct like that. It all
depends on the use case.

If you're thinking full table scans with complicated processing on a database
that's entirely in memory, then yes, you'll be CPU bound. But many (most?)
databases are not entirely in memory, and there's no reason they need to be.
With reasonable memory and indexing, most queries boil down to only a few
IOs/disk seeks, so you're looking at <50ms, which is entirely fine for your
average web request.

That's the case for the majority of "get by ID" style queries, which in my
experience really are the lions share of queries, and in particular are also
what needs to be fast. A complicated reporting query can take a couple of
seconds, but your "main page/item view" must be fast.

If you have random reads on a large database that cannot be kept in memory,
this will be the majority of your workload. Those queries will spend all their
time waiting for the disk, and more or less none interpreting your query tree,
so optimizing the query tree processing part makes no sense at all.

TL;DR, in my experience optimizing query tree interpretation does not yield
significant benefits for the majority of queries, and IMHO won't be a
significant differentiator for MemSQL outside of marketing.

~~~
vasilizo
@Nitramp: Fortunately for all of us this world is full of unsolved problems
many of which exist outside of consumer web.

There are use cases where 50ms is way too long, and where jitter associated
with disk I/O is not acceptable. Capital markets for example. You are looking
for sub-millisecond latency to be in the intraday post-trade game, and single
digit milliseconds for quote-to-trade. The number of instructions in the hot
code path (and cycles per instruction) actually starts to matter. Lock-free
sturcutres utilized for storage are also important. This combination gives
MemSQL some rather unique low-latency properties that certain folks can
exploit to their delight.

To your other point, random reads and key-value lookups is where MemSQL
shines. Since you never wait on I/O and don't take locks, optimizing execution
makes a lot of sense actually. All that's left is execution and network.

~~~
chris_wot
_Since you never wait on I/O and don't take locks, optimizing execution makes
a lot of sense actually. All that's left is execution and network._

I'm not sure where you are getting this information. According to the MemSQL
documentation, they have one and only one isolation level, READ COMMITTED. But
READ COMMITTED takes write locks and blocks reads, but doesn't take any read
locks. Blocking behaviour still (and should!) still occur.

It sounds like you are looking for dirty read behaviour, but from the
literature MemSQL doesn't support READ UNCOMMITTED.

 _Update:_ Read a later FAQ on the MemSQL website. They are using MVCC, so
write locks are irrelevant. My apologies to the MemSQL team.

~~~
vasilizo
Chris, MemSQL utilizes versioning to implement READ COMMITTED. In this
implementation readers are never blocked.

It is addressed in this FAQ:
<http://developers.memsql.com/docs/1b/faq.html#c1-q4>.

~~~
chris_wot
OK, didn't see that you are using multiversion concurrency control.

I didn't realise you had a new FAQ up on the website - I'll have a look.

------
fusiongyro
Whenever one sees a claim about new database technology outperforming standbys
like MySQL and Postgres by factors of 10x or more, it's a good idea to suspect
the out-of-the-box configuration isn't really durable. What kills write
performance in databases is the time it takes to write to disk, not the
software overhead.

~~~
endersshadow
For what it's worth SQL Server doesn't meet his definition of "durable." It
writes a transaction log and then writes those transactions to disk on a
checkpoint [1]. Typically, that checkpoint automagically happens, but you can
also force it to clear the buffer with a "checkpoint" command.

InnoDB also does this, too [2]. I don't know if MemSQL actually has
checkpointing or what, but it's just worth noting.

[1]: <http://msdn.microsoft.com/en-us/library/ms189573.aspx>

[2]: [http://dev.mysql.com/doc/refman/5.0/en/innodb-
checkpoints.ht...](http://dev.mysql.com/doc/refman/5.0/en/innodb-
checkpoints.html)

~~~
jstclair
This is a little misleading; an internal checkpoint happens upon transaction
commit[1]. The comment makes it sound as if one or more transactions can
commit before a checkpoint writes them to disk.

[1]: [http://msdn.microsoft.com/en-
us/library/ms186259(v=sql.105)....](http://msdn.microsoft.com/en-
us/library/ms186259\(v=sql.105\).aspx)

~~~
spudlyo
The transaction is actually written to disk twice. Once when it is written to
the transaction log, and then again when a checkpoint happens and it's written
to the underlying data blocks. Transaction log writes are sequential and fast,
writes to the underlying data blocks are random and slow. The whole point of
the checkpoint is to convert this sequential i/o to random i/o in an efficient
manner.

Still, that's not what we're talking about. We're talking about the difference
between writing data to the transaction log, and writing data to the
transaction log _and then flushing it_. Once you call fsync() and the data is
flushed you know for a fact (provided the hardware isn't lying to you) the
bytes are on the damn platter, and not in some OS buffer cache.

~~~
Nitramp
... you'd think.

However in practice most operating systems and/or file systems and/or disks
cheat. fsync() is usually buffered in the hard disk itself. You can disable
that and force the hard disk to truly write out on fsync, but that is so
prohibitively slow that people rarely do that.

If you want absolute durability, you'll have to have hard disks running on
some battery buffered power supply, which is a common configuration.

On the other hand, in a proper database system, at least the data files won't
be corrupted by a missing fsync, so it'll come up. Figuring out whether that
one commit did or did not go out in the very rare event of a fatal power
failure and a just pending fsync() and the commit making it out of the network
stack in time is probably a Heisenberg-esque inquiry into obscure realms of
uncertainty.

~~~
spudlyo
First of all, ext(2|3|4) and XFS filesystems honor fsync. That's what all the
noise about Firefox and SQLite was all about. Secondly the HP RAID controllers
I'm familiar with disable the drive write cache by default, and throw dire
warnings if you try to turn it on, and make you ACK your choice like:

 _Without the proper safety precautions, use of write cache on physical drives
could cause data loss in the event of a power failure...._

The bottom line is that Database pros insist on and get actual durability, and
with a battery backed write cache, it's not painful.

~~~
Nitramp
The file system does, but does the hard drive?

But I agree, getting battery backed write caches is not that hard, it's just
not a default config.

------
slewis
From MemSQL's home page: "MemSQL writes back to disk/SSD as soon as the
transaction is acknowledged in memory."

From the article: "See, MemSQL also has “transaction-buffer” setting, which,
in default “full durability mode” will asynchronously return “ok” until 128M
buffer is full (or background log flusher thread writes it out)."

So which is it?

~~~
tintor
In MemSQL transaction buffer is flushed continuously as long as it is not
empty. Default value of 128M is there just to absorb bursts in workload.

~~~
Devilboy
So you could lose up to 128MB of updates if the machine crashes at the wrong
time. NOT DURABLE.

------
cs702
Whenever I read an article like this, I often end up feeling that on the
'truthiness' spectrum the scale goes like this: lies, damn lies, statistics,
benchmarks ;-)

~~~
maratd
Every public benchmark is a horrible blasphemous lie. The only benchmarks that
are honest are the ones you conduct in private, with your very specific use
case. There is no point in publishing them, because they're useless to
everybody but you. Public benchmarks, on the other hand, are crafted as
advertising for your favorite [fill in the blank].

~~~
slurgfest
If you generate some data which would be dishonest when published, it is also
dishonest in private, except that you are only being dishonest to yourself and
nobody can review your dishonesty.

If you generate data which are honest in your use case, then you can publish
them honestly as long as you are specific about your use case.

------
mumrah
This reminds me of some early MongoDB benchmarks. Claims of extremely high
write throughput when really it's writing to memory and relying on mmap to
eventually sync the data to disk.

Making write speed claims when you're not touching the disk is like bragging
about how fast you can cat a file to /dev/null.

------
arihant
When debating a fresh new database project - benchmarks aren't really the way
to go, because those can be improved. What should be debated is what
underlying fundamental has the new project gotten right. Almost every
successful database project got something fundamentally right - MySQL, Redis
(in-memory), MongoDB(page based storage), Riak(Distributed).

On that note, here's my take on MemSQL - People who love MySQL, almost never
want to migrate. People who hate MySQL, usually hate it because of SQL. People
who are far along in the projects enough that decision of database is no
longer a matter of taste, don't use a brand new untested DB.

Therefore, I am unsure how a brand new, in-memory, SQL based solution fits.

~~~
bsg75
> People who hate MySQL, usually hate it because of SQL.

Or because of the shortcuts it has taken to avoid the "hard stuff", which
results in developers taking shortcuts by using MySQL to avoid the "hard
stuff".

Love or hate SQL for what it is, not the MySQL implementation of it.

~~~
Domenic_S
> Or because of the shortcuts it has taken to avoid the "hard stuff", which
> results in developers taking shortcuts by using MySQL to avoid the "hard
> stuff".

I am deeply interested in what you think this is

~~~
bsg75
Avoiding RI (in part due to poor FK support depending on version / storage
engine), designs that avoid simple schema changes because they can't be done
online, point version migration difficulties.

Consider unsupported SQL constructs from other engines (CTEs, window
functions, better join and subquery support), and the effort and mess to
accomplish what should be straightforward tasks generates SQL hate.

Some NOSQL projects have merit as they provide mechanisms for working with
data structures that do not map well to rows and columns. However,
difficulties faced with MySQL have also pushed some projects to use
equivalently broken data storage technology, when a "better" RDBMS would be
more appropriate.

------
mariusmg
Writing in a volatile storage is not "durable by default" you asshats.

------
bond
Can we say they're deceiving us with this "MemSQL is 30 times faster than
MySQL"?

~~~
aioprisan
that and the claim that it's a durable db

------
blake8086
Why is "written to hard disk" considered to be "durable"?

Isn't that just a higher likelihood of "durability"?

~~~
forgotusername
This is actually a fair and valid point. On all but the most delicately
configured systems, the standard configuration of basically all SQL databases
does not occlude committed data loss during power failure due to the hard
disk's write cache. See for example
[http://www.postgresql.org/docs/8.3/static/wal-
reliability.ht...](http://www.postgresql.org/docs/8.3/static/wal-
reliability.html):

    
    
        When the operating system sends a write request to the disk
        hardware, there is little it can do to make sure the data has
        arrived at a truly non-volatile storage area. Rather, it is the
        administrator's responsibility to be sure that all storage
        components ensure data integrity. Avoid disk controllers that have
        non-battery-backed write caches. At the drive level, disable
        write-back caching if the drive cannot guarantee the data will be
        written before shutdown.
    

Which almost nobody does, because performance falls through the floor unless
you're pimping a $500 RAID card (don't even mention software RAID).

The SQLite documentation goes into further detail surrounding the atomicity of
writing a single 512 byte sector during a power failure (on old drives, on
4096 byte drives, and on SSDs). Few people seem to account for any of this
stuff, yet sleep soundly at night regardless.

~~~
Dylan16807
While backup power might be expensive in a hard drive situation, it looks like
even cheap SSDs are available with power loss protection. And as far as I'm
aware you're better off using SSDs for databases anyway.

------
peterwwillis
I think this whole conversation is making a mountain out of a molehill. When
you're building sites which have enormous performance requirements, you take
what you have and you force it to scale. You don't need to compare one tool to
the other and find the perfect benchmark. Whatever you use, I promise you, you
can make it perform as fast and scale as far as you need it to.

So just pick the tool that fits your requirements the best and stick with it.
That being said, I also urge you to pick something free and open-source if you
can. Some of the ways big sites have leveraged MySQL to the extent they have
is due to the custom patchsets they apply. Closed-source solutions may not
provide similar hotfixes or customization (and when it comes it's months later
than they claimed they'd have it ready)

------
ChikkaChiChi
Simple solution:

[1] Someone create a project on GitHub [2] Upload an amply sized CSV full of
data [3] Publish list of queries needed to "gauge" speed [4] Allow anyone to
post their metrics to see how well tuned any database can be in comparison

The only other criteria would be to ask them to all use the same form factor.
EC2 makes the most sense.

~~~
aphyr
Shared environments with virtualized IO are not good places to run benchmarks.

~~~
ChikkaChiChi
I couldn't agree more, but its the closest to standardized testing we can get
that everyone can use.

I HATE the idea personally, but I cannot think of a better way to start
comparing apples to apples.

~~~
aphyr
I'd opt for the TPC workloads on standardized hardware:
<http://www.tpc.org/tpcc/results/tpcc_price_perf_results.asp>

------
mirou
Hey Nikita having a colorful html5 home page is not enough for a project to be
performant

------
willvarfar
People should also look at - and benchmark - tokudb with its fractal tree
indices.

~~~
jeltz
Sadly for the development of databases their fractal tree technology is patent
encumbered.

------
aaronh
I demand a free lunch.

------
ericfrenkiel
When someone writes on their intro "... and now in return I want to waste your
time a bit." it's a sign that someone isn't being serious.

Domas' main criticism is that our video, <http://vimeo.com/44087431>, uses a
MySQL with standard defaults. Since tuning a database is by definition a
custom process, we wanted to demonstrate what performance you'd get "out of
the box." The video speaks for itself. MemSQL can push 80k/sec vs 3.5k/sec for
MySQL on EC2.

Domas actually demonstrates MemSQL's robust durability controls: he bounds
MemSQL to disk. If you want to throw an anchor behind a speed boat, you can do
it, though we don't recommend it.

Even Domas agrees his article isn't serious: "I do agree, that its execution
speeds for table scans are not bad – 8M rows scanned a second (for SELECT
COUNT(*) query) from single thread is sure a great achievement."

If that's the highest praise I can get from Domas, I can live with it.

~~~
caladri
How do you feel about the reported performance of "SELECT * FROM table ORDER
BY id DESC LIMIT 5;"? Is this something that needs to be improved there, or
are you happy with that being one of the built-in anchors?

Additionally, I'm not sure that the implementation of the hard reliable case
sounds very mature or even well thought-out. Having a syncer running at 20hz
and waiting for it even in the synchronous path seems a bad idea. That's kind
of like adding an anchor to your anchor so your speedboat can crawl while it
crawls. Or was that a misrepresentation?

And who cares if the tone is serious? The criticisms seem to be valid, and
ones that anyone releasing software making bold claims should expect.
Statements about performance need to be qualified. Out-of-the-box behavior is
not a reasonable point of comparison if (1) that's not what people do in the
real world (2) the tests you use are ones that one piece of software is tuned
for and the other is tuned against. Some kind of sorted index by field is
necessary for some very common and desirable queries. Some of the performance
gains come at an unacknowledged cost. The durability just isn't there (and for
trivial reasons that could be fixed by making a few more parameters tunable,
it sounds like), but is implicitly lauded by claiming to be ACID.

I don't know if you need to go looking for praise in that article. It really
seems like there are some criticisms it would be beneficial to take seriously,
even if you object to the author's tone or lack of praise. Even-handedness is
not a marker of honesty, and honest criticism is worth the headaches it causes
and the growth it spurs if you're willing to take it seriously.

~~~
tintor
"How do you feel about the reported performance of "SELECT * FROM table ORDER
BY id DESC LIMIT 5;"?"

Index was missing on ID column in MemSQL's case, so MemSQL was running entire
table scan instead of using the index as MySQL.

