

MemSQL Does Oracle’s Own Demo Ten Times as Fast, Sixty Times Cheaper - frostmatthew
http://blog.memsql.com/memsql-does-oracles-own-demo-ten-times-as-fast-sixty-times-cheaper/

======
jandrewrogers
It is generally not that difficult to greatly out-perform Oracle on specific
benchmarks. Postgres can do it too. Companies do not buy Oracle for the
performance per se but as time has gone on Oracle has definitely been losing
ground in that arena.

MemSQL makes a good product but nothing about their architecture is
particularly novel, and other systems have a similar design. I've designed
several database engines; I would _expect_ numbers like they demonstrated on a
decent columnar implementation with the tradeoffs MemSQL has made. And Oracle
has never been optimized for these types of queries because those tradeoffs
adversely impact performance in other areas Oracle cares about. Features like
compiling queries are mundane; many commercial databases did that a decade ago
and it is a standard design element today. Same with the two-tier in-memory
and columnar disk storage.

I'm sure MemSQL is fast, but "world's fastest" should be assumed to be
hyperbole. Most of their benchmarks are table stakes for a modern, real-time
analytical database kernel. It is apparently a well-designed product but
nothing many other companies are not doing.

Here is what I tell people a properly designed database kernel on modern mid-
range hardware should be able to do _per server_ : continuously insert
millions of records per second all the way through storage, leaving virtually
all of the cores idle for some large number of concurrent queries. If you
can't saturate 10 GbE on ingest concurrent with saturating 10 GbE on query
output, your design is most likely obsolete or broken.

Things have moved a very long way since I first started designing massive
scale database systems (on Oracle, way back in the day).

~~~
jl6
What in your opinion counts as modern mid-range hardware?

~~~
jandrewrogers
The physical servers used in clusters I buy today run about $4k per server
without storage. Outside of the dual 10 GbE interfaces, they are like most
other budget servers. Storage costs per node vary depending on the size, type,
and mix of disks. Driving the disk I/O at 10 GbE can be trivial with a good
I/O scheduler design, so ironically the details of the storage device are
almost irrelevant for many workloads. Cheap SSDs are the default target but
cheap spinning disk works surprisingly well for many workloads.

Various AWS nodes are also used in clusters quite a bit (though we use
ephemeral disks rather than EBS). Nothing exotic.

------
tkyjonathan
There is a lot of use case optimization going on here. The sorting inside the
columnar store is in and of itself a huge performance boost. And saying that
it scans 138Billion rows per second is completely wrong because it uses a lot
of short cuts to get the result and it doesnt even use rows to store the data.

There is nothing stopping Oracle using the same optimization for their use
case, although they would be better suited if someone from the crowd queried
something that MemSQL didnt sort/optimize for.

~~~
nikita
It's hard to define what a "fair" columnar store scan is. We use a technique
that is called "segment elimination", which allows to skip over compressed
segments of data as you scan. However every solid columnar database uses such
techniques. It just happens that Oracle 12c doesn't have the ability to sort
columnar indexes, and neither does it have the ability to store columnar
indexes on flash or disk, which bloats the overall cost.

~~~
jma24
The much bigger problem with Oracle In-Memory is that a lot of operations like
joins and sorts often happen in the SGA, in row form. So you end up needing a
truckload of SGA for temporary calculations. And when the SGA runs out, which
it invariably does, you end up using TEMPDB. And everything comes to a crawl.

------
capkutay
MemSQL is doing great work...However from a strategic point of view, I wonder
how they plan to compete against 2 of the world's largest DB vendors (SAP and
Oracle) at their core product. Oracle's flagship DB has more R&D, sales,
marketing than a startup could ever muster even with the help of big name
VC's. But I'm sure that's part of startup glory. "no one thought we could do
it!" is how it always starts. I suppose they could be an acquisition target
for HP or EMC or a company thats lagging in the data management world

~~~
ffk
They will be able to pull in customers who cannot afford to spend 3 million on
hardware for a database but still need very high performance. There should be
enough to build a business in that niche.

Those who can afford the high price tag will likely continue to buy from
Oracle and the SAP in the short term. Over time, MemSQL can probably win some
of these customers as they establish more credibility and a proven track
record.

------
jma24
Doing some basic math... Wikipedia is around 80m rows a day, so 4 months of
Wikipedia is around 9.5bn rows. But they show 17bn on the graph.

Typical columnar compression gives about 11GB per 1bn rows, so 17bn rows
should be 187GB. The AWS machines they are using should be c3.4xlarge which
are 30GB, and 6 of them is 180GB. But you can't run an in-memory column store
at 100% RAM, you need to run it at 50-70% so you have capacity for
calculations.

Is it just me or do the results not make any sense? Seems likely they actually
had 9.5bn or 4 months data, which conveniently is what the graph shows?

------
angryasian
is memsql a free product, I can seem to find anything about pricing or
otherwise on their site ?

~~~
gcommer
The 60x cheaper benchmark is based on hardware costs, comparing 6 EC2 nodes
(they say ~$20k worth of servers) vs a > $million Oracle server.

------
pacaro
I'm amused that Nikita Shamgunov's bio suggests that he was a 'distinguished
senior database engineer' at Microsoft. The title "Distinguished Engineer" at
Microsoft refers to a pretty select group of people, not just any senior dev

~~~
nikita
That's right. "Distinguished Engineer" is a title at Microsoft. At the time of
my tenure at Microsoft my title was "Senior Engineer". The bio was written
without referring to the title, but rather the fact that I was "distinguished"
with Gold Star awards and HiPo program. I asked to remove "distinguished" from
the bio to remove any confusion. It's already live.

~~~
pacaro
This is a classy response. Corporate bios, like any pr, can walk a fine line
in what they communicate.

------
xtacy
I was hoping the blogpost would give at least a pointer or some insights about
the actual experiment. Am I missing some context here?

------
timetraveller
I wonder if it's a good idea to list comcast as one of you clients...

~~~
dguaraglia
Sure, why not? It's a big company with a lot of brand recognition. The people
most likely to be interested in this kind of product will first think 'oh,
they have a big client!' before they think 'oh, they are used by a company I
dislike because of anecdotal data or their stance on so called net
neutrality.'

Heck, I'd be even more interested if they told me they serve the NSA or
Palantir, as much as I dislike everything they stand for.

~~~
ericfrenkiel
This would be of interest to you then:
[http://siliconangle.com/blog/2014/09/29/heres-why-the-
intell...](http://siliconangle.com/blog/2014/09/29/heres-why-the-intelligence-
community-bought-a-stake-in-memsql/)

~~~
dguaraglia
Ohhh... interesting. Thanks for the link!

------
currysausage
A font with strokes 1px thick when you _scale it to 200 percent?_ What a great
design! I'm sure it looks so cool on the designer's retina display.

You don't want me to read your page? Fine, then I won't.

Edit: Hehe, now the contrast of my comment is about as bad as the contrast of
the blog. I guess criticizing Silicon Valley's favorite design fad of the
2010s is too much for the HN crowd to stomach.

~~~
nacs
That's just the 300-weight version of the Lato font, one of the most widely
used fonts on the internet.

If you have trouble reading it, you may need a new monitor or get your
eyesight checked.

~~~
currysausage
_> you may need a new monitor_

No, I'm actually perfectly fine with my Eizo. It's not its fault that it isn't
able to show half-pixels. That's by design.

 _> the Lato font, one of the most widely used fonts on the internet_

I seriously doubt that this is the case. Last time I stumbled across it was
heartbleed.com, and I had to use the F12 console to de-Lato the page, that's
why I remember. That was half a year ago, thankfully.

If you spend your day reading about cool parallax scrolling plug-ins for your
site, then you might see a lot more Lato than I do, yeah.

~~~
nacs
> I seriously doubt that this is the case

[http://www.google.com/fonts#Analytics:total](http://www.google.com/fonts#Analytics:total)
. 5th most popular font at the moment ( 2,059,357,755 views for Lato in the
last 7 days and 64+ billion in the past year).

And that doesn't even include the Typekit deployments (or locally hosted
ones).

~~~
currysausage
_> 5th most popular font at the moment_

So, Google Web Fonts now is the benchmark for professional screen typography?

None of the sites that I visit on a regular basis use Google Web Fonts. The
only one using any webfont for body text is The Guardian, which uses an
excellent custom font designed by Christian Schwartz and Paul Barnes.

~~~
nacs
> So, Google Web Fonts now is the benchmark for professional screen
> typography?

It's certainly a better benchmark than "none of the sites that _I_ visit on a
regular basis"...

And if hard numbers like 64 billion+ views per year don't convince you then I
won't bother.

------
SEJeff
The name of this makes me think of that MongoDB is web scale video[1]'s
comment, 'Is /dev/null web scale?'.

[1]
[https://www.youtube.com/watch?v=b2F-DItXtZs](https://www.youtube.com/watch?v=b2F-DItXtZs)

------
threeseed
Utterly stupid post from a company who should know better. The biggest reasons
that companies buy expensive appliances is (a) data sovereignty and (b)
ongoing support. The cloud e.g. AWS does have decent answers for this yet.

And the companies that are buying these types of appliances like we do
couldn't care less about a few hundred thousand dollars.

~~~
koenigdavidmj
> _The biggest reasons that companies buy expensive appliances is (a) data
> sovereignty and (b) ongoing support_

Don't underestimate reason (c): because some C-levels went golfing together.

~~~
etherael
That's actually more likely to be reason (a).

