
MemSQL (YC W11) Raises $2.1 Million From Ashton Kutcher, SV Angel And More - ericfrenkiel
http://techcrunch.com/2011/07/12/y-combinator-alum-memsql-raises-2-1-million-from-ashton-kutcher-sv-angel-and-more/
======
saurik
The thing I keep finding confusing about these companies is: did anyone who
cares about performance bother coding their app to use MySQL in the first
place (as opposed to a database like PostgreSQL or Oracle, which have better
index search and lock algorithms)?

This is really important, as MySQL has a weird dialect of SQL with all kinds
of broken MySQL-isms (including fundamental things like not using the SQL-
standard meaning for the various forms of quoting) as well as drastically
limited support for supposedly-advanced things like sub-selects and truly-
advanced things like windowing and partition functions...

Seriously: I don't think I have any apps that would actually /run/ on
something that spoke MySQL without recoding all of the data access parts (and
that's pretty much what a web backend is) from scratch, swearing and cursing
the entire time about why I don't get to use X or Y or Z feature.

To me, this is just "yet another of those MySQL optimizer companies" that we
see come and go year in and year out, and one that is almost certainly going
to experience the same fate as all of the other ones: they are a dime a dozen,
and almost always fail. The failure rate on PostgreSQL optimizer companies,
however, seems much lower, despite the fact that they don't seem to get the
fancy funding (probably because, if you don't take into account the underlying
demographic of "performance-savy" from "database user" it looks like a smaller
market).

~~~
ericfrenkiel
The article should definitely have been clearer - we're not a plugin to MySQL
by any means. MemSQL is a completely new database built for modern hardware
scenarios. That means lock-free data structures optimized for DRAM, access
patterns, among others.

We maintain the MySQL protocol on the wire to minimize friction for existing
applications.

We really see ourselves as a counterpoint to NoSQL, which has traditionally
offered speed but sacrificed functionality. MemSQL is as fast, scalable, yet
retains SQL.

~~~
saurik
I actually understood that you weren't a plugin to MySQL, but the fact that
you chose to "maintain the MySQL protocol on the wire to minimize friction for
existing applications" is confusing to me, as in my experience the
performance-savy database operators did not write their application to target
MySQL due to all of the reasons I stated (poor or missing locking and indexing
algorithms, such as no hash joins, and an inability to perform partitioning,
windowing, or ubiquitous/efficient sub-selects), and therefore your product
actually maximizes friction for those seemingly key users (due to MySQL having
a horribly unique and generally incompatible take on SQL syntax in some areas,
and outright missing functionality elsewhere).

~~~
adw
Think of it as parasiting off the existing library of MySQL libraries for
every language under the sun. It's why every key-value store speaks its wire
protocol _and also_ memcached.

~~~
saurik
So, not using the same actual SQL syntax? Just the binary protocol?

~~~
tptacek
The binary protocol for MySQL is not, as I recall, particularly idiosyncratic.
It's mostly a vehicle for asking the server to parse a text-based query,
return a handle, and bind variables to executions of that query (when I
implemented it, MySQL didn't even seem to use all the functionality of the
protocol).

------
sallen
Not a lot of info on their website. But it strikes me as similar to VoltDB
(<http://voltdb.com/>). Especially the emphasis on OLTP queries.

Is it based on the same paradigm of single-threaded serial access? What
differentiates it from VoltDB? Is it the MySQL compatibility and ability to
use ad-hoc queries instead of stored procedures? (If so, how do you make that
bit fast?)

------
iamelgringo
Hey, guys. Congrats on the funding.

Why would I use MemSQL as opposed to relational database on SSD? RDMS on SSD
gets me a similar speed boost as opposed to running on a completely new
database.

~~~
ericfrenkiel
There are a few reasons in fact:

\+ DRAM is actually much higher on the memory pyramid than SSDs, so MemSQL
handily beats MySQL on SSD any day

\+ Putting MySQL on SSD doesn't change the fact that MySQL isn't optimized for
SSDs or DRAM; so you're still using b-trees, latching and locking, etc.

\+ SSDs have poor write endurance, meaning it has low number of write cycles

\+ You would still have to shard your database yourself; while MemSQL helps
manage that process

\+ DRAM is available on the cloud today while it will be some time before SSDs
are offered on most public clouds.

~~~
viggity
so there is zero persistence to disk? Your customers are just counting on
power systems never failing and the ram never fails?

~~~
ericfrenkiel
We are most definitely fully persistent. Nikita has elaborated more on this
further down the thread.

------
gfaremil
How MemSQL compares with TimesTen? Or even with Oracle RDBMS with bunch of
RAM? Or Oracle RDBMS with Oracle In-Memory Database Cache (TimesTen) on
application app servers? What is MemSQL "secret sause" comparing to
competitor?

EDIT: Just so say that I'm so very happy to see a company which tries to solve
real problems not Web 2.0 social networking crap is able to raise money.

EDIT2: The company like this needs $40M investment not $2M.

~~~
pbreit
Well, it's a lot cheaper!

I don't see why a pure software company needs that much funding in the
beginning. Their expenses are close to zero.

------
jganetsk
I might be an idiot for asking this question, but how do you guarantee data is
never lost?

~~~
ericfrenkiel
that's actually a great question! we keep replicas in hot standby in case a
node fails in the cluster. And to survive a datacenter/cluster failure, we
employ continuous snapshotting.

~~~
jganetsk
Replicas are master-slave or something Paxos-esque?

So, continuous snapshotting means writing to disk asynchronously?

~~~
nikita
Replicas are master/slave, and continues snapshotting is asynchronous. On top
of that, we can write a write ahead transaction log. If you put snapshotting
and write ahead logs on different I/O subsystems you only produce sequentual
writes, which guarantees that you get the most of your IO. Of course, that's
on a single box.

For our distributed cluster we have the option of writing the write ahead log
to the network "log servers" so you can do it even faster. In this case once
ones K out of N log servers receive a log record we acknowledge the
transaction. Then eventually the log servers flush it to disk. In this case
you are able to write very quickly (you can spin off as many log servers as
you like), you store your data in memory in one copy and you're durable.

~~~
sigil
What happens when master fails? jganetsk mentioned Paxos, which you could use
to reliably elect a new master. If I were evaluating your tech for usage at
scale, I'd be pretty interested in failure modes, and pretty skeptical of
anything that didn't employ distributed consensus. There may be other
distributed consensus algorithms out there (one was apparently discovered in
fruit fly cells [1]), but Paxos is the only one I've seen so far with a
_proof_ of eventual consistency.

An opensource implementation of Paxos: [2]

[1] [http://www.kurzweilai.net/fruit-fly-nervous-system-
provides-...](http://www.kurzweilai.net/fruit-fly-nervous-system-provides-new-
solution-to-fundamental-computer-network-problem)

[2] <https://github.com/ha/doozer>

~~~
nikita
Thanks, sigil. We are certainly aware of various options. It's either Paxos,
or a simpler failover mechanism used by traditional dbs, such as SQL Server or
Oracle on the per node basic based on replication.

------
zach
A real victory for YC as well as the founders (congrats to them - great
software!)

This is the best example yet of YC's considerable ability to take a very deep
tech, non-consumer startup, help vet and guide its technology and
marketability, and provide the credibility to get it well-funded.

Fantastic round for a database software startup, a breed written off not long
ago. I'd think it wouldn't happen maybe anywhere besides YC. Really shows why
YC should be (and is) sought out by founders looking to build this kind of
company.

~~~
nikita
Thanks Zach!

------
gattis
I never heard of a database that doesn't reside in memory if you have enough
ram. Most of the time, it's the OS making this happen by maintaining a cache
for disk reads, and flushing writes to disk asynchronously.

Does MemSQL just have more compact data structures that let more stuff fit in
memory?

~~~
jganetsk
My guess is if you design your database to reside in RAM, you can use pointers
in the data structures. You also can drop all layout optimizations that are
meant for linear access (optimize for random access instead).

~~~
nikita
This is a correct observation!

------
jonathanmarcus
I would love to have heard an investment pitch to Ashton Kutcher and Guy
Oseary about databases!

------
boris
When I hear about such "MySQL replacement" databases built from grounds up by
companies that are just a few months old, I always wonder how complete their
support for MySQL feature set is. Do they support every SQL
feature/quirk/misfeature of MySQL?

It seems to me that unless it is a complete one-to-one mapping, it is going to
be a nightmare to explain what is supported and what is not.

~~~
nikita
For total MySQL compatibility, you could only modify the MySQL storage engine
via their plugin system.

As the market has shown, this isn't a compelling improvement. For us, every
part of MemSQL has been optimized for memory - this gives us a 30x boost,
rather than the 4-5x you might see just switching out the storage engine.

The decision to support the MySQL protocol on the wire was made to reduce
friction for the application. However, that doesn't mean we will support every
quirk or 'misfeature' for MySQL just to say we're "MySQL compatible."

Our value proposition is easy performance at scale. In working with customers,
it's been pretty straightforward to explain the MemSQL surface area.

------
gojomo
How does performance compare to MySQL running from a RAM disk?

(I can imagine designing for RAM from day one can help... but curious as to
how much it helps, once the actual storage layer and investment in RAM is
identical.)

~~~
nikita
Order of magnitude faster. We will be publishing benchmarks shortly

~~~
gojomo
Order of magnitude: base 2, base 8, base 10, base 16?

(Sorry, 'order(s) of magnitude' is a pet peeve of mine.)

~~~
nl
_base 10_

Is that binary?

Seriously - if that is a pet peeve, then you need bigger problems. I'll make
it easy for you: if someone says "order of magnitude" they always mean base
10, unless otherwise noted.

~~~
gojomo
"10x" is shorter, less ambiguous, and less pretentious than "order of
magnitude". Same with "100x" or "1000x" compared to "orders of magnitude".

Those places where "order of magnitude" is most commonly used – especially
sciences and computing – are those places where other bases are possible, and
where precision matters.

Your 'make it easy for me' doesn't match my experience. Often "orders of
magnitude" is used as hand-waving, to make a difference sound more impressive
than the speaker is willing to go on record about. When tangible numbers
arrive, it may only turn out that what was meant was 'a large multiple', less
than the actual power-of-ten that was implied by the high-falutin' phrase.

If you've got a multiple in mind, and want to impress careful quantitative
listeners, name the multiple. Say "around 10x" or "up to 10x" or "more than
10x". In comparison, "order of magnitude" sounds like wordy posturing.

~~~
nl
"order of magnitude" is a deliberate way to present something as a rough
estimate.

In that way it is a non-mathematical form of big-O notation.

You might prefer saying 10x or whatever, but complaining about the _base_ is
unnecessarily pedantic.

Edit: Actually, Wikipedia even says _In its most common usage, the amount
being scaled is 10 and the scale is the (base 10) exponent being applied to
this amount (therefore, to be an order of magnitude greater is to be 10 times
as large)._

------
nilanp
Ok - so i posted 2 questions:

1) Can @memsql scale writes ?

2) Is @memsql ACID compliant ?

Your answering on sharding indiciates that @memsql scales writes in exactly
the same way as MYSQL - sharding

Re. ACID complaince. I'm guessing you are ?

------
Ishamael
Looks awesome! Is this as fast as nosql datastores?

~~~
nikita
Yes, that's the idea

------
old-gregg
Congrats MemSQL team! We've been in the same YC batch, when it comes to
databases, they know their craft.

------
ethank
Hate to ask such an obvious question, but any invites?

~~~
ericfrenkiel
Absolutely! Sign up at www.memsql.com for an invite =)

~~~
TY
Signed up as well - and looking forward to the invite.

This looks like a very interesting project!

