The thing I keep finding confusing about these companies is: did anyone who cares about performance bother coding their app to use MySQL in the first place (as opposed to a database like PostgreSQL or Oracle, which have better index search and lock algorithms)?
This is really important, as MySQL has a weird dialect of SQL with all kinds of broken MySQL-isms (including fundamental things like not using the SQL-standard meaning for the various forms of quoting) as well as drastically limited support for supposedly-advanced things like sub-selects and truly-advanced things like windowing and partition functions...
Seriously: I don't think I have any apps that would actually /run/ on something that spoke MySQL without recoding all of the data access parts (and that's pretty much what a web backend is) from scratch, swearing and cursing the entire time about why I don't get to use X or Y or Z feature.
To me, this is just "yet another of those MySQL optimizer companies" that we see come and go year in and year out, and one that is almost certainly going to experience the same fate as all of the other ones: they are a dime a dozen, and almost always fail. The failure rate on PostgreSQL optimizer companies, however, seems much lower, despite the fact that they don't seem to get the fancy funding (probably because, if you don't take into account the underlying demographic of "performance-savy" from "database user" it looks like a smaller market).
The article should definitely have been clearer - we're not a plugin to MySQL by any means. MemSQL is a completely new database built for modern hardware scenarios. That means lock-free data structures optimized for DRAM, access patterns, among others.
We maintain the MySQL protocol on the wire to minimize friction for existing applications.
We really see ourselves as a counterpoint to NoSQL, which has traditionally offered speed but sacrificed functionality. MemSQL is as fast, scalable, yet retains SQL.
I actually understood that you weren't a plugin to MySQL, but the fact that you chose to "maintain the MySQL protocol on the wire to minimize friction for existing applications" is confusing to me, as in my experience the performance-savy database operators did not write their application to target MySQL due to all of the reasons I stated (poor or missing locking and indexing algorithms, such as no hash joins, and an inability to perform partitioning, windowing, or ubiquitous/efficient sub-selects), and therefore your product actually maximizes friction for those seemingly key users (due to MySQL having a horribly unique and generally incompatible take on SQL syntax in some areas, and outright missing functionality elsewhere).
Think of it as parasiting off the existing library of MySQL libraries for every language under the sun. It's why every key-value store speaks its wire protocol and also memcached.
The binary protocol for MySQL is not, as I recall, particularly idiosyncratic. It's mostly a vehicle for asking the server to parse a text-based query, return a handle, and bind variables to executions of that query (when I implemented it, MySQL didn't even seem to use all the functionality of the protocol).
But MemSQL is targeting MySQL users (at least initially)! Its proposition is more compelling to current MySQL users than it is Postgres (or whichever higher preforming DBs you are referring to) users. And frameworks will have little effort to support the new db and can optimize over time.
Not a lot of info on their website. But it strikes me as similar to VoltDB (http://voltdb.com/). Especially the emphasis on OLTP queries.
Is it based on the same paradigm of single-threaded serial access? What differentiates it from VoltDB? Is it the MySQL compatibility and ability to use ad-hoc queries instead of stored procedures? (If so, how do you make that bit fast?)
Why would I use MemSQL as opposed to relational database on SSD? RDMS on SSD gets me a similar speed boost as opposed to running on a completely new database.
Thanks for the responses. I have a few follow ups I'd love to get your perspective on.
+ DRAM is actually much higher on the memory pyramid than SSDs, so MemSQL handily beats MySQL on SSD any day
DRAM is an order of magnitude slower than SSD's. It's also an order of magnitude more expensive, however. You can get a TB of SSD for about $2k, which means 50TB of SSD for $100k, or the price of a single engineer for a year. Meanwhile, a Quad High Mem EC2 instance (64GB of Memory is around $1,700 a month), or $20k a year.
Thoughts?
+ Putting MySQL on SSD doesn't change the fact that MySQL isn't optimized for SSDs or DRAM; so you're still using b-trees, latching and locking, etc.
True, but there are still a number of optimizations that can be made.
Issues like locking become a problem when IO is slow, and transactions have to wait for locks to clear. When locks are 3 orders of magnitude faster on SSD's vs spinny disk, those issues tend to fade away.
+ SSDs have poor write endurance, meaning it has low number of write cycles
Agreed, SSD's have worse write endurance than RAM. You should be able to architect around that fairly easily, however.
That being said, I spoke with Blekko's CTO, and out of the 700 SSD's they purchased 2 years ago to serve their index off of, they've had 2 drive failures. Their index is write only.
+ You would still have to shard your database yourself; while MemSQL helps manage that process
Not having to shard is a Good Thing(TM). But, if you can throw 10TB of SSD storage in a server, sharding should be less of an issue unless you're dealing with Google(n) sized data problems, no? I suspect that 10 TB of DRAM MemSQL would be cost prohibitive.
+ DRAM is available on the cloud today while it will be some time before SSDs are offered on most public clouds.
Absolutely correct, but SSD's are rapidly replacing spinning disk in Enterprise data centers. I suspect that the first VPS service that offers SSD storage is going to rapidly kill the competition because of the insane performance differential. How long am I going to be willing to pay an extra $10 a month for 1 GB of extra RAM in my VPS when I can purchase an 80 Gig SSD for $120?
+++++++++++++++++++++
I'm not trying to troll or be difficult. I'm honestly looking for answers to the in memory / NoSQL / MemSQL vs SSD debate. I've looked at the issue from a number of different angles, but I keep coming down on the SSD side. I figured you would have some of the best, most thought out rebuttals to the issues I've been thinking about.
But, everybody seems to be using Cloud VPS's because they seem very wary of putting some hardware together and throwing it in a rack. I suspect that because I've been rolling my own PC builds for more than a decade, I'm probably much more excited about throwing 20 SSD's in a 2 x 8 core server to see what it would do than someone who has purchased Mac laptops for most of their career. (not flaming, just thinking out loud).
I have a theory that a side effect of Apple owning the startup developer market is that developers think of servers as something residing in a cloud to be rented rather than built from parts. And, because of that, things like NoSQL and non-relational databases have gotten a ton of buzz rather than throwing a _little bit of money at hardware to solve the problem.
Typical relational databases are designed from the ground up with the assumption that their disks behave like a traditional disk (hi-seek time). Whatever speed boost you're getting from using an SSD is far under the potential.
How MemSQL compares with TimesTen? Or even with Oracle RDBMS with bunch of RAM? Or Oracle RDBMS with Oracle In-Memory Database Cache (TimesTen) on application app servers?
What is MemSQL "secret sause" comparing to competitor?
EDIT: Just so say that I'm so very happy to see a company which tries to solve real problems not Web 2.0 social networking crap is able to raise money.
EDIT2: The company like this needs $40M investment not $2M.
that's actually a great question! we keep replicas in hot standby in case a node fails in the cluster. And to survive a datacenter/cluster failure, we employ continuous snapshotting.
Replicas are master/slave, and continues snapshotting is asynchronous. On top of that, we can write a write ahead transaction log. If you put snapshotting and write ahead logs on different I/O subsystems you only produce sequentual writes, which guarantees that you get the most of your IO. Of course, that's on a single box.
For our distributed cluster we have the option of writing the write ahead log to the network "log servers" so you can do it even faster. In this case once ones K out of N log servers receive a log record we acknowledge the transaction. Then eventually the log servers flush it to disk. In this case you are able to write very quickly (you can spin off as many log servers as you like), you store your data in memory in one copy and you're durable.
What happens when master fails? jganetsk mentioned Paxos, which you could use to reliably elect a new master. If I were evaluating your tech for usage at scale, I'd be pretty interested in failure modes, and pretty skeptical of anything that didn't employ distributed consensus. There may be other distributed consensus algorithms out there (one was apparently discovered in fruit fly cells [1]), but Paxos is the only one I've seen so far with a proof of eventual consistency.
Thanks, sigil. We are certainly aware of various options. It's either Paxos, or a simpler failover mechanism used by traditional dbs, such as SQL Server or Oracle on the per node basic based on replication.
Even if you put write ahead logs on different I/O subsystems the lock needed to ensure consistency will probably be a bottleneck (and crappy Linux implementation asynch IO on multi processor system does not help). How do you guys organize data so it is consistent and lock is not a bootleneck?
You don't necessarily need to write log entries in the order they are committed, nor do you need to store all of the entries in a single log, you could shard the log writing too - as long as you maintained, somewhere, the order information. So yes, you would still need to lock and sort out the order somewhere, but you already have to if you're ACID any way.
Right on the money. We every log record contains LSN (log sequence number) that allows us to reconstitute the order. Then as at the recovery time you need to merge sort by LSN from the shared log servers
A real victory for YC as well as the founders (congrats to them - great software!)
This is the best example yet of YC's considerable ability to take a very deep tech, non-consumer startup, help vet and guide its technology and marketability, and provide the credibility to get it well-funded.
Fantastic round for a database software startup, a breed written off not long ago. I'd think it wouldn't happen maybe anywhere besides YC. Really shows why YC should be (and is) sought out by founders looking to build this kind of company.
I never heard of a database that doesn't reside in memory if you have enough ram. Most of the time, it's the OS making this happen by maintaining a cache for disk reads, and flushing writes to disk asynchronously.
Does MemSQL just have more compact data structures that let more stuff fit in memory?
My guess is if you design your database to reside in RAM, you can use pointers in the data structures. You also can drop all layout optimizations that are meant for linear access (optimize for random access instead).
When I hear about such "MySQL replacement" databases built from grounds up by companies that are just a few months old, I always wonder how complete their support for MySQL feature set is. Do they support every SQL feature/quirk/misfeature of MySQL?
It seems to me that unless it is a complete one-to-one mapping, it is going to be a nightmare to explain what is supported and what is not.
For total MySQL compatibility, you could only modify the MySQL storage engine via their plugin system.
As the market has shown, this isn't a compelling improvement. For us, every part of MemSQL has been optimized for memory - this gives us a 30x boost, rather than the 4-5x you might see just switching out the storage engine.
The decision to support the MySQL protocol on the wire was made to reduce friction for the application. However, that doesn't mean we will support every quirk or 'misfeature' for MySQL just to say we're "MySQL compatible."
Our value proposition is easy performance at scale. In working with customers, it's been pretty straightforward to explain the MemSQL surface area.
It seems to me they should be able to support an intelligent sub-set of functionality. And if there are any glaring holes, they will hear from users. This doesn't strike me as a big issue.
How does performance compare to MySQL running from a RAM disk?
(I can imagine designing for RAM from day one can help... but curious as to how much it helps, once the actual storage layer and investment in RAM is identical.)
Seriously - if that is a pet peeve, then you need bigger problems. I'll make it easy for you: if someone says "order of magnitude" they always mean base 10, unless otherwise noted.
"10x" is shorter, less ambiguous, and less pretentious than "order of magnitude". Same with "100x" or "1000x" compared to "orders of magnitude".
Those places where "order of magnitude" is most commonly used – especially sciences and computing – are those places where other bases are possible, and where precision matters.
Your 'make it easy for me' doesn't match my experience. Often "orders of magnitude" is used as hand-waving, to make a difference sound more impressive than the speaker is willing to go on record about. When tangible numbers arrive, it may only turn out that what was meant was 'a large multiple', less than the actual power-of-ten that was implied by the high-falutin' phrase.
If you've got a multiple in mind, and want to impress careful quantitative listeners, name the multiple. Say "around 10x" or "up to 10x" or "more than 10x". In comparison, "order of magnitude" sounds like wordy posturing.
"order of magnitude" is a deliberate way to present something as a rough estimate.
In that way it is a non-mathematical form of big-O notation.
You might prefer saying 10x or whatever, but complaining about the base is unnecessarily pedantic.
Edit: Actually, Wikipedia even says In its most common usage, the amount being scaled is 10 and the scale is the (base 10) exponent being applied to this amount (therefore, to be an order of magnitude greater is to be 10 times as large).
10. The full answer will come from benchmarks. The short answer is that it depends on the workload. But we are very confident it's going to be 10x over mysql on a ram disk for some standard workload (TPC-C).
This is really important, as MySQL has a weird dialect of SQL with all kinds of broken MySQL-isms (including fundamental things like not using the SQL-standard meaning for the various forms of quoting) as well as drastically limited support for supposedly-advanced things like sub-selects and truly-advanced things like windowing and partition functions...
Seriously: I don't think I have any apps that would actually /run/ on something that spoke MySQL without recoding all of the data access parts (and that's pretty much what a web backend is) from scratch, swearing and cursing the entire time about why I don't get to use X or Y or Z feature.
To me, this is just "yet another of those MySQL optimizer companies" that we see come and go year in and year out, and one that is almost certainly going to experience the same fate as all of the other ones: they are a dime a dozen, and almost always fail. The failure rate on PostgreSQL optimizer companies, however, seems much lower, despite the fact that they don't seem to get the fancy funding (probably because, if you don't take into account the underlying demographic of "performance-savy" from "database user" it looks like a smaller market).