Hacker News new | past | comments | ask | show | jobs | submit login
Ex-Facebookers launch MemSQL (YC W11) to make your database fly (gigaom.com)
144 points by ericfrenkiel on June 18, 2012 | hide | past | favorite | 104 comments

I've seen a lot of these new SQL databases, and they all seem to compare themselves to MySQL. That seems like a particularly easy target, since MySQL's main strength is speed, and it's a fundamentally old-school (ie slow) design which has sacrificed everything else to get half-fast. What I'd like to know is how any of these compare in features and reliability to the featureful, reliable databases like Oracle and Postgres.

(For specific examples that I've run into recently: does it have geo-spatial extensions, and does it crash and burn and corrupt all its data if a malloc() fails)

It still uses a transaction log, but seems to make snapshots of the logs at a period of time to create compressed snapshots of the data itself.

You can turn off durability, but then you risk losing everything.

For features... forget about geo-spatial extensions - they can only do read-committed isolation level... but as they only allow for one SQL statement per transaction I suppose this might not be such a big concern.

If you were using this for something like a session management database, or something where durability is not so important then it's probably fine. Not sure I'd use it for anything that relies heavily on transaction management features.

I see other limitations, like it cannot support CHANGE COLUMN, and it can't do joins on more than 2 tables. Actually, it doesn't look like it can do FULL OUTER or RIGHT OUTER joins, either. :(

Memsql is a database that does some things incredibly well: deliver on high throughput for on small transactions and some things not as well (surface area).

Of course a new product will have certain limitation which will be removed as the product matures.

No probs, like I've said elsewhere, not attacking the product. Looks very interesting! I guess I'm trying to work out what sort of market you are trying to target here.

"and does it crash and burn and corrupt all its data if a malloc() fails"

MemSQL is fully durable (with option of sync and async durability, similar to sync and async replication) with its transactional log. If MemSQL process is killed it will recover all commited transactions on the next startup. Durability has very little performance cost on throughput as MemSQL uses only sequential I/O. And of course, durability can be completely turned off for maximum performance.

This company was probably an extremely easy investment decision:

1) Facebookers: Check

2) Data Scaling experience: Check

3) In-Memory with SQL semantics: Check

4) New-York based software that can be sold to quant funds: Check

I'm excited. It's downloading now.

Is it just me or does this entire product smell like it was designed for the sole purpose of extracting money from less than tech-savvy investors and clueless institutions?

The bullshit-bingo-lingo on their homepage is mindnumbing.

Meanwhile their actual software seems rather underwhelming, bordering on SnakeOil.

If you've ever tried to use SQL on a large, but fittable-in-memory dataset on existing row databases, then you would not be underwhelmed. Existing non-finance-industry solutions suck for things like mid-size trading queries, or (in coinlab's case) work setups that require periodic, frequent table scans.

You can buy something like kdb, but that costs $70k a year and requires that your engineers learn some extremely new semantics if they're only used to SQL and (choose any popular language here, Python to C++).

Really? I've done many, many queries on large, but fittable-in-memory datasets and with a few little tricks even MySQL can be fast (make everything a temporary table with an append only transaction table to rebuild the temporary tables if lost, don't use keys, use indexes, avoid bad practices (filtering after a join, using subselects, or needlessly avoiding HAVING clauses).

Did you mean large, fittable-in-memory, and (distributed across multiple computers and/or very reliable)? Because that is much harder.

Can you explain this further? How is using a temporary table faster? Explicit temporary tables are simply regular tables that only last your current session. Do you mean the MEMORY storage engine? What do you mean by "don't use keys, use indexes?" The term KEY and INDEX in MySQL are exact synonyms.

Temporary tables default to being mapped in memory and for more complex querying (where you have multiple queries and conditionally select further queries depending on results) are faster since the table that they are based off of could have changed, but they will not, so caching is much more likely. Keys and Indexes are not exact synonyms. Give it a try some time, do a where clause with mod(id, 5) = 0 on a keyed table and on an indexed table. The indexed table will be non-trivially faster (I've seen 3x faster).

You are clearly not talking about MySQL. CREATE TEMPORARY TABLE simply creates a physical table on disk that is destroyed at the end of the session, which can be created with any storage engine. Explicit temporary tables are handled just like "regular" tables and is dependent upon the storage engine (w/ MyISAM, indexes are stored in the key buffer, InnoDB keeps both recently accessed data and indexes in the InnoDB buffer pool).

KEY and INDEX are literally the exact same thing and are just there to support different syntaxes: http://dev.mysql.com/doc/refman/5.1/en/alter-table.html

Perhaps you are referring to primary keys vs secondary indexes, for which there is a difference in how they are physically stored on disk and in memory, particularly with InnoDB. There is a substantial performance gain to be had by having a relevant primary key that is referenced in the where clause as the data is stored in the same page as the primary key, both on disk and in the buffer pool. Secondary indexes reference the primary key, thus require two lookups.

Another key point is that temporary tables are horrible for concurrency. Obviously only one session can access them, but more importantly, they end up causing a great deal of disk activity. At the end of every session, the table is physically dropped. In ext3, this can take quite a while for large tables (not so bad in filesystems like xfs). It also can cause substantial stalls in InnoDB as the buffer pool is scanned for matching pages and are flushed out. Percona Server has innodb_lazy_drop_table to mitigate this issue to some degree, but dropping tables is still always an expensive operation. This is mostly a concern for OLTP, not necessarily OLAP servers.

So, MemSQL is a ColumnStore now? The docs didn't make that clear for me.

Anyway, why is there not a single benchmark available to support the sales-pitch?

(In your favor I'll just pretend you didn't mention kdb here...)

In your favor I'll just pretend you didn't mention kdb here

Please explain. What's bad about kdb, besides how nonstandard it is?

There is nothing bad about it (other than the price). It just seems an outrageous claim that MemSQL, with all its constraints, is even in the same game.

At the least they should come up with some seriously impressive benchmarks before dropping names like that.

I have had the misfortune of working with Q. It's a very clever little language that could be a great platform for numerical prototyping. Unfortunately, it's hobbled by a terrible programming culture. The interpreter gives cryptic single-character error messages and the C interface is written in a nightmarish soup of macros.

MemSQL is a row store. KDB is single threaded. More benchmarks are coming!

http://kx.com/faq.php seems pretty clear that kdb is multi-threaded. "Does kdb+ support multiple cores and/or multi-threading? Yes, both are built right in to the system and this makes kdb+ extraordinarily fast when compared with traditional applications, as it can make full use of all available cores. "

This ties with my own experience using it.

I'm sorry, I have to clarify this. It's a single writer database...

Any chance you could publish full info on the benchmarks?

For example, where can we find your configurations for the MySQL vs. MemSQL benchmark you show in your video? Or how big the dataset was, etc...?

Actually, what you say about the price for kdb+ is true, but Q (the new language to query kdb+, essentially the latest version of K which is now deprecated) is very much like SQL, syntaxwise.

The way I see it, this is very much targeted at financial firms that use kdb+ or products like it that perform really well for things like writing lots of tick data from dozens of exchanges at a time and querying across them quickly in a familiar SQL style.

Fast, in-memory data stores already exists with support and ease of use/deployment...I was able to set up hazelcast for my distributed cluster in minutes...I don't see the advantage of this over nosql or in-memory data storage.

It's not just you. But I think that's the whole idea. It's about fooling people. And I think some of the profits will be flowing back to Oracle, though that is not stated upfront. The marketing power of the acronym "SQL" should not be underestimated.

But that's only a query language. Not a db architecture.

Let's see the source code. That's the only way to verify the approach that is being taken.

Given the source the first thing I'd do is search for "mmap".

They already get one demerit for using a row store.

And another for disk access.


We do use mmap, but we do not use memory mapped files. We use mmap to allocate big chunks of memory from the OS, which we later distribute to our custom memory allocators.

It's not a column store. We are thinking about it, maybe some time in the future as we add more analytical feature. It's a row store.

We do write back to dist. We have a transaction log, which we truncate once it grows over a certain threshold, by taking a snapshot.

paul buchheit and max levchin seem pretty tech-savvy to me

All of the arguments for using this are contrived. How big is the total addressable market of projects where there's a huge budget and engineers who ultimately have no idea of what they're doing?

Thanks! We're super-excited to have the opportunity to build this. When we started we asked ourselves - what would a database look like if it were designed today?

We see a future where OLTP databases live in memory, and where you have hundreds of terabytes of memory and hundreds to thousands of CPUs at your disposal.

We built MemSQL to make it easy to go directly into memory and get the speed improvements we all need. What we're releasing aims to fulfills that vision.

p.s. Eric, two annoyances about your site: once I've been dumped on to the developers subsite, it's non-trivial to get back to the main company homepage.

Also, pricing. Come on, man! Pricing!!

Thanks! We'll add a link back to memsql.com =) What did you think of the video and diagram on developers?

I think none of that really matters until we see the price. There's a trial version and a free version of Microsoft SQL, but their full version can quickly top $20k without much effort - knowing how bad your "ouch" will be is key before I'd spend any time on the "ahhhh."

The diagram was fine; I didn't watch the video. I'll hit it up later.

p.s. Isn't a totally reasonable business model for a truly faster DB engine "free, everywhere, bought by Oracle?"

We think a lot about it. And even though this model definitely makes sense, we would like to build a business that can generate money and pay for itself.

Is this a joke?

Data must fit in RAM, no joins over >2 tables, no transactions, no builtin support for clustering/sharding/horizontal scaling whatsoever.

What is the advantage over memory-tables, MySQL on a ramdisk or something purpose-built (redis)?

That's a little harsh I think. It still has a transaction log, so it has certain advantages over memory tables. Some interesting enhancements to indexes also. And it is also binary compatible with MySQL, so you can actually add it as a MySQL node.

I'd say it's a database for a niche market, as one of the folks from MemSQL has been saying. It would fit in nicely under a few scenarios.

MEMORY tables have table-level locks and from a concurrency perspective, is just as bad as MyISAM, so really offer no advantage unless there is only a small number of consumers. A well tuned InnoDB table will outperform the MEMORY engine )(after the warm-up period) as soon as you have more than a handful of users performing write operations.

Advantages are clear. Blazing speed, transactional durability, ease of use and flexibility that come from using a well known protocol and SQL dialect. Did I miss something?

On the other hand, if your data fits in RAM and you don't need to join more than two tables and you don't need clustering/sharding whatsoever and you need the full power of SQL, what would you use?

> ..something purpose-built (redis)?

Errr maybe "features" like GROUP BY with HAVING? I think you're trolling, Mr Moe.


Says the guy who resorts to selective quoting to make a point? You may want to review my original comment where I mention two other options.

nope. you're definitely a troll

MemSQL does not yet support JOINs on more than 2 tables. [1]

1. http://developers.memsql.com/docs/1b/sql/join.html

Nor transactions with more than one statement. And only read committed isolation level. [1]

1. http://developers.memsql.com/docs/1b/isolationlevel.html

This seems really confusing and like it might be a maintenance nightmare http://developers.memsql.com/docs/1b/memory.html

Do you need to ensure that all of your data fits in memory? It's easily possible to have ten gigs in your db but only need a small amount of it to be hot in ram. Does it take a really long time to restart if it has to warm all the data?

Good questions - no idea! Can one of the developers of MemSQL clarify this? How long does it take to startup a large database of several 100 GB?

Recovery on the system restart is going at the speed of hard disk.

Can you pin how much data from a table can be placed into memory, or does the whole table need to be placed in memory? I can see settings that limit the transaction log memory usage, but no way of reducing the actual amount of data in-memory.

The whole table has to fit in memory. RAM is getting very cheap, you can buy 1Tb for 12K today.

But for anyone who is renting the server, it's $$$$ monthly for a fraction of that.

Recovery on the system restart is going at the "sequential scan" speed of hard disk.

The "sequential scan" speed of a generic SSD is pretty good. It's even better on something like FusionIO.

How it compares to VoltDB, which used similar approach and already "combines the proven power of relational processing with blazing speed, linear scalability and uncompromising fault tolerance" (according to VoldDB website) ?

There are few differences with VoltDB. 1. It's much easier to use (MySQL compatibility, ad-hoc sql no java dependency) 2. It doesn't have any issues with data skews. We are using lockfree skiplists vs b-trees partitioned by core. 3. I believe we are faster on a single box, however this claim should obviously be verified by a third party.

Congrats to the MemSQL team. It's always validating to see more entrants into the space. The comparison between the two systems is fascinating; the approach taking is really different than VoltDB, but the systems share some similar choices as well.

As for your comments:

1. Ad-Hoc SQL in VoltDB is a current area of focus in VoltDB development. We use similar compilation and expect them to be no slower than our stored procedures in the 2012 timeframe. Starting with our release next week, we'll also support multiple SQL statements in a serialized/ACID ad-hoc transaction.

2. On a single-node (or a several replicated nodes), VoltDB doesn't really have many issues with data skews. Rebalancing a distributed system is also in development, but hasn't turned out to be much of an issue with our production customers.

3. Faster at what? Single statement non-durable transactions? That's probably true for simple SQL operations for the time being. It gets murkier once you consider VoltDB features like materialized views that avoid table-scans altogether. I am also curious how durability affects performance though.

But I'm really interested to revisit this comparison once MemSQL supports real transactions and replication/sharding. For the time being, most of the VoltDB use cases we target won't run on it.

There's another in memory mysql replacement whose name escapes me right now. How are they doing?

Ycombinator seems to have backed a few of these next-gen databases. It will be interesting to see who wins.

EDIT: Rethinkdb. But it looks liked they've dumped their mysql engine and gone to a pure key-value store.

Xeround looks really interesting (http://xeround.com/). Seems to do everything in RAM and takes care of any replication / backup & scailing for you. Looks like it would get expensive with large amounts of data though... anyone using this in production?

One of the things that separate us from Xeround is the fact that you can see the benefit of super high throughput on any linux box, including your laptop. You can download the software and use it by yourself.

Xeround is a sas model, where you can't get exact same experience as on your local machine.

What do you think of Riak (basho.com)?

Riak is a great product with an incredible support team, if you choose to use their support.

I'm obviously missing something. Is SQL really the bottleneck for SQL performance? I'm not sure how much compiling down to C++ will help.

Is there more to this than the article suggests?

Most times that the press mention SQL, they actually mean "Relational Database Management System". But in this case, MemSQL seems to be doing two things: reducing locking semantics (the only transaction isolation level is read committed), and doing something or other to speed up query compilation.

Frankly, I can't see how their query compliation is much better than anything Oracle, SQL Server or PG does already. (see here: http://developers.memsql.com/docs/1b/codegen.html) Perhaps there is no plan cache in MySQL, or the plan cache is inadequate?

Incidentally, you don't want to use it if you need to do multi-statement transactions. Can't be done. Check out the documentation here:


It has a few extra tricks up its sleeve to enhance indexes in MySQL, but again nothing really particularly spectacular. http://developers.memsql.com/docs/1b/indexes.html

More interesting for me, is that it can be hosted on EC2, which is pretty neat!

Query compilation works incredibly well for high throughput. If you have a lot of small transaction (OLTP) you can benefit tremendously by dramatically reducing the amount of instruction you burn per query. Lots of database vendors know that, but they don't do it b/c they often have "one size fit all" dbms. BTW you will see some changed from the big db vendors as well.

There is no plan cache in MySQL. SQL Server has it, but it's nearly not as efficient. I know, since I used to work there.

Multistatement transaction are coming. We are beefing up SQL surface area, as well as working on many more exciting things.

Not sure what you are comparing against. However, from what I've seen of SQL Server execution plans, actually it's extremely robust and I would have to argue that it can often be far more efficient than even Oracle. I've used both.*

Incidentally, sorry if I seem to be disparaging the product. I'm actually looking at it myself, but want to know what the limitations are. Multi-statement queries would be a big win. Are you considering implementing higher isolation levels? Also, as was asked above, how are the startup times for multi-gigabyte/terrabyte databases?

* Hint for the unwary - if you run a SELECT statement with a WHERE 1=0 AND <rest of predicates> (for brain dead "dynamic" SQL generating apps), the initial compilation can take a dramatically long time in Oracle 10g (and I believe 11g, but cannot confirm). Oracle have not got any mechanism to shortcut the query compilation for such queries. A bit obscure, but I never saw this problem with SQL Server. Oracle support were stumped by the issue too and gave me short-shrift ("it might be fixed in an 11g patch... some time").

No worries :). MemSQL does very well for OLTP like workloads, but it's not an "all size fits all" database.

Recovery goes at the speed of disk.

The compile time for memsql is mostly code generation time. Once the code is generated/compiled MemSQL is doing a very good job matching the queries against the existing plan by removing parameters from the query text.

:-) I really like what you are attempting to do here. What would you say your core market is?

Some of the stuff is on the website (Capital Markets, Ad Serving, Mobile). It will be expanding as we add clustering and more analytical features.

Most DBMS products have "Stored Procedures" which, essentially, are compiled chunks of SQL that you can call by name, e.g.


On very large queries (many table joins) there can be a noticeable performance hit while the DBMS figures out the optimal query plans. Stored procedure compile and store this plan after first execution (some pre-compile) so the DBMS can just look it up in the cache.

It also removes the need to do runtime parsing (compiled vs. interpreted languages). There are more nuances to Stored Procedures than I can put here, but here's a link to a more in-depth analysis I wrote: http://stackoverflow.com/questions/59880/are-stored-procedur...

In general, MySQL has limited query plan cache (last I read, it was one cache per database connection and was wiped out when the client disconnected), so this is definitely a win. However, since every other DBMS out there already has robust query cache, I don't think the Postgres/Oracle/MS SQL etc. folks are losing any sleep.

Supporting stored procedures doesn't mean compiling to native code. E.g. sql server just stores a compiled plan so saves of parsing/optimization. Execution is still an interpretation.

Giants are investing into big memory technology no doubt about it, which is great. That's the beauty of the huge big data market - one size doesn't fit all.

SQL Server supports .NET stored procs, which are compiled code bits. Oracle / Sybase support Java.

I believe some DBMS vendors are working on the interpreted nature of the execution, but basically the query plan determination makes up the bulk of the parsing/execution time. Compiling SQL to something... "native" doesn't provide a whole lot of win in terms of execution latency.

I think its that its an in memory database, optimised for memory as opposed to disk. The SQL to C++ seems a bit of a confusion to me, I guess it means it has a JIT SQL compiler to optimise the query to native opcodes. Meh. But the in memory and mysql wire compatibility are big wins.

Over what? The big iron DB I use is to all intents and purposes in memory in any case. The only Input requirement is to pull the data in at some point (which can be forced) and the Output requirement is to write to a logical log (which we buffer in any case). It flies.

yes, memsql is an in-memory database. the big innovation here is that we're literally hardwiring the SQL into the server and matching queries to those compiled query templates.

Since we work in memory, you see a huge perf boost with code generation because there are no lags with disk.

I don't believe this is the case - it's not an in-memory database.

Even if it is (and I can't see anything on their developer website) then if it's ACID compliant, like they say, then it will have to be writing to the disk at some point - at the very least it will need a transaction log. Otherwise, it's not really usable for anything you need to persist to long term storage!

Yes, its not clear on their site, but the gigaom article states: "As its name implies, MemSQL achieves its fast performance in part by keeping data in memory".

Yes, it does write to disk (append only logging I think) but my point was, that if you are keeping ALL your data in memory, you can optimise the storage for fast queries, as opposed to a hybrid / paged memory and disk system.

Oops, sorry - I'm wrong. Actually, according to their documentation:

"By default, MemSQL runs with full durability enabled: transactions are committed to disk as a log and later compressed into full-database snapshots. The snapshot and log files are stored in the datadir directory configured in memsql.cnf. When MemSQL is restarted, it will recover itself into the state it was in before the shutdown, by reading the snapshot and log files in datadir" [1]

I missed that piece of documentation on their website. I think they could do better to explain better the advantages and how their technology works. Their product overview doesn't really tell you much of anything, except that they have an enhanced query parser. [2]

1. http://developers.memsql.com/docs/1b/durability.html

2. http://developers.memsql.com/

I am not sure this solves anything. From their implementation details I could just mount mysql on a memory filesystem and have 100% of MySQL's feature. If I need near 100% assurance that the data will be saved, there is always block replication, transaction logs and regular replication. their transition module adds nearly nothing but overhead, since stored procedures are better, and if you have the need for ultra performance, there is always the ability to access mysql at the command level and bypass sql entirely.

You could do that, but you won't be able to achieve as high throughput in highly concurrent environments. MemSQL achieves this via code generation and lockfree data structures.

We will also put the technology on steroids, by building a clustered system

> We will also put the technology on steroids, by building a clustered system

Isn't that the same as MySQL Cluster, minus the SQL-to-C++ part (and maybe lock-free structures)? Why not fork MySQL Cluster to add the SQL-to-C++ code generation? Then it seems you could claim (from MemSQL site) "MemSQL is ... still faster than other in-memory databases because of SQL-to-C++ code generation" and also support every feature of MySQL Cluster. Why reinvent the wheel only to create something with great limitations compared to the existing wheel?

"Why not fork MySQL Cluster to add the SQL-to-C++ code generation (and lock-free storage engine)?"

:) It is a little more complicated than that.

lets see some stats. Its a bit unfair to compare an in memory database with a severely truncated feature set to a more robust sick based system.

Correct! Different scenarios require different solution. One size doesn't fit all. And it's nice when solutions play nicely with each other.

lockfree data structures FTW!

There are many bottlenecks and it really depends on the workload. Sometimes it's IO, sometimes it's CPU. When you have a throughput problem, yes, it's the SQL.

Why should I use this over MySQL Cluster?

MySQL Cluster is lock-free, distributed, HA, and has better performances:



And MySQL Cluster has been used in mission critical apps for years.

What does MemSQL give me that MySQL Cluster doesn't?

Question - how well does the query parser handle stale queries? SQL Server for the longest time had issues with plan stability in that the plan became too stable. When the data distribution changed dramatically, the queries didn't age out of the cache and the queries would do such things as use the wrong index, or not work out the correct cardinality of a table and then use an index where really a full table scan would have been better... and so on.

How does the database handle this sort of thing?

Right now you can handle this with query hints. The plans are stable, but they are attached to the query text. When you throw a hint in there memsql generates a new plan since the query text changes.

To clarify - if you add the query hint, run the query, then remove the query hint then has the original query's plan expired?

Edit: I know I'm asking a lot about the plan cache, but as it's a core selling point, then I'd like to know how data distribution changes will affect performance. Stale queries will potentially adversley affect performance under certain circumstances.

No, original query plan will only expire if indexes are dropped or new ones are created.

It looks like this is a fork of MySQL. Did they actually pay Oracle for a proprietary license or is there a source release hidden on the website?

yes, we have commercial licensing agreements in place.

Looks useful. Drop in mysql compatibility is a huge plus for many projects that are tied to mysql (for whatever reasons).

Does anyone know if there are any plans for having MongoDB style replica sets (e.g. sharded and replicated databases in a cluster)?

Also, it would be great if it supported the native mysql replication, so you could have MemSQL replicas of a master mysql DB.

Sharded/replicated MemSQL is in the works. And yep, the replication will be compatible with MySQL replication to enable grouping MemSQL and MySQL nodes.

Excellent! Good work!

Side note: IMO, If you want to completely dominate this space, make it super easy to add new nodes and replicate the existing DB without having to take a snapshot from the master.

Thanks! This is absolutely on our minds.

Seconded, this would be an amazingly compelling and valuable feature!

Looks promising. Love how it inter-operates with MySQL driven apps.

I should be able to try it for one of my apps over this weekend!

Congrats guys on the launch! Looking forward to giving it a spin.

Looks like they're using jQuery Knobs: http://anthonyterrien.com/knob/

Yes, for the workload simulator :)

Congratulations, Eric!

Why not just use objects in your favourite shitty OOP language as usual and take regular snapshots of the data in memory and save them to disk as rollback points instead? Bam! Nothing new to implement in the application. Bam! Lightening fast access and write times. Bam! Data persistence.


Wow, HN is harsh. This is probably a really solid product that solves a specific problem and you all nit-pick at it's trade-offs and poo-poo it.

HN is not harsh. HN is one of the main marketing channels for YC-funded projects such as this.

Sorry some commenters on HN are harsh


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact