Hacker News new | past | comments | ask | show | jobs | submit login
MySQL now includes memcached, and a plugin that allows fast NoSQL-style access (mysql.com)
149 points by alpb on Aug 3, 2012 | hide | past | web | favorite | 43 comments

Title should be:

MySQL development server, not meant for production use, in development for almost 2 years, release date unknown, now includes memcached, and a plugin that allows fast NoSQL-style access.

I hope they are taken seriously however; I'd like to see all the major RDMS tightly integrate memcache in front. It only makes sense, web developers have been doing it themselves since LiveJournal's hayday.

This isn't integrating memcache in front, it's letting you query normal InnoDB tables directly using the memcache protocol instead of SQL.

from the docs: http://dev.mysql.com/doc/refman/5.6/en/innodb-memcached.html

"MySQL 5.6 includes a NoSQL interface, using an integrated memcached daemon that can automatically store data and retrieve it from an InnoDB table. By default, you use the memcached API purely to speed up database operations, letting InnoDB handle memory caching using its buffer pool mechanism. Advanced users can enable the traditional memcached in-memory caching and control whether operations look up and store data in memory, in InnoDB tables, or both. This offering provides users with the best of both worlds"

To really get this right they'd need to add some auto-rebalance / sharding magic and have it integrated much more deeply. This is really only a head-nod toward NoSQL.

Relevant excerpt:

The MySQL server now includes the widely used memcached in-memory caching system, and a plugin that allows fast NoSQL-style access to InnoDB tables through the memcached protocol. This access method avoids the overhead of SQL parsing and constructing a query optimization plan. You can store the underlying data in a single InnoDB table, or spread it across multiple tables. You can read and write data through both memcached and SQL. For example, you can do fast single-key lookups through memcached get calls, and do statistical reports across all the data through SQL.

So it's actually an implementation of the memcached API onto the table-space?

Advanced users can enable the traditional memcached in-memory caching and control whether operations look up and store data in memory, in InnoDB tables, or both.


I've read also about another solution for NoSQL-like access to Mysql, HandlerSocket, that is implemented in the Percona Mysql http://www.percona.com/doc/percona-server/5.5/performance/ha... but I had no opportunity to try it

One of the versions[1] of the 5.6 documentation[2] has a comment regarding that.

> This configuration differs from another popular MySQL NoSQL solution, HandlerSocket, by skipping even the Handler API layer and directly accessing the InnoDB storage engine through low-level APIs, making the code path even shorter and more efficient.

[1] http://www.oszone.co.kr/reference/refman-5.6-en.html-chapter...

[2] http://dev.mysql.com/doc/refman/5.6/en/innodb-memcached-bene...

Does it support range scan selects (example: "where f > 5") as HandlerSocket do?

Seems like a very misleading title. Surly it should be "MySQL now supports the memcache protocol for simple NoSQL-style access on InnoDB tables".

As far as I can make out, the memcached server software itself is not included in MySQL right?


No, it does seem to include memcached as well for write through caching. Interesting. http://dev.mysql.com/doc/refman/5.6/en/innodb-memcached-bene...

This is incredibly helpful.

However, I wish there's something in reverse: Redis as a MySQL storage engine. The most obvious useful things is that you can save a `tags` table without m2m tables, juse use Redis sets.

Why have an m2m table? Just have a table with two columns: post_id (or whatever is being tagged) and tag (varchar). If you want to present a list of possible tags just query for uniques. I may be missing something, so please correct me if there is a better way or I am misunderstanding your point.

there are three design patterns for tags in a table based db


Link is dead... Just redirects to dev.mysql.com

Cache link, since I can't find it on dev.mysql.com:


I believe Postgres supports something like that.

One might choose to use the array datatype, combined with GiST, GIN, and/or expression based indexes, depending on the workload. That would be somewhat more lucid in most situations than a string packed into an attribute.

If mysql would throw flying pigs out of my server I still won't be using it. Not in a world where PostgreSQL exists.

In a world where perfectly respectable, performant, and resilient companies use MySQL, your comment stinks of inexperience.

Just what App Engine already offers, its memcache api[1] on its NoSQL datastore.

App Engine FTW!


> Just what App Engine already offers

No, it doesn't.

"The Memcache Python API High performance scalable web applications often use a distributed in-memory data cache in front of or in place of robust persistent storage for some tasks"

MySQL memcached is not a cache, it's a persistent storage with memcached API. And it's fast.

Isn't this MySQL memcached thing just a memcached in front of MySQL where the 'key' is the expanded SELECT statement?

What you've described it's MySQL query cache.

MySQL memcached is a memcached deamon that uses InnoDB storage instead of in-memory storage. The caching is done by InnoDB itself.

(and actually it's possible to enable the optional in-memory storage too)


Given that the most common reason for sticking with MySQL is the maturity of InnoDB and the surrounding toolsets, and the ability to hire knowledge easily this sounds like an excellent development.

Have you tried to hire a competent MySQL DBA recently?

URL: http://dev.mysql.com/doc/refman/5.6/en/innodb-memcached-inte...

Last line: "If there is no innodb_memcache.containers.name value of default, the row with the first name value in alphabetical order is used. "

This is in regards to which table is used as the default for get/set operations. Anyone else foresee this leading to mass confusion/issues for unsuspecting users down the road?

I think this is pretty cool, waiting for some benchmarks.

This makes a lot more sense than adding the HandlerSocket API which provided marginally more utility at the cost of being a completely new API. Being able to bypass the query engine for key lookups is pretty cool and a write through cache is a bonus.

Postgresql has had a memcached extension for, um, 8 years. I haven't used it myself, but I think it is well respected.


pgmemcache is a memcached client, it connects to external memcached server. Which is, um, quite useless.

w.r.t "useless": Joins against memcached data, and expiry of keys via trigger. Different, yes, useless -- well, up to you. Not written for no reason, though.

That expiring triggers could be somehow useful. But in most cases it's not worth to query memcached from database (selects). The reason is it's much faster to query memcached directly without even opening database connection.

All these caching things make applications complex, because cache invalidation it's a complex task.

MySQL is doing right steps - it's doing caching itself via InnoDB buffer pool. And makes the interface faster by replacing SQL queries with fast memcached API.

> The reason is it's much faster to query memcached directly without even opening database connection.

Sometimes you need to augment data in a SQL database with memcache information, and then subsequently compute a predicate, and it would be faster to use your sql executor than Ruby, say. I'm not suggesting you'd always use this access path, but it can be handy, just like dblink is.

> MySQL is doing right steps - it's doing caching itself via InnoDB buffer pool. And makes the interface faster by replacing SQL queries with fast memcached API.

I'd really like to see some benchmarks on that. If one has a protocol-level prepared plan in Postgres (which is not an exotic entity, some drivers even create them transparently), one just has to send "Bind" and "Execute" messages to call functions.

If there are major advantages to be had where there is any disk i/o involved at all, my suspicion it'd be having a caching strategy that is more suited to the memcached workload: most SQL implementations already have fast-path mechanisms (such as prepared statements, which are in my understanding relatively slow in MySQL due to some vagary in the protocol) where I'm a bit hard pressed to believe on intuitive grounds alone that parsing (and not planning, if one uses prepared statements) is the principal culprit for poor performance.

I'm actually going to include this as an example of non-innovation in my pitch to take us to postgres.

What does postgres have that's equivalent or better? If I'm understanding this MySQL improvement, it persists memcache data on disk, a great thing. Does postgres already do that, or better?

I think this MySQL feature has the potential to be neat, but if one is using it in the mode where one persists or faults to disk it seems like the biggest feature will not be performance, but protocol-compatibility and ease of development and deployment.

It seems like there are a few tuning modes relayed in the documents that have various persistence trade-offs, though, so the performance angle of the protocol could come into play.

I'd like Postgres to be able to support some kind of carefully-sized multi-backend caching utility someday, but as an extension if at all possible. I think such a thing is technically not that exotic (but still a lot of work), yet does not exist AFAIK. Prepared statements would usually be fast enough as-is if one had the right backend functions (UDFS) I suspect.

Is there an example of posgtres nosql doing like 700K requests/sec?

does anybody know how this would work with a pool of servers? if servers were replicated, would the memcache pool stay in sync as well? I'd assume it has to work that way to be effective. it sounds cool though, I'm glad MySQL continues to evolve.

is it clear how this works? i can't work out what a memcache query would look like. is the key that SQL select statement as a string?

It's just a memcache API call.

It doesn't even go over the SQL interface, it's not text-based.

so there's no cross-use at all? it's just a memcache api implementation that happens to be implemented in the same software package as mysql?

This part of the manual lists several advantages as compared to simply packaging MySQL and memcached together, the first two probably being the most significant.


thanks. so the advantage seems to be that it's persistent (and in a way that's integrated with existing mysql system mgmt for backups etc).

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact