Hacker News new | past | comments | ask | show | jobs | submit login
DocStore: Document Database for MySQL at Facebook [pdf] (percona.com)
60 points by ngrilly on April 17, 2015 | hide | past | favorite | 17 comments



Things like MySQL and PHP have bad reputation nowadays but Facebook shows impressive use cases and improves the technology more and more.


While tools like MySQL and PHP aren't as bad as the community makes them out to be - Facebook is a terrible example how "impressive" these tools can be considering that Facebook almost entirely rewrote both of those tools (https://github.com/webscalesql/webscalesql-5.6, https://github.com/facebook/hhvm)


You can do many things with PHP and MySQL but the real question: is it worth it? I would be interested comparing the TCO (including development time) with platforms like this. I think MySQL is lagging behind Postgres nowadays so I would much rather use that even for storing JSON, even though I do not think that storing JSON in an SQL engine is a particularly good idea, it might work. Our #1 complain against MySQL + JSON was the insert and update time that was significantly higher than the normal insert/update for obvious reasons. If the workload is not update heavy this approach might even work. :)


If you're not stuck with a huge MySQL legacy installation, Postgresql 9.4's JSONB + json_path_ops index queries seem neater. Or am I missing something?


I guess Facebook hired a lot of MySQL people and that investment is bigger than the technological one. Also, you are right, Postgres is ahead of MySQL in terms of JSON support and many more.


MySQL 5.6+ / MariaDB 10 with its InnoDB engine is very popular and suitable for web scale.

PostgreSQL is a good open source alternative for GIS and enterprise needs (Oracle, DB2, MSSQL). Unnerving is a tiny fraction of the Postgres community that acts as trolls.


Why exactly is "MariaDB 10 with its InnoDB engine" "suitable for web scale" while PostgreSQL is merely a good "alternative for GIS and enterprise needs"?

I think you may have some interesting information to add, but your comment currently reads like aped talking points, with a dash of accusing someone asking a reasonable question of trolling.


The question at the end wasn't there, when I wrote comment. The default postgresql config is rather conservative. It is more an open source Oracle with many of its (former) unique features. PL/pgSQL resembles Oracle's PL/SQL procedural language. [1]

Most people simply don't understand MySQL. MySQL is unique in as it support many database engines [4], it's like a common SQL layer on top of dozens of database engines. MySQL had a bad reputation because of its old MyISAM engine [2], the former default engine (one of many). The current default database engine is InnoDB [3] which is very fast and is what Facebook and many other huge web sites use.

Instead of repeating myself and others, MySQL+memcached is quite popular, check out the highscalability website [5].

[1] http://en.wikipedia.org/wiki/PostgreSQL , http://stackoverflow.com/questions/12622524/postgresql-9-2-1... , http://stackoverflow.com/questions/110927/would-you-recommen... , etc.

[2] http://en.wikipedia.org/wiki/MyISAM

[3] http://en.wikipedia.org/wiki/InnoDB

[4] http://en.wikipedia.org/wiki/Comparison_of_MySQL_database_en...

[5] https://www.google.com/?gws_rd=ssl#q=site:highscalability.co...


The question at the end was always there.


The queries remind me of UnQL. It was introduced by the Couchbase and SQLite teams as a way to standardize NoSQL queries: http://unql.sqlite.org/index.html/doc/tip/doc/syntax/all.wik...


At least for Couchbase thus was replaced by N1QL: http://docs.couchbase.com/developer/n1ql-dp4/n1ql-intro.html


At least for Couchbase, this was replaced by N1QL


This reminds me of FriendFeed´s Schemaless MySQL.

https://backchannel.org/blog/friendfeed-schemaless-mysql

They were acquired by FB.


I came here to post just this. Bret Taylor, one of the co-founders of FriendFeed, was CTO @ Facebook after the acquisition.

Also besides FBSON, most of what I've read in those slides looks like PostgreSQL's JSON support:

[1]: JSON type http://www.postgresql.org/docs/9.4/static/datatype-json.html

[2]: JSON functions http://www.postgresql.org/docs/9.4/static/functions-json.htm...


This is awesome. We would find it super useful.

Does anyone know if Facebook is planning to open source it, or if not, whether something similar exists?


I understand this is or will be a part of WebScaleSQL [1]. But WebScaleSQL does not provide binaries. You have to build it yourself.

An alternative is the JSON Labs Release of MySQL 5.7.7 [2]. The differences are explained in the slides, at the end.

Another great alternative is using PostgreSQL that supports JSON columns stored in a binary format since version 9.4 [3].

That said, I really like the approach of Facebook with DocStore, because of its use of "document paths", instead of functions.

[1] http://webscalesql.org/

[2] http://mysqlserverteam.com/json-labs-release-overview/

[3] http://www.postgresql.org/docs/9.4/static/datatype-json.html


This looks similar to mongoDB




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: