
RethinkDB versus PostgreSQL: my personal experience - williamstein
http://blog.sagemath.com/2017/02/09/rethinkdb-vs-postgres.html
======
mglukhovsky
I appreciate the detailed analysis. A few comments:

> This post is probably going to make some people involved with RethinkDB very
> angry at me.

Actually, our community has always felt the opposite. Performance and
scalability issues are considered bugs worth solving. That may have been the
reaction of one or two community members, but that doesn't represent our
values at all.

> A RethinkDB employee told me he thought I was their biggest user in terms of
> how hard I was pushing RethinkDB.

This may have been true (at the time) in terms of how SMC was using
changefeeds, but RethinkDB is used in far more aggressive contexts. Here's a
talk from Fidelity about how they used RethinkDB (for 25M customers across 25
nodes):
[https://www.youtube.com/watch?v=rm2zerSz6aE](https://www.youtube.com/watch?v=rm2zerSz6aE)

SMC did seem to uncover a number of surprising bugs along the way: I would
describe it as one of the more forward-thinking use cases that pushed the
envelope of some of RethinkDB's newest features. This definitely came with
lots of performance issues to solve along the way. I appreciate William’s
tenacity and patience in helping us track down and fix these along the way.

> In particular, he pointed out this 2015 blog post, in which RethinkDB is
> consistently 5x-10x slower than MongoDB.

It’s worth pointing out that this particular blog post raised serious
questions in its methodology, and recent versions of RethinkDB included very
significant performance improvements:
[https://github.com/rethinkdb/rethinkdb/issues/4282](https://github.com/rethinkdb/rethinkdb/issues/4282)

> Even then, the proxy nodes would often run at relatively high cpu usage. I
> never understood why.

I'd have to double-check with those who are far more familiar with RethinkDB's
proxy mode, but it's because the nodes are parsing and processing queries as
well, which can be CPU-intensive. They don't store any data, but if you use
ReQL queries in a complex fashion (especially paired with changefeeds) it's
going to require more CPU usage. We generally recommend that you run nodes
with a lot of cores to take advantage of the parallelized architecture that
RethinkDB has. This can get expensive if you aren't running dedicated
hardware.

> The total disk space usage was an order of magnitude less (800GB versus
> 80GB).

RethinkDB doesn't yet have compression
([https://github.com/rethinkdb/rethinkdb/issues/1396](https://github.com/rethinkdb/rethinkdb/issues/1396)).
Between this fact and running 1/3 the number of replicas, the reduced disk
usage is not surprising.

> I imagine databases are similar. Using 10x more disk space means 10x more
> reading and writing to disk, and disk is (way more than) 10x slower than
> RAM…

This isn't necessarily true, especially with SSDs. RethinkDB's storage engine
neatly divides its storage into extents that can be logically accessed in an
efficient fashion. This is particularly valuable when running on SSDs, which
are fundamentally parallelized devices. RethinkDB also caches data in memory
as much as possible to avoid going to disk, but using more disk space doesn't
immediately translate to lower performance.

One other interesting detail: since RethinkDB doesn’t have schemas, it stores
the field names of each document individually. This is one of the trade-offs
of not having a schema: even with compression, RethinkDB would use more space
than Postgres for this reason. (This also impacts performance, since
schemaless data is more complicated to parse and process.)

> Not listening to users is perhaps not the best approach to building quality
> software. [referring to microbenchmarks]

I think William may have misinterpreted the quote he describes from Slava’s
post-mortem. Slava was referring to benchmarks that don’t affect the core
performance of the database or production quality of the system, but may look
better when you run micro-benchmarks: [https://rethinkdb.com/blog/the-
benchmark-youre-reading-is-pr...](https://rethinkdb.com/blog/the-benchmark-
youre-reading-is-probably-wrong/)

We have always had an open development process on GitHub to collaboratively
decide what features to build, and what their implementation should look like.
I’m not certain what design choices William is suggesting we rejected. One has
to only look at the proposal for dates and times in RethinkDB to see how this
process and open conversation unfolds with our users:
[https://github.com/rethinkdb/rethinkdb/issues/977](https://github.com/rethinkdb/rethinkdb/issues/977)

> Really, what I love is the problems that RethinkDB solved, and where I
> believed RethinkDB could be 2-3 years from now if brilliant engineers like
> Daniel Mewes continued to work fulltime on the project.

RethinkDB development is proceeding after joining The Linux Foundation,
despite the company shutdown. We believe that with a few years of work,
RethinkDB will continue to mature as a database to reach Postgres’ level of
stability and performance. We’re exploring options for funding dedicated
developers long-term as an open-source project.

My thoughts: whatever technology you end up picking is going to have tradeoffs
depending on your use case (and the maturity of the technology) and it's going
to come with baggage. That's true of Postgres, MongoDB, RethinkDB, any
programming language you choose, any tools you pick. If you're willing to
carry that baggage it can be worth it: especially if it gives you developer
velocity or if the problem you're solving is particularly well-suited to the
tool.

Pick the technology that will have the least baggage for your problem. I often
recommend Postgres to people, despite being one of the RethinkDB founders.
Pragmatism wins over idealism, every time.

~~~
nemothekid
>It’s worth pointing out that this particular blog post raised serious
questions in its methodology, and recent versions of RethinkDB included very
significant performance improvements:
[https://github.com/rethinkdb/rethinkdb/issues/4282](https://github.com/rethinkdb/rethinkdb/issues/4282)

I wouldn't even seriously consider that point - the article didn't even
mention what version of MongoDB was being used. Safe mode writes could have
been off, and he may have been just testing the latency between the client and
database nodes. It's a pretty poor benchmark.

------
dhd415
I think there are several useful points here:

1) It's rare to have enough insight into the internals of a particular
datastore to accurately predict how it will perform on a particular workload.
Whenever possible, early testing on production-scale workloads is essential
for planning and proofs of concept.

2) Database capabilities are a moving target. E.g., the performance
improvements to pgsql's LISTEN/NOTIFY are essential to its ability to handle
this particular workload. In previous jobs, I've had coworkers cite failed
experiences with 15-20 year-old databases as reasons for not considering them
for new projects. Database tech has come a long way in that time.

3) Carefully-tuned RDBMSs are more capable than many tend to admit.

~~~
koolba
> Carefully-tuned RDBMSs are more capable than many tend to admit.

I'd generalize that further: _RDBMSs are more capable than many tend to
admit._

When it comes to a persistent data store, you've got to go out of your way to
justify using something that isn't statically typed with firm ACID guarantees.
I'm not saying those use cases don't exist, I'm saying most people don't have
them.

"Saving" 15 minutes of dev time because you don't know what your schema is
going to look like is going to cost you orders of magnitude more time down the
road asking yourself that same question.

~~~
stouset
> "Saving" 15 minutes of dev time because you don't know what your schema is
> going to look like is going to cost you orders of magnitude more time down
> the road asking yourself that same question.

Yeah. In my experience, "schemaless" doesn't mean you have no schema. You
still have a schema, it's just implicit and you don't have any tools available
to actually operate on that schema.

~~~
mickronome
The problem with standard'ish SQL+RDBMS and their schemas is that types are
not enforced well in SQL and most libraries and language integrations, thus we
end up with the quagmire of static schemas that can not be reliably type
checked when you use standard tooling.

Also lacking in many DB systems is integrated support for tables withh
heterogenous schemas that is supported by page/row-level cersioning and/or on
line schema alteration. Having to rewrite a huge table only because you want
to narrow a couple field for future data, it gets old quickly.

~~~
ankrgyl
> The problem with standard'ish SQL+RDBMS and their schemas is that types are
> not enforced well in SQL and most libraries and language integrations, thus
> we end up with the quagmire of static schemas that can not be reliably type
> checked when you use standard tooling.

This used to be an issue with MySQL (and not really any other major RDBMS
implementation), but with STRICT_MODE and the default settings on other
implementations, SQL is fantastic at representing and enforcing static types.
Furthermore, ORMs and tools like Apache Spark have really upped the
integration between types in the database and types in languages. Practically
speaking, RDBMS is really the _only_ way you can have sane static typing in
most applications (esp. ones that are built on dynamic languages).

> Also lacking in many DB systems is integrated support for tables withh
> heterogenous schemas that is supported by page/row-level cersioning and/or
> on line schema alteration. Having to rewrite a huge table only because you
> want to narrow a couple field for future data, it gets old quickly.

This is just not true. All of the major commercial RDBMS systems support both
online schema alteration AND flexible/semi-structured types (JSON, XML).
Again, MySQL and Postgres are behind the curve on this (although they both now
support JSON). Commercial systems like SQL Server, Oracle, and MemSQL all have
both.

~~~
jeltz
PostgreSQL may be behind in some other areas but not here. PostgreSQL has some
of the best support for online schema changes (transactional DDL for almost
everything, lots of work put into reducing lock level of schema changes) and
was the first to implement support for JSON.

~~~
rixed
Please try to alter a schema of a live multibillion rows table before
pretending that it works. Unless they changed how is implemented recently then
such an operation can take hours. I love postgres for many reasons so i would
like to be proven wrong.

~~~
unixhero
Hours is okay, if it works.

~~~
sgift
Now, if you could just tell this to the people who depend on the software and
tell me that hours of down time is not a possibility.

~~~
MarHoff
Hours of work <> Hours of downtime

The bigger the database the more complex it will be to implement schema change
while keeping track of transactions that happen while your are transitioning.
However I see no reason why it shouldn't be feasible given proper amount of
preparation. And if short downtime is a goal I don't see why a proper amount
of time developing a solution wouldn't make it either.

And on top of that PostgreSQL community is implementing new way to solve
problem with each release. I'm not in a huge live-data use case so I might be
wrong, but upcoming logical replication look promising for such use cases
[https://www.depesz.com/2017/02/07/waiting-for-
postgresql-10-...](https://www.depesz.com/2017/02/07/waiting-for-
postgresql-10-logical-replication/).

------
eikenberry
IMO the main advantage for RethinkDB is its HA story. Last time I had to
manage a PostgreSQL cluster (2012-2013) its HA story was pretty bad. It was
limited to a master-slave(s) setup with manual failover and manual cluster
rebuilding all dependent on incomplete 3rd party tools. Has PostgreSQL
improved on this? A quick googling leads me to believe it hasn't and I'd only
even consider it again if it were managed by a 3rd party (eg. aws rds).

~~~
api
If you want a managed solution RedShift supports PGSQL syntax and connectors
now I think. That's a fully managed turn-key DB, but of course it locks you
into AWS.

~~~
dguaraglia
Not meaning to be an ass, but this is really bad advice.

Redshift is _not_ supposed to be used like Postgres. Redshift is a _data
warehousing_ solution, with completely different tradeoffs, and accommodating
completely different work loads than your average Postgres database. For
example, you can't create indices on Redshift tables or relationships between
them, the consistency story is completely different from Postgres and Redshift
is optimized for bulk loads, not millions of discrete inserts/updates a
second.

Amazon RDS does provide hosted Postgres, and _that_ is what you want to use if
you want managed Postgres. Or you can use Herokus's hosted Postgres (it does
get expensive with size, though.)

------
jwr
Do I understand correctly that the author went from a distributed database to
a single-master scenario? That's a valid tradeoff, but I'd clearly describe it
as such.

My experiences with RethinkDB have been rather positive, but my load is
nowhere near that of what the article describes. I agree that ReQL could be
improved, I found that there are too many limitations in chaining once you
start using it for more complex things.

But the two most important advantages remain for me:

* changefeeds (they work for me),

* a distributed database that I can run on multiple nodes.

I do agree that PostgreSQL is fantastic and that SQL is a fine tool. In my
case the above points were the only reasons why I did not use PostgreSQL.

EDIT: after thinking about this for a while, I wonder if the RethinkDB
changefeed scenario is doable with the tools in PostgreSQL: get initial user
data, then get all subsequent changes to that data, with no race conditions.
Many workloads seem to concentrate on twitter-like functionality, where the is
no clear concept of a change stream beginning and races do not matter.

~~~
ruslan_talpa
You say that you did not have such loads, so why did you need multiple nodes
then? Why the complexity when databases like Postgres can do 1.5M queries per
second on a single box? StackOverflow is running just fine on basically one
big mssql box (the other is just a standby replica, if i remember correctly)

~~~
jwr
Two reasons:

* I expect a larger load in the future,

* I want multiple nodes not just for speed, but mostly for data replication.

~~~
ruslan_talpa
* you are solving problems you don't have * you can have replicas with postgres also

even if you do end up getting the load you are hoping for (though i doubt you
will have 1.5 million qps), reads are easily scalable with replicas, and
writes are also possible with sharding or tools from CitusData

~~~
ruslan_talpa
I don't have of the top of my head other examples but i know PG is used by
yandex.ru which is like the google of russia, and they migrated from oracle
and they are very happy, so if it works for them i think it will work for you
:)

~~~
rixed
Don't they use clickhouse instead?

~~~
ruslan_talpa
they probably use a lot of tools for a lot of reasons but they do use postgres

[https://www.pgcon.org/2016/schedule/attachments/426_2016.05....](https://www.pgcon.org/2016/schedule/attachments/426_2016.05.19%20Yandex.Mail%20success%20story.pdf)

[https://www.youtube.com/watch?v=-SS4R1sFH3c](https://www.youtube.com/watch?v=-SS4R1sFH3c)

------
living-fossil
>Everything is an order of magnitude more efficient using PostgreSQL than it
was with RethinkDB.

A large part of the sales pitch of "NoSQL" was that traditional RDBMSs
couldn't handle "webscale" loads, whatever that meant.

Yet somehow, we continue to see PostgreSQL beating Mongo, Rethink, and other
trendy "NoSQL" upstarts at performance, one of the primary advantages they're
supposed to have over it.

Let's be frank. The only reason "NoSQL" exists at all is 20-something hipster
programmers being too lazy to learn SQL (let alone relational theory), and
ageism--not just against older programmers, but against older technology
itself, no matter how fast, powerful, stable, and well-tested it may be.

After all, PostgreSQL is "old," having its roots in the Berkeley Ingress
project three decades ago. Clearly, something hacked together by a cadre of
OSX-using, JSON-slinging hipster programmers _MUST_ be better, right?
Nevermind that "NoSQL" itself is technically even older, with "NoSQL" systems
like IBM's IMS dating back the 1960s:
[https://en.wikipedia.org/wiki/IBM_Information_Management_Sys...](https://en.wikipedia.org/wiki/IBM_Information_Management_System)

~~~
Vinnl
You might have a point about Postgres being faster than whatever NoSQL
database, but there's really no need for all the name calling. Especially the
"lazy" is very uncalled for in my opinion - there's just too much to learn,
and too little time. If you're a 20-something (or whatever age, for that
matter) and have to maintain the entire system of a single company, you're
simply not going to be able to be a master of every part of the stack, or even
know where you're making the wrong choices.

~~~
sergiosgc
If there's too much to learn and little time, you learn the stuff that
computer science got right. Some abstractions are so good that you should
prioritize learning them above any technology of the day:

\- Relational algebra

\- The network stack model (OSI or IP; it is irrelevant)

\- Compiler theory

\- Category theory

You'll never regret the time spent on these.

~~~
leeoniya
so if there isn't even enough time to learn practical but sub-optimal
solutions, a dev should instead absorb books about IT, mathematics and
programming theory?

great advice...unless any part of your equation includes receiving a paycheck.

~~~
sergiosgc
Don't make it sound harder than it is. Each of these is 30-40 hours of work
for undergraduate students. You don't have to quit your job and join a
monastery to study compilers. It's interesting, challenging and fun for those
of us into computing; and it pays back the time spent learning.

------
cpr
Thanks for taking the time to write this up, William.

This is a great read, even if only as a helpful "this is how I did something
hard, and how it turned out" kind of hacker story.

And I could see this being quite relevant to some ideas I have for a multi-
user semi-real-time cooperative-editing web app.

------
bryogenic
How far has postgres come w.r.t. setting up a cluster with automatic fail-over
and recovery? I didn't see the author address this aspect of Rethink that has
a lot going for it.

~~~
ploxiln
Clustering and automatic fail-over and recovery is not always a great idea. It
sounds like you want it implemented so you "don't have to worry about it", but
fail-over and recovery systems are a lot more complicated and troublesome than
an alert, some read-only degraded operation, and a manual promotion.

Aphyr's "call me maybe" series shows just how hard it is to get automatic
failover and recovery working just right. If you're not ready to become a
full-time expert on the topic, and pay for more servers for the same load,
then simple replication and manual promotion can result in less downtime and
less dataloss. I've seen multiple small startups have cassandra clusters fail,
because they did not maintain them properly. I've seen them mysteriously lose
data in elasticsearch clusters, for the same reason.

That said, GitLab has recently shown that even simple replication can be
screwed up. I'd say, the lesson is, the most effective thing is to keep it
simple (not "easy"), and avoid disaster caused by mundane mistakes.

~~~
ansible
Yeah, if you can wait another year or two, something like CockroachDB [1]
might fit the bill. That is a database that is distributed from the start. So
you (theoretically) won't have to worry as much about individual datacenter
failures, and scaling.

[1] [http://www.cockroachlabs.com](http://www.cockroachlabs.com)

~~~
brightball
What about Couchbase?

~~~
StreamBright
We had serious operational issues with Couchbase and some funny developers
where always ready to answer our tickets with "works on my laptop". One we had
1.3 billion documents in the cluster and it became unstable and we had to
delete all of the data to bring the service back online. Luckily we used S3 as
the source of truth based on the operational characteristics of CB. I would
not recommend it for anybody who is not ready for massive dataloss and
continuous operational issues. Some of these might be been fixed in the
meanwhile though.

------
iamleppert
It doesn't make sense to me to architect these kinds of applications at the
database level.

What's wrong with using something like redis pubsub? I don't get the obsession
of evented databases, or implementing this kind of thing at the database
level. I suppose its attractive to "listen" to a table for changes but the
pattern can be implemented elsewhere and with better tools.

Databases should be used for persistence, organization and schema of data,
have flexible querying, and not much else.

~~~
ankrgyl
You can flip this argument and make a similar point:

"What's wrong with using something like Postgres LISTEN? I don't get the
obsession of redis pubsub, or implementing this kind of thing outside the
database. I suppose its attractive to "subscribe" to a collection for changes
but the pattern can be implemented inside a database which has better tools."

In more constructive terms, databases are familiar to a lot of folks and have
reliable guarantees (persistence, ACID) that are generally useful properties.
If you're able to achieve the performance you need within the database, then
you get a lot of operational benefits from keeping your workload running
inside of it. If your workload for some reason is slow inside of a database,
then it certainly makes sense to consider specialized alternatives (like
Redis). In my experience, however, you can tune a database to perform better
than these tools in pretty much every real-world case (i.e. moderate
concurrency with realistic load).

~~~
floatboth
Postgres LISTEN/NOTIFY is actually very similar to Redis pubsub. You can
manually add triggers to send notification on changes, sure, but the real "in
the database" stuff is in RethinkDB.

------
StreamBright
Should be titled: Finding out how awesome Postgres is (the hard way)

------
acidity
>>> Weird. OK, I tried it with some other parameters, and it suddenly took 15
seconds at 100% CPU, with PostgreSQL doing some linear scan through data.
Using EXPLAIN I found that with full production data the query planner was
doing something idiotic in some cases. I learned how to impact the query
planner, and then this query went back to taking only a few milliseconds for
any input. With this one change to influence the query planner (to actually
always use an index I had properly made), things became dramatically faster.
Basically the load on the database server went from 100% to well under >>> 5%.

I am actually interested in this part. Figuring out issues with EXPLAIN is one
of my favorite things.

~~~
topspin
I'm replying due to this same part/paragraph. I've been dealing with some
manifestation of EXPLAIN since Oracle 6. In the last few years I've become
handy with PostgreSQL's EXPLAIN as well, and this reminds me of my biggest
hang-up regarding PostgreSQL; their hostility toward optimizer hints.

Like Mr. Stein I too have found myself in bad places with PostgreSQL's
optimizer. This is commonplace with relational systems; every such system I've
ever dealt with, including all versions of Oracle since the mid 90's,
Informix, MS-SQL, DB/2 (on AS/400, Windows and Linux,) and PostgreSQL
eventually get handed a query and a schema that produces the _wrong_ plan and
has intolerably bad performance. No exception. None of these attempts to
create flawless optimizers that anticipate every use case has ever succeeded,
PostgreSQL included.

With other systems there are hints that, as a last resort, you can apply to
get efficient results. Not so much with PostgreSQL. Not implementing the sort
of hints that solve these problems (as opposed to the often ineffectual
enable_* planner configuration, unacceptable global configuration and other
workarounds needed with PostgreSQL) is policy:

"We are not interested in implementing hints in the exact ways they are
commonly implemented on other databases. Proposals based on 'because they've
got them' will not be welcomed."

How about proposals based on "because your hint-free optimizer gets it wrong
and I _require_ a working solution without too many backflips and somersaults
or database design lectures." No? Then sorry; I can't risk getting painted
into a corner by your narrow minded and naive policy. PostgreSQL goes no
further than non-critical, ancillary systems when I have say in it. And I do.

~~~
grogers
Having never used postgres, this seems extremely scary. I've only ever needed
index hints a couple times in other databases, but when you need them there
isn't usually an alternative. Messing with global knobs is a good way to cause
more problems than you are solving.

I've even entertained the idea that every query should be hinted. For OLTP
workloads, you practically always know exactly how you want the DB to execute
your query anyways. And often times you find out very late that the query
planner made the wrong choice and now your query is taking orders of magnitude
longer than it should (worse, sometimes this changes at runtime). I've never
actually gone through with this religiously though...

~~~
topspin
>> Having never used postgres, this seems extremely scary. I've only ever
needed index hints a couple times in other databases, but when you need them
there isn't usually an alternative. Messing with global knobs is a good way to
cause more problems than you are solving.

You've got the plot exactly. The last such battle I was involved with ended in
creating a materialized view to substitute for several tables in a larger
join; without the view there was no way[1] to get an acceptable plan. Creating
this view was effectively just a form of programming our own planner. And yes,
the need to update the view to get the desired result is an ongoing problem;
one that's scheduled to get solved with a migration to another DB.

Like you I've never been all that quick to employ hints. I tend to use them
while experimenting during development or troubleshooting and avoid them in
production code. But there have been production uses, and you know what? The
world did not end. No one laughed at or fired me. No regulatory agency fined
me. It did not get posted on Daily WTF. No subsequent maintenance programmer
has ever shown up at my home in the dead of night. It just solved the problem,
quickly and effectively.

Sure would be nice if people purporting to offer a fit-for-purpose relational
systems understood the value of a little pragmatism.

[1] given the finite amount of time we could sacrifice to deal with it

------
api
What I didn't see mentioned here is clustering. One of the things that sold us
on RethinkDB was how easy it was to cluster compared to PostgreSQL. The latter
has poor documentation and it's very hard to know you've done things right...
and if you don't the results can be catastrophic failures or mysterious
replication problems with cryptic error messages.

Edit: also I was led to believe by PG documentation that LISTEN/NOTIFY is
impossible across a cluster, which means that code depending on LISTEN/NOTIFY
is impossible to cluster. If that's the case you're stuck with master/slave
and manual or (scary) automatic failover now.

We wanted a system that is masterless (or all-master) in the sense that any
node can fail at any time and the system doesn't care. RethinkDB delivers
that, at least within the bounds of sane failure scenarios, and it delivers it
without requiring a full time DBA to set up and maintain. That's worth a
certain amount of CPU, disk, and RAM in exchange for stability and personnel
costs, especially when a bare metal 32GB RAM SSD Xeon on OVH is <$200/month
fully loaded with monitoring and SLA. So far we've been unable to throw a real
world work load at those things that makes them do anything but yawn, and OVH
has three data centers in France with private fiber between them allowing for
a multi-DC Raft failover cluster. It's pretty sweet.

The only thing that would make me reconsider is if the use patterns of our
data were really aggressively relational. In that case PGSQL would be a clear
winner in terms of the performance of advanced relational operations and the
expressivity of SQL for those operations. ReQL gives you some relational
features on top of a document DB but it has limitations and is really designed
for simpler relational use cases like basic joins.

~~~
zzzcpan
"We wanted a system that is masterless (or all-master) in the sense that any
node can fail at any time and the system doesn't care."

Neither PostgreSQL, nor MySQL can do much here and this is actually why we
have the nosql movement. These problems are fundamentally unsolvable with the
trade offs those RDBMSs made.

~~~
erichocean
_ahem_ CockroachDB would like a word…

~~~
kodablah
And that's why we have the "NewSQL" movement, though a bit slow. The benefit
of NoSQL in some cases (Cassandra and HBase for me) is that it restricts
developers from using non-scalable data structuring and querying. RDBMS's have
a bad reputation primarily because how they were used via unrestricted queries
that came back to haunt the product owners.

And even then, some local DB approaches are fundamentally unsolvable in a
distributed way (CAP, exactly-once-sends, etc) without trading something, even
with cockroach.

------
skc
So are there any actual success stories that the NoSQL movement can point at
because it's bizarre to me how NoSQL can still be all the rage when time after
time all I read is post mortems detailing painful experience after painful
experience.

I'm at a stage where I haven't built enough of my current project to make
moving back to an RDBMS painful yet, so all this stuff scares me.

~~~
keithnz
I've read a number of things on HN where people are building massive DBs with
Cassandra with good success.

------
h1d
I've always wondered how ReQL didn't look intuitive from this comparison and
stayed away but I guess that wasn't a wrong assumption.

"Definitely, the act of writing queries in SQL was much faster for me than
writing ReQL, despite me having used ReQL seriusly for over a year. There’s
something really natural and powerful about SQL."

[https://www.rethinkdb.com/docs/sql-to-
reql/python/](https://www.rethinkdb.com/docs/sql-to-reql/python/)

------
bartread
Good grief. Well this is a genuinely fascinating post containing a couple of
absolutely terrifying insights:

"A RethinkDB employee told me he thought I was their biggest user in terms of
how hard I was pushing RethinkDB."

Erm. If I were working for/a founder of a relatively small company using a
product or service, especially one that's so critical to my own business, that
is not the sort of thing I'd want to hear from the provider.

"Everything was a battle; even trying to do backups was really painful, and
eventually we gave up on making proper full consistent backups (instead,
backing up only the really important tables via complete JSON dumps)."

Holy crap.

Well, the story has a happy ending, and I think the point about the
fundamental expressiveness of SQL is something that a lot of people miss in
the mad dash to adopt "simpler" NoSQL solutions. I personally find SQL verbose
and a bit ugly, but I still sort of love it because it's _hugely_ powerful and
expressive. I was perhaps 6 or 7 years into my career before I became
comfortable with it, but I wish I'd thrown myself into learning it properly
sooner because it is so incredibly useful.

------
crad
(Also posted to the comments section of the blog)

Great writeup! One of the issues I've run into with LISTEN/NOTIFY is the fact
that it's not transaction safe. ie if you call NOTIFY and then encounter an
error causing a rollback, you can't undo the NOTIFY.

I ended up building a system on top of PgQ
([https://wiki.postgresql.org/wiki/SkyTools#PgQ](https://wiki.postgresql.org/wiki/SkyTools#PgQ))
called Mikkoo
([https://github.com/gmr/mikkoo#mikkoo](https://github.com/gmr/mikkoo#mikkoo))
that uses RabbitMQ to talk with the distributed apps that needed to know about
the transaction log. Might be helpful if you end up running into transactional
issues with your use of LISTEN/NOTIFY.

~~~
neilc
NOTIFY is transaction-safe: if the NOTIFY-ing transaction rolls back, the
NOTIFY will not be delivered. See the discussion here ("NOTIFY interacts with
SQL transactions in some important ways..."):

[https://www.postgresql.org/docs/9.6/static/sql-
notify.html](https://www.postgresql.org/docs/9.6/static/sql-notify.html)

~~~
crad
Yup, my bad, thanks. I confused the two problems I worked through in working
through my use case. The transaction safety one was not the NOTIFY version of
the project. That had different issues, which I need to go back and look at my
notes for. My memory is a bit hazy, but IIRC we were seeing notifications
dropped under high velocities and needed delivery guarantees.

~~~
netghost
I seem to remember that when it was first implemented there was a fairly small
buffer for queued messages. That might have contributed to your problem. The
docs say it's 8GB by default now, which is probably sufficient.

~~~
masklinn
The original NOTIFY also didn't have a payload (that was added in PG9) which
made it much harder to know what you were notified about.

That one simple addition made pre-9.0 NOTIFY and post-9.0 NOTIFY completely
different beasts.

------
rattray
> I wrote code that automates creation of all triggers to do listen/notify

Care to elaborate and/or open-source? Sounds potentially enticing.

~~~
rattray
Author actually linked in the following paragraph, I believe the referenced
bit is mostly here:

[https://github.com/sagemathinc/smc/blob/master/src/smc-
hub/p...](https://github.com/sagemathinc/smc/blob/master/src/smc-hub/postgres-
synctable.coffee#L882)

Looks pretty nice!

------
grizzles
A really nice thing about code written from the 90s and earlier is that it was
designed from the get go to be perfomant, because it had to be. No one in
conventional sw writes code like that anymore, not really. eg. When was the
last time you used a profiler?

I recently updated from Fedora 24 to 25. I noticed a big performance drop
until I shoved more ram into my desktop, and now it's fine again. I can't be
certain but I'd wager that this might be because F25 is the first Fedora to
use Wayland (over X) by default. X might be old and fugly but it was certainly
written in an era where it had to achieve a certain baseline level of
performance.

~~~
trishume
It's true that Wayland uses more RAM and if you don't have enough swapping
will kill you. But in this case (and others) RAM use is inversely correlated
with performance. For example Wayland uses more RAM as full backing buffers
for every window, and this actually makes moving windows around faster because
it doesn't have to ask the app to re-draw exposed areas. Unless you spill into
swap, then you are indeed screwed.

~~~
floatboth
Xorg with a compositor does the exact same thing with full backing buffers for
every window. The only way X is using less RAM is if you run without
compositing.

If you're not using a compositor, you must be either living in the 90s or
using a device without a supported GPU. Compositing is not new, people were
excited about Compiz desktop cubes and wobbly windows back in the mid-2000s!

A Wayland compositor running directly on EGL should always be faster than Xorg
with a compositor. Xorg acts as an awkward middleman between your app and the
compositor, wasting time on buffer copies.

------
thewhitetulip
> I care about solutions, not glorifying a particular piece of code for its
> own sake I wish everyone were like this!

------
postila
Was curious when were LISTEN/NOTIFY implemented in Postgres.

Seems like 6.4 version already had it:
[https://www.postgresql.org/docs/6.4/static/sql-
notify.html](https://www.postgresql.org/docs/6.4/static/sql-notify.html)

6.4 was released in 1998...

[https://github.com/postgres/postgres/blob/REL6_4/doc/src/sgm...](https://github.com/postgres/postgres/blob/REL6_4/doc/src/sgml/ref/listen.sgml)

~~~
williamstein
Thanks! NOTIFY is much nicer to use today, since you can also include a
payload, and there is a clean way to format data to include in the payload
using all the new json functions. I understand that NOTIFY has also been sped
up over the years...

------
perlgeek
I don't understand the approach, really. They manage a 25+ node cluster of
RethinkDB, but are reluctant to introduce a message broker?

In my experience, a message broker isn't such a big operational burden, even
more so if it doesn't have to persist messages. And a message broker with
pub/sub doesn't make the architecture necessarily more complicated.

Somehow that seems like optimizing along the wrong axis.

------
jbhatab
I would LOVE a layer built on top of postgres for reactive db events. +1

If it also configured into graphql like you were mentioning at one point, even
sexier.

~~~
lobster_johnson
We're preparing for an open source release of exactly that -- a document-
oriented data store on top of Postgres and Elasticsearch, with transactions,
joins, optional schemas, a query language, fine-grained document patches and
change feeds. Email me if you want a ping when it's available.

~~~
floatboth
Why Elasticsearch when Postgres has full-text search built in? It's one of my
favorite Postgres features, I basically never have to bother with
Elasticsearch/Solr/etc.

~~~
lobster_johnson
Good question. I hope to create a backend that actually stores data in
Postgres, too.

We don't really use Elasticsearch because of the fulltext support, although we
do use that, too.

Our document store is split into two parts: (1) A highly transactional data
store (which uses Postgres) which stores master data and where everything is
strict; and (2) an eventually-consistent search index (which uses
Elasticsearch) where everything is expendable and queries are less strict.

This has the benefit of allowing extreme horizontal scalability on the read
path, without impacting the performance of the write path, but at the cost of
less consistency. By dividing the two, we can control the flow of data into
the write path; for example, some clients do batch imports that are indexed
more slowly than real-time updates.

The challenge with layering the search index on top of Postgres is how to
represent the data. The document store manages the schemas for you, so we know
to some extent what the data is, but also supports either partially or
completely open schemas (where the allowed fields are either partially
validated or completely schemaless).

We could build tables dynamically from schemas, or we could denormalize the
data into a less efficient [id, field, type, value] table, or we could index
JSONB documents (GIN, not as efficient as B-trees on normal columns afaik, and
limited in some ways). There are a bunch of options.

------
Kiro
My use case: I run a couple of stateful multiplayer games where I have a
Node.js server writing to MongoDB on each player update (e.g. "update player.x
to 123"). I only read from the DB when a player logs in and is added to the
game or when the server restarts (items etc). As long as a player is online
the state is just kept in in memory (a big array of all the players) but is
written to the DB every time it's updated.

This means the game could theoretically work without a DB at all, the time it
takes to write to the DB etc doesn't matter as long as it happens in the
correct order. Read speed is also not really relevant since it happens so
seldom.

The MongoDB document is the same as the player object in Node.js.

I've been thinking of migrating to RethinkDB but I've also been looking at
PostgreSQL. Would the JSON support cover this sufficiently and would it make
sense? I don't need any schemas or anything like that. I just want to be able
to add and update JSON objects.

~~~
raverbashing
You could move to redis

~~~
Kiro
Thanks. Can I update a specific property without overwriting the whole JSON
blob? Or how do I do it effectively?

~~~
gamesbrainiac
Yes, you can namespace properly, and have that effect.

------
mi100hael
_> I didn’t seriously consider MySQL since it doesn’t have LISTEN/NOTIFY, and
is also GPL licensed, whereas PostgreSQL has a very liberal license._

GPL/AGPL is perfectly permissive and doesn't require any sort of special
disclosure if you're just running a vanilla distribution of a server without
modifying its code...

~~~
newsat13
Yeah, that line stood out for me as well. Can someone explain why MySQL's GPL
license is a problem in practice for his use case?

~~~
infradig
In the actual article he mentioned a big potential customer who had license
limitations... no AGPL! I can undertand someone not wanting to limit the sales
audience.

~~~
daenney
That particular quote was in relation to RethinkDB though, not MySQL:

> Then I was in a very long and intense meeting with a potentially major
> customer for an on-premises install, and one of their basic requirements was
> “no AGPL in the stack”. With the RethinkDB company gone, there was no way to
> satisfy that requirement, and my requests went nowhere at the time.

~~~
sirclueless
The point being that the AGPL licensing of RethinkDB caused problems selling
SMC to customers. So why invite those same problems again by replacing it with
other GPL'ed software?

~~~
daenney
I don't know. But if you read further down the article:

> All the code I wrote related to this blog post is – ironically – AGPL.
> Basically it is everything that starts with postgres- here.

All the code they just wrote for the Pgsql rewrite is AGPL. Obviously they can
relicense since they own the copyright, but it's weird to mention MySQL's GPL
license in passing as a reason they didn't went that way yet put your own code
up as AGPL.

~~~
grzm
One of the motivations to use Postgres was that it was not AGPL, while the
associated code _is_ AGPL; this is what drives the inclusion of the word
_ironically_ , does it not? In my reading, the author is very aware of the
inconsistency.

------
ssfak
I am a bit puzzled about the scalability of listen/notify in Postgres and its
use in the article. Each "listener" in the code requires a connection in the
database so it's not a good design to have one listen "query" for each user.
You will probably need a dedicated connection in a thread (or a limited number
of connections) for the database listening functionality. You can possibly use
some async PG driver but still on the database end I am not sure how efficient
and scalable this solution will be.

I can assume that this is a good solution if you don't have (need) a high rate
of "notify" statements and a high number of subscribers waiting on "listen".
Any comments on these limits of PostgreSQL?

------
ruslan_talpa
Good read and yes, postgres is king, no argument there but i do think the
comparison is a bit unfair to RethinkDB. Work on Postgres started in 1986, you
can't compare performance of tools when one of them had 30 years to work on
performance and the other is like 5y old which had the benefit of only a
handful of brains working on it. I would say it's remarkabl what RethinkDB did
in this timeframe.

In relation to Postgres and real time messages, another approach is to use a
real messaging server instead of using only the simple listen/notify interface
pg provides. It's possible to connect them using this
[https://github.com/gmr/pgsql-listen-exchange](https://github.com/gmr/pgsql-
listen-exchange)

I am just wrapping up the integration here
([http://graphqlapi.com](http://graphqlapi.com)) and so far it looks good.
Postgres provides the power and features we all know and rabbitmq gives you
all the realtime capabilities you need, and you can route messages in complex
ways and have them delivered to a whole bunch of clients.

~~~
paulsutter
> you can't compare performance of tools when one of them had 30 years to work
> on performance and the other is like 5y old

Actually, you can and should. In fact that's the whole point.

You certainly can't make a judgement on the skills of the developers, but
William isn't doing that.

~~~
ruslan_talpa
You are right, that was not a good choice of words. What i was getting at is
that the article reads as "i was constantly fighting Rdb and Pg solved all my
problems" and even if ppl don't make judgements on the developers they do so
on the tool and that kind of remains in your brain, you attach a stigma to the
tool and never consider it even though it will get a lot better if you just
give it time. For example, i tried pg back in 2000 something, did just a
stupid benchmark on how fast it does a count, compared it to mysql and never
looked back, so for 10 years i ignored pg because of that stupid test and
because in my brain there was this idea that it's slow. Thank god i somehow
got over that ...

~~~
aidos
Might be wrong, but I feel like you may have a vested interest and hence read
the article as more of an attack than it really was.

To me it reads more like "I understood the problem space better during the
rewrite so I could solve things in more apppropriate ways with a the tools at
hand."

~~~
ruslan_talpa
Yes you are wrong :) i do not have a vested interest in RethinkDB, I actually
do have a vested interest in Postgres :), so i was kind of defending the other
guy although i am PG all the way (read my other comments, you'll see i quite
like PG, so much so that i built a tool/business around it)

EDIT: removed a link

------
gamesbrainiac
Quick question, how does one impact PG's query planner other than creating
indexes (partial or otherwise)?

~~~
radiowave
Potentially, there was an index which the author _knew_ would yield good
performance for a particular query, but the query planner over-estimated the
number of rows, and so decided that a sequential scan would be a better
choice. By executing the command, "SET enable_seqscan = OFF;" inside a
session, you can force postgres to use an index if it possibly can.

These sort of problems can occur (for example) when very complicated joins are
performed, or when unstructured data is being used as part of a join, as these
tend to lead to poor estimates of the number of rows that the different parts
of the query will yeild.

Relevant docs here:

[https://www.postgresql.org/docs/current/static/runtime-
confi...](https://www.postgresql.org/docs/current/static/runtime-config-
query.html)

~~~
Groxx
For additional anecdotes, I've had to hint MySQL's planner twice, both times
to exclude an index. It'd switch because the cardinality of one index was much
better, but there were enough pathological cases (and they were the most-often
used) that the planner's choices were proving fatal.

It's not too abnormal. Sometimes you know the structure of your data better
than the database's internal summary does, so you just give it a nudge in the
right direction.

------
jorblumesea
I'm confused, the situation clearly called for a relational database. Almost
any RDBMS would have been better than any "schemaless" db in this case. Was it
just to get the reactive architecture features? I'm confused why you would
architect an application from the database up.

~~~
williamstein
Nothing was clear to me when I started writing the application. However, after
years of running it in production, things did become much clearer, and indeed,
a relational database is ideal.

------
rattray
The author mentioned they're running in Google Cloud Engine. I'm curious; why
not use RDS, which has Postgres support? (Especially considering that they are
moving from a 3x-redundant setup to one with no redundancy)

~~~
brianwawok
GCE does not have RDS, RDS is AWS.

GCE only has CloudSQL which is MySQL.

------
devmunchies
No mention of RethinkDB joining the Linux foundation?

~~~
wmf
It happened after he finished migrating to Postgres so it's not that relevant.
The timing of the relicensing was pretty bad luck in this case.

------
brilliantcode
This gives me hope that maybe we could see something similar with Datomic.
Perhaps it is possible to implement the same append-only (this is the best way
I understand it), immutable audit trail on top of PostgreSQL and still walk
away with SQL (datalog is neat-o but has a learning curve).

but I dream the dream...

~~~
tensor
Technically PostgreSQL already has this in terms of the WAL log. The problem
is that "rolling back" to a given point of time requires recreating the
database from the WAL log so it's not useful in the same way.

There are other "remember everything" schemes you can play with clever
triggers, but it always comes back to how you end up using the stored data and
how easy it is to bring it back to a queryable state.

~~~
twic
PostgreSQL also has it in the form of its MVCC table structure. If you never
delete 'old' rows, you could query old states of the databases by ignoring
rows with newer transaction IDs.

However, nothing about pgSQL is designed for this approach, so i imagine the
performance would be terrible.

~~~
grzm
_nothing about pgSQL is designed for this approach_

Interestingly enough, that was a part of the original design from the Berkeley
days.

[http://db.cs.berkeley.edu/papers/ERL-M85-95.pdf](http://db.cs.berkeley.edu/papers/ERL-M85-95.pdf)

 _Our proposed approach is to treat the log as normal data managed by the DBMS
which will simplify the recovery code and simultaneously provide support for
access to the historical data.

...

3.3. Time Varying Data

POSTQUEL allows users to save and query historical data and versions [KATZ85,
WOOD83]. By default, data in a relation is never deleted or updated.
Conventional retrievals always access the current tuples in the relation.
Historical data can be accessed by indicating the desired time when defining a
tuple variable.

...

Finally, POSTGRES provides support for versions. A version can be created from
a relation or a snapshot. Updates to a version do not modify the underlying
relation and updates to the underlying relation will be visible through the
version unless the value has been modified in the version._

One of the purposes of the much-maligned VACUUM command was to push the
historical data to archival (optical) media.

 _The archival store holds historical records, and the vacuum demon can ensure
that ALL archival records are valid._

~~~
twic
TIL!

> POSTQUEL allows users to save and query historical data and versions
> [KATZ85, WOOD83]. By default, data in a relation is never deleted or
> updated. Conventional retrievals always access the current tuples in the
> relation. Historical data can be accessed by indicating the desired time
> when defining a tuple variable.

Or, to put it another way [1]:

"Since one can't change the past, this implies that the database accumulates
facts, rather than updates places, and that while the past may be forgotten,
it is immutable."

&:)

[1] [https://www.infoq.com/articles/Datomic-Information-
Model](https://www.infoq.com/articles/Datomic-Information-Model)

------
9gunpi
The year people discover RDBMS are quick and reliable. Again.

------
wildchild
OMG what next? Redis vs PostgreSQL?

------
fapjacks
I think any rewrite is going to be orders of magnitude more efficient anyway.
You've got the benefit of hindsight.

~~~
bjt
But it's not like he wrote RethinkDB and then wrote Postgres with the benefit
of hindsight. When he says that the big change was the database and not his
own code, I believe him.

~~~
fapjacks
I believe him, too. I'm not trying to participate in any database war. I've
only said that any rewrite is going to have the benefit of hindsight, and
because of that, his results are going to reflect that to a degree.

------
bandrami
Isn't that kind of like saying "jackhammers vs. crescent wrenches: my personal
experience"?

