
Performance Benchmark 2018 – MongoDB, PostgreSQL, OrientDB, Neo4j and ArangoDB - pluma
https://www.arangodb.com/2018/02/nosql-performance-benchmark-2018-mongodb-postgresql-orientdb-neo4j-arangodb/?hn
======
olavgg
I've stopped reading database benchmarks, because they are extremely vague.
Instead I spend my time optimizing my current solution/stack. For example
Postgresql has hundreds of knobs that you can adjust for almost every scenario
you can imagine. Sometimes you have a special query and increase the work_mem
just for that session. Other cases you adjust the cost settings for another
query/session. You can analyze your indexes and index types. And sometimes you
need to rewrite parts of a big query.

Learning all this takes time, you are much better off learning more about your
chosen technology stack than switching to another technology stack.

Though in a few rare races, you need a different technology to solve your
business problem. In most cases they complement your existing solution, like
Elasticsearch/Solr for full-text search or Clickhouse for OLAP workloads.

~~~
maxxxxx
Agreed. Switching to another system is expensive and the benefit is pretty
questionable.

~~~
emsy
Unless you hit a very specific use-case/bottleneck, which I only ever
witnessed once.

~~~
TremendousJudge
expand, please?

~~~
maxxxxx
I imagine something very specific like having a lot of inserts into a table
and that being your main use case. Depending on your data some databases may
be better than others and that should be easy to measure.

In most real-world cases the requirements however are not very clear and often
conflicting so it's much harder to get data that shows the performance of one
system over the other.

~~~
gopalv
> Depending on your data some databases may be better than others and that
> should be easy to measure.

And the performance difference could be an accidental feature of the design
and completely unintentional.

Postgres for instance has a native data engine, so it can store the exact row-
ids for a row into an index, but this means that every update to the row needs
all indexes to be updated.

Mysql has many data engines (InnoDB and MyISAM to start with), to the row-id
is somewhat opaque, so the index stores the primary key which can be pushed to
the data engine scans and then have it lookup a row-id internally. This needs
an index to be touched for the columns you modify explicitly or if the primary
key is updated (which is a usual no-no due to UNIQUE lookup costs).

When you have a single wide table with a huge number of indexes, where you
update a lot of dimensions frequently, the performance difference between
these two solutions is architectural.

And if you lookup along an index with few updates, but long running open txns,
that is also materially different - one lookup versus two.

Though how it came about isn't really intentional.

~~~
janemanos
We just published an Update to the Benchmark, please find it here:
[https://news.ycombinator.com/item?id=16473117](https://news.ycombinator.com/item?id=16473117)

------
jakewins
Haha, I almost spit out my coffee when I saw how poorly we performed here!
Reading the code, it's kind of incredible that Neo4j even competes - all the
databases benchmarked are running with 25 concurrent database connections:

[https://github.com/weinberger/nosql-
tests/blob/master/arango...](https://github.com/weinberger/nosql-
tests/blob/master/arangodb/description.js#L5)

Except Neo, which is given a single connection to work with :(

[https://github.com/weinberger/nosql-
tests/blob/master/neo4j/...](https://github.com/weinberger/nosql-
tests/blob/master/neo4j/description.js#L13)

When I modify it locally to just set {maxConnectionPoolSize: 25}, like how the
other databases are set up, I get roughly an order better performance.
neighbors2 went from 0.0226ms to 0.0068ms, for instance. Single write sync
went from 3.36415ms to 0.77549ms avg/op..

Benchmarking is hard!

~~~
janemanos
Jan from ArangoDB here:

Did you read the appendix?

"We used a TCP/IP connection pool of up to 25 connections, whenever the driver
permitted this. All drivers seem to support this connection pooling, except
Neo4j. We sent instead twenty-five requests via NodeJS to Neo4j."

Just to avoid unfair conditions now, if connection pooling is possible with
the neojs driver, then you would have to configure the benchmark scripts to
not send 25 requests via Node as well.

We tricked ourselves with such stuff when preparing the benchmark.

~~~
maxdemarzi
Did you read the manual where it says the driver permits connection pooling?

[http://neo4j.com/docs/developer-
manual/current/drivers/clien...](http://neo4j.com/docs/developer-
manual/current/drivers/client-applications/#driver-config-connection-pool-
management)

~~~
janemanos
Hi Max.. thanks also for your feedback. We worked it into an update to the
benchmark, please find it here:
[https://news.ycombinator.com/item?id=16473117](https://news.ycombinator.com/item?id=16473117)

------
etxm
ArangoDB always makes for exciting benchmark posts.

I could see myself there in a bowler hat with a fistful of racing chits
screaming “go, Postgres, go.”

I’d love to see a competition were the developers of each database got to use
the same hardware and data then tune the hell out of their configs, queries,
and indices.

Red Bull could sponsor it. I’d buy a T-shirt.

~~~
kbenson
That doesn't sound that hard to start. Something like RealWorld[1] and the Web
Framework Benchmarks[2] combined but for DB workloads. Have one dataset that
includes data amenable to OLAP and OLTP, but have separate tests each
consisting of OLAP queries, OLTP queries, and combined queries. Choose a low-
end, mid-range and high-end set of AWS or GCE instances/configs to normalize
against. Let people submit pull requests with new technologies or configs.

You'd want to get some funding to run the tests (or maybe solicit Google or
Amazon to see if you could get the instance time donated once a month or
something.

If you started small, with maybe a portion of these features, and then scaled
up over time, you might actually get to the point where you had tests that
emulated a power failure, or master/slave and dual master scenarios and how
they handle certain common network errors (split-brain). That would be an
amazing resource.

Edit: It occurs to me I probably should have read more of the article, since
this is _sort of_ what they are doing already...

1:
[https://github.com/gothinkster/realworld](https://github.com/gothinkster/realworld)

2:
[https://www.techempower.com/benchmarks/](https://www.techempower.com/benchmarks/)

~~~
odammit
Yeah after I posted it I started thinking about what it would take and what
that would actually look like... and how you’d cheat :)

It would probably require a few different categories with some sort of output
assertion to validate the query performed right and a means of tracking CPU,
usage ram usage, and execution time.

It would be cool to see things like disaster recovery and chaos proofing as
well.

------
ahachete
My feedback:

\- 1.6M documents. This is a tiny dataset. I bet it's much smaller than RAM
size (122GB), which means all data is in memory. Interesting benchmark, but
not what I'd typically expect for a database (data size >> RAM).

\- All databases are allowed either to use all memory or capped at 10GB.
PostgreSQL is capped at 128MB, the default shared_buffers value on PostgreSQL
10! This means PostgreSQL is running off of disk, whereas the other databases
have all data in memory.

\- Just to insist on the previous bullet: I see the point (even though I don't
consider it interesting, compared to a minimum tuning of every db) of using
default config, but this puts PG on a very unfair position (it's limited to
1/1000th of RAM!). Just set shared_buffers to 32GB (or 10GB to cap all db to
10GB!) and re-run the tests. I bet PG results will improve significantly.

\- Other basic tuning parameters (like work_mem or min_wal_size or
random_page_cost) may also affect significantly the performance. This could be
left as an exercise to do another benchmark with properly tuned databases. If
interested, here's a recent presentation I did with many other PostgreSQL
tuning recommendations: [https://speakerdeck.com/ongres/postgresql-
configuration-for-...](https://speakerdeck.com/ongres/postgresql-
configuration-for-humans)

Other than this, congratulations for the work on ArangoDB. Building a database
is a really brave, hard work.

~~~
janemanos
Hi @ahachete and thanks for your feedback to our benchmark 2018. We worked it
into an update, please find it here:
[https://news.ycombinator.com/item?id=16473117](https://news.ycombinator.com/item?id=16473117)

------
maxdemarzi
I hate these types of posts. They do nothing to help the graph database space.
How good are they at bidirectional traversals, what about deep traversals in
complex networks with more than 50 relationship types? What about x? Every
vendor is going to be better at one thing than another, there is always a
trade off, but it is irrelevant.

Last week we took a very deep and very wide sales hierarchy query from 10
seconds in Oracle to 4ms in Neo4j. If Arango could it in 3ms, or Orient in 5ms
doesn't really matter. The point is graph databases are much better at some
queries than relational. They should be blogging about use cases and customer
successes, not benchmarks.

~~~
ifcologne
>They should be blogging about use cases and customer successes, not
benchmarks.

They do, just recently the cases of AskBlue (Location aware recommendations)
and Thomson Reuters (Fast & Secure Single-View of Everything)

[https://www.arangodb.com/why-arangodb/case-
studies/](https://www.arangodb.com/why-arangodb/case-studies/)

I am very interested in Graph-Cases and appreciate the support of Neo4J for
projects like the Panama- or Paradise Papers that show how connected graphs
help to understand former hidden relationships. Graphs are very powerful and a
great addition to relational or document approaches.

(full disclosure: I worked for ArangoDB 2 years ago.) And, I've used Neo
successfully for a fraud detection PoC recently...

------
anilshanbhag
I don't think these results are comparable. Query runtimes are highly
dependent on how much memory they can use. Postgres for example isn't
configured making it use less memory than other systems. Hence its performance
is likely inferior and could be improved significantly by increasing the
buffer size.

~~~
mpv89
Hi, Mark from ArangoDB here. We used the default configuration for every
Database in the benchmark under the assumption they are picked reasonable. If
you have any specific suggestions how to configure Postgres for the specific
environment, please let us know.

~~~
bsg75
That is a terrible approach. No database platform is tuned by default.

You may have to run a utility, or edit a config file, but a new system must be
configured to fully utilize available hardware, and for the expected workload.

A benchmark run on default config is at best misleading - it smells like
marketing spam. You actually dampen interest in a forum like this with such an
approach.

~~~
don71
Disclaimer: I'm part of the ArangoDB team. As written in the post the whole
benchmark is open source. The idea is that you can run it on your own. Also,
pull requests are welcome. If you think it's marketing spam, take the chance
and improve the configuration. We will publish an update of the post.

~~~
bsg75
One simple suggestion for one platform: [https://github.com/weinberger/nosql-
tests/issues/22](https://github.com/weinberger/nosql-tests/issues/22)

The more common pgtune CLI is not up to date for PostgreSQL 10+ at this time.

Also there are some old, open issues that indicate the benchmarks have
problems:

\- [https://github.com/weinberger/nosql-
tests/issues/16](https://github.com/weinberger/nosql-tests/issues/16) \-
[https://github.com/weinberger/nosql-
tests/issues/13](https://github.com/weinberger/nosql-tests/issues/13)

~~~
janemanos
We just published an Update to the Benchmark including PGtune, please find it
here:
[https://news.ycombinator.com/item?id=16473117](https://news.ycombinator.com/item?id=16473117)

------
cldellow
The table of times notes:

> The shortest path query was not tested for MongoDB or PostgreSQL since those
> queries would have had to be implemented completely on the client side for
> those database systems.

I can't speak for Mongo, but I imagine the shortest path query could be
implemented as a recursive CTE or a sproc in Postgres, which would execute
server side. It might be a bear to write, and it might be slow, but it should
be possible.

~~~
janemanos
If you know how to write the shortest path query for Postgres, team ArangoDB
would be happy to include it into the next update

~~~
tankenmate
Not exactly what you are asking for but this sits on top of PostgresQL and
uses non-standard indexing to improve graph queries.

[https://github.com/bitnine-oss/agens-graph](https://github.com/bitnine-
oss/agens-graph)

------
maxdemarzi
Don’t ever trust any vendor benchmarks:
[https://maxdemarzi.com/2015/10/16/benchmarks-and-
supercharge...](https://maxdemarzi.com/2015/10/16/benchmarks-and-
superchargers/)

~~~
janemanos
In your 2015 article, you criticized that the ArangoDB team restarted the
instances after each test run. In this 2018 edition, they don't do this
anymore.

~~~
maxdemarzi
You want to trust a vendor that used to restart the db between queries for
their benchmarks?

~~~
virmundi
Yes, because they learned. Arango has honest people that are open to
criticism.

~~~
tinymollusk
But the larger issue -- that they have direct financial incentive to
unconsciously bias the results -- still exists. This isn't anything against
Arango's team, I'm sure they're lovely people, but even physics experimental
results have been shown to be biased when the experimenter knew the testing
hypothesis.

"I think I've been in the top 5% of my age cohort all my life in understanding
the power of incentives, and all my life I've underestimated it. Never a year
passes that I don't get some surprise that pushes my limit a little farther."
\-- Charlie Munger

~~~
ifcologne
Performance tests - especially those of databases - are a very complex and
resource-consuming venture. And because each use case is different, the
benchmark published somewhere does not fit your specific problem and the
available environment/budget.

Unfortunately, there is no independent organization that believes in this and
defines scenarios that are tested in different environments.

~~~
janemanos
We just published an Update to the Benchmark, please find it here:
[https://news.ycombinator.com/item?id=16473117](https://news.ycombinator.com/item?id=16473117)

------
zilchers
Big missing piece of this was clustered / HA configurations. To run a
production level application, you’re going to be running with some sort of
clustering or replication, which is going to add a pretty significant latency
depending on implementation. But, interesting article, I’m always impressed
some of these smaller database companies are still kicking.

------
drej
Has any vendor ever produced a benchmark, where their solution wasn't the
clear winner?

~~~
virmundi
Arango isn’t the clear winner in all of these. Unconfigured PG did pretty
good. PG with JSON and no indexes did well too. It also held its own using
less memory. Finally, PG has transactions where as Arango doesn’t really if
you have an index constraint on a document attribute. This shows Arango is
better than Neo4j and Mongo for graph resulted service and document store. It
skipped PG’s recursive capability that probably allows for basic graph
traversal to a known depth.

------
amorroxic
ArangoDB is an incredible database with some very shy marketing imo - edge
indexing over json documents is too good of a sweet spot.

Not sure if still actual, however among headaches I remember race conditions
in the node driver(rocksdb engine) and maybe less straightforward clustering
(compared to Couchbase/Cassandra/etc at least).

~~~
mchacki
The race conditions have been eliminated and we greatly simplified the cluster
setup. With this "new" tool shipped with arangodb
[https://github.com/arangodb-helper/arangodb](https://github.com/arangodb-
helper/arangodb) it is just a single line on each machine to get the cluster
up and running.

~~~
amorroxic
Props to you guys, outstanding product really.

------
HugoDaniel
Didn't know about this project. Their homepage made me want to try it out. Is
AQL an invention they made ? reading their [examples] made me feel that this
is a very welcome update on SQL. Congrats.

[examples] [https://www.arangodb.com/why-arangodb/sql-aql-
comparison/](https://www.arangodb.com/why-arangodb/sql-aql-comparison/)

------
steinerj
It's curious that single purpose Graph DBs like Neo4j perform poorly at some
typical graph queries. Or am I getting something wrong?

~~~
mchacki
Hey Michael from ArangoDB here. I was also a bit surprised. Especially for the
graph case. We see that more memory is consumed but the throughput did not
really increase in comparision to the benchmark we conducted 3 years ago.
Nevertheless we see improvements on the shortest path queries, these challenge
the database more. As we are no dedicated Neo4J experts we are intrested in
every configuration that will speed things up, please share them with us.

------
kbenson
In what case does it make sense to benchmark these databases but leave
MySQL/MariaDB/Percona out?

~~~
don71
Disclaimer: I'm part of the ArangoDB team. As written in the post the whole
benchmark is open source, and you are welcome to add other DBs. That is
appreciated. It is just important that there is an official node.js driver and
a GA version will be used.

~~~
e12e
Would be nice to domain/product experts to some tweaking and host this at a
third party a la "The benchmark game"[1]. A nice little graph-oriented
benchmark would be great, especially avoiding things like:

[https://news.ycombinator.com/item?id=16378157](https://news.ycombinator.com/item?id=16378157)

[ed: and:

"For comparison, we used three leading single-model database systems:... and
PostgreSQL for relational database."

Jsonb with indexes doesn't rate as "document db", how about postgis for domain
specific graph db?]

Some agreement on hw (virtual or otherwise), dataset and queries might make
this actually rather useful.

[1]
[https://benchmarksgame.alioth.debian.org/](https://benchmarksgame.alioth.debian.org/)

------
lobo_tuerto
From the article, PostgreSQL still looks like the best option in general.

~~~
riku_iki
Unless you need easy clustering..

~~~
lobo_tuerto
In that case, you can go for CockroachDB. ;)

You can use the same drivers for PostgreSQL to talk to CockroachDB. As
specified here: [https://www.cockroachlabs.com/docs/stable/frequently-
asked-q...](https://www.cockroachlabs.com/docs/stable/frequently-asked-
questions.html#what-languages-can-i-use-to-work-with-cockroachdb)

"CockroachDB supports the PostgreSQL wire protocol, so you can use any
available PostgreSQL client drivers."

------
rambossa
Any particular reason why [https://github.com/dgraph-
io/dgraph](https://github.com/dgraph-io/dgraph) missed the benchmark?

\-- It seems to be highly capable (horizontal-scale/acid) & "performant".
Reference vs Neo4j: [https://blog.dgraph.io/post/benchmark-
neo4j/](https://blog.dgraph.io/post/benchmark-neo4j/)

~~~
pluma
It doesn't seem like dgraph has an official Node driver, I think?

------
oneweekwonder
Interesting, thanks for doing this, and sharing it so openly.

Do you maybe have the data for your graphs in csv or json? Want to plot my own
graphs with it.

Personally I would call it Javascript Performance Benchmark for MongoDB,
PostgreSQL, OrientDB, Neo4j and ArangoDB. Which is mentioned at the end of the
article under the software section. To be truly a Database Performance
Benchmark you would need to interface with the c/native library.

Can anybody recommend other database benchmarks?

~~~
graetzer
I think the orkut dataset is already in a tab seperated format, to turn it
into CSV you only need to add the column labels.

[http://snap.stanford.edu/data/com-
Orkut.html](http://snap.stanford.edu/data/com-Orkut.html) For an example you
can have a look at the import scripts: [https://github.com/weinberger/nosql-
tests/blob/master/arango...](https://github.com/weinberger/nosql-
tests/blob/master/arangodb/import.sh)

~~~
oneweekwonder
Sorry I should have been more clear. I want the raw result data he used to
draw the graphs from.

I wanted to use high-charts to make it interactive, eg hide dbs to compare
more easily.

I don't want to run my own tests, but I did have a look at the github repo and
dataset.

~~~
janemanos
Here you go:
[https://docs.google.com/spreadsheets/d/1_unaj2x_NCVNUHcxwMUV...](https://docs.google.com/spreadsheets/d/1_unaj2x_NCVNUHcxwMUVkst9shlqLwrCZHRCMv_LDus/edit?usp=sharing)
Let me know if you have any question or need something else

------
robbiemitchell
Compare to a similar post from a couple years ago:

[https://news.ycombinator.com/item?id=9699102](https://news.ycombinator.com/item?id=9699102)

------
vijaybritto
I don't know if its only me but the graphs are very hard to understand. Im
viewing in a large desktop monitor and I cant imagine it would be in a mobile
phone!

------
CodeSheikh
Is it fair to compare JSONB format for PostgreSQL in this benchmark as it is
not the main highlight of PostgreSQL but more like an added feature on top of
SQL?

~~~
Simran-B
Users asked for a JSONB comparison after the previous benchmark, so we
included it. The tabular format is also there. Thus, I don't see how it would
be unfair.

------
ChicagoDave
In-house performance stats are the least trustworthy thing on the planet. Are
there any independent studies?

~~~
Simran-B
No, there aren't even standard benchmarks that cover all types of databases to
compare performance etc. The ArangoDB benchmark is open source however. Feel
free to test it yourself and tune as many parameters as you want. You can also
add scripts for other database systems.

------
speby
A benchmark made by a database vendor that shows that the database vendor
performs well, if not the best, overall. Whether it is actually true or not,
the result being published is entirely unsurprising.

~~~
don71
Disclaimer: I'm part of the ArangoDB team. As written in the post the whole
benchmark is open source. the idea is that you can run it on your own. Take
the chance and get your own impression.

------
ahs1200
Wow, that looks like a lot of work. Nice effort.

------
ChicagoDave
Why is Neo4j in this comparison at all? It's a graph database and its use-case
is very specifically different than relational or document databases.

Please compare apples to apples.

~~~
kbd
ArangoDB is also a graph database. Many of the benchmarks (shortest path,
neighbors, etc.) are specifically geared towards graph databases. Neo4j is
totally appropriate in this comparison.

------
allandubey
Much needed! Thank You

------
albertgoeswoof
TLDR, just use postgres

~~~
noah-kun
That’s not the conclusion of this at all...

~~~
pabl0rg
In most benchmarks Postgres with tabular data got the best results. If you
need to store json or do graph queroes and don't want to create indices
youself, then consider araango

~~~
michaelbuckbee
And even then, should you be storing JSON? Is putting GraphQL into your
datastore the right thing?

~~~
gmueckl
That depends. If you know that your data contains a highly irregular, non-
exhaustive list of extra attributes that have no relations to other parts of
the data model, a JSON column might be the right choice. It's rarely the best
choice, but I can totally see applications where it is.

------
maxpert
Hahahaha this makes me use PostgreSQL even more :D

