
MongoDB Responds to PostgreSQL Benchmarks - crcsmnky
https://www.mongodb.com/blog/post/benchmarking-do-it-right-or-dont-do-it-at-all
======
javiermaestro
As others have pointed out in another HN discussion (1), Mongodb's reply is
definitely questionable, if only by its tone.

I already replied to metheus on Twitter (2) in a thread where we asked for a
way to repro their claims. I found their reply and comments very
inappropriate, similar to the comment in here. Arrogant and derogatory to
OnGres.

Anyway, I was writing this to note that OnGres has replied to Mongo's reply
setting an example of how tech discussions should happen: without derogatory
and arrogant comments, open to valid criticism (i.e. with something more than
words and numbers that cannot be reproduced) and transparency.

Check it out: [https://ongres.com/blog/benchmarking-do-it-with-
transparency...](https://ongres.com/blog/benchmarking-do-it-with-
transparency/)

In there you'll see how Mongo consistently mis-interpreted (or mis-
represented?) the results. They kept mixing the benchmarks and constantly
talked about an experimental driver and missing connection pooling. In fact,
they did use the official Mongo Lua driver _and the official Java driver_ for
different benchmarks and they did some of the benchmarks _with and without_
connection pooling and published both results.

It's really sad to see Mongo reply to a thorough benchmark like this. It
probably has its flaws but instead of correcting them or publishing a better
benchmark like the one they did (to magically get 240x...) they chose to
mischaracterize the work of others, spreading FUD and accusing them of
cheating and being dishonest.

Hopefully they'll turn around and fix it. All it takes is to publish how they
got they amazing numbers so that others can comment, repro or dispute the
benchmark.

(1)
[https://news.ycombinator.com/item?id=20479670](https://news.ycombinator.com/item?id=20479670)

(2)
[https://twitter.com/javiermaestro/status/1151849279226556417](https://twitter.com/javiermaestro/status/1151849279226556417)

------
feike
I was at the presentation last Thursday, they (OnGres) have fully open sourced
both their methodology and their results and had a pretty strict divide
between teams designing the benchmarks and teams running the benchmarks.

MongoDB could create a Pull Request/Merge Request against that repository so
we can all judge those results ourselves, their current response is only words
and a single table showing unlikely results.

However I do think the criticism of not tuning MongoDB is valid, however their
response is dishonest:

> with their own heavily tuned PostgreSQL.

This was explicitly not the case according to OnGres other than the
established norms of taking 25% memory for `shared_buffers` etc. No other
tuning that is normally done for big clusters was done.

[https://gitlab.com/ongresinc/benchplatform/](https://gitlab.com/ongresinc/benchplatform/)
[https://gitlab.com/ongresinc/txbenchmark](https://gitlab.com/ongresinc/txbenchmark)

~~~
metheus
Hi, I work at MongoDB, and I'm here to elaborate in answer to your comment.

> I was at the presentation last Thursday, they (OnGres) have fully open
> sourced both their methodology and their results and had a pretty strict
> divide between teams designing the benchmarks and teams running the
> benchmarks.

> MongoDB could create a Pull Request/Merge Request against that repository so
> we can all judge those results ourselves

The existing, unaltered content of the OnGres repo is all the testimony one
needs to know that the OnGres team is incapable of or unwilling to produce a
valid test of MongoDB. Open source garbage is still garbage.

I understand the allure of asking for a pull request from our testing team to
demonstrate how we obtained the measurements we cited in our retort. It is
tempting to see this as a case of well-intentioned scientists, doing their
best, honestly asking for peer review. But that view relies on two things that
we can not take for granted: 1) that the OnGres team is acting in good faith
and will work to correct their errors, fairly declaring MongoDB more
performant if they concur with our results; and 2) that such an open back-and-
forth will be illuminating to bystanders.

1) We cannot assume that OnGres is acting in good faith when their report so
clearly demonstrates that they biased the test against MongoDB. This
conversation should start and end with the fact that OnGres used an
experimental MongoDB driver to compare against PostgreSQL with a production
driver and a dedicated connection pooler in front of it. (What kind of pull
request could MongoDB submit to address the use of sysbench, which requires a
Lua driver?) They are simply not credible.

2) What would a MongoDB-submitted patch prove? It would certainly print out
different numbers, but that alone proves nothing. For those numbers to mean
anything, you have to read and understand the code. Anyone capable of
understanding why our patch is valid is equally capable of seeing the deep
flaws in the code as published, no patch required.

Consider this: if a research group funded by the fossil fuel industry
published a report, littered with false statements and methodological errors,
claiming that climate change isn't happening, NASA and NOAA aren't obligated
to issue full a correction of that report along with their response calling
shenanigans.

No, we're not going to get mired in a patch war with demonstrably biased
authors over a fundamentally flawed comparison methodology. We have published
our own benchmarks demonstrating how to test MongoDB performance, and in a few
months, one of our engineers will present her work adapting the industry-
standard TPC-C at the VLDB conference.

> their current response is only words and a single table showing unlikely
> results.

There is nothing unlikely about our obtaining speedups to queries by using
indexes that OnGres ignored.

> However I do think the criticism of not tuning MongoDB is valid, however
> their response is dishonest:

>> with their own heavily tuned PostgreSQL.

> This was explicitly not the case according to OnGres other than the
> established norms of taking 25% memory for `shared_buffers` etc. No other
> tuning that is normally done for big clusters was done.

I'm very comfortable using the phrase "heavily tuned" when OnGres used
"established norms" for PostgreSQL and ignored the existence of those (clearly
documented) norms for MongoDB, while falsely claiming in their report that
MongoDB does not require tuning.

