Hacker News new | comments | show | ask | jobs | submit login

Back in 2012 I worked at a startup and we had a few backend data stores (Vertica, Postgres, Redis, Big Table, MySQL, Cassandra, ElasticSearch, Hadoop).

When 10gen came along, they sold us an amazing vision. We were processing 500 million to billion social media messages a day and they sold us on this dream that we would have very fast writes, a cluster of 9 machines in master-slave-slave config for fail over. We wouldn't have to write SQL anymore (which was actually funny to watch one of our engineers try to unlearn table denormalization).

At the end of the day, it was hype. I had to stay up during multiple 2012 debates for 5 hours and flip the Mongo databases because they kept crashing. The shard key they set up for us was slightly out of balance so the data distribution was 30-30-40 and every so often, that server processing 40% of our data would jump to 50% and then knock that server offline leaving the other 2 to split 50% of the work... knocking them offline. There were also tons of replication problems all traced to write locking. At the end of the day 10gen solved one problem, but our company traded that solution for other problems.

That experience taught me that you really need to understand what you're trying to solve before picking a database. Mongo is great for some things and terrible for others. Knowing what I know now, I would have probably chosen Kafka.

"which was actually funny to watch one of our engineers try to unlearn table --denormalization-- normalization*"

Going from sql to nosql would be denormalizing data not normalizing it so, the engineer would be unlearning table normalization.

Ah yes, you're correct!

I think the worst part about MongoDB in that timeframe was the poor CPU utilization since it was single threaded.

In order to leverage the multiple CPUs in a box you'd need to run multiple instances of mongo on the same machine and we designed a whole system around sandboxing instances.

But then you'd run into memory allocation issues and how badly it would handle indexing and handle running out of memory, which mostly involved crashing.

We ended up having to run about 100 bare-metal servers to support what would probably have been a level of queries about 10 PostgreSQL boxes could handle easily.

Truly a terrible DB.

You do realize it's 2017 now, right? Many things have changed in MongoDB (and in the IT world in general) during the last 5 years. MongoDB is a mature, established, full-featured database - recognized by Gartner as a "challenger" to more established RDBMS vendors like Oracle and Microsoft.


I'm sure it's better today. I'm talking about my experience back then. It may be a challenger, but it still has tradeoffs that system architects need to factor in when making database decisions.

Still no transaction control. It's not going anywhere any time soon.

But... Kafka isn't a database?...

No, it's not. We were trying to wrap our own system around Mongo to do the job of Kafka which is why I said that.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact