
The Marketing Behind MongoDB - nemild
https://www.nemil.com/mongo/3.html
======
martinald
100% of my friends who have used Mongo/similar NoSQL have given up and had a
nasty rewrite back to pgSQL.

This seems to be the journey:

1\. Lack of migrations is awesome! We can iterate so quickly for MVP

2\. Get users

3\. Add features, still enjoying the speed of iteration

4\. Get more users

5\. Start building reporting features for enterprise/customer support/product
metrics (ie: when the real potential success starts)

6\. Realise you desperately need joins, transactions and other SQL features

7\. Pause product dev for 1-3+ months to migrate back to SQL, or do some weird
parallel development process to move it piecemeal back.

I think the most interesting question though is would they be able to get MVP
and initial customers that set off this if they were moving (slightly) slower
due to SQL and slight overhead that comes with?

My thought is definitely yes.

~~~
brandur
> _I think the most interesting question though is would they be able to get
> MVP and initial customers that set off this if they were moving (slightly)
> slower due to SQL and slight overhead that comes with?_

I've used Postgres and Mongo pretty extensively, and for any reasonably
seasoned developer, the startup overhead of an SQL system is a myth. There may
upfront cost to learning how an RDMS and SQL work in the first place, but once
you're familiar with them, they'll be faster than Mongo on any new project.

The schemaless concept of a document database seems to be the major selling
factor in velocity of movement, but once you've got a good handle on a
migration framework in the vein of ActiveRecord or other popular software,
that's negated completely. It also really doesn't take long before schemaless
starts to cause big problems for you in terms of data consistency -- it's not
just the big players that get bitten by this.

The simplified query language is another one. SQL is a little bit obtuse, but
it's not that bad once you have a handle on it, and a lot of people are
familiar with it. Once you add in an ORM layer, the lazy-style access of a
framework like Sequel or SQLAlchemy makes the developer experience quite a bit
better than any Mongo APIs that I've seen. Also, after you get beyond trivial
usage, SQL's flexibility so wildly outstrips Mongo's query documents that it's
not even worth talking about.

Postgres on the other hand ships with a great management CLI, a very powerful
REPL (psql), and features like data types/constraints/transactions that
guarantee you correctness with zero effort on your part. I can only speak for
myself, but I'd take Postgres to the hackathon any day of the week.

~~~
rmrfrmrf
> the startup overhead of an SQL system is a myth.

It's also a myth that devs choose NoSQL over SQL because "SQL is too hard."

~~~
kafkaesq
_It 's also a myth that devs choose NoSQL over SQL because "SQL is too hard."_

What I see more of is situations where the developers "know" SQL but still
just don't want to sit down and figure out their data models.

It's like, the temptation to reach for the nearest short-cut -- whether that
means over-reliance on ORMs (or hodgepodge 'data access layers'); or
continually munging stuff at the application layer for nearly every operation;
or... Mongo -- is always just too great.

~~~
bryanrasmussen
figuring out the data model as you build can be beneficial, you realize what
you need as you work with the thing and can add it on the fly, at a certain
point of course you need to do a cleanup. Getting people to accept the cleanup
requirement is the difficult part.

------
nemild
(Author here) You can read parts 1 and 2 of the three part series:

\- Part 1, Why Did So Many Startups Choose MongoDB:
[https://www.nemil.com/mongo/1.html](https://www.nemil.com/mongo/1.html)

\- Part 2, Startup Engineers and Our Mistakes with MongoDB:
[https://www.nemil.com/mongo/2.html](https://www.nemil.com/mongo/2.html)

You can see most of the notes from my interview with MongoDB's CTO, Eliot
here:

[https://news.ycombinator.com/item?id=14804765](https://news.ycombinator.com/item?id=14804765)

And the interview notes related to MongoDB's marketing are somewhere in this
HN post:

[https://news.ycombinator.com/item?id=15124316](https://news.ycombinator.com/item?id=15124316)

~~~
primitivesuave
I really enjoy your writing style and how you wrote about the "story of
humans" rather than writing YAMTT (Yet Another MongoDB Trash Talk).

I'm actually a lot more comfortable now using MongoDB in production after
understanding its level of maturity and what the right application is. For the
last couple years I was scared off by all the negative HN comments on MongoDB
articles.

~~~
bulldoa
so what is the right application for mongodb? only when you don't need ACID?
Can you give some example?

Genuinely curious

~~~
mandevil
We used Couchbase (a NoSQL like MongoDB) for email batch sending (think
invitations to a party) and it worked very well. We could store each email
sent as a separate, denormalized document so the sender could see EXACTLY how
their contact data was replaced in each individual instance, the "View as a
Web Page" functionality was trivial (instead of recalculating everything from
the normalized forms- which can be blown up by contact data changes, just load
the document that you sent out), and it's lovely TTL feature meant we could
handle the configurable retention policies trivially as well.

It wasn't so good at doing reports (how many customers viewed the email yet,
responded, etc.). One thing we talked about doing was just storing a sqlite or
h2 database as a document for reporting purposes (if we had been more single
threaded that could have worked nicely). We ended up using a separate Sql DB
for that.

There are cases where denormalized data is the "right" way to view stuff, and
cases where the data really is easier if its normalized, and that is a good
reason to push your DB selection one way or another.

~~~
lilbobbytables
Interesting. Seems like that would just be one part of a larger application,
though. And for that, my mind just jumps to something like Postgres with
`jsonb` fields to store it all denormalized, then using columns to store
relations, like the contact it was sent to. Along with other tables for other
parts of the application, of course.

This way your aren't complicating your stack by adding more services sooner.

~~~
mandevil
We were replacing a fully SQL email engine (that was starting to fall over due
to load) with this more hybrid approach; we had customers, we knew that the
business case closed, but the load was starting to overpower the main
database, so we spun off separate databases, and bought ourselves a little
more overhead by splitting out the normalized and denormalized data. Could
well have been a mistake, but we weren't thrilled with postgres ability to
scale horizontally, so went to CB so we could scale a bit more. (As I recall,
we were doing 7m emails a day, our goal was to support up to 70m with that
structure.)

------
kureikain
I used MongoDB in production. About 6 years ago, I have bad experience with it
and abandon it.

In a new job recently, I picked up it again and to my surprise, it performs
super well and have lots of new concepts.

1\. TTL index: the data automatically remove after a certain time

2\. Hidden Replica: we can do whatever on this node without slow down
producion

3\. Very easy to use oplog: It's super easy to get access to oplog, just as
you work with a normal collection

4\. Aggregation Framework is awesome: It's tedious to write at first, however,
it become very clear with the pipeline design

5\. JOIN: I'm suprising when I discover this. But it does have a similar
concept to join now with `$lookup`

6\. Very low storage: If you used MongoDB as a log/timeseries database/event
you will appreciate WiredTiger

7\. Metric exposing is awesome: Lots of userful metrics

8\. The cluster is much easier to manage nowsaday and very stable: adding a
new node is just a matter of booting up server and evrything handle
automatically. Think of MySQL, where you have to export data, get the position
of bin log and file name to config the slave.

One thing remains same is we still have to deal with migration and back fill
data.

Careful planning and design the database are still requirement, not just dump
whatever you want to it.

------
jerf
"but need to spend time debating how to protect ourselves from marketing
“attacks”."

I don't have a complete solution to this, but I know I've got two for you:

1\. Beware any solution where the same thing is demo'ed every time. It's a
sign that either nobody is actually using it, or it's over-tied to the
specific thing being demo'ed.

2\. Any time you see a solution that massively outperforms some existing and
well-developed solution, you must always ask what was dropped from the
existing solution to get that speed, on the assumption that the well-developed
solution is probably already pretty optimized to be whatever it is. It isn't
necessarily bad to drop things, heck 90% of "cloud" technology basically
consists of "dropping the things that don't work well in a distributed
environment", but you need to _know_ what those things are.

#2 in particular would have saved a lot of people in the context of MongoDB.
It doesn't mean Mongo is a bad choice for everything, or at least, not
anymore, but you need to _know_ what you're giving up to get there. (And in
particular I'd suggest "We can use this technology without having to have
discipline!" is... double-edged at the very least. You may not want a bondage-
and-discipline tech at a startup, but tools that offer some gentle-but-solid
defaults and guidance may help you focus your cognitive energies less on
establishing all the rules from scratch and more on whatever your actual
problem is.)

------
jasondc
There is a common perception in the developer community that "if you build it
they will come", which is usually not the case. Theres a surprising amount of
marketing behind every successful company.

~~~
meritt
Absolutely. RethinkDB was technically a far superior product but lacked the
marketing and fanboyism that MongoDB orchestrated so effectively.

Today, Docker is following MongoDB's playbook.

HN Hiring Trends of MongoDB vs. Docker:
[http://i.imgur.com/tYKzmaG.png](http://i.imgur.com/tYKzmaG.png)

~~~
TheMissingPiece
tbh, I don't disagree with you here. RethinkDB invested in our open source
community more than marketing.. it's arguably part of why we didn't succeed as
a business.

------
goodroot
Great write up! That was informative and well-written.

As someone who writes content aimed at developers, I'm concerned about the
generalized sentiment this creates towards technical content marketing. I
stopped coding years ago and started trying to lead teams and figure out how
to make businesses grow; quality marketing is a way to do that. In the
HackerNews bubble, that's viewed as a step 'down the ladder'.

I think it's important to make a distinction; is the issue that they marketed
well or that their product didn't match their marketing? It's the latter --
any company that hopes to succeed in the uber-competitive, fast-moving
technical world needs marketing.

If we're making a fancy tool and I write a quality guide demonstrating how you
can do "us" 'the ol' fashioned way' in an effort to demonstrate our value, is
that wrong? How else am I to do it? You simply won't come if I don't tantalize
you.

Digital Ocean had a brilliant content strategy; incentivize users to write
tutorials that would bring people to try those tutorials out on their
droplets. I hope we don't dismiss technical marketing as evil. Technical
marketing can be valuable and be written with care and integrity.

------
rdtsc
Reminder: they shipped with the defaults which basically would throw the data
over the wall and never acknowledge it back to the user (because speed and
such). Interestingly enough it still called itself a database, and that right
there is the power of marketing.

~~~
nemild
My own research doesn't back up that this "unsafe write" was done by 10gen for
benchmarking reasons, but rather due to an early expected use case (from a
footnote in part 1 of my series):

Waiting for writes was off by default:

> This "unchecked" type of write is just supposed to be for stuff like
> analytics or sensor data, when you're getting a zillion a second and don't
> really care [if] some get lost [or] if the server crashes.

This was rarely the typical use case in most startups - even though the
defaults were based on it for a long time.

It’s unclear to me how long this was the top Google result, and it was earlier
in Mongo’s life (2009). This may be one of the benchmarks that led to anger at
competing NoSQL vendors - and seems to me like a mistake rather than a
malevolent effort.

[https://www.nemil.com/mongo/1.html#fn2](https://www.nemil.com/mongo/1.html#fn2)

_____________________

MongoDB's CTO has also mentioned that if he could go back and change anything
it would be this early default, as his earliest customers valued it - but he
quickly realized that it caused issues for others.

(Throughout this series, my goal has been to be tough, but fair to both sides)

~~~
wmf
The real question is how many years they left the unsafe default after people
pointed out that it was a problem.

~~~
nemild
It took a little while (say 1.5ish year from when they realized it was causing
issues). According to MongoDB's CTO, this was because they couldn't quickly
move their early users from the behavior they were used to.

------
dblock
Just to counterbalance the amount of negative sentiment here, we've used
MongoDB for Artsy.net, following a recommendation from Foursquare's CTO (and
others) since 2010. Elliott has also been very generous with his time, too.
MongoDB enabled us to iterate fast and continues to serve well.

~~~
infecto
I would love to know what mongodb offered you above MySQL/Postgres?

I always hear you can iterate fast but for most usecases I cannot see how
mongodb is quicker to iterate than a relational database. Would love to learn
though.

~~~
bjt
I can't speak for the parent, but there was a time around 2011 when I was
pretty excited that MongoDB would let me just start coding without needing to
1) configure postgres's listening port and pg_hba.conf to allow easy local
access, and 2) churn through a dozen or so revisions to the schema without
having to set up a system for running migration scripts.

I really think a lot of Mongo's success came from that "It just works"
experience the first time people tried it.

I later came to see it as a bad tradeoff. Open, unauthenticated access by
default created more costs in security than it saved in early prototyping.
Automating the creation of my dev environment made it not such a big deal to
make Postgres configured right when it booted. And once I'd gotten a SQL
migration system that I liked, I just kept re-using it. These things took some
time for me to develop and learn, though.

~~~
gaius
_onfigure postgres 's listening port and pg_hba.conf to allow easy local
access_

That is literally 30 seconds work!

~~~
bjt
Only with the benefit of hindsight, which isn't helpful to the newcomer.

To someone new who is just trying to work through some web framework tutorial
or hack out a half-formed proof of concept, it can be several minutes (or much
longer) of trying, failing, copying error messages into Google, and following
some guide written for Debian but failing to get it working because you're
running Red Hat (or vice versa). When you stack up enough similar problems
getting the rest of the stack to work (e.g. configuring Apache or nginx), it
stops a lot of new people in their tracks. When they're given something that
just works, it's a godsend.

~~~
Karrot_Kream
How is someone like that qualified to work as a software engineer? These were
things I did when I was 13 on our old family Pentium, _not_ something I would
want a production engineer to do.

~~~
gaius
_How is someone like that qualified to work as a software engineer?_

If software engineering was like civil engineering then using MongoDB would be
like building a bridge out of noodles.

------
thinbeige
Heavy user of Postgres and Mongo. I use both in production for years. I don't
think that there are good or bad databases. Every database has a different
focus and thus respective trade-offs.

You know this feeling when you have an idea and want to create a quick
prototype to test the idea? By quick I mean you want to have this thing
running in a few minutes without any effort. This is possible both with Mongo
and Postgres.

But with Mongo you even don't need to create databases or collections (tables)
before. You just write to the DB and everything will be created if not there
yet. This small thing let the code stay super simple and no migrations are
required. With Docker and the official Mongo image you can setup a webserver
with Mongo with authentication in seconds. Accessing the DB without an ORM is
still super comfortable since everything is JSON (in case you work with Node
and the Mongo native driver).

So, one of the biggest reasons for me is that I _start_ after all and not find
an excuse not to start this prototype. Because I am lazy and most time for
creating a prototype is always spent on setting up the development
environment. This is also why people like JSFiddle and Codepen so much: Just
start coding.

It's great for prototyping and most of my prototypes stay prototypes. The few
which got successful got migrated to a more mature architecture. Sometimes
they are employing Postgres, sometimes Cassandra, but often enough they stay
with Mongo but including validators, etc. All successful projects I started,
began quick and dirty and slowly evolved into a mature architecture. The best
projects were those where I had a prototype running in less than two hours.

It's really the use case. I think disliking a technology just limits your
options.

~~~
msla
Limiting my options is a good reason for becoming more experienced. The more
of my options I limit, and the faster I limit them, the faster I can make
things, as opposed to exploring options.

It's why beginners' guides are opinionated: The beginner has no opinions of
their own, so they must be given the opinions of someone who has them so they
can make progress instead of being overwhelmed by choice. Making progress
involves replacing your teachers' opinions with your own, so you can limit
your options to a manageable set instead of them being limited for you, but at
no point are you truly open to looking at every single thing equally. It would
take too much time for very little reward.

------
nemo44x
I've seen the best success in the NoSQL world when you treat data like a
history book rather than a moment in time.

What I mean by this is creating entities, mutating them, and trying to
maintain a state is the natural pattern to assume, but it is often wrong. It
leads to a many issues reported here and elsewhere and people declaring "such
and such DB sucks". This is a SQL approach to using data and NoSQL fails badly
at this generally speaking and should not be used to store "state" since state
often requires ACID and relationships.

Treating data as a history book, or log of events over time is a safer way to
use NoSQL as an authoritative source. It plays well with their ability to
scale simply and naturally and event data has no relations. Everything in the
single event is immutable as it is a fact of occurrence. Invalidating a past
event is as simple as generating a later event that says that event is now
invalid from this point in time.

From this you can generate "entities" which describe a relationship between
your events. From your log data you can generate a session, for instance.
These are not authoritative sources and if found to be invalid you recalculate
the entity, or state, from the collection of events.

I could create an entire social network profile for a person by aggregating
and querying the history of events that make up their current state. And the
real power is I can generate their state at precisely any point in time in the
past and possibly predict what their state will be in the future based on
their past stream of events and derivative states. If a past event was found
to be invalid, a new event declares this and with a greater timestamp
authorizes that fact and I can recalculate the entity that describes their
"current state".

This takes work and thought though and one of the promises you've seen from
some NoSQL vendors is they save you work. The work is always there but you
need to get comfortable thinking of data in a different way and not all data
needs this approach. But, in many cases, NoSQL is not trash.

~~~
chrisco255
I love event sourcing as a pattern. ES has its own overhead, but it is an
intriguing pattern that works well for many use cases.

~~~
nemo44x
It really does and can be extrapolated to many, many use cases.

To me, this is the real power of document data stores. It's not for everything
such as configuration or business rule data. But for a lot of generated data
in the form of events it's a useful way to organize data.

------
gerhardi
Happy resident of a MongoDB environment here! In one of my projects, we are
collecting IOT "clickstream" with slightly varying contents, around 20 million
inserts per day. These are then transformed into "sessions" that are inserted
in another collection. Everything works as it should.

Reporting is based on PowerBI and aggregations are scheduled through cron jobs
to refresh the data sources that PowerBI uses. Realtime metrics are not
needed, but could easily be implemented through the "session" generator by
calling PowerBI Streaming API in parallel when writing a document.

~~~
nemo44x
I think what you're describing here is something that is often overlooked. You
are treating the same data in 2 ways - events and entities and there's an
important distinction.

The events come in and you store them. They are immutable descriptions with a
timestamp which provides sequence. There is no relation between any 2 events.

Then you keep another collection of entities which you call "sessions". These
comb through the event data and combine them into sessions which can answer
different questions.

The event data is the authoritative source and because there are timestamps
they are sequenced. This way you can generate session from them and if
anything were to happen to make the session collection invalid you can
"rollback" to a certain point in time and recalculate the sessions from there.

In the NoSQL world you often want to begin with "events". And then form
"entities" from the sequence of events that are useful to you. Because you've
stored every action you can always go back and recalculate and create new
entities you hadn't yet discovered.

~~~
gerhardi
Yes, this is exactly our use case. Every now and then a need for new
attributes or metrics appear that are to be calculated or otherwise logically
determined for the "sessions" generated. Sometimes a new device model /
software version starts providing new attributes with the transmitted event
and we can just add metrics and aggregations based on those attributes into
the session generator if those attributes are present on an event.

------
Kiro
I have a Node server that restarts once a day. On start it reads the state of
the app (which is just one big JSON) from MongoDB into a JavaScript variable.
When something changes it updates the variable in the Node process and fires
an event to update the field in MongoDB. In other word it only ever reads from
MongoDB on server start. This has worked very good so far but with all the
negative comments I'm thinking there must be some kind of drawback or reason I
should switch to Postgre JSON instead.

One problem I see is that I need to overwrite the whole thing when saving with
Postgre when you can easily just set one field in MongoDB. If I need to
overwrite I may as well just save the whole state to a file instead.

Forgot to add that it does the same for users. Users are one collection which
reads the document on login and saves the same way as the global state does on
every update. How would that work with Postgre?

~~~
jerf
While I have no particular stake in whether you rewrite, presumably in the
Postgres rewrite, you'd break the monolithic JSON file up to be one row per
key you write.

But that on its own would hardly be a reason to change.

"Forgot to add that it does the same for users. Users are one collection which
reads the document on login and saves the same way as the global state does on
every update. How would that work with Postgre?"

Well, that boils down to "how do relational databases work?", which is not
really a suitable HN comment topic. Relational databases are designed
fundamentally differently.

It doesn't matter as much as long as you stay under a scaling limit where
reading and writing the entire state all the time isn't a problem for you, and
there's no need to do any sort of query like "Tell me the last login time for
all users in a fraction of a second, please". What is probably the best thing
for you to do is A: budget some time to read about and maybe even do some
playing with relational databases so that B: when you encounter a problem for
which they are the best solution, you realize that you've done so, and can
take appropriate action. "Appropriate action" doesn't always mean "rewrite the
entire website in Postgres"; sometimes it can be "denormalize the data into a
Postgres db, run the queries I need, and throw the database away", which can,
in some circumstances, still be faster than trying to convince a NoSQL to do
queries it really doesn't want to do. Especially on a production database
that's busy trying to do the real work.

The idea here is to make sure you don't get sucked into the tarpit of
accidentally trying to replicate relational database features at your
application level, which starts so beguilingly small and grows to consume your
entire development budget if you start down that road accidentally and don't
turn away soon enough, rather than to do things the "right" way for the
abstract sake of doing it right.

Oh, and here's a tip that can save your job and/or career: If you _ever_ find
yourself in a position where you're trying to implement "transactions" in your
application code, _immediately stop_ and go get an appropriate database on the
backend. The horrifying thing about this case isn't that you'll fail... it's
that you will _totally_ produce a solution that works perfectly in QA, and
fails miserably with data destruction and angry (if not actively litigious)
customers in production, and the road of "temporary fixes that seem to fix the
problem but actually make it worse" is basically never ending. Learn what this
looks like so you know it when you see it!

~~~
Kiro
Thank you for your very detailed and good response. I actually have a lot of
experience with relational databases, much more than with NoSQL.

The thing with the users collection is that I need to be able to add any
arbitrary attribute to the user document so I can't really use a table with a
schema. I guess I could use a related key value table for those but the
attributes can be anything, nested objects and arrays. Maybe serializing the
value would work but it still seem that Mongo is a better fit in that case.

I also like seeing the user document as a big JSON without joins when doing DB
admin stuff.

~~~
6nf
> I need to be able to add any arbitrary attribute to the user document so I
> can't really use a table with a schema.

I'd love to see a sample of this data

~~~
Kiro
It's not your usual CRUD app. It's a realtime sandbox game in early access.

Some arbitrary stuff I keep adding for various reasons:

    
    
        gunsLeftToCompensateDueToChangeInVersionX: 123,
        winnerOfEventX: true,
        didSomethingFunky: Date.now(),
        noOfEnemiesKilledBeforeVersionY: 123,
        inventory: [{id: GENERIC_ITEM_ID, quantity: 123}, {id: uuid, name: "Ultimate Weapon", damage: 10000, someUniqueProp: {doThis: true, uses: 123}}],
        flaggedForCheatingInThisOrThatArena: true
    

Etc etc. Biggest reason though is just convenience of adding some state to the
player when developing new features. Itt would be a complete pain having to
add schemas for every single situation, even if it's standardised for all
players. Now I just add it to the player object and it's saved automatically
to the player document.

Not saying what I do is best practice or anything but it enables me to move
fast without having to do much DB admin stuff.

------
jasonrhaas
Is MongoDB the leader in NoSQL? I stopped using them years ago because of
scaling issues. With things like Elastic Search and Dynamo DB, I see no reason
to use MongoDB anymore.

~~~
myth_drannon
Based on the job posts on SO it is. But it's decreasing in popularity quite
strongly comparing to Cassandra for example.
[http://www.reallyhyped.com/?keywords=cassandra%2Credis%2Cmon...](http://www.reallyhyped.com/?keywords=cassandra%2Credis%2Cmongodb)

~~~
scriptproof
I tried to add SQLite to the graph and it is at the bottom. We could question
the relevance of this tool!

~~~
neuland
Well, it does say "reallyhyped.com" not "reallyused.com".

Joking aside, SQLite is on a ton of devices, yes. But that's primarily because
it ships with OS's and is in lots of embedded devices. For web development and
other server-side things, I get the feeling that SQLite is not as heavily used
as client-server databases.

------
sandGorgon
Whenever I have talked to people choosing Mongodb, it always came down to two
things:

1\. Not knowing a about Postgres JSONB column type (which behaves pretty much
like Mongodb)

2\. Not knowing about Amazon RDS as a single click hosting option for
Postgres.

Most people run their own Mongodb servers...Which is what gives the impression
of "easy to get started".

And here's the other truth as well - People painfully move back to Postgres
after losing data. Not because of "oh we can now afford to do SQL" or "oh our
data model now needs Postgres". But plain and simple - they lost data that
they can't admit to in public.

~~~
vacri
1\. Hardly fair, considering that Postgres JSONB is only 2.5 years old, and
the marketing fanfare around Mongo had largely petered out at that point. I've
been involved in three companies which use mongo, and all of them selected it
before JSONB was in Postgres.

~~~
sandGorgon
I'm not quite sure what has time got to do with it. I don't believe I
mentioned that this was more than 2.5 years ago.

I'm talking contemporary companies - Mongodb still holds far higher startup
mind share than Postgres. MEAN/MERN stack is still a thing.

------
edejong
Whenever I see a young gullible engineer, all twinkly-eyed and bedazzled by
the MongoDB hype (or any other NoSQL hype), I'll send him a link [1] on a
rather interesting talk by a very thoughtful Turing award winning engineer
named Michael Stonebraker. It takes a couple of days for the charm of NoSQL to
wane off, but eventually they'll always see how they have been fooled into
believing without a sane level of critique.

[1] [https://youtu.be/KRcecxdGxvQ?t=2072](https://youtu.be/KRcecxdGxvQ?t=2072)

~~~
InTheArena
Bah. Stonebraker invented this marketing strategy in the first place. I heard
his companies make all sorts of insane claims - from Postgres, to Vertica to
VOltDB. After the first one, I didn't bite on the second and third, only to
have some random manager overrule me, and then spend millions migrating off of
these proprietary databases later (Vertica and VoltDB)

------
nemild
(Author here) As part of this three part series, MongoDB’s CTO, Eliot
Horowitz, was gracious enough to spend two hours chatting with me.

In our discussion, we touched on 10gen’s early marketing strategy, which I've
combined with notes from my research:

\- __10gen’s Marketing Focus __: Eliot noted that much of 10gen’s marketing
message was meant for large enterprise CTOs and engineers who made database
decisions. If you’re trying to build a database company, this is where most of
the money is. But many startup engineers I knew didn’t realize this, and
seemed to think the message applied to them. 10gen’s explicit focus on
sponsoring hackathons and and targeting startups also encouraged these issues.

\- __Anger on HN __: Eliot and I disagreed where some of the MongoDB anger on
HN comes from. In his view, a fair bit of it stems from competitors and their
supporters. In my view, much of the anger came from 10gen’s fanciful marketing
message that outstripped the product in the early days. I believe that if the
marketing message had been more thoughtful and the product more mature when
the marketing ramped up, the community anger would have been much less (but it
likely would have significantly hurt MongoDB 's adoption, which is a
challenging problem).

\- __10gen’s Marketing “Strategy” __: Eliot argued that 10gen didn’t have much
of an early marketing strategy. In my view, their marketing team made some
smart decisions that really set them apart from competitors. First, MongoDB’s
Javascript DSL, JSON data store, and onboarding experience were critical
differentiators - and their product was early to market. Their marketing team
then used this as they built the MongoDB user groups /conference network,
pitched NoSQL and the MEAN stack as the future, and brought industry allies to
their side.

\- __Engineers and Marketers __: We debated how much role CTOs should play in
dev tool marketing. At 10gen, Eliot noted that he was rarely involved, instead
focusing on engineering and product. My own view is that engineers should be
involved in the marketing message in highly technical products - and have some
input into marketing. I also believe that marketing a database is very
different from other tools in the engineering stack such as a frontend
framework.

Much of this debate stems from the differential objectives of engineers
(making the right decisions for their teams) and marketers/founders
(convincing customers to use your product, sometimes at any cost).

I let Eliot know that I would support him in sharing any follow-up thoughts,
with the hope of spurring a thoughtful debate in our community.

Finally, I won't argue that 10gen's marketing and product was enough on its
own to explain the growth amongst startups - the usability of MongoDB was a
key reason for it's success in startups. Less discussed also was the marketing
undertaken by training programs, bootcamps, and conferences (see the marketing
around MEAN that inundated Hacker News and Reddit in the early-mid 2010s, as
NodeJS was growing).

~~~
gaius
_In his view, a fair bit of it stems from competitors and their supporters_

LOL no. 99% of it comes from actual, working DBAs and others with production
responsibilities.

~~~
Jare
"Default settings" and "DBA with production responsibilities" should never
coexist in the same post. Since >50% of the complaints are about "default
settings", you know the rest.

------
flavio81
>"The Marketing Behind MongoDB"

>"Countless NoSQL databases competed to be the database of choice. MongoDB's
marketing strategy helped it become the winner."

Of course, of course!

Because if it the strategy was based on MongoDB's scalability and security...

------
balozi
I have always wondered how much of what I see on Hacker News is "sponsored
content." Not that there is anything wrong with it.

~~~
owlmirror
There is a lot wrong with not declaring sponsored content as such, so much so
that it's illegal in many countries.

------
anothrowaway45
A similar success story, albeit a bit earlier in the hype cycle, is Realm.

They started with an lightweight db engine for embedded systems named TightDB.
Getting traction in that space is hard, so they pivoted to become a „mobile
database“, taking a page out of the MongoDB playbook

\- they built a slick wrapper for iOS and Android to make it extremely simple
to get started

\- they wrote a fantastic set of tutorials to get started quickly

\- they started doing lots of events to grow their community

And they‘ve succeeded to get quite a lot of companies to use their product —
by making something that is fun to use for developers (at least in the
beginning)

------
DonnyV
I feel like a lot of issues people have with NoSQL or MongoDB is that there
trying to fit the Relational Model and SQL way of doing things into MongoDB.
MongoDB is a document database. So the things you did in a RDBMS won't work in
a document database. Time to think different.

Some of the things I'v seen people complain about.

JOINS - should be done in the application

SQL - one of the big issues I have with RDBMS is that application logic lives
in the database and application, causing all kinds of brittle spaghetti code.
If you need that code to live outside of the application then its time to
build a web-service around it or just have a shared library all the apps use.

Reports - should be done in the application, could also create a view in
Mongodb and pull report data from that.

Schemes - RDBMS schemes are a huge pain, the restrictiveness of it causes all
kinds of pain when you want a dynamic data-store. The best thing about
Mongodb's collections is that every document in a collection can have a
totally different schema. If you want to do that in a RDBMS you end up hacking
it by shoving json or xml data into fields. Your basically creating a document
database but doing a horrible implementation of it.

ORMS - if your using an orm then your already going down the path of using a
NoSQL database. ORMs are hacks for RDBMS because mapping your data model to
SQL is a pain. With Mongodb this is all built into the driver and database.
Thats why Mongodb Inc creates drivers for all major languages. Plus most
drivers have attributes you can decorate your modals with that make it
flexible when the structure changes. So nothing blows up.

~~~
flavio81
_> JOINS - should be done in the application_

Ok... Also tell me, transactions should be done in the application as well?

Great! Next time, i'll do transactions and joins in the application, and
discard the 35+ years of experience and performance improvements that any
modern RDBMS has for joins and transactions.

 _> The best thing about Mongodb's collections is that every document in a
collection can have a totally different schema._

In which sense is this a good thing?

 _> ORMS - if your using an orm then your already going down the path of using
a NoSQL database. ORMs are hacks for RDBMS because mapping your data model to
SQL is a pain._

The ORM simply allows easier interaction with the RDBMS when using an object-
oriented programming language, mainly for moving data from tables to objects.
The ORM is explicitely for RDBMS; the "R" in "ORM" stands for Relational.

------
noncoml
I start with MongoDB on every single one of my side projects because it’s so
easy to get started.

No setting up permission or schema.

Just works.

------
laichzeit0
From reading the comments on HN every time these MongoDB threads come up, I'm
beginning to think -- and I really hope I'm wrong about this -- that a lot of
people started using MongoDB who had basically no prior experience in using
relational DBs. If true, that is just insane. I guess I take it for granted
that programmers have at least some computer science background and are
familiar with the relational algebra, ACID, etc. It kind of saddens me.

------
vira28
As we all know these marketing strategies works well only for the initial
bubble. For the long term, its just going to be whoever providing the perfect
content.

------
bluetwo
Refreshingly honest.

------
ishtu
Winner of what? [https://www.linkedin.com/pulse/mongodb-world-2017-lonely-
sto...](https://www.linkedin.com/pulse/mongodb-world-2017-lonely-story-versus-
john-de-goes)

~~~
nemild
Survival? A planned IPO?

------
qaq
OK for everyone describing issues with RDBMS take Postgres create table with
id and "data" jsonb column and you will have a NOSQL solution with Nx the
performance and Yx the features.

------
icantrank
Can somebody point me toward the 'also a marketing strategy for mongodb'
comment?

I came here to like it

~~~
pavlakoos
I think the only strategy would be: give for free to the developers, charge
big bucks from the enterprise.

------
tschellenbach
A lot of their growth is due to a lack of mature ORMs. If you're using Django
it's super easy to use Postgres. If you're using Node you're out of luck. So
for certain ecosystems MongoDB can be easier to work with due to the available
libraries.

I personally much prefer Postgres though.

~~~
raverbashing
Django's ORM, especially with built-in migrations, is great.

Just don't think it doesn't have any gotchas. It also can't do JOINs (with
more than one element)

------
leandrod
Summary: learn the relational model of data management, as set up by Edgar F
‘Ted’ Codd almost half a century ago, and enjoy innoculation against hype for
all your life.

~~~
jackweirdy
Except all the hype around CockroachDB

~~~
ojosilva
Even their logo reminds me of Mongo's. I wonder if it's intentional (a spoof
to be precise).

------
mindhash
if only elixir phoenix supported it :(

~~~
mischov
I can't tell if this is a serious comment or not, but there is no reason why
you can't use Mongo with Phoenix.

For example- [https://tomjoro.github.io/2017-02-09-ecto3-mongodb-
phoenix/](https://tomjoro.github.io/2017-02-09-ecto3-mongodb-phoenix/)

