

Ask HN: Why are there so many NoSQL databases? - kez

Back in the early 2000s it seemed like your choices for database-driven web sites were MySQL, PGSQL, BerkleyDB (for the Perlites) and maybe SQLite.  Enterprises had their Oracle and SQL Server.<p>Now, when trying to get stuck into a bit of NoSQL/schema-free/document store databases for the web, I am overwhelmed by the number of options, and am struggling to understand the best one for the job.<p>Do people genuinely believe that the world needs this many NoSQL systems, or are we just in the infancy/resurgence of schema-free, and things are yet to settle down?
======
pierrefar
Because they're all different. We have key-value stores (Berkley and
MemcacheDB), column stores (Cassandra), document stores (CouchDB, MongoDB),
and even new data structures (Redis).

They all solve different types of problems (e.g. document stores vs key-value
stores). Even similar databases solve the same problems differently (e.g.
sharding). They have different performance profiles and bottlenecks. They give
you different ways to model your data and query it. Some are persistent, some
are not, and some are lazy persistent.

Big picture though: this is the first time your average startup/small
team/individual hacker has needed a very scalable database solution because of
websites. A website has the ability to get you a ton of users very quickly
even if you are just one man hacking on a personal pet peeve (I went through
this).

This kind of experimentation is awesome and it allows us to figure out what
really works in what situations and is a sign of a very healthy community. I
love being part of it.

------
simonw
It's a cambrian explosion. A lot of the concepts involved (eventual
consistency, CAP theorem, map reduce) are relatively recent innovations, so
there's plenty of scope for exploring them with software. I imagine things
will settle down eventually.

~~~
z8000
You are consistently available for insights.

------
nawroth
To answer your question I think things will settle down -- in the long run
it's too hard for developers to deal with all the alternatives and their
differences. To get a high-level overview of the NOSQL space you could read
this blog entry: [http://blogs.neotechnology.com/emil/2009/11/nosql-scaling-
to...](http://blogs.neotechnology.com/emil/2009/11/nosql-scaling-to-size-and-
scaling-to-complexity.html) As Ben Scofield puts it (cited in that post):
"NoSQL DBs often provide better substrates for modeling business domains". I
think this aspect is often forgotten in the debate. So I'd say: start from
your business domain, what are the characteristics of it? Then look for a DBMS
that is a good fit. And to get down to the details of some of the NOSQL
systems, here's a walk through: [http://www.vineetgupta.com/2010/01/nosql-
databases-part-1-la...](http://www.vineetgupta.com/2010/01/nosql-databases-
part-1-landscape.html)

------
bradfordw
Because monoculture is bad and the more options we have, the more they all
learn from one another (which drums up competition). In the end, like the
other fellows on here have stated, you'll have a few emerge as the "standards"
based on the type of problem you are trying to solve.

------
silentbicycle
The new non-relational databases have fairly different designs. For example,
if your data set would fit entirely in memory (on one or a few servers), Redis
would probably be a great choice. Their different strengths come out of the
design choices that set them apart.

A while ago, there were several different database query languages for
relational databases, too. In interest of having a standard, they compromised
on SQL. There are lots of version control systems, parsing frameworks,
programming languages, etc., too. This isn't really unique to databases, they
just get talked about more since there's so much buzz about hot new web
development stuff.

~~~
silentbicycle
Huh, apparently my saying so is enough to cite this as fact in kez's blog.

One good source about relational databases (including their history) is _An
Introduction to Database Systems_ by C.J. Date. The author has an axe to
grind, but he's thorough, and there are plenty of other references cited
should you want to dig deeper.

------
bitdiddle
I think people have been doing these things for years, in the past they were
just more apt to be embedded in desktop applications and so forth. Issues with
scaling for the web have changed the dynamics, so the new non-relational
approaches are quite distinct from the earlier ones such as Statice,
ObjectStore, and Ontos.

CouchDB is well worth a hard look mainly because it takes advantage of several
new ideas all in a very simple stack.

In a year or two I predict two or three will emerge as clear choices for a few
distinct scenarios.

------
majke
> Do people genuinely believe that the world needs this many NoSQL systems, or
> are we just in the infancy/resurgence of schema-free, ...

The non-SQL world is still pretty young. Well, the ideas themselves are old -
but recent implementations try to solve unique problemsets.

> ... and things are yet to settle down?

Yes. IMO there would be 5-7 major projects supported by larger communities.
Every of this projects will solve particular problem.

So, instead of having 2-3 general SQL providers, we can expect many solutions
for very specific problems. The issue right now is that we don't really know
what these problems are. Current NoSQL implementations are probing the market
- answering the question if this specific features are useful for broader
audience.

I think we can guess some of these 5-7 major specializations, for example:

\- Memcachedb: Distributed K-V optimized for speed - no replication

\- Distributed K-V optimized for reliability

\- Distributed K-V optimized for size - like Dynamo.

\- neo4j: Graph database

\- redis: K-V with reach features, but limited to data size that fits in
memory

\- K-V framework created to allow Map-Reduce jobs - including scheduler,
debugger and so on.

~~~
z8000
FYI if you are into the bleeding edge, redis has a virtual memory
implementation as of about 12 hours ago.

------
jokull
Because if you look closely - they're all different. There's a great deal of
feature overlap however. It's like the community collectively is throwing
things at a wall and seeing what sticks. It's the healthiest way to eventually
get the best. My bets are on redis and perhaps MongoDB.

------
jdp
NOSQL is definitely not a new idea, but the current favorite mode of access
and interaction is (REST). I think the explosion in popularity for the
creators is due to a lot of things, including: the ability to start an open
source project in a new environment, giving it a real chance for wide
adoption; the need to fill a niche, there are many different types of NOSQL
stores; and to a lesser extent the perceived simplicity of such a project. For
people using NOSQL stores in their projects, the attraction comes from the mix
of shiny new technology and performance benefits, both real and perceived. It
also helps that there are many different types each addressing a different
requirement.

------
keefe
I'm a big fan of document databases. It was one of those things where I was
working on my app, thinking to myself... self, aha! I need documents or
arbitrary kvp storage... then thinking, yeah somebody else must have done that
already and there I am on couchdb or whatever. Having to have schemas is just
an unnecessary pain in the ass imho (sometimes necessary blah)

------
kez
Thanks for all the comments; I have put together a brief summary:
<http://www.justkez.com/why-are-there-so-many-nosql-options/>

It's been very interesting reading the responses.

------
koenbok
It's a hip thing to work on, an interesting problem to solve, has some nice
ideas around it, has great potential to get lots of users and there are no
widely accepted solutions yet.

