

Ask HN: What's the current state of NoSQL ? - olalonde

What's the current state of NoSQL ? Is it ready to replace SQL ? Should I better stick with an ORM + hand written SQL for my next app or should I consider trying a NoSQL database ? Combination of both ? Share your thoughts !
======
simonw
NOSQL = Not Only SQL - I don't think you should ever expect it to completely
replace SQL since they're good at different things. I've played with a bunch
of NOSQL engines (including CouchDB, MongoDB, App Engine, Solr, Xapian and
Tokyo) and while I really like them and would use them for a bunch of
problems, for most of my projects the ability to create arbitrarily complex
queries using joins is essential for rapidly iterating and trying out new
ideas.

Instead, I'd suggest using NOSQL stuff to complement SQL. Songkick use MongoDB
as a fast caching layer for example:
[http://effectif.com/ruby/manor/denormalising-your-rails-
appl...](http://effectif.com/ruby/manor/denormalising-your-rails-application)
\- and I've found Redis incredibly useful as a way of handling write-heavy
parts of my applications and dealing with requirements to return random
elements.

One of the most interesting aspects of document stores such as CouchDB is that
they are schemaless, which for some problem sets is incredibly powerful -
anything where you might be tempted to use key/value pairs in SQL for example.

------
gridspy
I'd only consider NoSQL if you expect a very heavy load and one of these:

    
    
       - You have live data that is changing very regularly
       - You have a large quantity of flat data (think column-oriented databases)
       - You don't need to index / find data and it is very flat 
         (perhaps simple files on disk will allow you to store more)
    

Plus :

\- You cannot possibly put off going NoSQL until you have further established
yourself in the marketplace

For Gridspy, we have live data and I expect large quantities of pretty flat
data. It makes sense to stream the data directly to the user via messaging
rather than polling through a database. Plus, I plan to store large quantities
of high resolution data in a specialised database or dumped to disk - it will
be much smaller and simpler without the indexing information since I don't
need to search it, only slice it.

See [http://blog.gridspy.co.nz/2009/09/database-meet-realtime-
dat...](http://blog.gridspy.co.nz/2009/09/database-meet-realtime-data-
logging.html)

~~~
m0th87
To be fair, I think there are a lot more use cases than that. My last project
and my current one both used NoSQL solutions. The former because graph walking
is painful in SQL and the latter because schemaless documents enable so many
new scenarios (in my case, the automatic mapping of user-created forms to the
backend). For me, the performance ramifications of NoSQL is way overplayed
compared to its usability improvements. It is certainly not a successor or
replacement to SQL, but for applications that handle a lot of unstructured or
semi-structured data, NoSQL makes code tighter, simpler and easier to reason
about. Which means less bugs and faster deployments.

EDIT: I should qualify this argument by stating that I'm talking about
datastores that support custom queries (like MongoDB and CouchDB) rather than
ginormous k/v stores (like Tokyo Cabinet). I've found limited usefulness for
the latter.

~~~
gridspy
You know when you have a special case. If you don't - just the most convenient
off the shelf database and get back to adding core functionality.

It sounds like you do, however I think that there are a lot of people who have
no real reason to move away from SQL other than fashion. The database choice
should be just like any other engineering decision - often your familiarity
with the new tool is very important.

------
justinsb
Is your app pushing the limits of SQL databases? Is there any reason to look
at NoSQL other than the fact that it's 'cool'? Currently all the NoSQL
databases are very early adopter products, and each have their own strengths
and weaknesses, so you'll have to choose a NoSQL database whose strengths
match the area where a SQL database is failing you, and where the weaknesses
aren't deal breakers.

Of course I'm biased, and tend to lean towards using SQL/relational
databases... FathomDB is all about trying to eliminate the pain points of
running a (My)SQL database. I feel a lot of the NoSQL marketing hype is
picking on weaknesses of MySQL (rather than relational databases per-se), and
so we're thinking about how to make MySQL better, and we don't think it's a
good idea to abandon the relational model entirely. After all, our industry
started with NoSQL back in the 60s, and there were good reasons for adopting
the relational model 30 years ago!

~~~
olalonde
My ultimate goal is to increase productivity - I'm getting tired of writing
and maintaining all those SQL queries. I felt the driving force behind NoSQL
databases was cutting through the pain of SQL, but I can see from other
comments that it is not really the case.

~~~
justinsb
I believe that simplifying querying is a non-goal for NoSQL, in fact most
NoSQL databases actually push more of the burden of querying onto you. CouchDB
is arguably a bit better here with its concept of views. SQL is incredibly
powerful for expressing very complex queries succinctly, and it's pretty
difficult to beat. CRUD queries are tedious in SQL or NoSQL, and an ORM or
similar abstraction layer definitely helps productivity when programming those
bread-and-butter operations.

~~~
olalonde
Thanks, that's exactly the answer I was looking for. For some reason, I was
under the impression that simplifying queries was actually a design goal of
NoSQL whereas it really is about scalability and performance... right?

~~~
cperciva
NoSQL simplifies queries to the extent that it makes complex queries
impossible and thereby forces people to design their data structures in such a
way as to allow them to do everything they need using only simple queries. :-)

~~~
richcollins
How are complex queries impossible? You could easy write a declarative
language for querying graphs.

~~~
wlievens
Sure, but the performance would be horrible. That's his point. You can't do
some of the things rdbms's do.

------
kunley
There are different things to consider depending on whether you'd create new
app alone / with trusty co-founders, or you'd want to introduce it to a team
using some form of agile development, or you'd want to expose NoSQL to people
using ol' rusty waterfall model.

The last case is hardest and I'll share some thoughts on it. I know most of
you don't live in such environment, but still you can infer the "agile"
scenarios from the waterfall one. In other words, the following waterfall
issues can be areas of potential f __kups using whatever development model.

So, the impact of switching to NoSQL for different waterfall'ish teams:

\- it changes the way how your data is organized -- mostly it's
denormalization and some strategies tied to the specifics of queries you'd
have to use most (read: ad hoc strategies). So, it influences the analysis,
architects, development & release management.

\- it changes the way how the db "schema" changes can be introduced. You'd say
"there's no schema". Well, it's partly true, but in real life you have to add
some metadata information to the underlying db, otherwise your db queries
won't run. For example, Cassandra has ColumnFamilies definitions, CouchDB has
its view definitions. Somebody has to agree what needs to be changed and then
write these changes and maintain it in sync with the codebase. You'd probably
need mechanism like Rails migrations to maintain it - you won't get rid of it
with the promise "there's no schema". Somebody has to apply such changes to
production as well. So, back to the waterfall: it influences analysis,
development, release management & operations.

\- it changes the way how your app scales. The goal of many NoSQL engines is
to easily scale horizontally -- this is a big win to operations! But we're not
there yet (Cassandra? Maybe MongoDB?), see eg.
<http://bjclark.me/2009/08/04/nosql-if-only-it-was-that-easy/>. Also, if
something you need crucially doesn't scale, you have to redesign your app. So
the influence is: operations have less work, release management has more work,
but in the worst case all the teams have to rework the app.

\- it allows for some non-standard app behaviours. Eg. CouchDB is excellent at
disconnencted operations, meaning: ocasionally synchronizing data between
nodes which are mostly offline. It's also called "no master" as opposed to
"multi master". No wonder IBM research funded CouchDB development (trying to
rewrite Lotus Notes? ;) and also Ubuntu chose it for their Ubuntu One sharing
platform. Feature like this is a relief for release mgmt & operations, but can
need a lots of work from the architects, analysis & devs.

Hope this is useful. I'm considering convincing some BigCorp to use NoSQL in
some project and these are the issues I thought of.

~~~
kunley
I didn't write anything about the lack of transactions. Well, one should
examine the atomicity level provided by the engine and then there's some work
needed to be done by architects & devs to ensure data consistency where it's
needed.

------
alexpopescu
I think it would be a mistake to think of NoSQL as a replacement of RDBMS. Its
main goal is rather to make our lives easier for a set of scenarios that were
created with the read-write web. I'd encourage you to take a look at these
NoSQL usecases: <http://nosql.mypopescu.com/tagged/usecase>. Hopefully that
would give you an idea where some of these systems are fitting in. If you
check other presentations on MyNoSQL you'll notice that many live systems are
using a mixture of RDBMS and NoSQL.

:- alex

------
Jim_Neath
I've been using MongoDB to rewrite the activity feed on one of my apps. I
wouldn't use it for everything though (not just yet anyway).

Having said that, the guys behind Harmony (<http://get.harmonyapp.com/>) use
MongoDB for everything, as far as I know:
[http://railstips.org/blog/archives/2009/12/18/why-i-think-
mo...](http://railstips.org/blog/archives/2009/12/18/why-i-think-mongo-is-to-
databases-what-rails-was-to-frameworks/)

~~~
simonw
I'm curious: how do you structure an activity feed in MongoDB? Are you doing
anything special to support the case of "show me activity from all of the
people I am following", or do you just use it as a fast append-only log?

------
siculars
from your comments here, it seems that you are a bit confused by the term
nosql. it is kind of a misnomer, inmho, and should rather be called nordbms.
what this movement is really replacing is the traditional rdbms approach to
data storage, retrieval and searching. as @emileifrem points out in his talk
at nosqleast.com, nosql should be referred to as "not only sql". further he
likens the explosion of new systems under the nosql banner of that to the
explosion of rdms's in the 1980s and 1990s. i tend to agree. there are a
number of solutions out there right now, each approaching nosql from a
different angle.

watch some of the videos from nosqleast 2009, <https://nosqleast.com/> to get
a better picture of some of the different options and major players in this
area before making a decision as to what nosql solution to base any of your
future projects on.

------
aita
Yahoo's benchmarks: <http://www.brianfrankcooper.net/pubs/ycsb-v4.pdf>

------
randliu
No, it's not ready to replace SQL, and I don't think it ever will. What are
your requirements? If it's horizontal scalability (and you're actually hitting
a performance wall) you should begin to think about it. Maybe also if you
never do any joins.

Relational database systems (+ normalization) compromise everything to ensure
the ACID properties, which for the majority of cases, is the most important
part.

~~~
richcollins
How is the relational model inherently better for ACID than non-relational
models (graph dbs for instance)?

~~~
randliu
>database _systems_

------
w3matter
For us the big thing, during the current refactoring of
<http://www.funadvice.com>, is eliminating joins.

In testing of the new up-coming platform, that was a huge, huge win for speed.
And we're a Postgresql shop too.

MongoDB allowed us to: * Have embedded documents (very large performance
improvements) * Have arrays and hashes as "columns"

We also use Redis in a few crucial places, because of its really good support
for lists (queues), and sets, besides, just its blazingly raw speed.

Downsites? Yes. Many rails-style plugins don't work well. But an upside is
that we're forced to write leaner code and not depend too much on those.

Another downside, MongoDB is super-fast, but is still a work in progress in
some places, and the ORM we're using (mongo_mapper) is somewhat of a moving
target right now.

But hey, thats what happens when you're on the bleeding edge.

MongoDB: * build-in replication * basic sharding * embedded documents * very
very fast

