

SQL vs. NoSQL  - thinkerer
http://sql.learncodethehardway.org/book/introduction.html
Theres a lot of debate on this and I understand theres no clear answers. Just wanted to hear some of your experiences working with both and whats applicable for what type of systems. I myself am working on a database for a web app analytics tool and witnessed some benefits of NoSQL while SQL is supposedly more reliable.<p>Understand companies like Amazon have shifted to NoSQL to manage their data. 
In short, I do think that SQL is for high processing, low disk space while NoSQL is vice versa in layman terms. Im pretty new to programming and appreciate a healthy discussion on this.<p>Please do not draw lines and have hate posts and all. Keep the discussion clean and intelligent.<p>Thanks!!
======
twic
_The problem with SQL is it seems everyone hates its guts. It is a weird
obtuse kind of "non-language" that most programmers can't stand._

I'm baffled. I'm a programmer and i rather like SQL. Most programmers i know
use SQL quite happily. Indeed, i often hear them wishing they could just write
SQL to solve some problem rather than use some API that purports to save you
from writing SQL.

I certainly prefer SQL to MongoDB's query language, which is verbose and
unexpressive by comparison.

Am i in a tiny minority? Or is this SQL hatred a popular myth?

~~~
crazygringo
I know SQL inside and out, and I hate it for anything put trivial queries that
involve a single table and a single WHERE clause.

Every time I write a query of reasonable complexity, I have to figure out what
I want the database to do (in an imperative sense -- read from this table
first, then look up this in that table, then do this other thing with this
other table for each row), and then figure out how to translate this into a
declarative SQL statement, just so the database's query interpreter can turn
that back into an imperative program it can run.

The whole thing is just a huge unnecessary round-trip of thinking, and SQL is
limited in that there are many simple things it can't do, and you never really
have any guarantee that the database's query analyzer will execute your query
imperatively in the manner you hope for (which can lead to terrible
performance problems, and hacks like index "hints" still don't always solve
it). I'd MUCH rather be able to program against the database imperatively,
directly.

So despite using SQL day-in, day-out, I totally agree that it is weird,
obtuse, and a kind of "non-language" (in the sense that any query is limited
to a single declarative statement, with a host of limitations, instead of an
imperative programming language that allows you to write whole programs of
execution).

~~~
nwenzel
Good points all the way around. Do you think SQL's obtuseness comes from it's
ability to do more than (perhaps) it should?

Maybe it really only should be used for simple queries with simple where
clauses. But because it can be used for more, we choose to do so.

It's based on relational algebra making it fairly predictable (query optimizer
aside which is completely unpredictable). It operates on relations and returns
relations.

"Select * from MyRelation;" returns a relation. So we could also write "select
* from (select * from MyRelation);" and get the same result. You can continue
that nested select as many times as you like and with as much complexity as
you like and it would still work. Though undoubtedly it gets difficult to
follow.

Adding to the obtuseness, SQL is not typed. So for "select * from MyRelation
where SomeField = 1;" you could replace the 1 with another select statement,
which would return a relation. But as long as that relation had a single
column and a single row, it would be interpreted as a single value and
successfully replace the 1.

Also adding to the obtuseness as you get more complex is that it variable
scope for table aliases can be difficult to follow. Table alias names can be
used across various select, where, from and other clauses to link them
together through a common filter but it happens in the single SQL statement.

I think it's that common issue of reducing a "program" into a so gel statement
that causes the issues I run into. At times, I feel that I am writing what
basically amounts to a stored procedure in a single statement. That's when I
start getting mad at SQL. Though, it's really my own choices that made it a
problem.

------
SchizoDuckie
A project from 2010 that's stalled half-way through in alpha state.

Still, he is quite correct. You still need a Structured Data Query Language.

~~~
mantrax5
SQL remains a useful tool in the box, but while SQL and NoSQL are fighting it
out to be the default go-to database technology, something else on the horizon
is threatening the concept of having any database separate from the app at
all: persistent in-application state.

We don't need a database if we can just load the application state in RAM and
save it back via serialization. This works well in surprisingly many cases.

~~~
_delirium
That's a fairly traditional no-DB approach [1], but what's the reason to
believe the dominance of that approach is on the horizon?

[1] Common Lisp has a particularly nice system for doing it transparently,
[http://common-lisp.net/project/elephant/](http://common-
lisp.net/project/elephant/), and I believe HN is implemented using a DB-via-
serialization strategy as well

~~~
mantrax5
Actor systems.

Actor systems allow this to scale and cross the machine boundary without
having some dedicated database product doing it for us.

I've noticed that the more I work with actor systems (each actor is stateful
and keeps its own state), the less I use databases.

In fact, the idea of keep state separate from the code reading it and changing
it begins to look oddly broken as a concept.

------
vertex-four
Personally, I find that SQL is hard to _generate_. It's hard to reason about
the creation of dynamic queries, say, for some types of searches. It's not
reasonable to create a generic function that can operate on any same-shaped
piece of data. And it's ridiculously hard to manage the concept of a pointer
to a piece of data which could be one of many types, a problem I'm sure most
people have faces.

Stored procedures can help with some of this, but then you've got to remember
to update them every time you need to access data in a slightly different way,
and it's hard to figure out when you can get rid of old stored procedures.
Tools for managing them aren't up to scratch with tools for managing "real"
programs.

Personally, I'm keen on RethinkDB. It's a document-oriented NoSQL database
which doesn't lose your data, has built-in clustering, and an extremely strong
query language built on chaining method calls. While it doesn't have
transactions, the query language is powerful enough that you can easily model
most forms of deterministic data transformation within a single query.

------
daleharvey
There is some irony in opening the post proclaiming how important SQL is
because of webSQL, webSQL has hung on by being the only choice with Safari,
with Apple almost certainly adopting IndexedDB webSQL will cease to exist in
any relevance pretty quickly (its already fairly niche).

This post is just filled with naive strawmen arguments, I work on a NoSQL
database, show me a SQL database that is implemented embedded inside every
browser and allows offline operations to sync between masters transparently
...

Trying to define or dismiss 'NoSQL' as a singular movement is not going to
work, there are lots of tools, I use different ones for different things, some
of them even have a SQL interface.

~~~
zik
NoSQL is a strange term. It really just means "everything else that isn't a
traditional SQL database". It encompasses a massive range of completely
different database technologies.

SQL has been astonishingly successful in recent years and it's a testament to
its dominance that the term NoSQL even exists. It's kind of like calling all
meats except for beef "NoBeef".

I'm really happy to see some different database technologies getting some
attention these days. They each have their own niche. There's no SQL vs redis
vs MongoDB vs cassandra vs whatever "winner", just databases with different
strengths which can benefit us in different ways.

~~~
Pacabel
I don't think that's necessarily the definition of "NoSQL database".

Databases that predate Codd's work, or long-established ones like dbm and BDB,
may appear similar to modern-day NoSQL databases in how they operate, but they
surely aren't the same.

Those systems couldn't use relational theory or SQL because they didn't exist
yet, or at least didn't reject them outright as one of their main goals.

NoSQL databases, on the other hand, are completely about rejecting relational
theory and rejecting SQL. That's at the very core of their philosophy.

True NoSQL databases have been developed as a reaction to several things:

1\. A very, very, very small number of situations where relational DB systems
cannot easily scale.

2\. The far more widespread ignorance of the basics of relational theory, and
a lack of willingness to learn about it.

3\. The far more widespread ignorance of SQL, and a lack of willingness to
learn about it.

4\. An urge to be "different" solely for the sake of being different, even if
it brings no technological benefit.

5\. Unmitigated hype surrounding the term "NoSQL".

Of those, 1) is perhaps the only legitimate reason for using a NoSQL approach
today. The number of times this sort of a situation actually arises is
remarkably small.

The other four are why those databases have become more widely used,
especially within the web development community. As anyone who has dealt with
such systems knows, they're rarely about safely and reliably storing and
managing data, and they're rarely about doing this efficiently. They're merely
a shortcut that some developers use to avoid learning how to use a RDMS.

~~~
jchrisa
We surveyed NoSQL users in 2012 and it turns out the biggest draw is the no-
schema thing. People don't want to coordinate migrations when they are
implementing new features. In some cases this may be technical debt, in
others, having a schema to maintain is technical debt.
[http://blog.couchbase.com/nosql-adoption-survey-
surprises](http://blog.couchbase.com/nosql-adoption-survey-surprises)

~~~
cms07
This is crazy. Why wouldn't somebody want a schema?

------
bitL
I'll just put a few points here to understand when you should use which
architecture.

NoSQL:

\- reading is fast, writing is expensive (if all data are pre-
processed/denormalized for reading during the writing phase)

\- often schema-less

\- low latency (for key-value storage)

\- offline batch processing (classical Map Reduce)

\- no ACID, choose 2 of 3 in CAP

\- demanding on SW engineers to get client-side conflict resolution, tricky in
general

\- Petabytes of data can be suddenly processed

\- huge variation of different paradigms, key-value, document, graph, batch
etc.

\- haywire indexing

SQL:

\- writes are fast (normalization), reads are expensive (JOINs)

\- ACID (well, only to some extent, clustering messes up many ACID properties
unfortunately and conflicts arise in corner cases)

\- set operations and a neat math theory behind them

\- stable indexing, easily constructable real-time JOINs

\- OLTP

\- easier for developers

\- non-flexible schema

\- tradition, well-known recipes on how to do things

Basically, if you want to have low-latency access, your concurrency model
allows eventual consistency, or you have a need to store your data in non-
standard structure such as graphs/trees, use NoSQL and pre-process all data to
be exactly in the format you require for reading.

If you need 99.999% guarantee of consistent data, amount of data you need to
handle is under 50TB, you can put your data into a fixed schema and latency
doesn't matter that much, use SQL.

I would recommend you to ask yourself a question - is your app/business read-
heavy or write-heavy and decide accordingly.

------
nawitus
>The NoSQL movement is partially a reaction to antiquated database servers,
and also a response to a fear of SQL borne from ignorance of how it works

I think the NoSQL movement was created for practical reasons. Since SQL didn't
scale properly for certain applications, new kinds of databases had to be
invented. What often happened is that a software used a typical SQL database
with locks, and as that didn't scale, the locks were decreased to a point ACID
wasn't theoretically guaranteed anymore. And it worked.

If the database in production doesn't guarantee consistency, you might aswell
design a new database which is based on that idea, which is a core reason
NoSQL databases were created. There's other reasons, too, of course..

~~~
jsemrau
"Movement" is a bit much, no? It is a solution for a niche problem.

~~~
Pacabel
"Movement" is a good term for it.

There are a lot of people out there who have a quasi-religious attitude toward
NoSQL databases.

It's a cause for them to rally around. In some ways, it gives them something
to "fight" for.

Objective consideration about whether it's the best technology to use in a
given situation is often disregarded.

There are a very small number of very rare situations where such technology is
the best or most feasible approach. Otherwise, it's merely something that a
lot of people get involved with in order to intentionally avoid learning how
to use SQL and relational databases, or to feel like they "belong" to some
greater cause.

~~~
nawitus
>There are a very small number of very rare situations where such technology
is the best or most feasible approach.

I don't agree that it's a very small number. In fact, usually SQL brings a lot
to the table of which very little is actually useful.

------
dorfsmay
He is right, the issue has been servers not scaling horizontally.

SQL the query language is great, the concept of set theory applies really well
to data. SQL can be and is used on non-SQL database, impala for example
implements SQL to query data burried in hadoop.

NoSQL stores become popular because they scale horizontally (able to use more
than one server) naturally. Once the SQL servers can do automatic
partitioning, I suspect people will start migrating back.

The truth is that there is no pixie dust, at the end of the day you need to
index. I see nosql proponent having the same strugles sql people have with
indexing, but right now they have the upper hand because they can spread the
work over several servers.

~~~
mattdeboard
I guess this might explain some of its popularity but I think it has much more
to do with the fact you don't have to do any advance planning about your data
(i.e. no need to write schemata). "Rapid prototyping" is much easier when you
don't have to think about the your data, its types nor the relationships
between them.

Of course, moving out of this phase becomes a massive headache, since basing
your product on essentially unstructured data is a very good definition of
"technical debt". And if you're using structured data in your rapidly
prototyped object, why not just a RDBMS in the first place? :)

~~~
vertex-four
> Of course, moving out of this phase becomes a massive headache, since basing
> your product on essentially unstructured data is a very good definition of
> "technical debt".

Of course, it's possible to use things like JSON-Schema to validate your data
if you choose to.

> And if you're using structured data in your rapidly prototyped object, why
> not just a RDBMS in the first place?

Because where's the RDBMS with a "natural" query language that is well-suited
to complex, dynamic queries and document structures? How do people model a
document store, with the ability to point to differently-shaped data, in an
RDBMS without giving up all that safety?

~~~
mattdeboard
All _what_ safety?

Anyway, I've used and have used both kinds of data stores. They have their
places. Solr's a great document store with a very good distributed data story,
for example. Not to mention interesting nested documents & queries features.

For the most part though, you'd have a hard time convincing me that MongoDB or
COuchDB (or whatever other document stores) are good enough as the system of
record for actual products.

Yes, relational databases require a lot more thought about your data. That's
the point.

------
SimeVidas
> Or Why You Still Need SQL

Great way to make people close the page immediately after reading the title.
(I personally don't need SQL for my small web app => the title's lying =>
close tab).

~~~
jimktrains2
I don't need a hammer to screw in a screw, so I don't need a hammer. QED.

