
SQL vs. NoSQL Databases: What’s the Difference? - rockdiesel
https://www.upwork.com/hiring/development/sql-vs-nosql-databases-whats-the-difference/
======
gjolund
I have issues with his "Reasons to use NoSQL"

* Making the most of cloud computing and storage

There are plenty of "cloud" hosting options for SQL databases, enough that a
simple google search should make my point for me.

* Rapid development

This is a really common reason that I have heard for wanting to use NoSQL
databases. It is flat out wrong.

While NoSQL databases may not force you into a schema for your database, you
will pay a toll for this mutability in your application layer.

I have worked on several NoSQL backed projects, and nearly without fail
development speed drops like a rock once you make changes to the way your data
is persisted.

SQL migrations may seem tedious, but the process of planning your data model
and having rigid definitions for your data types saves you huge amounts of
time in the long run.

Data is the foundation of most applications, and if you can't rely on your
data then you are building your application on sand.

~~~
Scarbutt
_SQL migrations may seem tedious, but the process of planning your data model
and having rigid definitions for your data types saves you huge amounts of
time in the long run._

Your reasoning seems too black and white, most the of the time, you _can 't_
know your data model ahead of time, that's why prototypes are so useful, so
just using your language data structures or some less rigid datastore lets you
get quicker to your data model.

~~~
paulmd
Having rigid definitions for data types doesn't necessarily mean the datastore
is rigid. It just means that you know what type of data goes in each column.
Like it or not, if you change your schema that change has to be handled
somewhere - either you update the old data to the new schema (do this), or you
handle old versions of the schema in the application (fundamentally non-
scalable IMO). It doesn't matter whether they're values in a JSON object or
columns in a relational DB.

Flyway is a fantastic tool for making changes to relational schemas. Your code
is versioned, and you can just keep the schema migration scripts right next to
it. If you need to upgrade an old schema, it just walks through the migration
scripts until it's up-to-date.

If there is some portion of your model that really, truly, honestly consists
of unstructured data, you can just put that in a JSON column.

~~~
Scarbutt
That is why I mentioned prototypes, they are a quick messy hack that is not
meant to scale, its purpose its to validate your ideas and decisions, what is
going to become your MVP, which IMO is far more important than getting the
perfect stack/tooling from the start.

~~~
gjolund
We obviously value very different things in regards to software development.

You really seem to favor initial velocity and rapid iteration. Fair, I can
totally understand why.

I favor planning, stability, and limiting thrash due to bad/uninformed
decision making.

In my experience the rush to market only pans out when you plan on having a
large influx of funding to rebuild your stack. Otherwise you are stuck holding
a mess of code and trying to scale ideas that never should have gone to
production in the first place.

There is a whole other side to the industry that demands provably correct
solutions, and will not tolerate the "hacker culture". Not because of any
petty reasons like work hours or perks, but because some applications require
a level of accuracy that your average growth hacker can't meet.

------
manigandham
Why is this even on HN? There's no reason to debate this, it's a content
marketing fluff piece for a company that has no technical pedigree, written by
a freelance copywriter rather than an engineer.

It's also about 50% wrong. Let's just move on.

~~~
zer00eyz
On some level I'm glad it is:

Things your boss or client will read, quickly turn into things your boss or
client will say. I would rather be prepared for the wacky misinformed
conversation that will happen, it helps keep down the laughter and
frustration.

~~~
abraae
And on that theme, then get ready for when your boss gets to the part about
how NoSQL gives you:

"Rapid development. If you’re developing within two-week Agile sprints,
cranking out quick iterations, or needing to make frequent updates to the data
structure without a lot of downtime between versions, a relational database
will slow you down."

Reminds me of this classic
[https://www.youtube.com/watch?v=b2F-DItXtZs](https://www.youtube.com/watch?v=b2F-DItXtZs)

------
callesgg
The article read like it is written by someone who has never understood how to
use a SQL database.

To quote Mr Torvalds himself:

"Bad programmers worry about the code. Good programmers worry about data
structures and their relationships."

Something that i struggled with before i learned how to use a SQL/relational
database was to actualy use the database relations in my code. I first
specified the relations in the database then i basicly redid them in the code.

The way i tend to program these days is to have basically all my relations
restrictions and relational rules in the database. Whenever i want to change
something you just alter stuff in the database GUI. Most of the application
code will just follow along.

~~~
wvenable
In my experience, if you have the relational database model fully defined and
correct then the rest of the application is pretty much designed. All the
screens and elements you need are obvious.

~~~
abraae
Indeed, but if you are building in microservices, a new level is introduced.

If you can get your service boundaries and APIs fully defined and correct,
then the relational database models (underpinning each of those services) just
fall from the tree for you.

And then as you say the screens and elements are obvious.

~~~
wvenable
Honestly I'd still prefer to go other way -- from the relational database
models you get your APIs defined and correct and then the screens and elements
are obvious.

I find if you start at the database model layer you much less likely to screw
up the API design. The database model forces a bit of rigor that you can't get
going the other direction.

~~~
abraae
IMO it's dead easy to create APIs from a database schema, where that schema
underpins your monolith and you're opening said monolith up to the outside
world, and you don't really care too much what goes on out there.

But if you're building a system which is not a monolith, and instead is made
of up of, say, 20 services, then putting good boxes around those services and
defining the lines that connect them (the APIs) is an essential prereq to
designing the databases that sit inside the boxes.

(Though really what's happening, as always, is you're doing it all in
parallel, just with more emphasis on the higher level stuff).

~~~
wvenable
I can't really disagree with that. I usually design micro-services the same as
do monoliths; it's just the client is different (another service).

------
hobs
Chuckling at: Your data is structured and unchanging. If your business is not
experiencing massive growth that would require more servers and you’re only
working with data that’s consistent, then there may be no reason to use a
system designed to support a variety of data types and high traffic volume.

SQL Databases support a variety of data types and high traffic volume, they
just dont allow you to change things without thinking a little more deeply on
the problem.

This article is just fluff.

------
hardwaresofton
Postgres is the last SQL DB thing mentioned?

No mention of RethinkDB ([https://rethinkdb.com/](https://rethinkdb.com/)) ?

SQL is good if your data is "unchanging"? Schema migrations happen, SQL can be
sharded and scaled to multiple servers, and the fact is, unless you're
facebook or google, you probably won't even need it. Didn't reddit run
postgres since it began?

Glad I skimmed this.

~~~
overcast
Yeh seriously RethinkDB is the perfect hybrid of "NOSQL" document storage,
with SQL like relationships. Been using it for all my projects the past year,
and I love it.

~~~
hardwaresofton
Same here, I have no idea why it doesn't get more press, as far as I'm
concerned it's the best document store out there.

Mongo has gotten away with some pretty substantial misses, yet still get
considered every time NoSQL comes up (maybe first-to-market is really just
that powerful), but RethinkDB rarely gets one, despite being so stable and
having so many good features.

And when I say mongo has gotten away with substantial misses, I don't mean the
documentation miss that lead to documents getting dropped, I mean core issues
with their engine that necessitated the wiredtiger contributions/rewrite. Of
course, mongo works very well generally, and they've definitely done their
part when it comes to documentation and growing the community, and carving the
path for what NoSQL document stores (in this day and age) look like.

------
ktamura
I personally think SQL v. NoSQL is less useful of the two dichotomies, the
other one being OLTP v. OLAP. In the end, SQL is just an API. You can put SQL-
like interface on NoSQL as well (like SlamData for MongoDB or Crate for
Elasticsearch).

On the other hand, OLTP v. OLAP (transactional v. analytic) is directly tied
to use cases and has more profound consequences as far as technical decisions
go (data structures, algorithms, etc.)

~~~
fpoling
This is very good point. For example, MongoDB and Oracle are not that
different compared with Splunk.

------
nthacker
This talk by Martin Fowler is a fantastic overview into the SQL vs NoSQL
discussion :
[https://www.youtube.com/watch?v=qI_g07C_Q5I](https://www.youtube.com/watch?v=qI_g07C_Q5I)

------
jcyw
There should be mentions about CAP theorem, and tradeoffs about consistency vs
availability. Sometimes you dont care as much about CAP-consistency than
availability or scalability.

------
lowbloodsugar
For me, the most important difference is that a SQL "Database" is actually an
incredibly powerful application server, with a built in compiler for a data
processing language, and many years of development on access strategies,
whereas a NoSQL database is just a level above an http server with an index
that scales really well. That's a bit of an exaggeration, but I find it helps
new teams understand the big picture. In a microservice environment, I
encourage my teams to think of their SQL database server as a microservice,
not a persistence store.

------
0x434D53
That happens, when a content marketer write about a technical topic... Just
sad.

------
BinaryIdiot
Many databases considered NoSQL actually have SQL now. Cassandra for instance.
Granted it's not as critical to the execution as SQL is for RDMS but I would
much prefer the distinction to be: RDMS / Relational versus KeyValue /
Document Stores. But I digress.

Some of the other criticisms in this thread are very valid especially in the
area of why you should use one versus the other so I won't bother regurgitate
them I just felt like a better distinction could have been used between what
they were comparing.

------
Florin_Andrei
> _NoSQL data doesn’t need to be prepped ahead of time._

Unless you're using CouchDB and you need to change some views, and then you
have to re-index the whole thing, which can take forever.

------
aristus
Bah. People keep confusing the syntax with the data model. SQL is simply a
notation to express theorems in relational set theory. It's pretty much the
only pure-functional, side-effect-free programming language that's actually
gone mainstream [0]. If you look carefully at the APIs of the "No SQL"
databases, you'll find them introducing relational set theoretic features.

To paraphrase Paul Graham's unbearably smug comment about Lisp: once you add
filter, group by, and join, you can no longer claim to have invented a new
query language, just another dialect of SQL. Math has this funny habit of
remaining true no matter how many mean things people say about it on their
blogs.

[0] I mean basic core SQL, not the vendor-specific stuff like PL/SQL.

~~~
gshulegaard
This is interesting, but I admit I am a bit confused.

Having read "An Introduction to Database Systems" by C.J. Date
([http://www.amazon.com/Introduction-Database-
Systems-8th/dp/0...](http://www.amazon.com/Introduction-Database-
Systems-8th/dp/0321197844)), I have a tremendous appreciation for Relational
Algebra and Theory which allows for the closed functional nature/syntax of SQL
which holds true across the various normal forms (1st-5th).

Perhaps due to my own lack of understanding, I always assumed that without an
underlaying relational model SQL loses its closure. So for me, I always
thought the distinction between SQL and NoSQL was whether or not the
underlying database model was relational.

Sure for many of the queries against a non-relational model you could adopt a
SQL-like syntax (which many languages seek to do), but it would no longer be
operating in the closed Relational Algebra space.

Could you comment on this and help me understand the finer points/where I went
wrong?

~~~
btilly
Syntax and being relational are orthogonal.

You can set up non-relational data structures in SQL. You can set up a
relational data structure in virtually any programming language.

However what the NoSQL people are learning is that smart developers really
want relational models. So they are offering the features that can be used to
achieve that.

~~~
gshulegaard
That's where I think where most NoSQL proponents start to miss the value of
relational models and the closed Relational Algebra it allows.

As I understood it, SQL was a language for describing the Relational Algebra
which was a closed mathematical concept.

You can take the syntax, assign functions to produce output from a non-
relational model, but it's no longer SQL...and it's underlying power (the
closed Relational Algebra) is no longer present.

So in a sense syntax and being relational are orthogonal, but syntax isn't the
defining characteristic of SQL (or why it's so powerful). I think a more
insightful statement is: SQL (Relational Algebra) and being relational are
codependent and relational and non-relational are orthogonal.

Which, if I am not mistaken (and again I am still trying to understand the
finer points which is what motivated my original comment), NoSQL people
fundamentally _cannot_ offer a query language similar to SQL based on the
closed relational algebra.

They can mimic the syntax to make it more intuitive for new users coming from
traditional RDMBS environments, but it's fundamentally _not_ SQL/relational
algebra.

But again, I am just a cursory admirer/learner and I am still trying to iron
out my own understanding of the underlying theory. So all additional
information/clarifications welcome :)

------
yamadapc
SQL...?

~~~
systems
not really, sql is usually used to refer to relational databases (and yes i am
aware that sql is not fully relational)

to sql vs nosql

is more like relational vs other models

nosql can also be seen as domain specific databases some nosql options work
great for specific types of apps but not others

and to conclude, the key feature of relational databases is integrity, your
data is normalized, so you are guaranteed integrity, problem is, integrity is
not free, this why we need nosql and other bi solutions like olap cubes

but olap cubes are different, since they dont replace relational databases,
they complement each other ... olap can also be seen as domain specific

------
Mister_Snuggles
I'm always saddened when I don't see MultiValue[0] databases in the list, but
the reality is that there's almost nothing that's open source in this space.

My first real job was doing development on a custom application built entirely
in UniVerse[1]. It was a really interesting way to work and mirrors a lot of
what the Open Source NoSQL databases do, though it does it differently.
Instead of storing the schema with the records (e.g., a JSON document), a
record was simply a list of data elements (fields), each field could have
multiple values (or not), each value could have multiple subvalues (or not),
and there was a lower level that you could go to in order to store multi-line
text data.

There was a dictionary attached to each file that primarily drove the query
tool. You could also build calculated fields (I-descriptors) to do primitive
joins, among other things. If you had a file containing invoices, you might
have two fields to list the item number and quantity (these would be repeated
for as many items as are on the invoice). You could then use an I-descriptor
to follow item number over to the items file and pull the description, price,
etc.

The entirety of the application ran within UniVerse. When users logged in to
the unix server, their shell would run a script that would set up the
environment and fire up UniVerse. The UniVerse LOGIN paragraph would then show
a menu system and allow them to do their thing. Everything was written as
either a paragraph, roughly equivalent to a shell script that ran UniVerse
commands, or a BASIC program. The dialect of BASIC natively understood dynamic
arrays, which were a 1:1 match to the structure of the records on disk.

UniVerse also had a built-in way to execute SQL queries against the
multivalued data. In the invoice example above, the item number and quantity
would be associated in the dictionary. These would then be exposed as another
table by the SQL engine. You could go "SELECT A.INVOICENO, B.ITEMID,
C.DESCRIPTION, B.QTY FROM INVOICES A, INVOICES_ITEMS B, ITEMS C WHERE A.@ID =
B.@ID and B.ITEMID = C.ITEMID" and it would know that INVOICES_ITEMS is really
part of the data in INVOICES.

It was a really cool system and I wish there was something like it in the Open
Source world. Sadly, development on MaVerick[2] seems to have stalled.

[0]
[https://en.wikipedia.org/wiki/MultiValue](https://en.wikipedia.org/wiki/MultiValue)

[1] [http://www.rocketsoftware.com/products/rocket-
universe](http://www.rocketsoftware.com/products/rocket-universe)

[2] [http://www.maverick-dbms.org/](http://www.maverick-dbms.org/)

------
neo2001
_Grabs popcorn_

