
NoSQL No More - bsg75
http://technosophos.com/2014/04/11/nosql-no-more.html
======
danbruc
The NoSQL data model - please read as document oriented - is fundamentally
flawed because it forces you into denormalization without good reasons like
performance optimization but with all the disadvantages.

The only good reason not to use a relational database is that they suck at
modeling complex entities and their relationships. Shattering nice entities
into small relations to get them normalized, adding join tables to model many
to many relationships and all that fun. Everything that is not roughly tree-
like is just painful.

More than once I finally decided to just stuff the really complex pieces into
a blob - XML or what ever worked best - and just deal with it at the
application layer. Not pretty, not fun, but still less painful than modeling
and dealing with it in the relational model.

~~~
jchrisa
One thing that is not easy to do with a relational model, is offline
synchronization for high-availability on mobile devices. The document model
was built for this, I find it hilarious that so few of the NoSQL databases are
taking advantage of the latent capability.

Try out Couchbase Mobile
[http://mobile.couchbase.com](http://mobile.couchbase.com) and compare it to
best of breed relational sync.

~~~
SchizoDuckie
It is not hard at all. people just hadn't built as much of the tools for it.
Which also confuses me because we now have package managers that run on
package managers.

My own little WebSQL Object/Relation mapping library uses promises to get away
from callback hell, and it's a pure joy to work with.

It handles creating tables and inserting fixtures before the first connect
promise is returned, so that you don't have to worry about the setup process
and your data being available.

Creating and inserting entities is as easy as

    
    
      > var P = new Presentation();
      > P.set('name', 'test');
      > P.Persist().then(function(result) { console.log('Done!') });
    

[http://schizoduckie.github.io/CreateReadUpdateDelete.js/demo...](http://schizoduckie.github.io/CreateReadUpdateDelete.js/demo/)

I'm using it extensively in my Chrome plugin, and it runs on 10.000+ clients
right now.

Real world offline synchronisation example right here:

[https://github.com/SchizoDuckie/DuckieTV/blob/angular/js/ser...](https://github.com/SchizoDuckie/DuckieTV/blob/angular/js/services/MigrationService.js#L52)

~~~
frik
Nice work. This would be the way to go.

Sadly one cannot use it everywhere as WebSQL is not implemented in Firefox nor
IE :(

The good thing is WebSQL works also great on mobile devices.

------
moron4hire
Typically, the best you can hope for out of your data is that it is a directed
graph. Relational databases exist in a sweet spot of efficiency, widespread
availability, and fairly decent ability to represent directed graphs.

But you need to know the shape of your graph first. The document-store model
tends towards giving up the graph entirely and just going with a tree. Us
programmers love tree structures. They make things nice and tidy and can be
traversed in finite time.

But I've never worked on a project that could easily be modeled as a tree.
Unfortunately, this was often realized after I had been working on the project
for a year. You know how phenomenally fucked you are having a tree-like data
structure when what you really need is a directed graph? You're basically all
the fuckeds.

Directed graphs generalize trees, so trying to shoehorn a tree where a
directed graph is needed is basically bringing a Fischer Price tool set with
you to Habitat for Humanity.

Thus, from experience, I never start with a tree now. I always start with a
directed graph. If I learn that the data is a tree after a year and not
actually a directed graph, oh well, look at all this flexibility we didn't
need.

------
SchizoDuckie
Hear hear!

Now can somebody convince the W3 and Mozilla that there's _nothing_ wrong with
implementing SQLite next to IndexedDB so that I can move forward without
having to write an IndexedDB adapter for my clearly relational TV Show ->
Season -> Episode data please?

[https://github.com/SchizoDuckie/DuckieTV/blob/angular/js/CRU...](https://github.com/SchizoDuckie/DuckieTV/blob/angular/js/CRUD.entities.js)

Foreign Relationships on foreign keys are not difficult

Many to many Relationships are not difficult

SQL is not difficult

Joining is not difficult

Grouping is not difficult

Now try to do that in IndexedDB / NoSQL, and suddenly you're in a world of
hurt. It can work, but can it perform? Maybe. With time and patches. So wham,
let's throw the option of implementing the probably most well tested piece of
software (in the universe probably, see
[http://sqlite.org/testing.html](http://sqlite.org/testing.html) ) out of the
window.

~~~
dragonwriter
> Now can somebody convince the W3 and Mozilla that there's nothing wrong with
> implementing SQLite next to IndexedDB

There's nothing wrong with _implementing_ SQLite next to IndexedDB, and its
not like W3 is going to send the Internet Standards cops to bust the browser
vendors that have.

OTOH, there is something wrong with standardizing "whatever the version of
SQLite happens to be used in the most recent version of Chrome happens to do"
in a W3C spec. Speccing out an API and a specific supported subset/dialect of
SQL for WebSQL that could support multiple independent compatible
implementations would be appropriate (and it could even be based closely on
what a particular version of SQLite does) -- but no one involved was
interested enough in doing that to actually, well, do it, and that's why
WebSQL ended up in limbo.

~~~
SchizoDuckie
I really don't understand this argument. They were able to throw up IndexedDB
support from scratch in a jiffy, but because something that actually works on
nearly all platforms (including mobile!) happens to be Chrome's (??) that is a
reason to hold back?

Sqlite has been around for almost 15 years now. It's time to adopt it as a
standard. Even if they're not satisfied with it, they can easily put a 'draft'
stamp on it and provide people with a working, well-tested way to use it
today. Firefox already has the support for it since your internal settings and
favorites are also stored in guess what.... SQLite databases!

What's even more confusing is that Mozilla is actually listed on the
Sqlite.org webpage as a sponsor, but they refuse to land it in Firefox. All
because of hipster politics and creative arguments like the one above.

Sure it can use work. Migrations suck. But it will never evolve any further as
long as they refuse to adopt it.

~~~
dragonwriter
> They were able to throw up IndexedDB support in a jiffy, but because
> something that actually works on nearly all platforms (including mobile!)
> happens to be Chrome's (??) that is a reason to hold back?

No, because the proposed spec did not actually _specify the behavior in a way
that was independently implementatble_.

The spec did not specify the supported query language, either by simple
reference to a specific version of the SQL standard (which would be
problematic, because a complete and correct implementation of any, at least
recent, version of the SQL standard is rare, and probably inappropriate for
the use case), or by more complex reference-with-identified-exceptions, or by
just listing out the supported features and expected behavior.

It was, therefore, _not possible even in principle_ to have mutually-
compatible, independent implementations (and all of the existing
implementations just did "link to some version of SQlite and do whatever it
does".)

> Sqlite has been around for almost 15 years now. It's time to adopt it as a
> standard.

Its a good, widely used, tool -- and one that is rapidly changing. But the
whole point of web standards is to specify _behavior_ in a way which permits
mutually compatible, independent implementations. And WebSQLDB didn't do that,
and didn't really seem to be progressing _toward_ doing that.

> What's even more confusing is that Mozilla is actually listed on the
> Sqlite.org webpage as a sponsor, but they refuse to land it in Firefox.

SQlite is -- or has been, at least, not sure if it _still_ is -- used in
Firefox. What they haven't included in Firefox is a WebSQL database
implementation, because they believe -- and rightly so -- that WebSQL database
wasn't appropriate as a web standard, and didn't show any sign of heading
toward something that would be appropriate as such a standard.

~~~
frik
> It was, therefore, not possible even in principle to have mutually-
> compatible, independent implementations (and all of the existing
> implementations just did "link to some version of SQlite and do whatever it
> does".)

SQLite library is in public domain, has more lines of code test cases than
lines of code C library code and its SQL API is stable. Plus the SQL language
is well documented:
[https://www.sqlite.org/lang.html](https://www.sqlite.org/lang.html)

W3C could also simply just fork SQLite at any time and modify the SQL dialect.

Mozilla and Oracle both are official gold sponsors of SQLite, both companies
use the library in their own products though made a lot of effort to create
IndexedDB and especially tried to deprecate WebSQL - nice job!

Microsoft has several embedded SQL libraries (JetBlue, JetRed, SQL Server
Express) and it would be a piece of cake for them to modify the SQL parser a
bit to the WebSQL SQL dialect.

~~~
vertex-four
The point is that someone has to actually sit down and define the exact
feature set that a WebSQL implementation is supposed to implement, such that a
development team could sit down with a copy of the standard and no other code
and implement their own WebSQL implementation from scratch.

The fact that SQLite is available is entirely irrelevant to any of this. The
W3C can't just say "do it like SQLite does it" \- that's not a standard. The
WebSQL spec that exists says:

> User agents must implement the SQL dialect supported by Sqlite 3.6.19.

Unless someone actually sits down and goes through the standardisation process
to define WebSQL properly, this is not the way to go, or the next thing will
be "implement this tag like Firefox does".

~~~
jeffdavis
To be more specific: let's say SQLite has a bug which manifests differently on
different platforms. It would be impossible to adhere to the standard.

~~~
SchizoDuckie
May I refer once again to
[http://sqlite.org/testing.html](http://sqlite.org/testing.html) and the fact
that it's running in the real world on millions of devices? This is a non-
argument in this discussion! What if IndexedDB has a bug which manifests
differntly on differnt platforms? You would refer to the specs, which can be
tested.

------
DrJokepu
This is why PostgreSQL is awesome. You can freely mix relational with non-
relational (object) storage depending on your needs.

~~~
fidotron
This is only if you ignore why people moved to NoSQL in the first place.

PgSQL is great for a lot of things, but I would argue that if you're using it
you're betting on your overall product/service having some other killer
advantage than data processing. The competent wing of the NoSQL crowd are
using it in strange ways that enable new classes of product and service that
cannot be achieved with PgSQL.

~~~
jeffdavis
Can you give examples? I assume you're talking about hadoop-related tech, but
you left it vague.

~~~
fidotron
A lot of the Hadoop users are in the right ballpark. Not necessarily that
stack, but that approach to things, and the problems they are attacking,
especially graph based data.

My broader point is that if your project fits into pgSQL then you need another
unique selling point and the data functions are just an implementation detail
of some other aspect of your offering, whereas for many of the people not
using that kind of thing their analytics and data systems are their selling
point. (That's a backhanded compliment to pgSQL, in that it's easy enough to
get right on small systems that any competent developer should be able to
manage it, thus reducing the market value though). There is, of course, the
blurred line of crazy MySQL deployments, many of which are barely relational.

~~~
jeffdavis
I'm trying to follow this reasoning. I guess the idea is that if a system has
a smaller market share, then it's less likely to be used by your competitors,
and can therefore offer a competitive advantage. Is that right?

That doesn't make sense to me because it only really applies when there is a
high likelihood of competitors using one product but not the other. Although
postgres is doing great, in most markets it's still far more likely that your
competitors are using oracle or sql server. So any advantage postgres has --
and I believe there are many -- offers a potential competitive advantage.

For instance, you could argue that data systems are a critical selling point
of Heroku, and they use postgres.

I think your point ultimately boils down to: "postgres is not quite at the
forefront of certain analytic use cases", which I agree with. It is at the
forefront of many other use cases though.

~~~
fidotron
That's not quite my reasoning, so I'll try again!

pgSQL does what it does easily enough that it reduces the barriers to entry to
such a level for traditional RDBMS workloads (which there are plenty of) that
such workloads are simply not economically worth pursuing (especially for
startups) except as small components in larger systems where the value add is
elsewhere. The "other" world of big data/time series/graphs/nosql is hard to
get right, and so is worth more, as if you crack it someone else copying you
is decidedly non-trivial, meaning that it alone can form the core of a
successful business.

This is a bit like what the web people are trying to do in mobile, where if
HTML5 was magically the best cross platform mobile deployment option when we
wake up tomorrow the value of mobile developers will collapse, and the web
people will then cease to be remotely excited about mobile.

~~~
jeffdavis
Oh, so it's a barrier-to-entry argument? That makes more sense. The barrier to
doing something with postgres is pretty low, so competitors can more easily
copy your ideas unless there are more barriers somewhere else.

An interesting point. More broadly: if what you're working on is not hard (and
awkward), then others can copy your idea easily. Technology like postgres
makes a new class of problems easy, and thus you need to find new problems to
solve if you want a sustainable business.

------
nemothekid
>Many people seem surprised that we, a tech-savvy startup, would be moving to
an "old" technology.

Who is this "many?"

~~~
spamizbad
There's a sizable contingent (but still a tiny minority) of developers who
will outright dismiss anything SQL and typically view databases as these
things you kinda pick on a whim based on whatever is easiest to shoehorn your
MVP into. Fancy things like a powerful query language, robust data types,
transactions, ACID-compliance etc? YAGNI.

Although I don't share this view, I cant say its always wrong. If you're
building simple MVPs to find product-market fit for ideas why bother? If
there's a 90% chance what you're building will never evolve to need those
features, why invest in them?

But if your "product" ends up being a 10% survivor you better have a plan to
move to something more powerful before you compound too much technical debt
from your NoSQL database. Once you start scaling your business and have to
face competition, NoSQL becomes productivity tarpit. Queries that take 20
minutes to write and optimize in the SQL realm can turn into day-long
exercises in NoSQL.

The reality is most developers aren't cranking out disposable software used to
test the market for new ideas. We're working on things that already have a
place, we just need to make them better. For us, the best option right now is
SQL + _maybe_ some specialized databases for certain purposes (Columnar for
analytics, graph for relationship analysis, full-text for text, etc)

------
visarga
I'm sorry but all I need can be handled by simple key/value stores, and I
prefer the simplicity of JSON data structures, without predefined fields.

~~~
SchizoDuckie
That is all perfect until your project has matured into 2, 3, or 5 years old
and you need to do patches and migrations. Or how about you have to transfer
your project to another developer? I bet he/she'll be happy not having a spec
of what's in where, when!

~~~
rogerbinns
Those can just as easily be screwed up in relational databases. For example
getting the schema wrong, or only updating it by accretion (adding new
tables).

A developer being below average (as half of all developers are), using the
wrong tool for the job, not understanding things etc will make a mess. A claim
that developers can screw things up is uninteresting (and true).

A claim that there is no scenario in which a "NoSQL" database is the better
solution is a lot more interesting, and requires more than "some developers
could screw it up" to substantiate.

------
pokstad
I love CouchDB because it works directly with web browsers without middleware
(and supports awesome replication), but I wouldn't mind structured records
like in SQL databases. I end up enforcing schemas using update validation
functions anyway. As long as it inputs and outputs in JSON over HTTP, I'd be
happy.

------
mrsteveman1
Why refer to the database that didn't fit this use case as "the particular
NoSQL database that we selected", while specifically mentioning the one that
DID fit by name? I know Postgres is great, no convincing necessary :)

~~~
astine
Probably avoiding a backlash from partisans of the that db complaining about
the bad press and from partisans of other NoSQL dbs complaining that the
points in the article don't apply to them.

------
iSnow
I think the author is conflating the document-oriented variant of NoSQL with
the whole family of databases.

I cannot criticize him for leaving Couch|Mongo if their data is rich in
relations, but I would like to read his thoughts on graph databases for
instance.

Maybe their data is too large for Neo4J, but for querying n:m relationships,
this model can offer advantages.

------
azth
Is the page down for anyone else?

~~~
farkerhaiku
yes. if only they had gone to a web scale mongodb backed blog engine.

------
jasallen
TL;DR; of my comment: NoSQL is a good optimization sometimes. Don't
prematurely optimize. If you don't know what you need, you need an RDBMS.

NoSQL / Document databases got so cool so fast, and were usable with little to
no actual know-how that people just lost their minds and used them by default.
That was always the wrong decision.

RDBMS are the swiss army knives. They do "all the things". But power,
responsibility, etc.

I use various types of NoSQL models for various special purposes, they are
great, and should be used, for those purposes. You almost always need an
RDBMS. If it's not your gold record (because you need quick writes and can
lock a whole document at a time), then you need to replicate to RDMBS for
better ad hoc reporting. OTH, if RDBMS is your gold, you may want to replicate
to NoSQL for fast lookups.

I'm currently working through a situation where the latter is required. I have
hundreds of thousands of records and the need to solve a classic backpack
problem (with hundreds of thousands of potential items to go in the pack, each
of which can have n instances)... so I've replicated a few billion pre-solved
solutions to Azure Table Storage. It solves a particular problem, but its not
my WHOLE application's data store.

