
The Future of NoSQL - gk1
https://foundationdb.com/key-value-store/white-papers/future-of-nosql
======
haney
While I understand the initial appeal of schemaless databases in my experience
the schema is the best living documentation of the shape of the data. It
becomes really handy to decouple this from the application layer when you
start having multiple clients connecting to the database (transactional vs
analytics workloads). I've also had my fair share of seemingly non
deterministic behavior when working well tested code hits old data that you
forgot was in a slightly different format.

~~~
jn
> when you start having multiple clients connecting to the database

If you assume that will happen, then all the things you suggest are true: the
schema is indeed the best documentation, and clients will have to pay close
attention to versioning since you can really only have one version of a fixed
schema at a time. You'll probably also want to move business logic into the
database in the form of foreign key constraints, triggers and the like.
Getting that right is _really_ important to protect against a broken client
corrupting data.

But that isn't the only strategy. You can instead have clients connect to an
API, with the API implementation being the _only_ thing that connects to the
database. The API becomes the documentation. It can handle versioning. It
handles business logic. In this world, the database schema is much less
important, and you can safely use schemaless databases.

Both designs have their advantages, and multiple clients connecting directly
to the database may well be a better choice in many circumstances, but it's
not inevitable.

------
letstryagain
> Schema-less design allows data to be modeled more flexibly than in
> relational databases, which lock the developer into a single schema at any
> given point in operations.

Implying that schemaless design is a GOOD thing

~~~
xj9
I've always been interested in NoSQL databases, but I never understood the
advantage of a schema-less design. Migrations make it so you can change the
structure of your data at any time so you really aren't locked into anything.
Even ignoring that, your data _has_ to have some kind of shape to it (a kind
of ad-hoc schema) and you still have to deal with data whose shape has changed
(which migrations take care of _for you_ ).

~~~
skybrian
Migrations get expensive and risky when you have lots of data. Being able to
do them incrementally is sometimes worth the complication of maintaining code
to read older versions. A traditional database doesn't let you do that; you
have to execute an "alter table" statement all at once.

~~~
Rapzid
The current state-of-the art with MySQL uses temporary tables and triggers to
perform online migrations.

------
chad_walters
"After extensive experience working with Bigtable and other eventually
consistent systems..."

This is not accurate -- Bigtable is not eventually consistent. The scope of
transactions supported by a system is a different set of considerations from
the level of consistency it provides. Bigtable is consistent but only allows
for transactionality at the row level.

Optimistic concurrency control is nothing new and Percolator layered
transactions on top of Bigtable years back. Furthermore, TrueTime -- allowing
for comparatively low-latency update across a globally distributed set of DCs
-- is the real innovation in Spanner, not the use of optimistic concurrency
control.

Honestly, I am not sure what this article is trying to claim, except perhaps
that per-node performance has been improved. AFAICT, most of this is due to
the fact that RAM is cheaper than it was, SSDs have reached commoditization,
and networks in the DC are faster than they used to be.

------
postmeta
NoSQL always seemed like a misnomer, should be SomeSQL, postgresql can do the
same kinda ops and usually faster than your average NoSQL db:
[http://www.enterprisedb.com/nosql-for-
enterprise](http://www.enterprisedb.com/nosql-for-enterprise)

~~~
joeclark77
According to Martin Fowler, #nosql was initially just a hashtag somebody came
up with for a meetup on some of the new databases, with nary a thought that it
would become the name of a "movement".

~~~
collyw
It should maybe be renamed noRealUndersatndingOfSQl.

Certainly the majority of proponents I speak to want to use noSQL databases
without any real reason, and where a relational database would be a better
fit.

~~~
e12e
Right. What did that Codd guy know, anyway? As if data can be modelled as
relations! Information wants to be FREE! Normal forms are a straitjacket that
no self-respecting hipster would wear to the bar at lunch-time.

------
marknadal
I can't blame these guys for trying to write these articles, because I also
write on these subjects. BUT there is always something that bothers me about
them (maybe cause I'm jealous), because their advertisements follow me around
- even though I use ad blockers.

Basically, I feel like it is all promotional material, despite the fact that
it is dealing with technical stuff that I research all the time, and well, as
a hypocrite also try to promote my work. But I open source my work, while with
them I feel like they are just trying to bait/give-an-excuse for me to click
the buy button.

But I know, as with all software, it is going to come with the initiation...
and in psychology, there is the sunk cost fallacy. Right? Where the more I
invest, the more I want to believe and will make excuses to keep going with
it. And so the incentives aren't actually aligned.

Compare this with let's say MongoDB - not only is it free AND ridiculously
easy to start playing with, they then have to wait long enough to bait you
into their consulting. However, as we go up the "cult ladder of software" we
don't have the sunk-cost fallacy of money we've invested (just time), so it is
easier for us to bail. That sucks for 10gen.

So with Foundation, I'm sure that once you are in the circle, it is wonderful,
because that is what you are paying for. But it makes me not even want to
enter into the circle in the first place. So what am I asking of Foundation?
To open source their product? Make their lives suckier and harder to pay their
bills?

Unfortunately yes, not because I'm malicious, but because database technology
is much more about academics than anything else. Not that business and money-
making can't co-exist with that, but because it is a field where we
fundamentally have to have open access and collaborate even if we are
competing for funds/grants/money/customers.

Let's ask why again. Why? Because nobody, businesses or not, is going to get
reliable progress until then. Yes, we'll get incremental innovations, and
Foundation probably has that, but these will be lost and reset time and time
again, until every one in the field is willing to sacrifice their all.
Unfortunately, a field that requires extreme expertise and costly talent to
push forward, and even then, only slowly.

So I'm not going to even bother analyzing the technical claims of the post,
because we have so much work to do first in even defining and making the terms
more common, understandable, concrete and clear. Despite a century or so worth
of work in the field.

Honestly, I feel like [http://aphyr.com/](http://aphyr.com/) is doing the most
important contribution, despite the fact that he isn't even building solutions
(compared to my team and Foundation's). Why? Because Aphyr has had great
success in popularizing the discussion and providing needed clarity. And yes
yes yes, I know that Foundation claims to have run their own Jepsen tests and
so on. Good for them, I'm still trying to even get to that point. But...
openness.

Like I probably shouldn't be as critical as I am being, a lot of my complaints
in here can easily be refuted with various aspects of Foundation, like their
free getting started, like their pay-what-you-need, etc. And obviously they
are contributing to the discussion by making posts like this. But I feel like
at least some of my comments and sentiments resonate, maybe at least in
helping Foundation in knowing they are giving off a weird vibe/signal that
they aren't even aware of.

