

5 predictions on the future of databases (from a guy who knows databases) - lukaseder
http://gigaom.com/2013/12/14/5-predictions-on-the-future-of-databases-from-a-guy-who-knows-databases

======
falcolas
"However, it’s still not an endorsement of MySQL as much as it is a
recognition of Facebook’s database chops."

That's certainly one way to interpret Facebook's dilemma (I know a few DBAs,
and this statement certainly isn't false). Another way is that nothing else
scales as well as MySQL, or nothing else offers the same tools to manage them
at scale.

Relational data stores aren't going anywhere; there are still quite a few use
cases they fulfill that NoSQL, NewSQL, and Graph DBs do not. This just
triggers the "use the right tool for the job" reflex in me.

~~~
lukaseder
Well, after all, this _was_ Michael Stonebraker, the salesman, talking. Not
Michael Stonebraker, the computer scientist...

------
angryasian
>in-memory architectures for transactions

Can someone explain this to me ? In terms of acid principles wouldn't you want
your transactions to be durable.

>Durability means that once a transaction has been committed, it will remain
so, even in the event of power loss, crashes, or errors. In a relational
database, for instance, once a group of SQL statements execute, the results
need to be stored permanently (even if the database crashes immediately
thereafter). To defend against power loss, transactions (or their effects)
must be recorded in a non-volatile memory.

~~~
ericfrenkiel
I'm a cofounder at MemSQL. The explanation around durability and in-memory
databases is more nuanced than what is assumed with disk-based systems.

First, it's worth mentioning that you use in-memory databases when you have
performance requirements that far exceed physical capabilities of hard disks
or even SSDs. The workload will have a high degree of contention, high
concurrency, etc.

Second, you're probably working with non-human-generated data. If you're not a
bank that needs to guarantee a debit and credit went through (and that's slow
human-generated data anyway), then you're looking at in-memory technology
because you can guarantee every read will hit memory.

For writes, you basically have to use more machines to guarantee durability.

Any in-memory database worth its salt will write a transaction log to disk, so
in the event of a power failure, it will read back from disk into memory.

You can increase the probability of 0 data loss, even in high contention and
concurrent workloads, if you run the dataset across a set of machines, which
both multiplies the number of disks capable of writing a sequential log as
well as storing an extra copy in memory for high availability.

------
batbomb
I think VoltDB could be huge, but the utter reliance I stored procedures is
misguided in my opinion. The reasoning behind it is really to optimize
database access at the database level, including targeting features like JIT
compilation and leveraging zero-copy, but it's so hard to get this all to work
right, unit test, debug, etc... On the other hand, his "NewSQL" approach can
probably become a good replacement for many use cases where MongoDB would
otherwise be used.

~~~
falcolas
I have a separate problem with stored procedures - it's pushing more and more
work into the single hardest piece of software to scale - your database.

~~~
y0ghur7_xxx
> pushing more and more work into the single hardest piece of software to
> scale - your database.

I never understood this argument. Maybe you can elaborate a bit? Almost all
our apps use sp for data lifting work (most of the so called "business
logic"), and we never got scaling problems. Well, we don't have that much
users or data either, but how many users/data do you have that your stored
procs could not keep up with? What numbers are we talking about?

What makes the db slow is throwing bad sql at it, not Stored Procs.

~~~
falcolas
Terabytes and 5-6 thousand queries per second.

> What makes the db slow is throwing bad sql at it, not Stored Procs

Absolute statements are absolutely wrong. :)

Particularly with DBs like PostgreSQL where you can (and people do) run Python
scripts as stored procs. Great way to slow down your DB.

~~~
y0ghur7_xxx
> _Terabytes and 5-6 thousand queries per second._

That is not much info :)

Just for reference: a simple query by pk index on my veryveryvery low end pc¹
on a table with 1mil rows takes about 0.04 ms (so says pgsql analyze). That is
25000 queries per second. What type of app are we talking here where the db
needs to scale? Something like reddit? Stackoverflow? Just curious. Never saw
such a beast where a DB server needs that much power. Where I work we have 200
apps on the same DB server, and it's idling most of the time.

¹i5 CPU 2.40GHz with 2GB of RAM

> _run Python scripts as stored procs. Great way to slow down your DB._

There is more than one way to shoot yourself in the foot :)

------
Eleutheria
One prediction about databases. Codd will rule forever.

~~~
lukaseder
Yes! This:

[http://www.opensourceconnections.com/2013/12/11/codds-
relati...](http://www.opensourceconnections.com/2013/12/11/codds-relational-
vision-has-nosql-come-full-circle/)

------
Vektorweg
One size could fit everything. With a good flexible query language and an
sufficiently smart compiler, it might be possible. Which means: one size, one
winner and SQL isn't flexible enough. SQL will fall down to earth.

------
d4nt
At some point, I think traditional database platforms like Oracle and SQL
Server will build things that make it easier to treat them like a NoSQL store
(simpler replication, ability to distribute read only queries across multiple
replicas, a few enhancements to sparse columns) and the NoSQL stores will add
ACID, a query language, ODBC support etc. At which point talking about the
"relational market" or the "NoSQL market" will not make a lot of sense.
They'll all just be competing storage engines.

