
Immutability, MVCC, and garbage collection - mattrjacobs
http://www.xaprb.com/blog/2013/12/28/immutability-mvcc-and-garbage-collection/
======
jasonwatkinspdx
Pretty disappointing critique.

There are manifold differences between couchdb, datomic, rethinkdb and more
traditional sql databases, but the author can't see past his pet issue. He
doesn't seem to understand the use cases, or the infrastructure differences.
In terms of use cases, there are plenty of analytically datasets that are
strictly monotonic. There is no opportunity to reclaim overwritten storage in
this case.

But CoW index write amplification you say? Well now let's talk about
infrastructure differences. Datomic can use s3 as a storage layer. Cold data
storage is then $10/month/terabyte. And compaction is trivially non-blocking
in a CoW dataset. It can be done by EC2 spec instances whenver is convenient
with zero impact on the production system. Don't generalize the experiences of
a few folks running couchdb on a couple in house servers to people working
with a fundamentally different approach on fundamentally different
infrastructure.

Back to the cheap cost of cold storage, for most businesses that's well into
"who cares" cost territory, but having the data also allows them to preserve
full lineage in the advent they discover a data loss bug in their application.
That could potentially be quite valuable. One way to think of these systems is
that they have integrated backup data. MySQL folks generally don't count the
size of backup sets as part of the database footprint, and common backup
schedules (eg father, grandfather) provide only a limited ability to dig back
into the past.

By way of simple argument: would _anyone_ advise using a source control system
that only kept the last few hundreds of commits? Why do we treat our data
differently? The answer is largely one of implementation constraints: text is
cheap, vcs doesn't index the way databases do, etc. But as storage and
processing densities continue to grow more datasets shift to the trivial
category. As much as many HNews posters dream of being in a 'big data' shop,
the vast majority of startups have maybe a couple GB in MySQL or Postgres. Our
status quo toolset and approach is discarding value for these folks. Maybe you
think that's fine, but some of us are going to keep pushing to enable new
capabilities and advantages. Increasingly these implementation constraints
will disappear as people attack them. It'd be nice to not suffer scornful
surface criticisms from people who base their career around the status quo.

~~~
homerowilson
"By way of simple argument: would anyone advise using a source control system
that only kept the last few hundreds of commits? Why do we treat our data
differently?"

Well said!

~~~
RyanZAG
What? Come on, really? My database - only the latest version of all the data
that gets updated frequently - has TB of data. My whole git repo with full
history is about 500mb and that's excluding large assets. It's a completely
different ball game. Sure, if I could buy 100TB disks cheaply and set it up so
I could access data across hundreds of them in real time then there wouldn't
be a problem. Obviously, I can't do that.

As the author of the blog post points out: the little thing called reality
gets in the way of such nice and perfect arguments and ideas.

~~~
jasonwatkinspdx
When talking about architecture at this level, you need to take a longer view.
Once upon a time, a gigabyte database was mammoth beyond thought. Also, you're
taking it as given that your data has its present storage footprint even
though alternative architectures might have very different properties. It's
worth noting that a purely additive database can be very aggressive about
compression.

But also, petabyte datasets are downright cheap to deal with these days.
They'll look more like that half gigabyte source code repo sooner than I think
you're considering.

~~~
corresation
_petabyte datasets are downright cheap to deal with these days_

Are you from the future? In no universe is a petabyte database remotely close
to being cheap, even for large organizations. At this point I am seriously
questioning your industry knowledge.

~~~
empthought
Sure it is, a few thousand entries pointing to 10-100GB pirated HD video
blobs. What's the problem?

/s

------
noelwelsh
The author ends with "Also, if I’ve gone too far, missed something important,
gotten anything wrong, or otherwise need some education myself, please let me
know so I can a) learn and b) correct my error." so allow me to offer the
following:

When you spend a substantial portion of your post sniping at the competition
you come across as an arrogant arse. For example, second sentence in the post
is "My overall impressions of Datomic were pretty negative, but this blog post
isn’t about that." Then remove this line! It adds nothing to the point you
claim you want to make. Same thing in the next section regarding append-only
B-trees, and then we have snark directed at RethinkDB, and so on. I got about
half-way through the post before I just skimmed to the end as I was sick of
the arrogance and insults, and I was beginning to doubt your authority to
comment on this area as a result -- if you have substantial points to make you
don't need to cover them up with this crap.

------
rdtsc
Well they are all tools with trade-offs. It is good to have something like
Datomic. The trick with Datomic (from a 10 minutes summary I listened to last
year) is that because of immutability interesting things can happen in respect
to caching. You can cache things locally and have smart clients. It simplifies
and adds interesting performance properties.

Sometimes immutability is not right for you, sometimes it is.

He criticizes CouchDB as well. Yeah you need to have double the disk space.
But CouchDB now has auto-compaction triggers based on fragmentation threshold
or based on times of day. Immutability allows completely lock-less reading.
Crash only stopping (can just kill the power or the service anytime without
corrupting the database). Master to master replication lets you build custom
cluster topologies. Well defined conflict models and explicit conflict
handling is invaluable (as opposed to other databases that hide and paper over
that, sometimes losing data in the process -- read Aphyr's blog
[http://aphyr.com/](http://aphyr.com/) for an thorough study on that). Do you
need those features? Maybe you don't, maybe you do. It is good to be aware of
it and have it your toolbox if you need it one day.

------
richhickey
I'd ignored this, trying to have a nice holiday with my family (without having
to contend with wrongness on the internet :), but would like to introduce some
facts. Datomic is not an append-only b-tree, and never has been.

There's a presumption in the article that "immutable" implies "append-only
b-tree", specifically, an argument against append-only b-trees is used to
disparage immutability. That's pretty naive. One only need to look at e.g. the
BigTable paper for an example of how non-append immutability can be used in a
DB. Such architectures are now pervasive. Datomic works similarly.

One can't make a technical argument against keeping information. It's
obviously desirable (git) and often legally necessary. Our customers want and
need it. Yeah, it's also hard, but update-in-place systems (including MVCC
ones) that discard information can't possibly be considered to have "solved"
the same problem.

Prefixing a polemic with "apparently" doesn't get one off the hook for
spreading a bunch of misinformation.

------
vardump
Append-only b-tree/skiplist/whatever immutability can be combined with
"checkpoint hashes" (think git) using for example SHA512, that can be
transmitted elsewhere at a low cost and used later to validate system
integrity. Any data store tampering would be detectable. In a mutable database
you'd need to transmit whole transaction log for the same integrity guarantee.

Thus immutable store makes it trivial to implement systems that need to have
full history and audit trail.

When there's only a trivial amount of data involved, immutable state can solve
a lot of headaches. No programming error can lose or corrupt data for instance
- every case can be traced back after the fact.

I hope some immutable store databases take off that cover those use cases.
Make it a graph database, bonus points if the underlying engine implements
directed hypergraph. And artificial intelligence reasoning engine. :)

------
carry_bit
"If entities are made of related facts, and some facts are updated but others
aren’t, then as the database grows and new facts are recorded, an entity ends
up being scattered widely over a lot of storage."

As far as I understand this isn't so. All data in Datomic is stored in the
indices, so the facts associated with an entity should be stored together.

------
gfodor
I'm going to go out on a limb here and use a logical fallacy, but I have a
feeling that Rich Hickey probably has some understanding of the history of
database theory and didn't just build Datomic out of ignorance.

~~~
jeffdavis
But he also does not have infinite engineering resources at his disposal. I
expect that a lot of the designs are probably somewhat simplistic compared to
the battle-hardened implementations in common use today.

It doesn't mean Rich Hickey was wrong; merely that we are seeing only the
ultra-clean design now. After Datomic takes off and has a lot of production
users in demanding environments for 5-10 years, I doubt the design will be
quite so clean.

~~~
DigitalJack
I thought he codesigned it with a company called Relevance

~~~
moomin
Well, there used to a firm called Datomic, and one called Relevance, but they
merged to Cognitect.

------
webmaven
To me, what was most striking about the article (once I got past the pot-
shots), was how much the descriptions of Datomic and RethinkDB reminded me of
the ZODB, the object database built into the Zope framework:
[http://www.zodb.org](http://www.zodb.org)

ZODB is an append-only object database, and one of the data structures
commonly used for persistence in it are BTrees:
[http://www.zodb.org/en/latest/documentation/guide/modules.ht...](http://www.zodb.org/en/latest/documentation/guide/modules.html#btrees-
package)

Several disadvantages noted in the OP (infinitely growing data, needing to
pack the DB to discard old object versions, needing 2x disk space to do the
packing) definitely apply to ZODB, but others do not, as it is ACID-compliant
and implements MVCC.

------
yresnob
The author needs to read up way more on datomic... and clojure for that matter
then try again...

------
dschiptsov
Very rare example of an old-school thinking and reasoning.

Indeed, industrial-strength databases are hard, and as an Informix DBA, I
could only agree, that there is no silver bullet, and that just "check-point
early, check-point often" strategy has its trade-offs in terms of concurrent
query performance. And, of course, unpredictable stop-the-world GC pauses are
unacceptable. Informix have polished their Dynamic server for a decade and it
is still one of the best (if not _the_ best) solutions available.

The more subtle notion is that one could never avoid compactiom/GC pauses, so
the strategy should be to partition the data and perform "separate"
compaction/GC, blocking/freezing only some "snapshot" of the data, without
stopping the whole world, like some modern FS do.

Of course, implementation is hard, and _here_ the simplicity of persistent,
append-only data-structures _could_ be beneficial for the implementer of
"partitioned/partial GC".

There must be some literature, because these ideas were well researched in old
times, when design of Lisp's GC, or CPU caches were hot topics.

------
greatsuccess
Id like to keep immutability and databases separate in this comment.
Immutability the supposed "big advantage" of functional languages, is a
discussion which is one-way.

No one ever discusses the implications of a system which is constantly,
needlessly, insanely doing nothing but MAKING COPIES OF DATA.

This is not how systems are/should be designed and Im sure as hell not going
to use a functional language until functional understands basic common sense.
A system that is constantly copying data for no reason will do nothing else.

~~~
talaketu
> No one ever discusses the implications of a system which is constantly,
> needlessly, insanely doing nothing but MAKING COPIES OF DATA.

see _persistent data structure_

~~~
greatsuccess
I don't believe in this data structure. Its ill informed Rich Hickey nonsense
and it doesnt scale and has no purpose but to seem clever.

~~~
TheHydroImpulse
> Its ill informed Rich Hickey nonsense

How so?

> it doesnt scale

I take it that if many of Clojure's primitives are built using this data
structure, it doesn't scale either? Yet some else mentioned how performant it
actually is, given that it doesn't copy everything, but merely shares
commonality.

Given your past comments, this isn't the first time you say something without
knowing much about it.

