Interesting that you name them 'together'. On the surface, they are doing quite different things. On a deeper level, it seems to me, they approaching things in a very similar manner. I think what they share is a style of work very detached from the hectic, local improvement approach, which is usually forced upon us in industry for efficiency reasons. They inspiringly take their time to dig deep to identify hidden assumptions to get to the root causes of problems. Quite in the sense of the artist or scientist Bertrand Russell thought of. http://downloads.bbc.co.uk/rmhttp/radio4/transcripts/1948_re...
At a deeper level they are both pragmatic philosophers. They think at a high level but they are hands-on and have their feet firmly on ground realities. Inventing on principle is Bret Victor's contribution, but Rich Hickey surely lives it; and Hammock-Driven Development is Rich's notion, but there wouldn't be "Inventing on principle" without HDD on Bret's part. These two are awesome.
Strange Loop is a very hot ticket. I tried to get a ticket w/ company support within 6 weeks of the early admission offer... no luck, the conference was sold out. I'm happy to see such a heavily tech-focused conference doing so well. I'll be looking to get an very early ticket next year.
Wholeheartedly concur. I'm also a fan of Rob Pike's straightforward way of "getting to the core" of quite tricky concepts and putting them forth in such a way that they become obvious. Pike's razor, perhaps?
Fascinating stuff. Some things that came up for me while watching this and the other videos on their site:
It's not Open Source, for anyone who cares about that. It's interesting how strange it feels to me for infrastructure code to be anything other then Open Source.
I'm sort of shocked that the query language is still passing strings, when Hickey made a big deal of how the old database do it that way. I guess for me a query is a data structure that we build programmatically, so why force the developer to collapse it into a string? Maybe because they want to support languages that aren't expressive enough to do that concisely?
I'm always puzzled when the Datomic folks speak of reads not being covered under a transaction. This is dangerous.
Here's the scenario, that in a conventional update-oriented store, is termed as a "lost update". "A" reads object.v1, "B" reads the same version, "B" adds a fact to the object making it v2, then "A" comes along and writes obj.v3 based on its own _stale_ knowledge of the object. In effect, it has clobbered what "B" wrote, because A's write came later and has become the latest version of the object. The fact that DAtomic's transactor serialized writes is meaningless because it doesn't take into account read dependency.
In other words, DAtomic gives you an equivalent of Read-committed or snapshot isolation, but not true serializability. I wouldn't use it for a banking transaction for sure. To fix it, DAtomic would need to add a test-and-set primitive to implement optimistic concurrency, so that a client can say, "process this write only if this condition is still true". Otherwise, two clients are only going to be talking past each other.
Anyone understands how this system would deal with CAP theorem, in the case of a regular "add 100$ then remove 50$ to the bank account, in that order and in one go" type of transaction ?
The transactor is supposed to "send" novelty to peers, so that they update their live index. That's one point where i would see trouble (suppose it lags, one "add" request goes to one peer, the "read" goes to the second, you don't find what you just add...)
Another place i see where it could mess things up is the "Data store" tier, which uses the same traditional technics as of today to replicate data between different servers (one peer requests facts from a "part" of the data store that's not yet synchronized with the one a second peer requests).
It seems like all those issues are addressed on his "a fact can also be a function" slide, but he skips it very quickly, so if anyone here could tell me more...
In Datomic you can setup a function the transactor can call within a transaction. This function takes the current value of the database and other supplied arguments (e.g. $100 and -$50 from your example) and according to its logic produces and returns a list of "changes" the transactor should apply to the database. The function is pure in sense that it doesn't have any side effects, it just "expands" into new data. And of course this "expantion" and application of its results happens in the same transaction.
We're building some stuff with it atm but I can't go into details. I've run into several others using it for a variety of things as well and the support group is active. http://groups.google.com/group/datomic
I think Datomic is potentially disruptive and represents some great thinking on the part of an individual. Whether it will be disruptive will hinge on how well that thinking has subsumed the years of industry experience and practicalities, not to forget the conservative approach to data. I'd be interested to see how it pans out.
I think if will also be imortend what other databases come out of this. There is space for other prioritys in terms of CAP in a world were perseption and process are seperated. Or just other implmentations of the same ideas, opensource maybe.
Conservative generally means using tried and tested technologies that has run stably for years on thousands of servers and where every corner case is well documented and well understood. Not using some new largely untested technology that sounds awesome on paper, but has yet to truly prove itself in the real world.
Disclaimer: I'm a huge fan of clojure and Hickey and I think Datomic sounds amazing. I'm currently looking for an excuse to play with Datomic and learn more about it, so don't take what I said as any indication that I'm somehow against Datomic.
- Rich's whole view on the world is pretty consistent with respect to this talk. If you know his view on immutability, values vs identity, transactions, and so forth, then you already have a pretty good idea about what kind of database Rich Hickey would build if Rich Hickey built a database (which, of course, he did!)
- The talk extends his "The Value of Values" keynote  with specific applicability to databases
- Further, there is an over-arching theme of "decomplecting" a database so that problems are simpler. This follows from his famous "Simple made easy" talk 
- His data product, Datomic, is what you get when you apply the philosophies of Clojure to a database
I've talked about this before, but I still think Datomic has a marketing problem. Whenever I think of it, I think "cool shit, big iron". Why don't I think about Datomic the same way I think about, say, "Mongodb". As in, "Hey, let me just download this real quick and play around with it!" I really think the folks at Datomic need to steal some marketing tricks from the NoSQL guys so we get more people writing hipster blog posts about it ;-)
Watching this talk has so far (I'm halfway through, and now giving up) been very disappointing, primarily because many of the features and implementation details ascribed to "traditional databases" are not true of the common modern SQL databases, and almost none of them are true of PostgreSQL. As an initial trivial example, many database systems allow you to store arrays. In the case of PostgreSQL, you can have quite complex data types, from dictionaries and trees to JSON, or even whatever else you want to come up with, as it is a runtime extensible system.
However, it really gets much deeper than these kinds of surface details. As a much more bothersome example that is quite fundamental to the point he seems to be taking with this talk, at about 15:30 he seriously says "in general, that is an update-in-place model", and then has multiple slides about the problems of this data storage model. Yet, modern databases don't do this. Even MySQL doesn't do this (anymore). Instead, modern databases use MVCC, which involves storing all historical versions of the data for at least some time; in PostgreSQL, this could be a very long time (when a manual VACUUM occurs; if you want to store things forever, this can be arranged ;P).
This MVCC model thereby directly solves one of the key problems he spends quite a bit of time at the beginning of his talk attempting to motivate: that multiple round-trips to the server are unable to get cohesive state; in actuality, you can easily get consistent state from these multiple queries, as within a single transaction (which, for the record, is very cheap under MVCC if you are just reading things) almost all modern databases (Oracle, PostgreSQL, MySQL...) will give you an immutable snapshot of what the database looked like when you started your transaction. The situation is actually only getting better and more efficient (I recommend looking at PostgreSQL 9.2's serializable snapshot isolation).
At ~20:00, he then describes the storage model he is proposing, and keys in on how important storing time is in a database; the point is also made that storing a timestamp isn't enough: that the goal should be to store a transaction identifier... but again, this is how PostgreSQL already stores its data: every version (as again: it doesn't delete data the way Rich believes it does) stores the transaction range that it is valid for. The only difference between existing SQL solutions and Rich's ideal is that it happens per row instead of per individual field (which could easily be modeled, and is simply less efficient).
Now, the point he makes at ~24:00 actually has some merit: that you can't easily look up this information using the presented interfaces of databases. However, if I wanted to hack that feature into PostgreSQL, it would be quite simple, as the fundamental data model is already what he wants: so much so that the indexes are still indexing the dead data, so I could not only provide a hacked up feature to query the past but I could actually do so efficiently. Talking about transactions is even already simple: you can get the identifier of a transaction using txid_current() (and look up other running transactions if you must using info tables; the aforementioned per-row transaction visibility range is even already accessible as magic xmin and xmax columns on every table).
In another talk he addresses your point specifically. He said, and I'm paraphrasing:
"It doesn't matter if you're using append only data-structures if your view of the world is update in place".
PostgreSQL exposes a view to the world of an update in place database, no matter what it's doing underneath. You could create a new interface to PostgreSQL's internals that doesn't and if you did, it would look a lot like datomic.
First, there is a major difference between MVCC and update-in-place that you can detect as a client, and that difference is that the problems that Rich outlines at the beginning of his talk do not happen: if one client edits something in the database, other transactions do not get an inconsistent view because the data on disk has already been permanently and irrevocably "updated in place". (Which, to be clear, means that modern SQL databases do not "expose a view to the world of an update in place database".)
Second, if all that is required to get his model is to add a command to an existing database (such as PostgreSQL, as I feel I know enough about how it works to be confident that this would be a reasonably simple task) "mark the current transaction read-only and pretend that it is as old as transaction X" (something that can be implemented quite rapidly in an existing system like PostgreSQL) we really aren't talking about something that is either very new, or that totally reinvents the "traditional database".
> Which, to be clear, means that modern SQL databases do not "expose a view to the world of an update in place database".
I disagree with you on this point. MVCC may prevent you from suffering from locking problems, but you can still, from a user perspective, modify rows. A row is a place and you are updating it. It's definitional.
On your second point. The data model that datomic databases expose is very different from SQL databases. That's enough of a difference to say that it's fundamentally new. Furthermore, I don't think anyone would disagree that the architecture of datomic is very different from that of PostgreSQL.
When you compare a distributed database that uses immutable data against PostgreSQL, the thing that is immediately apparent to me is that garbage collection is much more difficult in the distributed setting. You can't just rewrite the network interface for PostgreSQL and get datomic, but you might be able to get single-server datomic.
I think you are seeing a simple design and thinking, "anyone could have thought that up", when actually it's not that easy. The design that you are looking at has obviously gone through extensive refinement.
> When you compare a distributed database that uses immutable data against PostgreSQL, the thing that is immediately apparent to me is that garbage collection is much more difficult in the distributed setting. You can't just rewrite the network interface for PostgreSQL and get datomic, but you might be able to get single-server datomic.
PostgeSQL-XC solves this problem by adding a global transaction management server. I guess it does the same thing as the transactor for Datomic so there really is no major difference here.
In your last paragraph, I feel like you are mischaracterizing my overall thesis. I am not claiming the design is simple: MVCC took many lives in sacrifice to its specification and discovery, and I certainly am not claiming "anyone could have thought that up". Instead, my primary issue is that this is a talk about databases and database design that is providing motivation vs a strawman: specifically, the way Rich seems to believe "traditional databases" work, and for which we spend the first almost 20 minutes learning the negatives, roadblocks, and general downsides.
However, almost none of the things that he indicates actually are downsides of most modern database systems, and certainly not of PostgreSQL. His downsides include that the data structuring is simplistic, that you can't have efficient and atomic replication of it (not multi-master mind you, but seemingly even doing real-time replication of a master to a read-only slave while maintaining serialization semantics seems to be dismissed), and that if you attempt to make multiple queries you will get inconsistent data due to update-in-place storage.
Yes: update-in-place "storage", not "update-in-place semantics within the scope of an individual transaction". Even if he was very clear about the latter (which is again quite different from "update-in-place semantics", which MVCC definitely does not have), that would still undermine his points, as the problem of inconsistent data from multiple reads, a problem he goes into great detail about with an example involving a request for a webpage that needs to make a query first for its backend data and then for its display information, does not exist with MVCC.
During this discussion of storage, he specifically talks about how existing database storage systems work, not at the model level, but at the disk level, discussing how b-trees and indexes are implemented with their destructive semantics... and all of these details are wrong, at least for PostgreSQL and Oracle, and I believe even for MySQL InnoDB (although a lot of its MVCC semantics are in-memory-only AFAIK, so I'm happily willing to believe that it actually destroys b-tree nodes on disk).
The talk then discusses a new way of storing data, and that new way of storing data happens to share the key property he calls new with the old way of storing data. The result is that it is very difficult to see why I should be listening to this talk, as the speaker either doesn't know much about existing database design or is purposely lying to me to make this new technology sound more interesting :(. Your response that in a different talk he attempted to backpatch his argument with something that still doesn't seem to address MVCC's detectably-not-the-same-as-update-in-place-semantics doesn't help this.
Now, as I stated up front, after listening to half of this talk, I couldn't take it anymore, and I gave up: I thereby didn't hear an entire half hour of him speaking. Maybe somewhere in that second half there is something new about how some particular quirk of his model allows you to get a distributed system, but that seemed sufficiently unlikely after the first half that it really doesn't seem worth it, and based on the comments from discussion (such as in the threads started by bsaul and sriram_malhar, which seems to indicate that writes are centralized and reads are distributed, something you can do with any off-the-shelf SQL solution these days) that seems to hold up.
The model of consistency envisioned by Datomic is one in which consistency normally available only within a transaction is available outside of any transactions, and without any central authority. Consistent views can be reconstituted the next hour, day or week. Consistent points in time can be efficiently communicated to other processes. Nothing about MVCC gives you any of that. MVCC is an implementation detail that reduces coordination overhead in transactional systems. I used MVCC in the implementation of Clojure's STM. While you might imagine it being simple to flip a bit on an MVCC system and get point-in-time support, it is a) not efficient to do so, and b) still a coordinated transactional system.
The differences I am pointing out, and the notion of place I discuss, are not about the implementation details in the small (e.g. whether or not a db is MVCC or updates its btree nodes in place) but the model in the large. If you 'update' someone's email is the old email gone? Must you be inside a transaction to see something consistent? Is the system oriented around preserving information (the facts of events that have happened), or is the system oriented around maintaining a single logical value of a model?
The fact is with PostgreSQL et al, if you 'update' someone's email the old one is gone, and you can only get consistency within a transaction. It is a system oriented around maintaining a single logical value of a model. And there's nothing wrong with that - it's a great system with a lot of utility. But it isn't otherwise just because you say it could be.
Also, you seem to be reacting as if I (or someone) has claimed that Datomic is revolutionary. I have never made such claims. Nothing is truly novel, everything has been tried before, and we all stand on the shoulders of giants.
I'm sorry my talk didn't convey to you my principal points, and am happy to clarify.
First of all, thank you very much for the reply: you really didn't need to bother, as despite being a Clojure user who stores a lot of data, I'm probably simply not in your target market segment ;P.
For the record, I do not believe that you have explicitly stated this is revolutionary, although I believe various other people on HN in various threads on Datomic have. However, my specific reactions in the comment you are responding to are due to DanWaterworth's insistence that I believe that it is trivial: my original comment does not touch on this angle, and is entirely about "real databases aren't implemented like this".
That said, I do believe that if after 30 minutes of listening to a talk that doesn't mention "this is largely how existing systems are implemented, but we provide the ability to see all the rows at once", there is an implication "this isn't at all like anything you've ever seen or implemented before", which is why after DanWaterworth's comment, I started exploring that angle.
Yes: in the case of PostgreSQL's MVCC, the old e-mail is gone from the perspective of the model for other people not inside of a transaction viewing the contents, however the kinds of problems you were describing at the beginning of the talk did not need to avoid transactions.
However, the implementation is so close that if I were explaining this concept to someone else, I'd probably use it as a model, especially given that it even already reifies the special columns required to let you do the historical lookups (xmin and xmax).
As I mentioned in another comment on this thread (albeit in an edit a few minutes later), you can get historical lookup in PostgreSQL by just adding a transaction variable that turns off the mechanism that filters obsolete tuples: you can then use the already-existing transaction identifier mechanism and the already-existing xmin and xmax columns as the ordering.
The result is then that I'm watching the talk wondering where the motivation is: many of the listed motivations weren't really true faults of the existing systems, and the ones that remain seem like implementation details of the database technology.
In the latter situation, when I say it "could be" I really do mean "it is": PostgreSQL can take advantage of the fact that it is built out of MVCC when it builds other parts of itself, such as its streaming master/slave replication (which is another feature of many existing systems that you seemed to discount in your motivation section).
I am thereby simply not certain what the problem is that Datomic is trying to solve for me, whether it be revolutionary or evolutionary (again: I don't really care; I'm just commenting on the motivation section), as the listed motivations seem to be fighting against a strawman design for a database solution that doesn't have transactions to get you 90% there and isn't itself implemented and taking advantage of append-only storage.
Well, all you point out is that one aspect of datomic could be implmented with some SQL systems. Datomic however has many other aspacts that are intressting.
Other then that, the true genius is to recogniced that a system like that would be worthwhile. Just pointing out that one could theoreticly do that with something else is kind of pointless if nobody has ever done it.
I am not saying "Datomic is stupid" or anything so simple; I'm saying I was "disappointed" in this talk because it motivated Datomic against a strawman that mischaracterized the actual problems that people using "traditional databases" have sufficiently that it was no longer possible to determine what was actually being claimed as an advantage.
I realize that to many people it is impossible to dislike a presentation of something without disliking the thing being presented, the person making the presentation, and the entire ideology behind the presentation, but that is a horrible thing to assume and is unlikely to ever be the case to such a simple extreme.
I will even go so far as to say that watching this talk seems to be doing a disservice to many people on the road to doing them a legitimate service: some of the people commenting on this thread (or previous ones on HN about similar talks and articles about Datomic) actually do/did not realize that "traditional databases" can even do this at a transaction level, as the argument in the talk downright claims they can't.
The result is that when I bring up that you actually get even some of these advantages with off-the-shelf copies of PostgreSQL, I get comments of the form "I had no idea one could get a consistent read view across multiple queries within a transaction using most sql databases. That does poke a hole in a major benefit that I thought was unique to datomic, great to know!"; that can only happen when there is some serious misinformation (accidentally) being presented.
Now, does that mean that Datomic is something no one should use, and that it doesn't put things together in a really nice way, and that it doesn't have a single thing in it that is innovative, or that Rich is wasting his time working on it? No: certainly it does not. I did not claim that. I can't even claim that, as I gave up on the talk after the first half so I could spend my time attempting to clarify some of the things said in the first half that were confusing people.
I agree with most of what you say, but I don't think that MVCC is really what this is about. The qualities you describe are a feature of ACID. MVCC is just a way of implementating ACID so that it requires less locking.
More importantly, I think, there are issues with some data structures that are not well supported by postgres or any other DBMS (relational or otherwise). I do a lot of text analytics work and there are things I need to store about spans of text that I could model in a relational fashion but I don't because it would lead to 99% of my data being foreign keys and row metadata.
There will always be domains where you need highly specialized combinations of data structures and algorithms that are not efficient to model relationally and even less in terms of some of the other datamodels that you find in the NoSQL space.
That said, I found that even in natural language processing, RDBMS do a lot of things surprisingly more efficiently than conventional wisdom would have it. Storing lots of small files for instance, something that file systems are suprisingly bad at.
Sometimes I'm surprised how many people like to complain about premature optimization using languages that are hundereds of times slower than others but then go ahead and use horribly inflexible crap like the BigTable data model just in case they need to scale like Google.
Of course that's off topic because it's not remotely what Hickey proposes.
If you implement ACID using the "normal" locking semantics (such as the ones the SQL standard authors who defined the isolation levels used in the language were assuming) you can tell the difference because old values are not preserved.
Instead, we would have had contention: in the case of my example walkthrough, to implement the repeatable read semantics that I requested, the first connection would have taken a share lock on the rows it queried, causing the second connection to block on the update until after the first connection committed.
This means that you would not have been able to have the semantics where the first connect and the second connection were seeing different things at the same time (which, to be again clear, is due to none of the data being destroyed: MVCC is providing the semantics of a snapshot).
(As for your text analytics work, I am curious: are you using gin and trigrams at all? There are a bunch of things I dislike about PostgreSQL's implementation of both, but if you haven't looked at them yet you really should: if your use case fits into them they are amazing, and if not the entire point of PostgreSQL is to let you build your own data types and indexes using the ones it comes with as examples.)
I don't use gin or trigrams because I don't do much general purpose text search. I do things like named entity recognition, anaphora resolution, collecting statistics about the usage of terms over time, etc.
But you're right, it might be a good idea to look into the postgres extension mechanism. I've never seriously done that.
To elaborate further on what saurik has already pointed out: Hickey's "update-in-place" characterization of relational databases places blame in the wrong place. This is more about how people think about the data models; they think of their primary keys as identifying places, and updating rows as modifying those places. The relational model itself does not encourage this mode of thinking, although SQL arguably does, unless you think of UPDATE as shorthand for DELETE followed by INSERT. It's not that uncommon to build data models, or parts of models, that have the characteristic of never deleting facts. It's true that time-based as-of queries in such models are unwieldy, but that's a problem of the query languages that can be addressed by how queries are constructed, without redesigning the entire database system. There is research on "temporal databases" that addresses this, although I'm not up to date on it.
What I could not understand from the talk or from googling for more information on Datomic afterwards, is how it supposedly simplifies anything about the consistency issues that he talks about, with respect to the read/decide/write sequence. You read; time passes; other people make changes; when you write, the real world (modeled in your "transactor" as the single common store and arbiter of what is) will have changed.
Re: read/decide/write - Datomic supports transaction functions (written in Java or Clojure) which run within the transaction, on the transactor, and are true functions of the prior state of the database (with full access to the query API). They can be used to do all of the transformative work (traditional read/modify/write), or merely to confirm that preconditions of work done prior to transaction submission still hold true (optimistic CAS). The model is simple - there are no complexities of isolation levels etc. All transactions are serialized, period. Completely ACID.
The critical thing is that queries and reads not part of a transformation never create a transaction, and don't interact with the transactor at all.
Thanks for the reply. In the talk, I missed the point that transaction functions are able to implement "optimistic CAS" (which I assume is the same as what is more traditionally but inaccurately called "optimistic locking"). To clarify, though: Wouldn't precondition testing have to be done /inside/ the transaction, not before, in order to be effective?
That's a nice model, I suppose, but doesn't strike me as particularly novel. As has been pointed out elsewhere, the idea that read-only operations can avoid creating a transaction is not particularly important per se. What's more important is avoiding locking, and this is already available in several DBMSes. Indeed, in the sense that reads are consistent "as of" a certain point in time, Datomic really does have a notion of a read transaction. I saw in another of your replies here that you consider these to be more flexible than transactions because they are in effect immutable snapshots of the database state, and independent of a single process. I guess for some applications that could be important, but for many applications the process independence is irrelevant (perhaps it helps with scaling a middle tier?), and the eventual storage cost of true immutability will become intractable (although perhaps you'll eventually have ways to release state prior to a point in time). Coordinating different parts of code within a single process over a specific span of time doesn't strike me as any easier with Datomic's model than with traditional transactions.
This talk just didn't focus on the kinds of data modeling problems that are important to me and which I consider difficult, which mostly have to do with maintainability of the data model - except, possibly, for the ease of dealing with temporal queries. And I share others' unease with what seemed to be inaccurate characterizations of current DBMS implementations.
Please see the response I left from ten minutes ago to DanWaterworth's similar comment (the one where I point out that, as a client, you can tell the difference between those two models by testing for some of the specific problems that are mentioned near the beginning of this talk; yadda yadda).
I had no idea one could get a consistent read view across multiple queries within a transaction using most sql databases. That does poke a hole in a major benefit that I thought was unique to datomic, great to know!
However, I do think trying to setup a sql database to be able to query against any previous view of the world based on a transaction id as datomic allows wouldn't be as "simple" as you make it out to be.
Datomic allows one to get a consistent basis for multiple queries, separated by arbitrary amounts of time, outside of any transactions. Having to group queries motivated and conducted by different parts of your system into a single transaction in order to get a consistent basis is a source of coupling.
Given that I need to share that common basis among the different parts of my system that need the consistent view, I am already taking the hit on that coupling: I am going to need to make certain that all of those parts all have access to the shared basis, whether it be a timestamp, a transaction identifier, or an active connection.
If the concern is coupling across space, you can store the transaction on the server in most existing databases. Past that, I would be highly interested in knowing the motivating use case that is causing the need for this much global and distributed snapshot isolation, but otherwise agree: I'd love to see more databases actually expose the ability to more easily "query the past" as well as request guarantees on vacuum avoidance.
I would characterize the level of coupling involved in sharing "the basis is 12345" as categorically different from having to nest database access within the same transaction. Consider, e.g. the ease of moving the former to different processes.
Agreed, but this is due to MVCC databases not saving the snapshots of committed transactions. If you have an active transaction in PostgreSQL you can start a new transaction using the same snapshot from another process. (This feature is exposed in SQL in version 9.2, and was motivated by the plan to implement parallel pg_dump.)
If you in PostgreSQL would save all historical snapshots and disable vacuum you could communicate any point in time simply with a snapshot number. Now this probably wont be very efficient since PostgreSQL is not optimized for this kind of use.
This feature is very interesting! However, it is definitely not designed for this purpose, and is thereby fairly heavyweight: it is writing a file to disk with information about the snapshot instead of just returning it to the client, and the result is the filename.
Not knowing this existed, I spent the last hour implementing this feature in a way that just requires getting a single integer out (the txid) and then restoring it as the snapshot being viewed (but not changing the txid of the running transaction, which solves a lot of the "omg what would that mean" problems).
With this implementation, you can just use txid_current() to save a transaction snapshot and you can restore it using my new variable (which correctly installs this functionality only into the currently executing transaction's snapshot). (In a more ideal world, I'd re-parse the string from txid_current_snapshot().)
I didn't also solve the vacuum problems, but I think there might be some reasonable ways to do that. Regardless, I imagine that most of the interesting usages of this are not "restore a snapshot from three days ago" but more "share a snapshot between multiple processes and machines for a few seconds".
However, if you can get even one of those processes to hold open a transaction, you can copy its snapshot to other transactions in other sessions, and then even this naive implementation should be guaranteed to have access to the old data.
If you are using it only for these short periods of time, for purposes of "distributed and decoupled consistency", we also don't need to worry about the overhead of never running a vacuum: the vacuum process runs as normal, as we are only holding on to data very temporarily.
That said, in practice, you really can go for quite a while without running a vacuum on your database without issues, and I imagine any alternative system is going to run into similar problems anyway (log structured merge trees, for example, get screwed on this after a while, as the bloom filters become less selective).
You can't, however, with PostgreSQL, make this work "forever" (if you wanted to store data going back until the beginning of time) due to 32-bit txid wraparound problems :(. (This also should affect my silly implementation, as I should save the epoch in addition to the txid.)
To get a consistent view across multiple queries you just use the SERIALIZABLE isolation level. In PostgreSQL REPEATABLE READ also works, but the standard does not guarantee this (the standard allows for ghost reads since it assume you use locking rather than MVCC snapshots to implement REPEATABLE READ).
The ability to query any historical view of the data is indeed not there in PostgreSQL in any simple or reliable way. That is an advantage of Datomic, but I do not see why it would be impossible to implement in a "traditional database".
The reason I claim this would be simple is that PostgreSQL is almost already doing this. The way the data is stored on disk, every row has two transactions identifiers, xmin and xmax, which represent the transaction when that row was inserted the the transaction that row was deleted; rows, meanwhile, are never updated in place, so the old data stays around until it is deleted by a vacuum.
To demonstrate more tangibly how this works, I just connected to my database server (running PostgreSQL 9.1), created a table and added a row. I did so inside of a transaction, and printed the transaction identifier. I then queried the data in the table from a new transaction, showing that the xmin is set to the identifier of the transaction that added the row.
demo=> create table q (data int);
demo=> begin; select txid_current();
demo=> insert into q (data) values (0); commit;
INSERT 0 1
demo=> begin; select xmin, xmax, data from q;
Now, while this new transaction is still open, from a second connection, I'm going to create a new transaction in which I am going to update this row to set the value it is storing to 1 from 0, and then commit. In the first connection, as we are still in a "snapshot" (I put this term in quotes, as MVCC is obviously not copying the entire database when a transaction begins) from a transaction started before that update, we will not see the update happen, but the hidden xmax column (which stores the transaction in which the row is deleted) will be updated.
demo=> begin; select txid_current();
demo=> update q set data = 1; commit;
demo=> select xmin, xmax, data from q;
demo=> select xmin, xmax, data from q;
As you can see, the data that the other transaction was referencing has not been destroyed: the old row (the one with the value 0) is still there, but the xmax column has been updated to indicate that this column no longer exists for transactions that began after 189029 committed. However, at the same time, the new row (with the value 1) also exists, with an xmin of 189029: transactions that begin after 189029 committed will see that row instead. No data was destroyed: and this data is persisted this way to disk (it isn't just stored in memory).
My contention then is that it should be a fairly simple matter to take a transaction and backdate when it began. As far as I know, there is no reason that this would cause any serious problems as long as a) it was done before the transaction updated or inserted any data, b) there have been no vacuums during the backdated period, c) HOT (heap-only tuple) updates are disabled (in essence, this is an optimization designed to do online vacuuming), and maybe d) the new transaction is read only (although I am fairly confident this would not be a requirement).
For a more complete implementation, one would then want to be able to build transactions (probably read-only ones; I imagine this would cause serious problems if used from a writable transaction, and that really isn't required) that "saw all data as if all data in the database was alive", which I also believe would be a pretty simple hack: you just take the code that filters dead rows from being visible based on these comparisons and add a transaction feature that lets you turn them off. You could then use the already-implemented xmin and xmax columns to do your historical lookups.
P.S. BTW, if you want to try that demo at home, to get that behavior you need to use the "repeatable read" isolation level, which uses the start of the transaction as the boundary as opposed to the start of the query. This is not the default; you might then wonder if it is because it is expensive and requires a lot more coordination, and as far as I know the answer is "no". In both cases, all of the data is stored and is tagged with the transaction identifiers: the difference is only in what is considered the reference time to use for "which of the rows is alive".
However, it does mean that a transaction that attempts to update a value that has been changed from another transaction will fail, even if the updating transaction had not previously read the state of the value; as most reasonable usages of a database actually work fine with the relaxed semantics that "data truly committed before the query executes" provides (as that still wouldn't allow data you update to be concurrently and conflictingly updated by someone else: their update would block) and those semantics are not subject to "this transaction is impossible" errors.
Both Connections (setup):
demo=> set session characteristics as transaction isolation level repeatable read;
Thanks for taking the time to elaborate, very interesting. I wonder if the sql db vendors or open source projects will take the next step to make querying against a transaction ID possible given the underlying implementation details bring it pretty close.
I also see Rich has made some interesting points elsewhere in this thread about consistent views being available outside of transactions and without need for coordination (within datomic) - seems more appropriate to comment directly there though.
Overall I think it's important to understand these nuances, and not view datomic as some revolutionary leap, even if I am excited about the project. I appreciate your insight into the power already within sql db engines.
I am not certain whether your response comes from a reading of my comment before or after the paragraph I added that started with "for a more complete implementation", but if it was from before I encourage you to read that section: the ability to do the query is pretty much already there due to the xmin/xmax fields that PostgreSQL is already reifying.
(edit: Apparently, nearly an hour ago, jeltz pointed out that PostgreSQL 9.2 actually has implemented nearly this identical functionality through the usage of exported snapshots, so I recommend people go read that comment and the linked documentation. However, my comment is still an example of the functionality working.)
(edit: Ah, but the feature as implemented actually saves a file to disk and thereby has a lot of server-side state: the way I've gone ahead and implemented it does not have this complexity; I simply take a single integer and store nothing on the server.)
> I wonder if the sql db vendors or open source projects will take the next step to make querying against a transaction ID possible given the underlying implementation details bring it pretty close.
For the hell of it, I just went ahead and implemented the "backdate a transaction" feature; I didn't solve the vacuum guarantees problem, however: I only made it so that a transaction can be backdated to another point in time.
To demonstrate, I will start with a very similar sequence of events to before. However, I am going to instead use txid_current_snapshot(), which returns the range (and an exception set that will be unused for this example) of transaction identifiers that are valid.
demo=# create table q (data int);
demo=# begin; select txid_current_snapshot();
demo=# insert into q (data) values (0); commit;
INSERT 0 1
demo=# begin; select txid_current_snapshot();
demo=# select xmin, xmax, data from q;
demo=# begin; select txid_current_snapshot();
demo=# update q set data = 1; commit;
demo=# select xmin, xmax, data from q;
demo=# select xmin, xmax, data from q;
demo=# begin; select txid_current_snapshot();
demo=# select xmin, xmax, data from q;
So far, this is the same scenario as before: I have two connections that are seeing different visibility to the same data, based on these snapshots. Now, however, I'd like to "go back in time": I want our first connection to be able to use the same basis for its consistency that we were using in the previous transaction.
demo=# set snapshot_txid = 711;
demo=# select txid_current_snapshot();
demo=# select xmin, xmax, data from q;
This new variable, snapshot_txid, is something I created: it gets the current transaction's active snapshot and modifies it to be a range from that transaction id to that same id (I think a better version of this would take the exact same string value that is returned by txid_current_snapshot()).
From that previous basis, the row with the value 0 is visible, not the row with the value 1. I can, of course, go back to the future snapshot if I wish, in order to view the new row. (I am not yet certain what this will do to things writing to the database; this might actually be sufficient, however I feel like I might need to either mess with more things or mark the transaction read-only.)
demo=# set snapshot_txid = 712;
demo=# select txid_current_snapshot();
demo=# select xmin, xmax, data from q;
I noticed that the xmin changed from Connection 1 from 189018 -> 189028. Is that just a typo?
Is the concept of transaction in this regard a 'state of the entire db'? If a transaction included multiple modifications would they all get the same xmax? If so, I see this as a difference between the presentation and your example. The transaction is a modification to the entirety of the db and is a state of the db. In Hickey's presentation, he very clearly says that the expectation is that the transaction component of the datom is specific to an individual datom.
Since its been a while since I've worked on DBs, and even then I didn't know much, your demo has helped put it in perspective.
Thank you so much for noticing that error. What happened is that I started doing it, and then made a mistake; I then redid the entire flow, but forgot I needed to re-copy/paste the first half as the transaction identifier would have changed. I have updated my comment to fix this.
Yes: the transaction is the state of the entire database. However, you can make your transactions as fine-grained as you wish: the reason to not do so, however, is that you are likely to end up with scenarios where you want to atomically roll-back many changes that other people using the database should not see in the case of a failure. You certainly then will at least want the ability to make multiple changes at once.
The tradeoff in doing so, however, is that you will need to make a new transaction, which will have a different basis. I agree: if you then have a need to be able to make tiny changes to individual items one at a time that need to be from a shared consistency basis and yet have no need to be atomic with respect to each other (which is the part I am going to be quite surprised by), then yes: you need the history query function to implement this.
I would be fascinated by a better understanding of that use case. Does he go into an explicit example of why that would be required later in the talk? (If so, I could try to translate that use case as best I can into a "traditional database" to see whether you really need that feature; if you do, it might be valuable to try to get something related to this design into PostgreSQL: I am starting to get a better understanding of the corner cases as I think more about it, and think I can come up with a proposal that wouldn't make this sound insane to the developers.)
How would you idiomatically fix invalid data in Datomic? for example, if you needed to update a badly entered value in a record, but keep the record's timestamp the same so as not to screw up historical queries?
Figures. So I guess that, in the case where you both need to worry about fixing invalid data, and also need to do historical queries, you would have to add your own timestamp to the data to represent the actual time of the event, because the built-in timestamp is just giving you the time of the state of the database. Hopefully that wouldn't get too hairy.
Of course not. You can certainly learn all you need to know on your own. However, that doesn't make the process of learning any easier. If you can get into a PhD program, it is a wonderful way to get access to information of various sorts.
If not, then the new free CS courses that are now being offered by Stanford and others provide extra help beyond reading books and papers. I'd highly recommend trying some!
I would say no, but a good chunk of knowledge is (though probably less is required than imagined) and if the academic methods of acquiring knowledge suit you (for many hackers they don't) then a Ph.D is a fine way to get a good chunk of knowledge. Here's Rich Hickey's recommended reading for Clojure specifically: http://www.amazon.com/Clojure-Bookshelf/lm/R3LG3ZBZS4GCTH/re... It's not exhaustive for even functional programming and design let alone the entirety of computer science, but it's certainly a good chunk of knowledge enough to do awesome work from.