Hacker News new | past | comments | ask | show | jobs | submit login

Great talk. Love the phrase "Information Technology not Technology Technology".

But I do think he has been a bit unfair to databases (and primary keys) generally, in characterizing them as "place oriented". The relational model is actually a brilliantly successful example of a value-oriented information technology.

The very foundation of the relational model is the information principle, in which the only way information is encoded is as tuples of attribute values.

As a consequence, the relational model provides a technology that is imbued with all of the the virtues of values he discusses. * language independence * values can be shared * don't need methods * can send values without code * are semantically transparent * composition, etc.

It's true that we can think of the database itself as a place, but that's a consequence of having a shared data bank in which we try to settle a representation of what we believe to be true. Isolation gives the perception of a particular value. In some ways, this is just like a CDN "origin".

Also regarding using primary key as "place". Because capturing the information model is the primary task in designing a relational database schema, the designer wants to be fairly ruthless by discarding information that's not pertinent. For example, in recording student attendance, we don't record the name of the attending student - just their ID. This is not bad. We just decided that in the case of a name change, it's not important to know the name of the student as at the time of their attendance. If we decide otherwise, then we change the schema.




It wasn't a knock against relational databases. The issue is update in place. If you have a relational database that is append only there is no problem. He actually wrote one (datomic).

The criticism of a primary key is again not anything against having primary keys, but that in a database that allows updates in place a primary key is meaningless. It is meaningless because it doesn't specify a value -- you pass a primary key and it could be anything by the time the receiver gets around to using it. If instead the value was immutable passing a primary key would be fine.

I've done work with ERP systems and having the ability to query against arbitrary points in time would be amazing. What was the value of inventory on this date? There are other ways to go about this (event sourcing) but it moves all the complexity to application code. The goal would be for the database itself to do the work for us.


> you pass a primary key and it could be anything by the time the receiver gets around to using it. If instead the value was immutable passing a primary key would be fine.

Not sure what you mean by receiver here -- receiver as in the database or another component in your software hierarchy? The best way to ensure that your data goes unchanged across atomically disparate events is to insist that (Oracle, which is what I use in enterprise) lock the row. The easiest way is to use SELECT ... FOR UPDATE. The cursor you receive will have all the SELECTed rows locked until you close the cursor -- by commit or rollback. This will ensure that nobody can change your data whilst you're messing around with it, even if your goal is never to actually modify it, but merely capture a snapshot of the data. Obviously, if you have a lot of different processes hitting the same data they will block and wait for the lock to free (though this behaviour can be changed) so depending on what you're doing this may not be the most efficient way, though it certainly is the most transactionally safe way. Another way is to use Oracle's ORA_ROWSCN which is, greatly simplified, incremented for a row when that row is changed. So: read your data incl. its ORA_ROWSCN and when you update only update if the ORA_ROWSCN is the same. A similar approach could be done with Oracle's auditing or a simple timestamp mechanicm, but you obviously lose some of the atomicity from doing it that way.

> I've done work with ERP systems and having the ability to query against arbitrary points in time would be amazing.

You can do that in Oracle. You can SELECT temporally; so you could tell the DB to give you the state of the select query 5 minutes ago, subject to 1) flashback being enabled; and 2) the logs still having the temporal data.

Another way is to use an audit log table to store changes to data. We use this all the time in accounting/finance apps as people fiddling with invoices and bank account numbers must be logged; you can either CREATE TRIGGER your way to a decent solution, or use Oracle's built-in auditing version which is actually REALLY advanced and feature rich!

N.B.: I do not use other databases so my knowledge of them is limited, but it should give you some ideas at least!


I think the parent meant the value vs. reference separation.

Consider that you create a record, give it a primary key N, then start referring to that value by the primary key and at some point make an update to the record, the same primary key now refers to another value. The primary key is just a reference pointer to a placeholder (=record) and depending on whatnot the value in the placeholder can change to anything. So, you have to be careful of what you mean by the primary key because it's just a reference, not a value. In the value paradigm your primary key would be a hash, like in git, that would forever be that one value instead of referring to some value.


Exactly. Primary key is a subset of attributes - thus necessarily populated by values.

It's worth considering the correspondence between the concept of functional dependency in the relational model, and the concept of a pure function in functional programming. The issue under discussion, then, is whether referential transparency is afforded in the database.

While referential transparency in a database is achieved momentarily at the right isolation level, it is not achieved in the eternal sense of a pure function.

This is because the functional dependency in FP encodes an intensional definition, whereas the functional dependency captured in a relation is extensional, usually modelling the state of the knowledge of the relevant world, and therefore being subject to change.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: