
Time-travel queries in CockroachDB - ivank
https://www.cockroachlabs.com/blog/time-travel-queries-select-witty_subtitle-the_future/
======
josephg
Very cool. For a long time I've wanted databases to support this so I could
run query 1:

`SELECT ( _, $querytime) FROM table1 WHERE ...;`

Then based on some code in my application run a second query:

`SELECT _ FROM table2 WHERE ... AT $querytime;`

So much frontend code makes assumptions that databases aren't modified between
consecutive queries (like graphql-js). With an API like this it would be
trivial to make those queries correct. (And we wouldn't need to abort, and we
can cross database instance boundaries in interesting ways). The timestamp is
also super useful for doing isomorphic rendering of live-bound data - the
server can send the timestamp of the data queries it used to render the page.
When the client JS loads it can reconnect to the server and pass the rendered
timestamp back to check for deltas between the rendered version and the
current database view. This in turn lets you to safely cache server renders,
as well as a bunch of other fun things.

Hats off to the CockroachDB team! Fingers crossed some of these features start
making their way into other databases I love. (Looking at you, Rethinkdb!)

~~~
gaius
_Very cool. For a long time I 've wanted databases to support this so I could
run query_

This has been in Oracle for a few years now
[https://docs.oracle.com/cd/B28359_01/appdev.111/b28424/adfns...](https://docs.oracle.com/cd/B28359_01/appdev.111/b28424/adfns_flashback.htm#i1008579)

~~~
toyg
Yup, but it's one of those things where you need the most expensive license,
IIRC.

~~~
gaius
I'm a little bemused by _I’m not aware of any database that has implemented
it_

You would need to be hiding under a rock, like a cockroach I suppose, not to
know that Oracle and SQL Server 2016 have this :-)

~~~
toyg
I think we've reached a point where a lot of OSS folks never actually touched
a commercial database in their life. OSS alternatives are so good that less
and less people feel the need to go out and pay for something better.

In a way it's similar to some folks going crazy because you can do this or
that in the browser when desktop OSes have been able to do it for ages...
generational shifts.

~~~
gaius
Certainly modern open source DBs are very capable... It is just a little
frustrating to see the wheel being reinvented rather than genuine progress
being made. And the big commercial DBs aren't standing still either.

~~~
mjibson
I think CockroachDB is making genuine progress in the distributed, consistent
space. We now have to add many of the existing features on top of that.

------
grogers
Impressive that it works correctly across schema changes.

I assume the indexes are laid out like (col, timestamp)? Can you handle an
access pattern where keys are continuously inserted with the same value, and
then deleted shortly after? Will queries for the value work efficiently, or
have to scan all rows inserted with that value for the last 24 hours? Is there
a way to tweak GC to run more aggressively on a particular table (knowing that
you wouldn't be able to query back in time)?

We have an access pattern like that in our database. Rows are inserted into
the table with x=null, then later x is set to some unique value, but we need
to query where x is null in order to set the actual value. At one point we had
a long running transaction that prevented the database from cleaning up old
records, so the query for X is null grew slower and slower to the point where
it almost browned us out.

~~~
state_machine
[engineer at cockroach]

Basically, yes, all keys the KV, including the index entries, include a
timestamp suffix.

Repeated edits to the same value will accumulate MVCC revisions until they are
GCed, and those will be iterated over during access, so, as you point out, a
use-case like yours would likely benefit from shorter a GC cutoff and GC _is_
configurable in cockroach.

------
assface
This is an old idea. It was in the original version of Postgres from Berkeley
in the 1980s. They ended up taking it out in 1998 because systems kept running
out of disk space:

[https://www.postgresql.org/docs/6.3/static/c0503.htm](https://www.postgresql.org/docs/6.3/static/c0503.htm)

It's actually very easy to implement if you use the "append" version of MVCC
(which Postgres does). All you need to do is just disable garbage collection
(e.g., the vacuum).

~~~
nxzero
>> "They ended up taking it out in 1998 because systems kept running out of
disk space."

Reminds me of Microsoft SQL Server removing support for natural language data
queries, only to add it back years later.

It would be interesting to see a list of features from major database
management systems that have been removed over the years.

~~~
assface
> It would be interesting to see a list of features from major database
> management systems that have been removed over the years.

The only other thing I can think of right now are in-memory optimized indexes
(e.g., T-Trees) from the 1980s. These were later removed and replaced with
B+trees (or skip lists if you're MemSQL) because CPU caches (SRAM) got much
faster than memory (DRAM).

------
mjibson
Blog post OP here.

I originally wrote that I wasn't aware of any other DB that had this. I should
have specified any other SQL DB, since I was aware of Datomic.

While working on this feature I searched on Google for the syntax specified
("AS OF SYSTEM TIME"), but didn't find it referenced except about a third-
party Postgres extension. It is unfortunate that Oracle and MSSQL didn't come
up in my searching, since they support this feature.

I've edited the blog post to be more accurate in listing some of the previous
work.

~~~
baq
i'm still surprised that not a single link to
[https://en.wikipedia.org/wiki/Temporal_database](https://en.wikipedia.org/wiki/Temporal_database)
is present anywhere either here or in your post. on that page there's a link
to a very in-depth treatment of the problem:
[http://www.cs.arizona.edu/people/rts/tdbbook.pdf](http://www.cs.arizona.edu/people/rts/tdbbook.pdf).

~~~
mjibson
That's the first link under the link to the SQL:2011 section about this, so
it's not too difficult to get to there.

------
vyrotek
It looks like SQL Server 2016 has it as well. They call them Temporal Tables.

[https://msdn.microsoft.com/en-
us/library/dn935015.aspx](https://msdn.microsoft.com/en-
us/library/dn935015.aspx)

~~~
dspillett
A feature that made it to standard edition too, unlike CDC and related which
were Enterprise only, which is nice.

Similar support can be found in quite a few places. It is in the SQL:2011
standard:
[https://en.wikipedia.org/wiki/SQL:2011](https://en.wikipedia.org/wiki/SQL:2011)

------
talles
Amazing.

I always hated with passion keeping timestamps and messing the database design
so I can have some sort of time awareness, I always thought that this is
supposed to be handled by the database itself, I'm just a user.

I know there's Datomic[1] but its _stack_ is way different than what people
are used to (around here) and I'm yet to find an application that would
justify using it (yeah, I'm not proud of that).

If anyone from Cockroach happens to read this, I have some questions:

* Are you guys the first database _vendor_ to provide this as first class feature?

* How is the time travel performance-wise? What if I have a product that rely on this heavily?

* How the database would handle some deleted columns on the schema reappearing after some time? It would consider the same as before? What if the type is different?

Kudos for the feature, I barely know CockroachDB but now I have a good reason
to try it.

[1]: [http://www.datomic.com/](http://www.datomic.com/)

~~~
mjibson
Blog post OP here.

No, we are not the first SQL vendor to provide this. And many other non-SQL
databases also have this feature.

Performance is faster than other reads, in general, because of less risk of
transaction retries.

Columns can come and go, and change types, and that is all handled correctly.
If a column reappears as a different type, it will work correctly. This is
because we also version the table schemas each time they change, so we can
always fetch the correct schema when doing a time-travel query.

------
marknadal
This is really cool but slightly frustrating because both Datomic and GUN
(where I work) have talked about this for years. Several others mention
Snowflake, SQL Server, and Oracle are designed for this as well.

So while cool, it seems like CockroachDB's marketing/content team is what is
winning here. I don't know how they do it, but it is impressive.

~~~
mjibson
Nah, was just me (the engineer) not being aware of Oracle and MSSQL. I should
have written than I'm not aware of any other _SQL_ DB. I was aware of Datomic,
but it's not SQL. Just an error.

------
blandonnimrat
Snowflake has had the same functionality with similar syntax for some time
now, and for a longer default time period.

see: [https://docs.snowflake.net/manuals/user-guide/data-time-
trav...](https://docs.snowflake.net/manuals/user-guide/data-time-travel.html)

disclaimer - I work there.

~~~
teraflop
That sounds like comparing apples and oranges, though. For instance, according
to the documentation, every transaction in Snowflake locks the entire table,
so it doesn't seem like it would be usable for the kind of OLTP workloads that
Cockroach is designed for.

(Although I admit that I haven't tested this assumption, since Snowflake
appears to be proprietary.)

------
pmontra
There are two different kind of temporal queries. One is about getting the
state of the db at a given time. The other is to get the state about a given
time.

Example: I got paid yesterday and the operation will be recorded only
tomorrow.

The first kind of query AS OF today will never show the payment. The second
one will do (assuming that behavior is appropriate for the application, better
examples are possible.)

CockroachDB, SQLServer and Oracle implemented the first one. That's why they
are writing about backups.

Edit: see this PostgreSQL extension for the second kind of queries
[http://pgxn.org/dist/temporal_tables/](http://pgxn.org/dist/temporal_tables/)

~~~
pjungwir
Unfortunately the Postgres temporal_tables extension is about the first kind
not the second kind: "What did the database look like on April 1st?" not "What
did this customer look like on April 1st?". From your link:

> Currently, Temporal Tables Extension supports the system-period temporal
> tables only.

Also here is the same extension on github:
[https://github.com/arkhipov/temporal_tables](https://github.com/arkhipov/temporal_tables)

------
DanielShir
Off topic a bit, but is anyone using CockroachDB here and can share their
experience?

~~~
guftagu
The name itself is going to be a huge problem when you're proposing to use
this at your company.

~~~
jayzalowitz
Actually, its been easier for me in a large org. Maybe we should ask mongo if
it was an issue?

------
gaius
This says it is implemented in the GC - how does the history survive a
restart? Oracle does it because this history is kept in the translogs on disk.

~~~
kevincox
This is the same IIUC. They are talking about the disk gc.

------
Johnny555
Cool feature, how far back can I go? Looks like it depends on the setting of
the garbage collection interval with a default of 24 hours, how high can I
reasonably set that? Is there any overhead besides disk space? Would be nice
to have at least a week, so if I delete something on Friday evening, I can
still recover it the following week

~~~
state_machine
[engineer at cockroach]

How high you can set it depends on your access patterns -- there is some
overhead to iterating though MVCC revisions during reads, in addition to the
on-disk space you mention.

If your workload involves frequent writes to the same rows, GCing some of
those revisions sooner would have a greater impact, whereas if you have a
write-light workload, or if your writes are spread over rows such that a given
row doesn't doesn't see frequent repeated updates, then you could probably use
much a higher GC threshold with minimal overhead.

------
brobinson
Awesome to see this!

The only other DB I know of which has this is Oracle, where it's called
"Flashback". Are there any others?

~~~
rtpg
I don't know exactly how to do it in Datomic, but since it's append-only, it
must be possible to do something like "state of DB at this point in time".

~~~
valw
It is (without any time restriction re: GC etc.) The exact term would be
'accumulate-only' (technically, 'append-only' is more of a storage
implementation detail). Interestingly, you can also make 'WHAT IF' queries i.e
obtained a view _as if_ some writes were made (but without transacting them).

------
SEJeff
Reminds me of the "Freeze Frame Filesystem" that the CS dept @ Cornell is
working on:

[http://www.cs.cornell.edu/projects/quicksilver/public_pdfs/f...](http://www.cs.cornell.edu/projects/quicksilver/public_pdfs/fffs.pdf)

They want to power the backend of hundreds of millions of energy sensors for a
smart power grid.

------
jgalt212
With features like this you remove 1/2 of the raison d'etre for the
blockchain.

Away from applications that really need to solve the byzantine general
problem, I see no reason to use the blockchain.

In fact, I wonder if there are fintech start ups using language like
blockchain in their pitch docs, but in actually plan to use versioning
databases like this feature here of CockroachDB or Datomic.

------
overcast
This feature would actually be REALLY helpful for my
[http://imadefood.com](http://imadefood.com) project. Where I'm keeping
revision history for recipe changes. Basically do a time travel select, versus
a revision number select. Which I think works better in peoples minds.

------
Scarbutt
Could the same be accomplish using Postgres with Kafka's Event Sourcing
[http://martinfowler.com/eaaDev/EventSourcing.html](http://martinfowler.com/eaaDev/EventSourcing.html)
?

------
nwrk
Hats off to Cockroach Labs

------
baq
great to see some traction for temporal databases around here. my experience
tells me that pretty much every half-serious database would like to be at
least system time temporal or application time temporal and preferably both at
some point in time. the fact that this hasn't been more popular is kind of
baffling to me. i guess it's one of very hard and totally not cool problems.

