Still, I think it’s the right default to start with serializable. Then when you have performance issues you can think long and hard about whether relaxed isolation levels will work in a big free way. Better to start with a correct application.
Sybase SQLAnywhere implements (or at least did) strict serialization by taking an exclusive row lock on all rows... which you can imagine scales horribly for a table with a reasonable row count.
I found out the hard way at work. I had assumed it took an exclusive lock on the table level only, the documentation didn't really spell out the details of how it enforced the serialized access.
I changed it to a retry loop which worked fine and was fairly easy to implement all things considered. Not gonna reach for strict serialization again unless I have to.
The object storage stuff is new, but it's mostly confirmed that the older architecture works. MPP with shared (S3) storage and everything above that on local SSD and compute delivers the best performance. Even Snowflake finally came out with "interactive" warehouses with this architecture.
Parquet, Iceberg, and other open formats seem good, but they may hit a complexity wall. There's already some inconsistency between platforms, eg with delete vectors.
Incremental view maintenance interests me as well, and I would like to see it more available on different platforms. It's ironic that people use dbt etc. to test every little edit of their manually coded delta pipelines, but don't look at IVM.
Definitely they should include D4M and GraphQL [1],[2].
Not only D4M can cater for structured relational data, it's also suitable for non-structured and sparse data in spreadsheet, matrices and graph. It's essentially a generalization of SQL but for all things data.
There's also integration of D4M with SciDB [3].
[1] D4M: Dynamic Distributed Dimensional Data Model:
It should be this way. Clients should have some protocol to communicate the schema they expect to the database probably with some versioning scheme. The database should be able to serve multiple mutually compatible views over the schema (stay robust to column renames for example). The database should manage and prevent the destruction of in use views of that schema. After an old view has been made incompatible, old clients needing that view should be locked out.
> The database should manage and prevent the destruction of in use views of that schema. After an old view has been made incompatible, old clients needing that view should be locked out.
this is the interesting part where the article's prpcess matters. how do you make incompatible changes without breaking clients?
You’re right that it would run on a block chain, but that fact would primarily exist to power some marketing. Everybody would end up interacting with it through a single centralized web site and API because it’s the only usable way to get it to work.
Thank you for writing this. This comes up constantly, and it'll be great to have another reference to cite.
Another interesting thing about TPC-C is how the cross-warehouse contention was designed. About 10% of new order transactions need to do a cross-warehouse transaction. If you can keep up with the workload, then the rate of contention is relatively low; most of the workload isn't pushing on concurrency control. If, however, you fall behind, and transactions start to take too long, then the contention will pile up.
When you run without the keying time, it turns out that concurrency control begins to dominate. For distributed databases, concurrency control and deadlock detection is fundamentally more expensive than it can be for single-node databases -- so it makes sense that a classically single-node database would absolutely trounce distributed databases. I like to think of tpcc "nowait" as really a benchmark of concurrency control because, due (I believe) to Amdahl's law the majority of its execution time ends up just in the contended portion of the workload.
Also very interesting that, as Justin points out, the workload sets up the warehouses so there is never cross-node contention. That's wild! I'm glad they didn't go and benchmark against even more distributed databases (like YugabyteDB, Spanner, or CockroachDB) and call it a fair fight.
Folks, for the love of god, please please stop running TPC-C without the “keying time” and calling it “the industry-standard TPCC benchmark”.
I understand there are practical reasons why you might want to just choose a concurrency and let it rip at a fixed warehouse size and say, “I ran TPC-C”, but you didn’t!
TPC-C when run properly is effectively an open-loop benchmark that scales where the load scales with the dataset size by having a fixed number of workers per warehouse (2?) that each issue transactions at some rate. It’s designed to have a low level of builtin contention that occurs based on the frequency of cross warehouse transactions, I don’t remember the exact rate but I think it’s something like 10%.
The benchmark has an interesting property that if the system can keep up with the transaction load by processing transactions quickly, it remains a low contention workload but if it falls behind and transactions start to pile up, then the number of contending transactions in flight will increase. This leads to non-linear degradation mode even beyond what normally happens with an open loop benchmark — you hit some limit and the performance falls off a cliff because now you have to do even more work than just catching up on the query backlog.
When you run without think time, you make the benchmark closed loop. Also, because you’re varying the number of workers without changing the dataset size (because you have to vary something to make your pretty charts), you’re changing the rate at which any given transaction is going to be on the same warehouse. So, you’ve got more contending transactions generally, but worse than that, because of Amdahl’s law, the uncontended transactions will fly through, so most of the time for most workers will be spend sitting waiting on contended keys.
percona/sysbench-tpcc has been subsequently updated to include a stronger disclaimer that it's "TPC-C-like" and doesn't comply with multiple TPC-C requirements. Fingers crossed that this helps stop vendors from doing non-TPC-C benchmarking without realizing it.
Some time ago I worked on cockroachdb and I was working on implementing planning for complex online schema changes.
We really wanted a model that could convincingly handle and reasonably schedule arbitrary combinations of schema change statements that are valid in Postgres. Unlike mysql postgres offers transactional schema changes. Unlike Postgres, cockroach strives to implement online schema changes in a protocol inspired by f1 [0]. Also, you want to make sure you can safely roll back (until you’ve reached the point where you know it can’t fail, then only metadata updates are allowed).
The model we came up with was to decompose all things that can possibly change into “elements” [1] and each element had a schedule of state transitions that move the element through a sequence of states from public to absent or vice versa [2]. Each state transitions has operations [3].
Anyway, you end up wanting to define rules that say that certain element states have to be entered before other if the elements are related in some way. Or perhaps some transitions should happen at the same time. To express these rules I created a little datalog-like framework I called rel [4]. This lets you embed in go a rules engine that then you can add indexes to so that you can have sufficiently efficient implementation and know that all your lookups are indexed statically. You write the rules in Go [5]. To be honest it could be more ergonomic.
The rules are written in Go but for testing and visibility they produce a datomic-inspired format [6]. There’s a lot of rules now!
The internal implementation isn’t too far off from the search implementation presented here [7]. Here’s unify [8]. The thing has some indexes and index selection for acceleration. It also has inverted indexes for set containment queries.
It was fun to make a little embedded logic language and to have had a reason to!
While that’s sort of true, there’s a lot of language specific things that go into making the UX of a debugger pleasant (think container abstractions, coroutines, vtables and interfaces). Specifically async rust and Tokio gets pretty interesting for a debugger to deal with.
Also, there’s usually some language (and compiler) specific garbage that makes the dwarf hard to use and requires special treatment.
reply