
Millions of Tiny Databases - feross
https://blog.acolyer.org/2020/03/04/millions-of-tiny-databases/
======
mjb
I'm one of the authors of this paper, along with Fan and Tao. I'm also a huge
fan and religious reader of The Morning Paper, so it's really cool to see
Adrian feature our work. If you don't know The Morning Paper, be sure to check
out some of the other stuff that Adrian writes: deep looks at a mix of
systems, ML, and even classic papers.

~~~
maxtollenaar
Congrats on the work!

------
dwohnitmok
Another AWS component with TLA+ use. I'll shill it again because I think TLA+
is one of the most practical formal methods toolkits out there at the moment.
It's great. Try it out.

You're not getting rid of implementation bugs with TLA+, but it's a huge
breath of fresh air as a formal documentation language.

~~~
mjb
> it's a huge breath of fresh air as a formal documentation language.

Yeah! When I started with TLA+ I was mostly enamored with model checking and
proofs. Those turned out to be useful, but the unexpected use of being a
really great way to write crisp descriptions of protocols and algorithms is
probably a bigger benefit. That's one of the reasons I tend to choose PlusCal
over "raw" TLA+ these days: it's easier for others to read and engage with.

After publishing this paper, I had a great email conversation with Leslie
Lamport about this use of TLA+. We talked about TLA+'s use as a "low ambiguity
documentation" tool, and some of the cases where we've been able to resolve
conversations about ambiguities in our implementation because we had the TLA+
spec to fall back on.

~~~
dwohnitmok
Fascinating; did you guys stick almost exclusively to PlusCal (except
presumably for writing invariants in TLC)?

I'm quite partial to just using straight TLA+ for everything because it's both
what ultimately it all desugars to anyway and because it makes what you put in
TLC and what you write for your spec the same language. Plus once you're in
the mindset of TLA+, the syntax of PlusCal has always seemed more of a
distraction than anything else, but it does seem that PlusCal is a lot less
scary for an experienced developer with no TLA+ experience.

~~~
mjb
Speaking only for myself, because we don’t have anything approaching a
standard here, I do about 60% PlusCal and 40% straight TLA+.

Mostly the tradeoffs are the ones you mentioned. If I was the only audience of
what I was writing, I’d pick TLA+ every time, but for a broader audience
PlusCal can make this stuff much more approachable.

~~~
dwohnitmok
Makes sense. One last question while I still have you here. What's the TLA+
adoption look like within AWS? I imagine it's probably still only a small
minority of teams, but exactly how small are we talking (you guys are the only
ones, 2-5, 5-10, or 10s?).

------
topspin
Reading this makes me think of GitHub. I recall GitHub having a large,
distributed MySQL database at the heart of the system and when a partition
developed the whole system faltered[1]. This seems ironic to me; git was
designed to be decentralized and one can imagine a design for GitHub that did
not involve a globe spanning MySQL database, or at least one that didn't
directly impact the operation of all GitHub repos when it falls over.

Parts of this blog post also align well with another recent post: Simple
Systems Have Less Downtime[2]:

[1] [https://github.blog/2018-10-30-oct21-post-incident-
analysis/](https://github.blog/2018-10-30-oct21-post-incident-analysis/) [2]
[https://news.ycombinator.com/item?id=22471355](https://news.ycombinator.com/item?id=22471355)

~~~
collyw
I wonder how many people actually need a _distributed_ version control system.
It seems to make git more complex than is necessary.

~~~
topspin
I use to wonder this. Now I don't. The notion that with some other VCS I would
not have a complete copy of the entire history available to me locally now
seems dysfunctional. You make branches at will and it troubles no one unless
you need it to. Git is a great improvement over all that came before. Some
bizarre default CLI behaviors are my only complaint.

------
tkyjonathan
I said this 3 years ago, the future is having an SQLite DB inside a container
for each one of your customers.

/s

------
pjc50
> When I think about minimising blast radius, I immediately think of bulkheads

This is an excellent model to have for high-reliability work. There are going
to be failures, so the design should provide means of containing the failures.

The paper is also good at recognising the risk of cascade failures in failover
systems, where a single excessive load causes a failure - but the process of
trying to move the load elsewhere _also_ becomes overloaded.

------
jefurii
I thought this was going to be an article about all the SQLite databases
embedded in applications, and on smartphones and watches and other devices. Or
about how Archive.org publishes metadata for individual objects in
OBJECT_meta.sqlite files (alongside OBJECT_meta.xml).

------
webdva
Very inspiring work and achievement. It is perhaps the equivalent of the
application of quantum mechanics during the middle to late twentieth century.
As such, perhaps "decentralized general computing" is more plausible than I
thought.

~~~
teraflop
It's a great paper, but I think you're hugely exaggerating its significance.

As the authors themselves point out, none of the fundamental building blocks
of this system are particularly new. For example, the idea of partitioning a
very large dataset into lots of independent slices, each of which is handled
by its own Paxos group, is the same idea that forms the basis of Google's
Megastore and Spanner, the former of which is more than a decade old.

Most of the interesting stuff in this paper is the discussion of the nuts-and-
bolts of software engineering, such as testing, deployment and monitoring.

~~~
thedance
People find this stuff exciting because most of our industry works on systems
that are at least ten years behind the state of the art. A good example is
HDFS, an extremely bad likeness of GFS which itself wasn't great and died ten
years ago. If you describe Colossus/D in detail many people in our industry
will think it's really amazing, but of course that's more than ten years old
now. Many people will choose HDFS for new systems in new designs, today. You
can spend your whole career without getting so much as a whiff of the state of
the art.

~~~
vkazanov
And what are the better modern alternatives to hdfs among distributed file
systems?

~~~
thedance
I would say that a distributed filesystem is a solution looking for a problem
in most cases. Amazon S3 or Google Cloud Storage address some use cases, and
Google Cloud BigTable is a direct drop-in replacement compatible with the
HBase API but having dramatically better performance and reliability. There
are other use cases that have other alternatives, it all depends on what you
plan to do with the data on the filesystem, how far you need to scale it, and
whether you clients are in your own datacenters or in vendor clouds.

