
Learn how to design large-scale systems - donnemartin
https://github.com/donnemartin/system-design-primer
======
ris
I'm quite tired of everyone wanting to build "large scale systems" and play at
being Netflix. The truth of the matter is the vast vast majority of people
will never need to do this with their project and instead will just end up
making an expensive to maintain mess with way too many moving parts.

At least as important as designing something that can scale up is designing
something that can scale _down_. You don't know when the organization's going
to need to deprioritize this project and be able to keep it running without
burning a couple of million in resources every year.

See: microservices. (as in, for the problem, not the solution)

~~~
FLUX-YOU
>I'm quite tired of everyone wanting to build "large scale systems" and play
at being Netflix. The truth of the matter is the vast vast majority of people
will never need to do this with their project and instead will just end up
making an expensive to maintain mess with way too many moving parts.

Few companies will take a product that actually needs large scale systems and
hire someone that has no prior experience.

If you want to actually build large scale systems, you have to start
somewhere.

Even if you just want to be an entry-level person on a team that builds large
scale systems to learn by experience, they are likely going to ask you
questions about that topic.

You may not need that many people to build large scale systems, but you still
need a pathway as people leave that particular niche.

~~~
ris
> Few companies will take a product that actually needs large scale systems
> and hire someone that has no prior experience.

No I think most people end up hiring those who have experience _creating_ big
complicated systems but haven't stuck around long enough for their chickens to
come home to roost.

~~~
mmt
That's an important point, especially with the oft-repeated statistic of
2-years as the average tenure of an engineer.

Of course, averages (even if true) are like stereotypes.

It would be interesting to see the tenure data on the experts
(consultants/implementers) of large-scale systems, other than at the iconic
ones (e.g. Google, Netflix).

~~~
ljw1001
I think it likely that people with large-scale experience who aren't at Google
would have lower tenures than average, simply because they're becoming more
valuable and most companies don't pay people their replacement wage if they've
been there very long.

~~~
mmt
The effect that you mention is already cited for the trend of lowering average
tenure of technical professionals, in general, so, absent specific evidence
that this subset's market value differential (market value less existing
employers' willingness to keep up) is increasing faster than average, there's
no reason to believe that's the reason for a shorter than average tenure.

We don't even know if the tenure is shorter than average.

Regardless, neither the primary motivation for a short tenure, nor even any
average would be particularly meaningful with regard to what I believe to be
ris's implied accusation:

Absent at least one tenure long enough to see through the consequences of the
creation of the large-scale system, such a creator cannot be truly considered
experienced with large-scale systems, no matter how _many_ such creations are
on the resume (even though the market values/hires the latter).

------
cjhanks
I see something comparable to these diagrams (it feels like) a half-dozen
times a year.

The architecture is in general 'fine'. But communication paths of subsystems
is probably the easiest part of the problem. And in general, re-organizing the
architecture of a system is usually possible - if and only if - the underlying
data model is sane.

The more important questions are;

\- What is the convention for addressing assets and entities? Is it consistent
and useful for informing both security or data routing?

\- What is the security policy for any specific entity in your system? How can
it be modified? How long does it take to propagate that change? How
centralized is the authentication?

\- How can information created from failed events be properly garbage
collected?

\- How can you independently audit consistency between all independent
subsystems?

\- If a piece of "data" is found, how complex is it to find the origin of this
data?

\- What is the policy/system for enforcing subsystems have a very narrow
capability to mutate information?

If you get these questions answered correctly (amongst others not on the tip
of my tongue), you can grow your architecture from a monolith to anything you
want.

~~~
rargulati
Outside of raw experience, what can you do, read, or learn to build the
intuition for formulating and answering the above questions?

I can answer the above for systems I've built, but I've spent quite a bit of
time with those systems. How do I get better at doing this during the planning
phases, or even better, for a system I'm unfamiliar with (ie. are there tools
you lean on here)?

~~~
cjhanks
A bit of caution, I haven't worked in distributed systems for some time now.
And I am sure there many people more competent than I.

But in general the cliche of "Great artists steal" applies here. If
AWS/GCE/Azure (or any other major software vendor) is offering a service or a
feature, then it is almost certainly solving a problem somebody has. If _you_
don't understand what problem is being solved, then you cannot possibly
account for that problem in your design. Today, the manuals for these software
features are documented in unprecedented accuracy. Read them, and try to
reverse engineer in your head how _you_ would build them.

For example; AWS' IAM roles seems like a problem which could be solved by far
more trivial solutions. Just put permissions in a DB and query it when a user
wants to do something. Why do we need URN's for users, resources, services,
operations, etc? And why do those URN's need to map to URI's? Well, if you
look at the problem - it ends up being a big graph which is in the general
case immutable over namedspaced assets. So reverse engineer that, how would
you build that?

~~~
antpls
Hi,

I agree with you about reverse engineering the giants, it is one way of
acquiring knowledge.

However I disagree on :

> If AWS/GCE/Azure (or any other major software vendor) is offering a service
> or a feature, then it is almost certainly solving a problem somebody has.

AWS/GCE/Azure have industrialized the process of proposing new building
blocks. The cost for them to propose and maintain a new service is lower than
a few years ago. They are logically able to experiment more with users, and
eventually shut down services with no actual needs (or overlapping with
another service they propose). Especially true for Google.

I have the intuition it also works as a marketing process : more you spend
your time reading their documentation, more you accept their brand, more you
are statistically going to buy something from them.

------
madethemcry
Oh interesting, I have never seen Anki
([https://apps.ankiweb.net/](https://apps.ankiweb.net/)) being used for large
blocks of source code.

Anki is an open source application (desktop + mobile) for spaced repetition
learning (aka flashcards). It's a very popular tool among people who want to
learn languages (and basically anything else you want to remember). There are
many shared decks
([https://ankiweb.net/shared/decks/](https://ankiweb.net/shared/decks/)).
Creating and formatting cards is also possible and pretty easy.

If you are planning to learn a language or anything else give Anki a try. I
used it for all of my language learning efforts. With this least my vocabulary
is rocking solid.

~~~
nemo1618
I've been experimenting with Anki recently. I loaded up a deck of popular
fonts, with the goal being to memorize them to the point where I could
recognize them in the real world. Each card contains the sentence, "The quick
brown fox jumped over the lazy dog," and you have to identify what font it's
written in.

The first day was really tough; I missed cards over and over again. 20 new
cards (the default) is probably too many for this type of study. But I kept at
it, once per day, and today (a week later) I can recognize nearly every font
in the deck, and the ones that I have trouble with are very similar to other
fonts (which is a useful thing to know in its own right; you can start to
group fonts into "families" with a common ancestor). Pretty cool!

There's just one problem: so far, this hasn't translated into any ability to
recognize fonts in the real world. I can think of a few reasons why. First,
there are a LOT of fonts out there; even the "most popular" ones don't show up
all that frequently. This is especially true for business logos, which like to
use unusual fonts to make themselves stand out. Secondly, I think studying by
memorizing a single sentence has caused me to "overfit!"

For example, there's a font that I can instantly recognize (Minion Pro) by how
the 'T' and 'h' look together at the start of the sentence. I don't pay
attention to anything else about the font, because that single feature is
enough to distinguish it from the rest of the deck. And this turns out to be
true for most fonts: Today Sans has a funny-looking 'w', Syntax has a funny-
looking 'x', etc. So if I see a logo written in Today Sans, but it doesn't
contain a 'w', I can't recognize it! Similarly, because the cards only contain
the one sentence, which is entirely lowercase except for the 'T', I can't
identify any fonts from an uppercase writing sample. What I _can_ do is say,
"Hmm, I don't know what that font is, but it definitely has a lot in common
with Myriad..." and then I look it up and find out that the actual font is
Warnock, which was designed by the same guy (Robert Slimbach) who designed
Myriad.

So yeah, Anki is pretty cool, but an unintended side effect is that it can
give you a striking sense of how a classification algorithm "feels" from the
inside. :)

~~~
bjterry
I would be careful with what you put in Anki. There is only so much stuff that
you can memorize outside of stuff you'd learn from normal life, because the
time you have to devote to flashcarding is kind of limited (except if it's
something that excites you it creates more time). I think generally when you
choose to make the investment to add cards to your Anki deck you should have a
really concrete use case and I don't see how you'd save time over the course
of an entire life for your font project.

There are a lot of fonts out there (~50,000 families according to random Quora
people). Their distribution is probably power-law-like even if you discount
the ones that are preinstalled on major platforms. It might make sense to
recognize a few if you want to be able to really deeply discuss the difference
in how they are used for design, but just recognizing them doesn't seem like
the right way to gain that understanding. If you repeatedly perform a task
where you have to recognize a font, learning only the top 100 won't help you
much since it will eventually become pretty obvious. If you don't do that
task, then why train for it instead of looking it up as necessary?

My thinking on "what's worth flashcarding" is that there are two major
categories where it makes sense. First, if you need to remember a bunch of
specific facts and you will need to recall them more quickly than they can be
looked up. This is the case for things like tests, but there are also
reasonable possibilities for this in real work (for example, if you are a
programmer you may know you are going to need to look up the parameter
ordering of a standard library method that you use only once a month, or you
could memorize it).

The second is where you are using the flashcards as a scaffold, but the actual
knowledge is something that references or brings together the facts that are
contained in the flashcards. Recognizing fonts fits into this category, but I
have a hard time imagining that actually recognizing them is the knowledge
that is most efficient. Instead maybe you should be studying the major
categories of fonts, features of fonts, or something that would help you make
quicker decisions for whatever the real task is. I used to be able to
recognize a lot of fonts and it's basically only useful as a parlor trick.

Although if you are new to design then learning the top 20 or whatever could
be helpful to just have a basic fluency with Arial vs Times New Roman vs Comic
Sans, so you have a shared vocabulary to discuss with others. "It's like Times
New Roman but more suited for headlines and all caps" for example.

This is the weird confluence of work I've done at multiple companies (in one
case I basically implemented a SRS like Anki with applications to finance
exams, and in another I did a lot of work with fonts for a laser cutting
design editor).

~~~
nemo1618
Thanks for the detailed reply. I agree that this isn't the best use of Anki; I
did it more as a way of testing how effective spaced repetition is. And in
that respect, the experiment was a success, so now I feel confident using this
system to memorize other things I care about. The ML insight was just a nice
surprise. :)

------
bandwitch
If you liked this page, you might also like the excellent book "Designing
Data-Intensive Applications" that among others surveys many characteristics of
large-scale systems and presents some. Note that it's not a book for preparing
you on system design questions, but it can definitely help.

~~~
crystalPalace
I've been reading this and it's great so far. Are there other similar books
that describe modern enterprise architecture at scale?

~~~
jordanab
I'm currently reading "Building Evolutionary Architectures", and I'm liking it
so far.

------
agentultra
I'd add a section on using TLA+ as a design tool. Diagrams and rules of thumb
are useful but they don't catch errors or help you discover the correct
architecture. See the Amazon paper [0] on their use of TLA+ in designing (and
trouble-shooting) services.

[0] [https://lamport.azurewebsites.net/tla/formal-methods-
amazon....](https://lamport.azurewebsites.net/tla/formal-methods-amazon.pdf)

~~~
bandwitch
I feel TLA+ would be too much to ask in a system interview which is what this
site is about.

In case anybody is interested, there is a nice talk by Hillel Wayne on youtube
([https://www.youtube.com/watch?v=_9B__0S21y8](https://www.youtube.com/watch?v=_9B__0S21y8))
that provides a high-level overview on what TLA+ is about.

~~~
agentultra

        I feel TLA+ would be too much to ask
    

I think so too but I suggest putting it in an article like this because I
think TLA+ will have wider industry adoption. If candidates know it they could
be better equipped to ask more interesting questions of their interviewers
even if the position doesn't require knowledge of TLA+. And as the industry
does adopt such practices it would be great to be prepared!

------
Zeebrommer
Can we please come up with a more specific name for this type of expertise? A
large-scale system can mean anything from a social security system to a
rocket. I was a bit disappointed that it only concerns websites here (though
I'm aware that I'm browsing HN).

~~~
cc-d
The label is fine.

Nobody is confused as to what a "system administrator" is, even though
technically the word "system" itself can have a much broader range of meaning.

~~~
mcqueenjordan
I'm not saying the label is wrong, but I agree with the parent's sentiment for
a more specific label. "How to design a large-scale CRUD system" seems more
precise.

Large scale systems come in many different shapes and forms; this is an
instance of one of them. Its learnings are interdisciplinary and cross-
functional, but this isn't the roadmap for other types of systems, especially
asynchronous reactive systems.

~~~
s-shellfish
I agree. From my inferences in reading the usage of the label, large scale
means not only users interacting with defined components that operate in
predefined, predictable, static ways, but also components that involve the
automation of development. This can be anything from the development of APIs,
testing frameworks, parsers, code generation - all the computer science stuff
basically.

Large scale usually means some aspect of the business is focused on catering
to developers, because the systems have become that complex that they require
some form of automating existing automation.

------
mabbo
This design, roughly, is being used very widely and is well-documented
everywhere. But does anyone know of any lesser-known yet equally functional
designs that work at the same scale?

Are there cases this design does _not_ work for?

~~~
bsenftner
Yes. One can use a C++ library like Restbed and embed the web server directly
into a compiled executable that uses SQLite as an embedded database. The
"large-scale, multi-system architecture" in such common use today is
completely unnecessary when faced with this setup. I have multiple Restbed
integrated applications whose entire disk footprint is 7MB; they can run on a
$99 Intel Compute Stick, perform industrial grade facial recognition with
multiple HD video streams, and still overwhelm traditional web stacks with
events and data when pertinent events the software needs to report start
emitting over the wire.

The "only catches" are the developer(s) need experience working in multi-
threaded C++, and they need to understand the traditional web stack they are
eliminating.

~~~
s_ngularity
What about fault-tolerance though? That's definitely a single point of failure
scenario.

~~~
bsenftner
Run as many instances as your fault tollerance requirements needs. The expense
of adding another physical box is trivial when that physically box is $99 to
$250 total to own. They "pay for themselves" in their first month of use,
versus any cloud configuration running any 'amp or node or Mean or simply
"traditional" web stack.

------
e12e
Interesting how the write api doesn't appear to invalidate/update the memory
cache in the first diagram.

Still recommend people read Fielding's REST thesis - as it demonstrates a lot
of possible architectures (eg fat client or what we today call SPAs) - not
simply REST. Along with some trade-offs. (REST is mainly motivated by
simplicity of a simple hypertext application coupled with easy multi-level
caching).

[https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm](https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm)

For a preview of SPAs before the prevalence of Javascript, see 3.5, in
particular 3.5.3 "code on demand":

[https://www.ics.uci.edu/~fielding/pubs/dissertation/net_arch...](https://www.ics.uci.edu/~fielding/pubs/dissertation/net_arch_styles.htm#sec_3_5)

And keep in mind the text is from 2000. Early Ajax was introduced in IE in
1999, and late 2000 in Mozilla - but it took a while for Ajax to become
standardized...

------
yread
I hoped this would help me with this problem I have - I'm coding a web app
with a smallish database (<1GB for the next few years, <1% writes). I need low
latencies for accessing it. And I would like to have multiple servers over the
world sharing the database.

~~~
lalwanivikas
you need to provide more details to get any useful advice. but just based on
what you have described, any db would do the job. add a caching layer and you
have your low latencies.

again, what is the traffic and bandwidth load like? peak and average values?
what kind of data are you planning to store? small values but huge volumes or
the opposite? a lot will change based on your system requirements.

~~~
yread
To clarify: let's say I have servers in two locations A and B that are 200ms
from each other. When I issue a write to the db in A I don't want to wait
(multiples) of the 200ms before it returns. I don't really care whether the
write appears to a reader at B in 5s or 50 minutes but of course the writes
have to be at least causally consistent.

I won't have millions (realistically not even thousands) of users and the
database will be comparatively small.

I've looked at NDB cluster but it feels quite complicated to setup and
maintain

~~~
mindcrime
Consider Couchbase. It uses a combination of asynchronous writes and automatic
replication to do a pretty good job of giving low latency writes even at high
volume, while also ensuring data integrity. And since reads are served from
the cache if possible, you usually get really good read performance as well.

------
bovermyer
You know what hasn't been done? A blog post about how to make a service that
fulfills the needs of most people most of the time.

All of the online and print material about such things focus on how to achieve
massive scale correctly. Don't get me wrong; this is valuable and, generally,
sound advice.

However, it also ignores the majority of use cases for software.

I would love to see a blog post here from someone who has solved a very
specific problem for a very small audience, and gotten a very enthusiastic
response. That would be meaningful on a larger scale for me.

~~~
0xFACEFEED
This is because the people solving real world problems aren't writing
books/tutorials/guides.

Real world system design is dirty. Mostly this is due to constraints (time,
cost, etc). And no one starts with zero architecture and 10 million users.

Guides like this serve no purpose other than to fatten vocabularies and
promote the "brand" of people who aren't actually doing the work (speakers,
educators, etc).

~~~
bovermyer
I didn't say I was looking for a guide. I'm looking for a story.

Surfacing things like this elevates the entire practice, since it illuminates
what that "dirty" work looks like.

------
NightlyDev
I find it fun to thinker with high performance and high scalability designs,
but I, as most others, have no need for it.

Start out small, make efficient systems and have scalability in the back of
your head when doing so. Don't do as so many others: "Oh, this lib seems
popular, let's just use that! Heck, the cart sometimes takes 8 minutes to
load, we need to add more nodes on AWS!"

Yeah, stuff like that happens.

At least in my book optimization usually beats scalability as the place to
start for more performance.

------
stvnw
Is there something similar to designing scalable front-end systems and going
into deep discussions about how certain companies resolve similar issues at
scale? I'd be interested if there is a resource like that out there.
Everything out there tailored to systems design and architecture are
entrenched in backend components.

------
robax
As a junior dev who one day wants to be in a senior position, this is super
helpful. I failed the system design portion of the triplebyte interview and
this would have been invaluable to me. Thank you!

------
nwsm
This is a nice followup to the web architecture post yesterday

~~~
tryonqc
He/she means this one:
[https://news.ycombinator.com/item?id=17517155](https://news.ycombinator.com/item?id=17517155)

Today's post is way more in-depth. Good follow-up indeed.

~~~
peterwwillis
Obligatory pedantic HN grammar comment: on the outside chance that the gp's
gender is not binary, the word 'they' is a good stand-in gender neutral
pronoun to 'he/she'. You also have at least 14 alternatives to choose from
([https://en.wikipedia.org/wiki/Third-
person_pronoun#Summary](https://en.wikipedia.org/wiki/Third-
person_pronoun#Summary)) and two more if you're at a Renaissance faire
([https://en.wikipedia.org/wiki/Third-
person_pronoun#Historica...](https://en.wikipedia.org/wiki/Third-
person_pronoun#Historical_and_dialectal_gender-neutral_pronouns)). For the
grammar snobs, this convention has existed since the 16th century.

~~~
spraak
I wouldn't say it's obligatory, especially since the poster already was aware
of not assuming gender by using "he/she" (though I know some people identify
as neither of those). I do prefer singular they; it's very natural and yes,
it's been around in English for a long time.

~~~
tryonqc
I thought I did a good thing :(

The use of their / they refering a single person doesn't come naturally to me
as english is my 2nd language and we're taught its plural. (it can indeed be
used as "third person plural singular" according to oxford dict.)

Since its the "least bad" (to my ears) of the gender-neutral pronouns on the
wiki page I'll try to use the "they/their" instead.

------
d--
I'm teaching an intro distributed systems class and would like to share this
with my students. I was wondering about how general the linked interview
prepwork is. Are the Anki cards and sample interview questions mostly from
large companies (FB, Google, MS) or also applicable to interviewing at smaller
places?

At first look, seems like these are fairly general questions, which is great.

------
geggam
No database access layer ?

~~~
cirgue
What's the distinction between a database access layer and read/write apis? Is
that a semantic distinction or do they accomplish different things?

~~~
geggam
from my understanding you get the ability to put the DAL into a "pause" mode
where it queues all the api requests allowing you do to updates / upgrades to
the database with no downtime.

It also gives you a way of controlling what queries are used by the API
servers preventing a developer from doing silly things and creating a
production outage

~~~
gnahckire
It also makes changing your DB a lot easier since APIs using the DAL don't
need to be updated since they're DB agnostic -- you "only" need to update the
DAL API.

~~~
pc86
How often does one change the DB backing a live production application?

~~~
twistedpair
I've done it twice. If you're experiencing significant growth or change in
access patterns, you may for example go from Postgres to a KV store.

In one of the cases where I had to switch, we swapped from Cassandra to S3 for
100x OpEx savings since C* couldn't scale cost effectively to our needs, so we
rolled a database on top of S3 instead that well out performed C* for our use
case (e.g. need to export a 3B row CSV in a minute?).

~~~
pc86
I'm sure there are rare exceptions but I would imagine if you dug deeply into
the business rules around "I need to export a 3,000,000,000 row CSV file" and
into what the users are actually trying to accomplish at the end of that
workflow, you could find a solution that meets those goals better while also
obviating the need to export a 3,000,000,000 row CSV file.

------
squegles
This is a great outline for studying before interviews. I recently studied off
of this and can say it contributed to my success in SRE/Infra interviews.
Highly recommended!

------
ex_amazon_sde
Most of this stuff would not pass a design review at Amazon.

Anything that requires a fleet of (relational) databases to ensure consistency
will not work on a global scale.

------
tanilama
Large-Scale in what sense? A web service runs many instances isn't really
instantly indicating its complexity.

------
pier25
Anyone knows what software is being used to draw the diagrams?

~~~
poxrud
OmniGraffle

------
visviva
*Software systems

------
mozumder
What kind of numbers are they talking about for it to be "large-scale"?

One well designed fast app server can serve 1000 requests per second per
processor core, and you might have 50 processor cores in a 2U rack, for 50,000
requests per second. For database access, you now have fast NVMe disks that
can push 2 million IOPS to serve those 50,000 accesses.

50,000 requests per second is good enough for a million concurrent users,
maybe 10-50 million users per day.

If you have 50 million users per day, then you're already among the largest
websites in the world. Do you really need this sort of architecture for your
startup system?

If anything, you'd probably need a more distributed system that reduces
network latencies around the world, instead of a single scale-out system.

~~~
matachuan
Why not have a scale-up system?

~~~
marcosdumay
Because it costs money and slows development and ops down. Is there a good
reason for getting it when you are not one of the ~200 companies in the world
with enough scale to use it?

------
sillysaurus3
Note that HN, a top-1000 site in the US, runs on a single box via a single
racket process.

"The key to performance is elegance, not battalions of special cases."

~~~
nlawalker
Heh, elegance like "There is a story on the front page getting lots of
attention, please log out so we can serve you from cache."

~~~
tambourine_man
Admittedly, that’s very rare.

~~~
nlawalker
I know, just a good natured poke :) Plus you could probably take that comment
at face value - making use of web caching is definitely an important tool when
building a large scale system.

~~~
copperx
Why is caching out of the window when logged in?

~~~
paulddraper
It's a page-level cache, and your view on a page depends on username, hidden
submissions, point counts, etc.

