
Systems Design for Advanced Beginners - _ttg
https://robertheaton.com/2020/04/06/systems-design-for-advanced-beginners/
======
dzink
As a solo tech founder of several sites, I’ve had the pleasure of digging into
each of these problems and more over the past few years. Some topics worth
adding to the list for consumer sites:

1\. Prototype and benchmark each of your stack pieces before you pick a stack.
It is far easier to fix architecture mistakes when you don’t have 10000 users
expecting overnight customer service. If you are using new technology, find
good open source products to see how they’ve structured their projects. Your
architecture, designed for speed and experience, will be your key
differentiator. Inherent speed at core task, design to user needs, and name
choice are the 3 musketeers of a solid growth.

2\. Prepare for abusers to attack your system from every direction, especially
if you enable users to publish content under your domain. You will see bots
looking for Wordpress installations, users trying to fill content with SEO
links, users trying every hacking vector known to market. Collect known
vectors and test for them and never refuse a legitimate bug bounty request.

3\. There is an eternal debate about the trade off between building a quality
product at start or opening early to get user feedback before you get too deep
into features. There is merit to both choices as a solo founder. The moment
you open the gates to users, your ability to make changes comes with very high
friction. With time, trying new features becomes a tremendous luxury hidden
under bug requests, roadmap, customer service replies, etc. Build your biggest
riskiest assumptions first.

4\. Testing is hard as a solo founder. Selenium is your friend. If you don’t
spend time with them, your users will take that time in multiples after a
mistake.

The best way to learn is to come up with a product you really want and build
it in your own time. You can test launch in a weekend.

When I started launching consumer sites solo as an engineer, I went from a
tight specialization to being unafraid to try any tech if it gives me an
advantage in solving a problem. Once you’ve simulated enough problems and
dealt with the consequences of your choices personally, you can play that 10
level chess game with the architecture of each new feature much faster.

~~~
cactus2093
This is interesting advice because every company I've come across that has
been successful enough to get past the early startup phase has by and large
ignored all of this.

The companies I've seen succeed were 100% focused on shipping their product to
customers. Not 90% focused on customers and 10% focused on code quality, but
100% focused. They'd rather have to spend 30 engineer-days a few years from
now fixing an issue if they get to that point than spend 3 hours getting it
right upfront.

As an engineer that goes against every instinct I have, it really seems like
spending a couple hours upfront must be a better use of time. It seems like it
should be possible to spend 10% of your time setting yourself up well for the
future, that's still just a rounding error of your time. And then if you do
survive another few years, you'll have a huge leg up on other series B or C
stage competitors if you're not hindered by a lot of tech debt at that point.

But from a capitalist perspective, it's probably not so crazy. If you are
working with a $200,000 seed round in the beginning, and an engineer costs
$80,000 a year, 3 hours of their time costs $115 which is 0.06% of your funds.
And more importantly, that $200k is maybe enough for a year of runway, so 3
hours is 0.14% of the time you have to live given a 40-hour a week year (or
0.07% of an 80 hour a week year). Every bit of that starts to add up. Whereas
by the time you're a later-stage company and you've raised, say, $40 million
dollars and are paying engineers $150k, 30 engineer-days of work is $17,300
but that's only 0.04% of the money you've raised plus your runway is now
approaching infinity if you're close to profitable.

I'm still kind of playing devil's advocate here, my instinct really wants to
believe that a better balance than what I've seen is possible. One huge
missing factor is that people have a strong tendency to ignore spread out
costs like the time wasted fixing bugs that pop up later whereas the upfront
cost of writing a bunch of tests is more visible. But it has been interesting
for me to consider that maybe most founders really are acting pretty
reasonably, even though it originally seems pretty careless and lazy to let
your startup build up a ton of tech debt early on.

~~~
jacobr1
There are compounding gains from making "the right choice all else equal."
This is one of those semi-mythical 10x engineer super-powers. Or it might just
be experience. But technical choices need to be made and making slightly
better ones will pay off even in the short term. The technical changes should
just optimize for rapid change rather than stability or scale.

------
semicolonandson
Thought I'd post this here since I imagine there's a strong cross-over of
interest:

If you'd like a guided video code-tour of how all the pieces of a production
web-app fit together, I publish detailed weekly screencasts showcasing the
code, systems, and architecture behind Oxbridge Notes, the business that's
supported me for the past decade.

So far I've covered:

— software dependency vetting

— data integrity systems (constraints, foreign keys, transactions etc.)

— integration testing systems

— trade-offs in software quality between customer-facing and admin areas

— softer stuff, like designing for SEO (marketing ease is a critical part of
any system I design, as an indie-hacker)

[https://www.semicolonandsons.com/series/Inside-The-
Muse](https://www.semicolonandsons.com/series/Inside-The-Muse)

~~~
sciencewolf
Hey Jack,

Just a heads up - I tried to sign up for your mailing list but kept seeing
"ERROR: Did you forget to type your email?" even after typing the email. I've
tried a few different emails and also tried viewing the page in incognito
mode. Hope you get this fixed soon, as I'd really love to get more of your
content!

~~~
igneo676
I get the same error :/

Also, it'd be great to be able to turn off the mailing list sign up on the
videos. I'm already technically subscribed via RSS and it'd be great if the
videos play through without intervention

------
moonchild
They have this diagram:

    
    
      +-----------+   +--------------+   +-----------------+
      |Web Browser|   |Smartphone App|   |Client Libraries/|
      +-----+-----+   +------+-------+   |Other API code   |
            |                |           +-------+---------+
            |                v                   |
            |          +-----+------+            |
            +--------->+ Steveslist +<-----------+
                       |  Servers   |
                       +------------+
    
    

Why not this?

    
    
      +-----------+   +-----------------+   +--------------+
      |Web Browser|-->|Client Libraries/|<--|Smartphone App|
      +-----------+   |Other API code   |   +--------------+
                      +-------+---------+
                              |
                              v
                        +-----+------+
                        | Steveslist |
                        |  Servers   |
                        +------------+

~~~
filipn
Yes, I think the second diagram that you showed is the preferred way of doing
things, since using or "dogfooding" the client libraries will ensure they are
always tested and correct, plus you'll get immediate feedback from the other
developers who work on the apps as well.

~~~
amarant
In this day and age, your client libraries should be generated automatically.
And of course used in your apps/webpage to save time.

See openapi/swagger for good examples how client generation works

~~~
awofford
I've been using openapi/swagger for generating my clients and honestly think
it creates a mess of files and manually keeping clients up to date is not a
huge task. I'm not sure that I buy into "your client libraries should be
generated automatically" as a blanket statement.

~~~
kqr
Granted, APIs aren't updated often, but when they are, generated clients can
save a lot of time. I can update a generated client in half an hour and be
very confident of its correctness, whereas a manually constructed client would
take hours to update and test.

Of course, the generated client is not very convenient, being a 1-to-1 mapping
with the API. You build a high-level client with the more common operations on
top of the generated one.

It's possible to do this in a way that gets you the best of both approaches,
with some up-front planning.

------
dreamcompiler
Nice tutorial! One small nitpick:

> Calculating a hash value from an input is computationally very easy, but
> reversing the transformation and recovering the original input from its hash
> value takes so much time and computing power that it is, practically-
> speaking, impossible.

The above is true for encryption but not for hash codes. Recovering the
original input from a hash code is not just practically impossible; it's
_provably_ impossible -- even with an infinite amount of computing power --
because in general a hash code contains less information than the original
text.

~~~
sz4kerto
Actually, it might be occasionally possible. Not perfectly, of course. If the
input space is limited and you have other constraints on it, then you might be
able to find the original text. For example, let's say your input is a 280
character long tweet, the hash is only 16 bytes long. If you find that "I AM A
STABLE GENIUS" is one of the inputs that hashes to the given hash value, then
you can be reasonably sure that this was the original text, given that most of
the other inputs with the same hash won't be meaningful English sentences.

~~~
talaketu
I'd say the count of believable English tweets would be many orders of
magnitude greater the count of 16 byte hashes, so in general we can't just
find a believable tweet and declare that we found the plaintext. There would
be many possible collision.

The phrase you mentioned is notable (with a slight edit) - so I suppose you
mean you could index phrases by notability, then it may be tractable find
tweets of notable phrases, since the number of notable phrases is relatively
low.

------
Hyperborian
> How do they store their data?

SQL.

> How do their different applications talk to each other?

Proprietary APIs.

> How do they scale their systems to work for millions of users?

T H E C L O U D

> How do they keep them secure?

They just... don't.

> How do they make sure nothing goes wrong?

They just... don't.

> What are APIs, webhooks and client libraries, when you really get down to
> it?

Easily outsourced to India.

~~~
mberning
That last point is so painful and true. I work with a product that has dozens
if not hundreds of “connectors” to interface with other systems and you can
tell that most of them were farmed out to the lowest bidder.

------
gmanis
I have been doing a greater part of the things elaborated in the post as a
solo founder/freelance software person. Albeit the scale of things I need to
handle are relatively small.

What kind of a job profile should I be looking at if I am in a position where
I absolutely need to have one?

I don’t slot particularly well in any one thing I feel, neither a great
developer nor a great systems person. And I did do a bunch of recruiting and
tech consulting too:

~~~
fapjacks
Have you ever considered working technical support? As a lifelong generalist
programmer, I never imagined I would work in a support role beyond what every
programmer at a startup is expected to do, but I was persuaded to cross over
by someone I greatly respect. That support team was the most talented team I
ever worked with in almost twenty-five years in tech, and working in that role
grew many different skills both in and out of tech in ways I never imagined.
Hacking broken production environments back online and writing and pushing
bugfixes is exhilarating, and it was the single most valuable experience of my
career.

~~~
gmanis
Never considered support role to be honest. Maybe it feels a little inferior
after having those degrees I guess but that’s my own bias. I did think of like
a product management role at startups but they all expect you to have pedigree
or previous work ex in similar role. Catch-22!

~~~
fapjacks
Yeah, that's exactly what I thought, too. It is just one of those weird
cultural artifacts that when someone says "tech support" we automatically
conjure this mental image which is both not really technical and not really
flattering. It was definitely one of those life moments when I had to sit down
and talk myself out of a baseless prejudice.

Also, the sibling commenter mentions one of the big reasons I found so much
joy in it. This kind of hardcore technical support is in many respects closely
related to working on greenfield projects. In my experience, the fun stuff
usually ends up with some unfun caveats like company bureaucracy, office
politics, or having the tech stack chosen beforehand, this kind of thing. But
there's an interesting inversion when a company's production environment is
broken and you're the one that can fix it. Obstacles magically disappear and
you have a _tremendous_ amount of freedom to do your job, with effectively two
different companies doing what they can to enable your work, because the only
thing everyone cares about is getting it working again.

You do have to be able to walk the walk -- and I cannot stress enough that it
takes a certain type, and just knowing your shit isn't going to cut it -- and
you also have to get some personal enjoyment out of chaotic environments, and
be able to talk people out of a tree sometimes. But you'll never run out of
new and interesting problems to solve, you'll be testing your mettle far more
frequently than your peers, and you'll be expanding your professional network
every week about as much as everybody else does once or twice a year when they
go to a convention. It's the closest thing the tech industry has to a
firefighter or superhero or something.

~~~
Davertron
Maybe infosec or cyber security would be similar? I'm a fullstack turned
mostly FE developer these days, but I'm very interested in security and it
seems like there is a lot of demand out there for that skill set and plenty of
opportunity. It seems like it would push a lot of the same buttons that you
mentioned (fixing broken systems, lots of autonomy, working with smart folks).
I tend to get bored with long-term projects unless there are still interesting
problems to be solved so working on lots of smaller projects sounds pretty
appealing, and there's seemingly no end to new systems/devices/websites that
are just riddled with holes. Seems like a perfect role and potentially more
lucrative than IT (although this is just pure guessing, I haven't compared
numbers).

~~~
gmanis
I have explored security world too. Lots of experiments with reversing tools
like IDAPro, ollydbg.

Recently a client asked if I can help reverse their own android app because
the developers were holding their source code hostage. It was lots of fun and
lots of struggle but overall satisfying.

Maybe something like basic security freelance work especially basic webapp
security(xss,sqli,csrf, the likes) and some process security with maintaining
proper logins, maintaining and rotating credentials etc.

Thank you for your ideas.

------
amdelamar
This basically outlines my current job. Huge system with many, constantly
moving/upgrading parts and services across teams, all while utilizing dozens
of internal services, tools, and navigating corporate policies. I’m thriving
in it, but totally recognize it’s a steep learning curve and takes longer to
onboard newcomers.

Eventually you get to enjoy deprecating old services as much as building new
ones, simply because you never have to teach others about them again.

------
bibabaloo
> Steveslist currently has a very simple and slightly fragile cron setup. We
> have a single “scheduled jobs server”. We use crontab on this server ...
> This setup isn’t scaling very well ... We’re considering setting up a new
> system using a modern tool like Kubernetes.

Do people use Kubernetes for running scheduled jobs like this? It seems like
it'd be overkill but in saying that I'm not sure if I know of anything that
can be used for running scheduled tasks in a reliable and observable way
that's scalable. Maybe Jenkins?

~~~
cyberdrunk
> Do people use Kubernetes for running scheduled jobs like this?

We do, mostly because we already have Kubernetes and automation to deploy to
it, so why not? The reliablity and monitoring are better than something we'd
cook up on our own.

------
pc9
How can I gain practical experience in these things? The jobs I've had mainly
revolve around adding new features, not doing any of the things described in
the article. Does working at bigger companies actually give you experience
with this?

~~~
taigi100
Yes and no, depends. It gives such experience to a software architect,
otherwise not really. Tho, software architects do a lot more than just systems
design (tho, that is pretty much their main task). But yes, you get a job as a
software architect or something like an "in-training" one, helping one, etc.
at a big corporation to get such experience (I'd argue that small companies
only need some code design, you want a lot of systems/software design you
should go for a big corporation)

~~~
mav3rick
You evolve into the architect role as an IC. No new grad is hired like this.
You design things and keep going up in scale.

~~~
taigi100
It happens from time to time, but yes - there is no specific job / hiring for
this.

The default/common path is evolving into the architect role which seems like a
"bad" process of developing an architect.

The best way to develop a new architect, is to have him learn alongside a
mentor/teacher who is an architect himself. Developing and architecture are
very different jobs which require different mindsets and skills. Also, I've
seen many places, teams, etc where people are mostly made to implement feature
after feature with no time in between for learning, self-development, courses,
etc.

Otherwise, after you've just "arrived" to that architect role you start
learning on your own what architecture really is and means.

~~~
mav3rick
Developing and architecture are absolutely related. Many ideas seem great on
paper till you go and implement them and find out the pain points. All the
great architects have often been in the weeds of their systems.

------
rustamm
Regarding DB backups, it would be useful to add that it is not enough to
simply make backups, one should also test that the backups are actually
restorable.

~~~
chasd00
also the restore/recover process has to be laid out step by step and be as
simple as possible. No one is doing a restore in a relaxed, easy going state
of mind. Restores are done in full panic mode with the team thinking about how
they're going to break the news to their families that they've lost their job.

------
rhlsthrm
Just recently started messing with AWS Amplify. It's crazy how easy it makes
basically all this stuff. It really makes it simple to be a solo founder and
manage every piece of the stack.

------
imvetri
Title corrected - Web application system design for advanced beginners

~~~
fouc
Yeah, it's not true "Systems design" since it's specific to programming

------
avipars
I love the phrase "Advanced Beginner"

------
zeckalpha
Don’t forget a system for organizing contributions to all this code, a system
to ensure compliance with the many legal regimes “Steveslist” is offered in, a
system for onboarding new hires, a system for expanding market share, a system
for keeping money and reporting on it to wouldbe investors, and a system of
systems designers that know how to respond to changing and ever expanding
requirements.

Don’t sell yourself short: there are systems everywhere you look, not just
where there’s convenient precedent to make a web service.

------
zealsham
This is the best thing I have read in 2020. As a bug bounty Hunter this is
insanely helpful to me .

------
Roybot
> To do this, we need to write database queries that aggregate over the
> entirety of our data. We don’t want to run these queries against our
> production SQL database, because they could put an enormous amount of load
> on it. We don’t want a huge query issued by an internal analyst to be able
> to bring our production database to a grinding halt

What kind of query would you have to write to bring down a production db? What
makes a solution like hive much better - I guess its optimized for this?

~~~
sradman
> What kind of query would you have to write to bring down a production db?

Scans and Sorts, seen in a query plan, are relatively expensive to run in a
production row store. OLAP queries (GROUP BY with aggregate functions like
COUNT, SUM, and AVG) do large^/full table scans by definition. They take
seconds to run while your goal in a Cloud OLTP system is thousands of requests
per second. An automatic sort issued per query in an OLTP system is
pathological and represents a vector for a DoS attack.

> What makes a solution like hive much better - I guess its optimized for
> this?

Column stores use compressed bitmap indexes that are optimized for scans over
a small number of columns. Hive is SQL over Hadoop, and is inherently slow but
it does offload the processing from your Production OLTP server. Hive supports
the RCFile format which is partially column oriented. The ORC file format is
fully column oriented, replaces RCFile format, but requires Presto (or
equivalent). Hive is brownfield for existing Hadoop clusters but it has no
place in a discussion about greenfield architecture other than discussing
historical systems.

If you have a need for GROUP BY style analytics, a true column store like
Presto, Impala, or RedShift is a necessity.

^EDIT: based on zbentley's comment

~~~
zbentley
> OLAP queries (GROUP BY with aggregate functions like COUNT, SUM, and AVG) do
> full table scans by definition

Isn't it only a full table scan if your query isn't otherwise filtered? Those
functions have to read every row of "something", but that something might not
always be a whole table.

~~~
sradman
Very true, I've edited my comment. Some GROUP BY queries are inexpensive and
are fine in an OLTP system if the WHERE clause restriction limits the result
set size, most however, are meant to scan a large number of rows.

~~~
Roybot
Thank you - this was insightful.

------
iblaine
Systems Design is a formidable topic because it can go in so many directions.
This is a good guide. Might I also recommend something that includes caching,
partitioning, indexing, and NoSQL vs SQL. More detail can be found here
[https://www.educative.io/courses/grokking-the-system-
design-...](https://www.educative.io/courses/grokking-the-system-design-
interview)

~~~
humanlion87
I remember using the course you have linked to some time ago - it was very
useful. I noticed now that the course is sold on a subscription model. I don't
remember being that the case. Would you happen to know when that changed?

~~~
iblaine
I don't. I came across this in 2018 and recall the course being for sale for
$80.

------
withinboredom
> We’re considering either splitting up our cron jobs into multiple servers,
> or setting up a new system using a modern tool like Kubernetes.

:facepalm: beanstalkd can handle a pretty massive amount of jobs before you
need to start worrying about scale. I love the huge jump from simple to
complex.

~~~
zbentley
To be fair they offered a simple option (splitting crontabs) as well.

------
brianzelip
Really like the prose/Adventure-like writing style of this "tutorial"!

------
tinalumfoil
> database engines that are quick at small queries are typically unacceptably
> slow of answering giant queries

Even with proper indexing? I haven't seen this issue with Postgres but maybe I
wasn't working on large enough data sets.

------
boredatworkme
off topic: I like the simplistic design of this website. Can someone please
help me understand if this is hosted on WordPress or something like that?

~~~
tonyedgecombe
It might be Jekyll according to [https://robertheaton.com/2014/07/26/lessons-
from-a-surprisin...](https://robertheaton.com/2014/07/26/lessons-from-a-
surprisingly-successful-blog/)

------
rataata_jr
This is amazing. Thank you Robert.

------
giggl
Great post! Thanks for sharing.

------
pantulis
Great post!

------
forgotmypwbctbi
anyone else just getting a blank page here?

------
banq
no DDD?

