
There's Just No Getting Around It: You're Building a Distributed System (2013) - federicoponzi
https://queue.acm.org/detail.cfm?id=2482856
======
apeace
> There are many reasons why an organization would need to build a distributed
> system, but here are two examples:

> \- The demands of a consumer Web site/API or multitenant enterprise
> application simply exceed the computing capacity of any one machine.

> \- An enterprise moves an existing application, such as a three-tier system,
> onto a cloud service provider in order to save on hardware/data-center
> costs.

When you've exhausted the capacity of a single machine, typically you don't
jump straight to a distributed system. You can scale out horizontally with a
stateless application layer, as long as the data storage on the backend can
handle all the load. You can also scale database reads horizontally, using
read replicas.

This horizontal scale-out is not a distributed system, since consensus
("source of truth") still lives in one machine.

So I think a better phrasing of #1 would be "When your write patterns exceed
the computing capacity of any one machine".

~~~
flukus
There's also old fashioned performance improvements. Most enterprise software
has plenty of room for improvement there. I've noticed a correlation between
people that push for distributed systems and ones that don't know how things
like indexes and transactions work or the need to avoid n+1 queries.

I'm sure some apps need to be distributed systems, but I bet it's a tiny
minority.

~~~
apeace
Yes, that's exactly my point. This article makes it sound like you need a
distributed system at almost any level of scale. Yet the reality is, you only
need one of you are doing tons of writes. Most systems are read-heavy, which
is why most people scale reads horizontally instead of using a distributed
system.

~~~
mazerackham
More or less agree with your point, just wanted to point out that scaling
reads horizontally usually means adding a distributed quality to your system,
and not understanding how that impacts the system can have bad side effects

~~~
flukus
Where do you draw the line of where a horizontally scalable system begins?
Scaling reads might only involve a caching server, but most of us probably
wouldn't consider an app server with a cache to be distributed.

------
marknadal
Regarding testing distributed systems. Chaos Monkey, like they mention, is
awesome, and I also highly recommend getting Kyle to run Jepsen tests. But we
still need more tools on this front, so we built
[https://github.com/gundb/panic-server](https://github.com/gundb/panic-server)
which integrates with Mocha (and other unit test frameworks) to make it easy
to run failure scenarios across real and virtual machines. It has been a life
saver for me.

~~~
saryant
I've been keen to investigate Peter Alvaro's work around fault _injection_.
He's been working on that with Netflix[1].

The tl;dr is that you analyze _successful_ system outcomes and inject failures
along those paths to see if they subsequently fail. If they do, you've found a
bug. It's the next generation of Chaos Monkey.

[1] [http://techblog.netflix.com/2016/01/automated-failure-
testin...](http://techblog.netflix.com/2016/01/automated-failure-testing.html)

------
programminggeek
Any time you have something talking to a database, cache, or pretty much
anything else you have a distributed system, you just don't know it yet.

------
itsmemattchung
> Geographies. Will this system be global, or will it run in "silos" per
> region?

Although I've only worked at Amazon for about a year, I've learned that you
should always consider building siloed/regionalized applications—if not,
expect major headaches when the service needs to be deployed in multiple
environments.

------
spcelzrd
I'm going to join ACM just to see what happens. Who even is in this thing?

~~~
davidcuddeback
I joined ACM several years ago. It took filing charge backs with my credit
card company to cancel my membership. Even after canceling, I still get non
CAN-SPAM compliant emails from them from which I've found it very difficult to
unsubscribe.

If you're interested in joining a professional society for computer
scientists, software engineers, and electrical engineers that doesn't resort
to dark patterns, I'd recommend checking out IEEE. I don't get spam from them
and they respected my decision to cancel my membership. I generally find their
digital library to be of higher quality as well.

My two cents. YMMV.

~~~
spcelzrd
Thanks for the warning. I was serious about joining, so I'm still going to
give it a try. I'll use a disposable credit card.

~~~
tostitos1979
So .. I'm a professional scientist. ACM isn't a fly-by-night operation. It is
very similar to IEEE in terms of stature. In CS, I consider ACM conferences
slightly higher tier but this is not a hard and fast rule.

Yes .. they do send a lot of material I don't care for. But if you are a
student, it is definitely cheap and worthwhile to join. As a professional, you
have to make your own call .. I pay for it because I feel the money goes to
conferences, student subsidies, etc. (not really charity but I feel like I am
paying it forward). You get discounts at conferences but my employer pays for
those anyways.

~~~
smarks
Also agree. Plus, if you sign up for the ACM Digital Library, you get online
access to a __tremendous __library of papers, publications, conference
proceedings, etc.

------
peterwwillis
This is a verbose paper written by a tech bro that over-generalizes "How To
Build A Scaling High Availability Web App", but fails to explain how such
systems are designed in general. If you're a tech startup and have never
worked in this industry before, the very last paragraph is useful.

My personal ideal system design is one that you can pick up and drop into a
single machine, a LAN, a cloud network, geographically dispersed colocated
datacenters, etc without relying on 3rd party service providers. If you go
from a start-up to a billion dollar company, you will eventually have offices
with their own labs, dev, qa, middleware and ops teams, datacenters and
production facilities, and your hardware and software service providers will
run the gamut. If you can abstract the individual components of your system so
that dependencies can be replaced live without any changes to any other part
of the system, you have the start of a decent design.

However, nobody I've ever worked for designed their system this way initially,
and they made millions to billions of dollars, so there certainly is no
requirement that you have a perfect distributed system design for your emoji
app start-up.

~~~
d3ad1ysp0rk
"Mark Cavage is a software engineer at Joyent, where he works primarily on
distributed-systems software and maintains several open source toolkits, such
as ldapjs and restify. He was previously a senior software engineer with
Amazon Web Services, where he led the Identity and Access Management team."

What makes him a tech "bro", exactly?

