
Ask HN: Recommended journey on becoming a distributed systems practitioner? - fizwhiz
I&#x27;ve been programming professionally for ~3 years now and I started out knowing absolutely nothing. It&#x27;s been a fun journey, and I&#x27;ve learned a great deal about straight-forward application development (NodeJS&#x2F;Spring). Lately though, I&#x27;ve been really fascinated by distributed systems architectures, why they&#x27;re built in a particular way and what they&#x27;re hoping to achieve. I find myself hopping hopelessly between blogposts by notable practitioners[1][2][3], early academic literature[4] and conference videos[5].<p>Let me be honest: I&#x27;m drowning in a deluge of information, but I&#x27;m convinced that these are the kinds of interesting problems I want to work on moving forward. I&#x27;m caught between trying too hard to focus on the theory&#x2F;literature (which is a time-consuming endeavor) vs duct-taping systems that I may not fully understand the tradeoffs of (which feels haphazard). Is there a middle ground? Is there a sane way for an app-developer-by-day like myself to work out a solid understanding of these concepts so that I can build these systems with a degree of confidence? A colleague recently recommended that building out my own simple distributed key value store would force me to learn a lot of things in the process (consistent hashing, leader election, MQs, Merkel Trees, Vector clocks). Does HN have any such recommendations and motivating projects to kick things off? It could advice to build something particular or some repo that I should study intently to glean insights from.<p>[1] https:&#x2F;&#x2F;aphyr.com&#x2F;posts&#x2F;291-call-me-maybe-zookeeper
[2] https:&#x2F;&#x2F;engineering.linkedin.com&#x2F;distributed-systems&#x2F;log-what-every-software-engineer-should-know-about-real-time-datas-unifying
[3] http:&#x2F;&#x2F;book.mixu.net&#x2F;distsys&#x2F;single-page.html
[4] http:&#x2F;&#x2F;research.microsoft.com&#x2F;en-us&#x2F;um&#x2F;people&#x2F;lamport&#x2F;pubs&#x2F;pubs.html
[5] https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=BKqgGpAOv1w
======
nostrademons
Get a job at a big tech company (Google, Facebook, Microsoft, Yahoo, DropBox,
Uber, etc.). Beyond a certain scale, every problem is distributed, and you'll
be dealing with distributed systems and their pitfalls as a routine part of
your job.

It's very difficult to tell what is of practical significance vs. what is just
theoretically interesting without directly working with a real distributed
system storing real data for real users. For example, master election is
something you rarely face in the day-to-day: you build a system once (or just
use Zookeeper) and then re-use it for all your systems. Merkel trees and
vector clocks are fairly special-purpose - once in a while you'll find a
system where they're critical, but a lot of the time you don't need them.
Versioning problems and choices of ID schemas, however, come up _all the
time_. Bloom filters are far more useful than the amount of theoretical or
blogosphere attention given to them would indicate, sharding functions are
important to know, and concurrency & buffering are critical skills.

~~~
fizwhiz
My personal experience at at large tech company is that there are usually
designated teams thinking about these infrastructural problems so that the
rest of the engineers don't have to worry about them. So as someone managing a
"service" in a company that has a SOA model, sure, I'm technically a part of a
distributed system but the deliberate abstractions minimize the "necessary"
exposure to dealing with distributed systems.

I'm in complete agreement with your thoughts on stuff with practical
significance vs theoretically interesting topics (I'm far more interested in
building some expertise around practically significant things _first_.) Are
there any particular projects you'd recommend I follow to grow my knowledge in
this space? Alternatively, would you have any recommendations on specific
side-projects that could really motivate a good understanding of these
concepts?

