Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: How to learn about distributed systems when not possible at work?
35 points by askertoday on Aug 19, 2016 | hide | past | favorite | 6 comments

I work for a startup, and worked for many before, and all of them are with the monolithic, mid-sized code bases that do great just with a conventional stack (2 or 3 big servers and simple rerouting).

My question is: How do people learn about distributed systems if never given the chance in a professional setting? I am sure there is a book or two to go over, but would that give me the same skill-set that would allow me to convince a company asking for experience with distributed systems to hire me?

It seems I'm a few weeks ahead of you on this same problem. What I did was just notice a discussion in a HN thread and reached out to the guy in the discussion who seemed the most knowledgeable: joke's on me, that guy's the CTO of hashicorp.

Anyway, I just asked him how I should go about in learning and he linked me to (presumably) his alma mater's course list for distributed systems: https://courses.cs.washington.edu/courses/cse452/16wi/calend...

Moreover, he linked me some fundamentals: * Lamport Time * CAP Theorem * FLP Impossibility Theorem * Bimodal Multicast * Paxos

What I would advise doing with those fundamentals is going to google scholar and finding the original papers and adding them to your watch list. Then you can see wherever they're referenced, go through the list and pick out papers and read one a day (I advise no matter what, but that's not feasible for everyone--just keep reading!)

I can't say I'm a guru at all in this sphere, but this is how I'm approaching it. Hopefully you can take some of my advice and tailor it to best suit your needs and how you learn best. Beyond that, I'd say build, build, build: start out at your naive solution and improve over time. I've been working on a distributed key-value storage (https://github.com/GrappigPanda/Olivia) and once you get deeper entrenched into a problem like this, a lot of the problems really make themselves apparent and you can learn exactly where your naive solution differs from more optimal solutions.

Interesting approach. I'd like to know some of the specific lessons learnt using the naive approach and things to look out for.

I got started by doing the online MIT labs linked here:


You write a MapReduce/Paxos implementation in Golang, and then their test suite tests your implementation. There are many tricky edge cases, so having an automated test suite is seriously helpful.

Looks like since last year they've switched from Paxos to Raft (pretty great idea, the Paxos papers are terrible), here's the latest URL:


I would guess that any company doing distsys work would be impressed by you having a Raft or Paxos implementation on GitHub, enough to get an interview etc.

I would suggest looking at Rancher http://rancher.com/ - It's a great way to learn distributed container orchestration.

I was in the same boat; I started getting interested in distributed systems about 5 years ago but I never had the opportunity to play with those cool tools while at work so in my own time I built an open source project with a focus on scalability: http://socketcluster.io/

More recently, I've been implementing open source stacks to run and auto-scale on Docker + Kubernetes (using Rancher) and it's been pretty mind-blowing. I highly recommend playing around with those technologies - I don't have any doubt that this is where the software industry is all heading.

Already, I noticed some Kubernetes-related job postings coming up on various online job portals and they tend to pay REALLY well so it looks like a good area to work towards.

I think also it's important to read stuff online about CAP theorem and also various popular algorithms for building distributed systems like Pub/Sub, Message Queues, Raft (consensus), data sharding (e.g. consistent hashing) and others... I think if you start with those, you will stumble upon new material as you go along and you can gradually build up your knowledge.

I think playing around with the advanced networking features offered by IaaS platforms like Amazon EC2 is also a good way to put your theories (and code) to the test.

This starts in two weeks. I'm taking it too!


Here's a complete tutorial--from concepts to step-by-step command-line scripting. Eg, you provision EC2 instances, deploy a Hadoop cluster, and build a simple data processing pipeline on the cluster.

this is intended primarily for data scientists; two mid-level devs in my shop worked through it, and said it was solid.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact