Hacker News new | past | comments | ask | show | jobs | submit login
An introduction to distributed systems (2017) (github.com)
318 points by yarapavan 46 days ago | hide | past | web | favorite | 9 comments

This is a great outline. Martin Kleppmann's Designing Data Intensive Applications [0] is very much in the same domain as what's in these notes.

Speaking about scale and distributed systems: AWS specific resources I find interesting from their EdgeEngineering/NetEng teams (I haven't seen many other service teams in AWS openly share as much about design as them):

- https://aws.amazon.com/blogs/architecture/category/networkin... series of articles on Route53's 100% data-plane availability architecture [1].

- https://www.youtube.com/watch?v=O8xLxNje30M colmmacc [2] (seems to be the eng behind AWS HyperPlane [3][4]?) on 10 design patterns for building resilient systems.

- https://www.youtube.com/watch?v=swQbA4zub20 Peter Vosshall (co-creator of Dynamo [5]) presenting "cell-based" design in-use at AWS.


[0] https://dataintensive.net/

[1] https://www.slideshare.net/AmazonWebServices/under-the-hood-...

[2] https://news.ycombinator.com/user?id=colmmacc

[3] https://atscaleconference.com/videos/networking-scale-2018-l...

[4] https://www.youtube.com/watch?v=dfEcd3zqPOA&feature=youtu.be...

[5] Decandia, G.; Hastorun, D.; Jampani, M.; Kakulapati, G.; Lakshman, A.; Pilchin, A.; Sivasubramanian, S.; Vosshall, P.; Vogels, W. (2007). "Dynamo: Amazon's Highly Available Key-value Store".

[6] Bonus: The SRE Book https://landing.google.com/sre/sre-book/toc/index.html

I haven't checked these links in a few years, but here are some more distributed systems resources; in this case, a list of reading lists.


Thank You

I've lost track of Kyle's work since my last job crumbled, but for anyone unfamiliar with him, it's a pretty safe bet that his material is among the best available on this topic.

His tool Jepsen is the gold standard for testing the consistency guarantees of distributed databases.

Material for Jepsen talks are in here - https://github.com/aphyr/jepsen-talks

We (backend engineering at Remind) went through this repo over the course of 6 weeks, taking turns leading the discussion. It went really well. We had a variety of participants, from seasoned backend engineers to boot camp grads, and everyone got a lot out of it.

The notion of what constitutes a ‘distributed system’ has become vague. The original meaning implied a high degree of transparency - to the point, say, that if you open a new tab in Chrome (you know, a web browser), it might run on a different node. The idea was for the cluster to appear to the user as a single computer. I think Plan9/Inferno was the last example of such system.

The Erlang VM presents such a unified view to programs that run on it. The asynchronous messaging built into the language works the same between servers as it does locally.

You can determine whether the process to which you need to communicate is local or remote if desired, but it's unusual to do so.

> Lots of people use a queue with six disk writes and fifteen network hops where a single socket write() could have sufficed

This issue is so wide spread in the industry...it makes me sad.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact