Hacker News new | comments | show | ask | jobs | submit login

The first concrete thing I learnt is this - implement pull first, it works 100% of the time, but may be inefficient with regards to time. Then implement push, it works 99% of the time but is much faster. But always have both running.

I'm totally in agreement with this. Both processes should also be idempotent (you should be able to pull multiple times without side effects, and push and pull should be able to happen at the same time without side effects). When everything is working well, the 'push' does all the work, and though the 'pull' runs every few minutes/seconds/whatever it never has anything to do.

This same thing applies to time-based events: your system should not assume that the process is always running, so if something needs to happen at exactly 9:00am (and it's not okay to just skip it if missed), it should be able to run anytime later with the same outcome as if it ran at 9:00am.

When I was first getting into this, it helped me to understand that events must follow one of the semantics: at-most-once, at-least-once and exactly-once -- with the trick that exactly-once is not strictly possible [1].

    There are only two hard problems in distributed systems: 
      2. Exactly-once delivery 
      1. Guaranteed order of messages
      2. Exactly-once delivery 

    -Mathias Verraes [2]
[1] http://bravenewgeek.com/you-cannot-have-exactly-once-deliver...

[2] https://twitter.com/mathiasverraes/status/632260618599403520

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact