Hacker News new | comments | ask | show | jobs | submit login

Exponential back off is a good idea, but to make it even better you'll want to add some random jitter to the delay period. Especially in the case of deadlocks this helps avoid another deadlock when two things would have retried at exactly the same time.

Absolutely. I made some graphs of different kinds of jitter approaches at http://www.awsarchitectureblog.com/2015/03/backoff.html. Simply backing off with delay does very little to spread out spikes of work, and can lead to longer MTTRs if the cause of the issue was overload.

Would it be possible for the server to simply return a queue position to the contending clients? For instance, "There are 37 clients ahead of you. Try again in 37ms." (assuming each write is averaging 1ms)

I suspect if the client and server worked together in this fashion you could get much closer to the linear completion time seen with no backoff and also closer to linear work (since each client would try exactly twice in an ideal world where the server accurately predicted when it should try again).

Sorry. I accidentally down voted your comment when trying to up vote it on my phone! The article was very interesting.

I think most exponential backoff schemes I've looked at treat the 'backoff delay' as a range and do randrange(0, current_backoff) or similar which does what you say.

Truncated exponential (where you cap the upper limit of the retry delay to some maximum) is also often a good idea, to prevent a short service outage from spiking the retry timers to crazy numbers.

Doesn't exponential back-off mean that you select the time until retry uniformly at random from the interval [0, ..., 2^n-1] in the n-th round of failure, or something along those lines?

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact