
Latency numbers every programmer should know (2012) - okket
https://gist.github.com/hellerbarde/2843375
======
eatbitseveryday
Instead of memorizing tables, it would be more valuable to programmers to read
how these numbers come to be in the first place. An excellent, and much longer
read, is by Ulrich Drepper[1].

[1] [http://lwn.net/Articles/250967/](http://lwn.net/Articles/250967/)

~~~
tayo42
I came across this article the other day for the first time and started to
read it. How much still applies to today?

~~~
Const-me
Part 1: 50% applies, RAM controllers now part of the CPU, deprecating half of
what’s written, also some CPUs have eDRAM. Part 2: almost completely applies.
Part 3: 70% applies, hardware SLAT (Intel EPT / AMD RVI) deprecated what’s
written about virtualization. Part 5: 90% applies. The rest of them — don’t
know.

P.S. What’s I dislike most about the article, it fails to explain why L3 cache
is 10-20 times slower than L1 cache, while they both made from SRAM.

~~~
nateberkopec
> it fails to explain why L3 cache is 10-20 times slower than L1 cache, while
> they both made from SRAM.

Why is it? Is it because L3 is usually shared?

~~~
Const-me
Even when the cache line is unshared, L3 is still 10 times slower than L1.

The best explanation I saw is this:
[https://fgiesen.wordpress.com/2016/08/07/why-do-cpus-have-
mu...](https://fgiesen.wordpress.com/2016/08/07/why-do-cpus-have-multiple-
cache-levels/)

------
snarfy
Reminds me of the video of Grace Hopper talking about nanoseconds:

[https://www.youtube.com/watch?v=JEpsKnWZrJ8](https://www.youtube.com/watch?v=JEpsKnWZrJ8)

------
pacaro
It's also powerful to think of these numbers in the context of complexity. You
might not worry about a nanosecond here or there in an O(N) algorithm, but at
O(N^3), when N is 100 each nanosecond in the inner loop is a millisecond of
wall clock. Blowing the cache in that inner loop is going to hurt

~~~
R_haterade
Can confirm. Currently scrambling to finish up code for school, and the
additional synchronization involved in false-sharing is killing me.

------
luckydude
I agree whole heartedly that these are good numbers to have in your head.

[http://www.bitmover.com/lmbench/](http://www.bitmover.com/lmbench/)

measures a bunch of these in a portable way. It tries to give you latency &
bandwidth of all the things.

------
jheriko
Always always measure.

Better than guessing based on wrong or old information

------
dredmorbius
I'm considering an "Ask HN" on things programmers should know.

There's "things programmers should know" according to HN:
[https://hn.algolia.com/?query=%22programmers%20should%20know...](https://hn.algolia.com/?query=%22programmers%20should%20know%22&sort=byPopularity&prefix=false&page=0&dateRange=all&type=story)

And as previously mentioned, _97 Things every programmer should know:
collective wisdom from the experts_ , Kevlin Henney, ed., O'Reilly, 2010

[http://www.worldcat.org/title/97-things-every-programmer-
sho...](http://www.worldcat.org/title/97-things-every-programmer-should-know-
collective-wisdom-from-the-experts/oclc/460060136&referer=brief_results)

Most of the latter is practices, with a few technical recommendations: DRY,
floating-points, IPC, linkers, BTS. Most, however, aren't.

------
a3n
I searched DDG and github for "every programmer." Man, I gotta get busy,
that's a lot to know.

~~~
dredmorbius
Fortunately, there's a book: _97 Things Every Programmer Should Know_

[https://books.google.com/books/about/97_Things_Every_Program...](https://books.google.com/books/about/97_Things_Every_Programmer_Should_Know.html?id=sS7aPtrUuw4C)

[http://www.worldcat.org/title/97-things-every-programmer-
sho...](http://www.worldcat.org/title/97-things-every-programmer-should-know-
collective-wisdom-from-the-experts/oclc/460060136&referer=brief_results)

~~~
contingencies
Free Gitbook version @ [https://97-things-every-x-should-
know.gitbooks.io/97-things-...](https://97-things-every-x-should-
know.gitbooks.io/97-things-every-programmer-should-know/content/en/index.html)

------
webnanners
Not every programmer needs to know these numbers.

~~~
nxc18
Its not so much about programmers needing to know the numbers.

Its more about programmers needing to know what L1 cache is, the idea that
some operations are faster than others, etc.

I know a lot of web dev type people who have no idea how the CPU works, what a
register is, what paging or virtual memory are, etc. When you're treating
compute resources like they're free and abundant (what web devs like to do
these days), then of course you don't care. I just wish those devs did care,
because their fancy dev machines blind them to the fact that their
theoretically simple web app groans on anything other than an i7 with 8Gb ram.
Their fast internet and local servers also seem to make them forget why its
bad that first load requires megabytes of js. Sometimes I'd rather browse with
my cheap tablet and that nonsense seriously blows.

There's a lot of gluttony in development these days. I wish every developer
was required to take a basic OS or assembly course to see just how much is
happening between writing js and having it actually execute. To see what it
really means to program a computer and not a web browser.

I also wish more devs would take a look at the performance of MS word and the
performance of Google Docs and apply a little critical thinking. On my cheap
Surface 3 (4gb ram i3) Word loads instantly, sips power, and does everything
I'd ever want locally. Docs takes forever, destroys the battery and is slow as
molasses in January with both Edge and Chrome.

Yes, every programmer does need to know these numbers and why the numbers are
what they are.

~~~
digi_owl
Sadly such (willful?) blindness is far from unique to web devs (though much of
it seems to originate with web dev these days).

I have personally encountered people that dismiss the whole "lets stuff
everything in /usr" issue with claiming that everyone (or at least those they
care about) are using lights out management anyways.

------
ilaksh
According to sources like this
[https://www.cs.utah.edu/~manua/pubs/systor15.pdf](https://www.cs.utah.edu/~manua/pubs/systor15.pdf)
NVMe 'disks' give another 5-10X IOPS boost. With M.2 the pricing is pretty
decent for many things.

I think programmers or system designers that miss things like new secondary
storage technologies are doing it wrong because the orders of magnitude can
really affect development time.

------
caleblloyd
This is super applicable, even in web dev. Properly using a clustered index in
an RDBMS causes sequential reads instead of random reads.

That knowledge alone can speed up queries orders of magnitude and prevent the
need to move to a NoSQL solution.

------
webkike
I think it's far more useful to remember ratios rather than exact numbers.

~~~
hinkley
An important reason we keep using strategies that have been abandoned in the
past is because the ratios change over time.

Ubiquity of SSDs brings to a close a time when getting data from another
computer is faster than getting it from your own disk drive. A ratio that had
previously been true, IIRC, not since a time in the eighties (see the Sprite
system, process migration).

------
en4bz
For those interested in CPU specific numbers I've found this site [1] to be
quite useful.

[1] [http://www.7-cpu.com/](http://www.7-cpu.com/)

------
throwaway2016a
I've seen this chart in various forms going around for years. Do the numbers
ever go down or is this the same chart from 5 years ago? In which case I would
think the numbers would be high.

------
Keyframe
_Assuming ~1GB /sec SSD_

Is this from the future?

~~~
to3m
Not very common at the time but they did exist! Here's a review of one from
2012: [http://www.anandtech.com/show/6124/the-intel-
ssd-910-review/...](http://www.anandtech.com/show/6124/the-intel-
ssd-910-review/4)

The SATA III bottleneck is still common today, but Mac laptops, at least, have
sported SSDs that can do ~1GByte/sec for a couple of revisions now. I think
they're PCI Express too.

