Hacker News new | comments | ask | show | jobs | submit login

Great series of articles and the lessons are very important to someone writing performance system's programs.

Here is another chart I like you show people: https://dl.dropboxusercontent.com/u/4893/mem_lat3.jpg

This is a circular linked list walk where the elements of the list are in order in memory. So in C the list walk looks like this: while (1) p = *p;

Then the time per access was measured as the total length of the array was increased and the stride across that array was increased. The linked-list walk prevents out of order processors from getting ahead. (BTW another huge reason why vectors are better than lists)

(This is from an old processor that didn't have a memory prefetcher with stride detection in the memory controller. A modern x86 will magically go fast.)

From that chart you can read, L1 size, L2 size, cache line size, cache associativly, page size, TLB size. (It also exposed an internal port scheduling bug on the L2. A 16-byte stride should have been faster than a 32-byte stride.)

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact