Hacker News new | past | comments | ask | show | jobs | submit login

Cache performance is not constant, if it was, we wouldn't even consider it when optimizing.

The point is to use data that was recently fetched into the (faster) cache memory as much as possible instead of incuring the penalty of a cache miss

Cache performance really depends on memory access patterns.

Anyways :) I'm being pedantic here, I should probably go back to work.

Memory access time is a constant multiplier, whether it hits or misses the cache. There can be a big difference in that constant depending on the access pattern, but it is still considered a constant factor.

Yes, random access (missing the cache every time) may be 100x slower than sequential (hitting the cache almost always), but if you're iterating through an array twice as large, it will still be 100x slower.

I agree, on the same machine, the slow-downs are constant for each level of cache and for ram access.

One of my favorite ACM papers was an analysis on how the asymptotic runtime behaviors of various sort algorithms didn't tell the full story, and that the faster algorithms took a pretty big data set before they even broke even with some of the other sort algorithms. In fact I think at even 10,000 records it was still about 30% slower than a 'worse' algorithm and you had to get up to sorting a considerable amount of data before it was 'best'.

I don't know about you, but I don't sort 100k records in a single batch, and if I am it's because I messed up. But I might sort different batches of 100 records 1000 times a minute.

Not necessarily. You might want to create an algorithm that iterates in a sequential fashion through memory, since your memory controller will see the iteration and prefetch the next block of memory. This is a huge performance win on larger data structures.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact