Hacker News new | past | comments | ask | show | jobs | submit login

To rephrase the sibling comment, if you had an array of four {ABCD} structs, there's basically two ways of storing them on disk:

1. AAAA BBBB CCCC DDDD

2. ABCD ABCD ABCD ABCD

One major heuristic in how CPUs make your code fast is to assume that if you access some memory, you're probably interested in the memory nearby. So when you access the first "A" bit of memory (common to both sequences above), depending on the memory layout you use, the CPU might also be smart and load the next bits into memory too -- maybe the next "AA", maybe "BC".

Depending on your workload, one or the other of those might be faster. If you're only interested in the first ABCD element because you're doing

  SELECT * FROM users WHERE id=$1
then you'll likely want "row-oriented" data -- the #2 scheme above. But if you're interested in all of the A values and none of the values from B/C/D because you're doing

  SELECT AVG(age) FROM users
then you'll likely want something "column-oriented" -- the #1 scheme above.



Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: