Increasing the size of Ruby objects to minimize CPU cache misses

riffraff · on Jan 4, 2014

I wonder: couldn't it have been reduced to 4 words instead?

I seem to recall most RValue objects are actually smaller than 5 words and I only remember class objects being this large. I may be completely off base anyway.

FooBarWidget · on Jan 4, 2014

No it couldn't. 5 words is the minimum amount of storage that Ruby needs for representing all its core types.

programminggeek · on Jan 4, 2014

It's amazing to me how at every level caching helps considerably, and yet it is often one of the last things people seem to optimize.

FooBarWidget · on Jan 4, 2014

Because it's not easy to optimize for caching. The currently available tools for doing so are primitive at best. It's very hard to gain good insight in your program's cache behavior and very hard to figure out what you can do about it. There is also very little available literature about cache optimization.

_delirium · on Jan 4, 2014

They tend to also be brittle optimizations if your code is not tied to a specific platform, since cache behavior on different architectures and even micro-architectures differs significantly.

pkroll · on Jan 4, 2014

Since you said "at every level": optimizing via caching (at the program/memcached level) before you have otherwise made the code go as fast as it can has several problems.

It can obscure future performance enhancements, may take away cache memory from other parts of the code that are already optimized and need caching to reach full speed, and can be used as a replacement for seeking out improved algorithms, causing more structural changes down the line when performance needs to be improved and you can no longer "just add caching!

The low-level CPU caching is, as _delirium points out, highly specific to an architecture, so it's almost got to be the last thing on the list to optimize.

All this is to say, starting by looking at the cache is premature optimization most of the time.

programminggeek · on Jan 5, 2014

Yes, I agree that starting by caching when you have something that could be a lot faster by just doing smarter things with the code in the first place is not the right path to go. However, my point was just that there are many levels of caching going on in a computer and often times developers aren't even aware of them.

For example, how many people write their own file system or memory based caching scheme or jump right to memcache to cache database queries, when they really could just cache the whole page using Varnish or Squid? Just understanding that there are those various levels of caching from processor to memory to filesystem to web server to opcode caching (or JIT'ing) to memcaching to database caching. A lot of developers just flat out don't understand that all of those things exist and what to do make the most of them in the right situations.

_gtly · on Jan 4, 2014

looks like it's assigned to ruby core as a (potential) feature for 2.2 already!: http://bugs.ruby-lang.org/issues/9362

jamesaguilar · on Jan 4, 2014

Another option, just annotate it with your compiler's cache alignment attribute. That way you need not redo the size for each architecture.

plorkyeran · on Jan 4, 2014

That'd just leave you with wasted space rather than making more space available for storing things inline. For the real-world tests I would not be at all surprised if the reduced number of heap allocations was a bigger source of speedups than the more cache-friendly reads.

dschiptsov · on Jan 4, 2014

Someone read "The lost art of C structure parking" by Mr. Raymond.))