I do some graphics stuff. As soon as you get to "this chunk of code needs to be run for every pixel of every 4k frame at 60fps", suddenly the number of clock cycles and registers matters... Some of my platforms don't have GPU's, so it really is squeezing everything possible out of the language and compiler...