I'm an ME who went into embedded systems, so I'm not sure what qualifies as low level for a web developer and if checking timing using an oscilloscope or saleae is feasible for the type of work OP wants to do. But even just looking at the assembly would have made things obvious in my example.
EDIT- This made me realize that a rudimentary understanding of the assembly language for your architecture will be very valuable. Doesn't have to be enough to actually write any code in it, but it's great to be able to take a look at what was generated when things are acting weird.
Without that though, performance is really really bad. The inner loop of a motor controller is no place for them...
Anyway a bit toggle tells all, and if you're making speed optimizations you should be checking that things actually got faster.