Authors often do get noticeably worse after becoming famous, though. Take the reasonably well-known example of Robert Jordan, who wrote a series of 5 great books followed by 5 lackluster books. It's hard to say he just got lucky five times in a row.
And you can look at the same question another way by reading self-published (therefore, unedited) books -- I used to be quite open to reading those; bitter experience has taught me not to bother.
Does anyone else find it hilarious to see a bureaucrat heading up an enormous and bloated state intelligence apparatus trying to give a lecture about free markets?
Using Horner's method to evaluate the polynomials is probably suboptimal because it introduces dependencies and eliminates instruction-level parallelism. It might be worth trying Estrin's method instead (if the compiler does not rewrite them already).
Actually, there is quite a bit of available ILP in the forms that use division - as high as four insns. The numerator and denominator can be evaluated in parallel, and the coefficient loads can be interleaved with the FMAs to hide some of their latency.
That said, with such a low variation in the timings I wonder if some other overhead was dominating the costs. For example, floating-point division is typically much slower than either addition or multiplication. According to Agner Fog, div is 2-4x slower in latency than add and mul while completely hogging the pipeline.
If you want to make the work less boring and help it stick a lot better, then you could use spaced repetition. For example, instead of testing just the previous week's work, you might pick 1 third of the questions from the previous week, 1 third from the week before, and 1 third randomly.