I agree with you that, for large enough n, using O(n) steps in the recursion will be slower than doing O(log n) matrix multiplication steps. The addition chain improves on the binary approach for some values of n, but I don't think it can go below O(log n) because the binary method is fastest for all n that are powers of two (disclaimer: I know almost nothing of addition chains besides what I learned from TAOCP).
And of course, finding the fastest addition chain doesn't come cheaply, either, if P != NP.
Hm, maybe I understand the OP's claim: if you want to generate the first n items in the series, the recursive formula will be hard to beat. Even then, it may not be fastest on modern CPUs. https://oeis.org/A000045 gives
F(n + 12) = 18F(n + 6) - F(n)