Hacker News new | past | comments | ask | show | jobs | submit login

> with tolerably close accuracy.

No, speculative decoding has exactly the same accuracy as the target model. It is mathematically identical to greedy decoding.




Is there a reference for this? I was wondering the same thing.


Read the original whitepaper or go look at how any framework implements it.

You will see that tokens not predicted by greedy sampling of the target model are rejected. Ergo, they are mathematically identical.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: