Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
blovescoffee
7 months ago
|
parent
|
context
|
favorite
| on:
Diffusion Forcing: Next-Token Prediction Meets Ful...
Am I missing something about training time? Does adding per token noise cause training to slow significantly? Cool paper though!
Consider applying for YC's Spring batch! Applications are open till Feb 11.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: