Hacker Newsnew | past | comments | ask | show | jobs | submit | timshel1's commentslogin

Modded-nanogpt is also much more data efficient than vanilla napogpt, even if some of the individual optimizations trade off higher throughput for worse data efficiency.

yes, agreed, modded-nanogpt is already a data-efficient variant of original nanogpt. just that the kinds of algorithms it allows are somewhat constrained because it optimizes for wall clock time.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: