Hacker News new | past | comments | ask | show | jobs | submit login
Llama2.mojo - outperforms Karpathy’s llama2.c by 30% in multi-threaded inference (github.com/tairov)
2 points by swyx 5 months ago | hide | past | favorite | 1 comment



Isn’t llama2.c was just a fun project for Karpathy?

When I compared llama2.c to llama.cpp it was way way slower.

All the mojo <insert number>x speed up claims I’m hearing about always use the baseline of some toy examples that nobody actually uses IRL.

Am I missing anything?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: