Hacker News new | past | comments | ask | show | jobs | submit login

It' s a bit early to compare directly to TensorRT because we don't have a full-blown equivalent.

Note that our focus is being platform agnostic, easy to deploy/integrate, good performance all-around, and ease of tweaking. We are using the same compiler than Jax, so our performances are on par. But generally we believe we can gain on overall "tok/s/$" by having shorter startup time, choosing the most efficient hardware available, and easily implementing new tricks like multi-token prediction.






Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: