Hacker News new | past | comments | ask | show | jobs | submit | from login
MLX 0.11: faster generation across model sizes and machines (twitter.com/awnihannun)
3 points by tosh 67 days ago | past
With the latest MLX, 4-bit Llama 3 8B runs nicely on an 8GB M2 mini (twitter.com/awnihannun)
2 points by mariuz 68 days ago | past
100 tokens/s, 4-bit Mistral 7B in MLX on M2 Ultra (faster than llama.cpp) (twitter.com/awnihannun)
3 points by tosh 83 days ago | past
Apple is hiring GPU kernel engineers for the MLX project (twitter.com/awnihannun)
4 points by behnamoh 5 months ago | past
Mistral 7B 4-bit quantization runs no problem on an 8GB M2 (twitter.com/awnihannun)
20 points by tosh 6 months ago | past

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: