Submissions from twitter.com/awnihannun

		MLX 0.11: faster generation across model sizes and machines (twitter.com/awnihannun)
		3 points by tosh 67 days ago \| past
		With the latest MLX, 4-bit Llama 3 8B runs nicely on an 8GB M2 mini (twitter.com/awnihannun)
		2 points by mariuz 68 days ago \| past
		100 tokens/s, 4-bit Mistral 7B in MLX on M2 Ultra (faster than llama.cpp) (twitter.com/awnihannun)
		3 points by tosh 83 days ago \| past
		Apple is hiring GPU kernel engineers for the MLX project (twitter.com/awnihannun)
		4 points by behnamoh 5 months ago \| past
		Mistral 7B 4-bit quantization runs no problem on an 8GB M2 (twitter.com/awnihannun)
		20 points by tosh 6 months ago \| past