NexaQuant: Llama.cpp-Compatible Model Compression with 100%+ Accuracy Recovery | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

		NexaQuant: Llama.cpp-Compatible Model Compression with 100%+ Accuracy Recovery (nexa.ai)
		3 points by BUFU 15 days ago \| hide \| past \| favorite \| 1 comment

BUFU 15 days ago [–]

Will llama.cpp be the go-to local inference framework for every device?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact