Submissions from github.com/huawei-csl

		KVarN: Native vLLM backend for KV-cache quantization by Huawei (github.com/huawei-csl)
		130 points by theanonymousone 16 hours ago \| past \| 13 comments
		Sinkhorn: Make LLMs even smaller through quantisation while maintaining accuracy (github.com/huawei-csl)
		4 points by ilitirit 8 months ago \| past \| 1 comment