Ask HN: Why are 1.00-bit LLMs not used? | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

		Ask HN: Why are 1.00-bit LLMs not used?
		2 points by nlitened 9 days ago \| hide \| past \| favorite \| 2 comments

		I've seen papers on 1.58-bit LLMs with "minimal" weights -1, 0, and 1, and they show good accuracy with much smaller sizes. But I haven't been able to find any LLMs with strict 1.00-bit weights (just values 0 and 1). I guess one would represent negative values by having separate "positive" and "negative" matrices, then just adding/subtracting the results of multiplication. To me it looks like a very efficient solution: 1) huge memory and energy savings, 2) computationally dot product is just a POPCNT(A & B), and matrices could be laid out very efficiently in memory (a 64-byte cache line holds 512 weights!), so matrix multiplication should be very fast, too, 3) should run very fast on a CPU, 4) there should be no precision reduction against 1.58-bit LLM. What are the downsides of this approach? Where could I read about it?

ClassyJacket 9 days ago [–]

I don't know what I'm talking about but don't you need more than one bit to have non-linearity?

nlitened 9 days ago | [–]

Results of 1-bit matrix multiplications (essentially, counts of one-bits in rows) are integers, and a non-linear activation function may be applied, I think

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact