It’s better to think about mapping groups of weights to an entry in a large code...

It’s better to think about mapping groups of weights to an entry in a large codebook, and then using clever methods to do weight lookups.

2 bits of precision per weight is perfectly fine as long as you have enough weights. The information encoded by a neural network is measured in total number of bits, so you can compress it by either reducing number of weights or reducing number of bits per weight.