Hacker News new | past | comments | ask | show | jobs | submit login

Yes. More generally, this will enable implementation via crazy-cheap bit-wise ops in binary hardware, and possibly, maybe, via crazy-cheap trit-wise ops in ternary hardware that manipulates ternary digits, or trits. Note that any binary op over trits has only nine possible (trit, trit) input pairs and only three possible trit outputs. Maybe ternary hardware for AI will become a thing?

Fleshing out my thought above. If we want to multiply A*B = C and all operands are stored in 2 separate bits Ap and An (Ap = 1 if A = +1 while An = 1 if A = -1). We can do a product with:

Cp = (Ap & Bp) | (An & Bn)

Cn = (An & Bp) | (Ap & Bn)

So 64 products in 6 instructions, or 256 in 6 instructions with AVX2, or 512 in six instructions using AVX512. If you can execute 2 instructions at a time on different words, this becomes 1024 "products" in 6 cycles or between 0.5 and 1 TOP per core.

The summing still involves using popcount on the positive and negative bits - I doubt AVX supports that but its still a fast way to "sum" individual bits. I don't see custom hardware for this as a short term thing - they need to prove out the quantization concept more first.

Another way would be to use one register for "zero" vs. "non-zero", and another for negative (basically 2 bit sign-magnitude representation).

    C_sgn = A_sgn ^ B_sgn
    C_mag = A_mag & B_mag
The result can then be converted into bitmasks for positive and negative:

    C_plus = C_mag & ~C_sgn
    C_minus = C_mag & C_sgn
This solution should be more efficient if there is an "AND NOT" instruction, or when multiplying more than two factors.

Thinking a bit more about this, you could eliminate the conversion and do

    sum = popcount(mag) - 2*popcount(mag & sgn)

> I don't see custom hardware for this as a short term thing - they need to prove out the quantization concept more first.

Yes, I agree. This still needs to be more extensively tested.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
