
Reducing Memory Usage for DNN Training with Tensor Compression - liuliu
https://liuliu.me/eyes/reduce-another-70-memory-usage-for-deep-neural-network-training-over-mixed-precision-with-tensor-compression/
======
panpanna
This builds upon something he calls "adaptive quantization algorithm".
Basically reducing precision of intermediate values to 8 or 4 bits.

Is this something that is supported on all GPUs?

~~~
liuliu
Precision reduced to 2-bits, with "codebook" of 2 within each 4x4 patch.
Similar to
[https://en.wikipedia.org/wiki/Block_Truncation_Coding](https://en.wikipedia.org/wiki/Block_Truncation_Coding)
but use 1 more bit. There are a lot of tuning left to do to find better ways
to encode that reduces MSE.

------
nico_h
with code! loose some precision, gain some DNN depth!

