Really impressive and intriguing work, thanks for sharing! I'd be specifically c...

Really impressive and intriguing work, thanks for sharing!

I'd be specifically curious about applying PQ to transformers. It's quite depressing to me that the ultra-large-scale model training is inaccessible to the average poor person like me. The dream would be to figure out some method of compressing the parameter count significantly (or rather make models more efficient) such that it is possible to train and/or run 100 billion/1/10/100trillion+ parameter models on Colab say, as 'crazy' as that sounds to most probably.