Hacker News new | past | comments | ask | show | jobs | submit login

> Based on current hardware trends, GPT-4 will run on your phone within 6-10 years

they quantized model from 16 bits to 4 bits which was low hanging fruit, and looks like they can't quantize it anymore to 2 bits..




CPU/GPUs are general purpose, if enough workload demand exists specialized Transformer cores will be designed. Likewise, its not at all clear that current O(N^2) self-attention is the ideal setup for larger context lengths. All to say, I'd believe we have another 8-10x algorithmic improvement in inference costs over the next 10 years. In addition to whatever Moore's law brings.


Mobile TPUs/NPUs:

Pixel 6+ phones have TPUs (in addition to CPUs and an iGPU/dGPU).

Tensor Processing Unit > Products > Google Tensor https://en.wikipedia.org/wiki/Tensor_Processing_Unit

TensorFlow lite; tflite: https://www.tensorflow.org/lite

From https://github.com/hollance/neural-engine :

> The Apple Neural Engine (or ANE) is a type of NPU, which stands for Neural Processing Unit.

From https://github.com/basicmi/AI-Chip :

> A list of ICs and IPs for AI, Machine Learning and Deep Learning




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: