Most laptops with 64+GB of RAM can run a 70B model at 4-bit quantization. It’s not a miracle, it’s just math. M2 can do it faster than systems with slower memory bandwidth.
Most laptops with 64+GB of RAM can run a 70B model at 4-bit quantization. It’s not a miracle, it’s just math. M2 can do it faster than systems with slower memory bandwidth.