Not only possible but quite easy. Inference for 70B can be done with llama.cpp using CPU only, on any commodity hardware with >64GB of RAM
And if it doesn't you need to do some workarounds with compiling and it gets a bit harder to run.
Not only possible but quite easy. Inference for 70B can be done with llama.cpp using CPU only, on any commodity hardware with >64GB of RAM