isn't it 2 bytes (fp16) per param. so 7b = 14 GB+some for inference?

ancientworldnow · 2024-07-18T16:06:43 1721318803

This was trained to be run at FP8 with no quality loss.

hislaziness · 2024-07-19T00:04:36 1721347476

The model description on huggingface says - Model size - 12.2B params, Tensor type - BF16. Is the Tensor type different from the training param size?

fzzzy · 2024-07-18T17:36:14 1721324174

it's very common to run local models in 8 bit int.

qwertox · 2024-07-18T18:50:48 1721328648

Yes, but it's not common for the original model to be 8 bit int. The community can downgrade any model to 8 bit int, but it's always linked to quality loss.