Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This smells very badly of quantization - the extra commentary is a failure mode I observe frequently when dropping down from FP16 down to 4 bits.


Which base model do you work with?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: