Hacker News new | past | comments | ask | show | jobs | submit login

It’s interesting. It looks like if you’re trying to improve the accuracy/perplexity of the model and using fp32 it doesn’t make a difference , but if you want to quantize it/make it compressible a modified soft max makes a huge difference ( this is what I understand from the Qualcomm paper). Different goals, different findings ?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: