Hacker News new | past | comments | ask | show | jobs | submit login
Rigorous Testing of Novel LLMs
2 points by kai-dev on June 14, 2023 | hide | past | favorite
So, my friend and I built a custom model that is performing 2-3X as good as a comparable decoder only model (~same params, exact same input data etc.) and it's doing this on CodeXGlue.

I am currently trying to figure out what the best method would be to a) scale it, b) test it on other benchmarks, and c) how to really determine max context length. This last one is really the point of interest. We have reason to believe it can linearly scale to far larger context windows, but still need to test it. Only looking for evaluation help. Once we believe we have bulletproof eval metrics then we will probably release it or something.

A specific reason I am asking for help here is that there is nothing in the literature about a model like ours (trust me I have read hundreds of papers in the hopes of making my life easier while developing this). Like literally nothing. If anyone has leads on massive context window models that aren't decoder/encoder only please let me know? Thanks.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: