Hacker News new | past | comments | ask | show | jobs | submit | adityang5's comments login

Very cool stuff! Exciting to see so many people sharing their works on KANs. Seeing as the authors claim that KANs are able to reduce the issues of catastrophic forgetting that we see in MLPs, I thought "Wouldn't it be nice if there was an LLM that substituted MLPs with KANs?". I looked around and didn't find one, so I built one!

- PyTorch Module of the KAN GPT

- Deployed to PyPi

- MIT Licence

- Test Cases to ensure forward-backward passes work as expected

- Training script

I am currently working on training it on the WebText dataset to compare it to the original gpt2. Facing a few out-of-memory issues at the moment. Perhaps the vocab size (50257) is too large?

I'm open to contributions and would love to hear your thoughts!

https://github.com/AdityaNG/kan-gpt

https://pypi.org/project/kan-gpt/


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: