Hacker News new | past | comments | ask | show | jobs | submit login

Did you run the fine-tuning on LLaMA yourselves based on the 52k examples from Alpaca? Or is there a 7B pre-trained alpaca model out there that you grabbed?



For this demo, we're using the 8bit version here: https://huggingface.co/tloen/alpaca-lora-7b

We also fine-tuned and OSS'd a 30b version here that you can checkout (on the cleaned 52k Alpaca dataset) https://huggingface.co/baseten/alpaca-30b


Can you comment on the '8bit version' from above? Does that mean these parameters are uint8's (converted from the original float16 params)? Looking in your pytorch code I see some float16 declarations.

I've been running alpaca.cpp 13b locally and your 7b model performs much better than it does. I had assumed this was because alpaca.cpp was converting weights to 4bits from float16, but is there some other fine tuning you're doing that might also account for the better performance of chatLLaMA over alpaca.cpp?


did you use the cleaned and improved alpaca dataset from https://github.com/tloen/alpaca-lora/issues/28 ?


Yes, we did! The dataset has since been cleaned even more so we're due to update the model.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: