Hacker News new | past | comments | ask | show | jobs | submit | aaronrelph's comments login

Yeah, I mention this in the post but this variant of LLaMA isn't storing any of the conversation in memory so it doesn't have context on the prior questions. You're starting fresh with each prompt. We have some ideas for how to improve this though... more soon :)


The simplest way to improve is just to re-feed the whole conversation as a prompt.


ah, ok - thanks!


fixed!


Wow sorry y'all, we didn't expect this to take off so quickly. Working on getting this scaled up now so everyone can play!


And we're back! Might be little slow while traffic is high but responses are coming through.


Heads up: The UI is not usable on iOS safari if the URL bar is set to be on the bottom. The bar covers up the text area and send button and if you scroll down it jumps back up to be hidden again.


Did you run the fine-tuning on LLaMA yourselves based on the 52k examples from Alpaca? Or is there a 7B pre-trained alpaca model out there that you grabbed?


For this demo, we're using the 8bit version here: https://huggingface.co/tloen/alpaca-lora-7b

We also fine-tuned and OSS'd a 30b version here that you can checkout (on the cleaned 52k Alpaca dataset) https://huggingface.co/baseten/alpaca-30b


Can you comment on the '8bit version' from above? Does that mean these parameters are uint8's (converted from the original float16 params)? Looking in your pytorch code I see some float16 declarations.

I've been running alpaca.cpp 13b locally and your 7b model performs much better than it does. I had assumed this was because alpaca.cpp was converting weights to 4bits from float16, but is there some other fine tuning you're doing that might also account for the better performance of chatLLaMA over alpaca.cpp?


did you use the cleaned and improved alpaca dataset from https://github.com/tloen/alpaca-lora/issues/28 ?


Yes, we did! The dataset has since been cleaned even more so we're due to update the model.


Alright, after a few hiccups, you should be seeing a noticeable improvement in response times now!


A guide on how Abu fine-tuned Alpaca 30B and how to use it.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: