This class of startup, "build domain specific LLMS using your own data", is extr...

DebtDeflation · on June 26, 2023

> "build domain specific LLMS using your own data",

It seems to me that the vast majority of these people would be better off just doing semantic search with their documents chunked, run through an embeddings process, and stored in a vector database, with the search queries and results then run through an LLM at the final step to create an actual "answer". For applications where this is not practical, I agree that LoRA should be the next approach. I have a hard time believing that the future is everyone training their own domain specific LLMs from the ground up.

mootline · on June 26, 2023

I wholeheartedly agree with this. Vector databases are easily updatable, searchable by recency, and you can verify where the information came from. Training a custom frozen LLM for every company seems insane. Each company’s data is not that unique - it’s just the numbers that matter, for which you need a vector or traditional database.

brucethemoose2 · on June 26, 2023

> thanks to existing FOSS work on stuff like PEFT and LoRA

YMMV. Sometimes a LORA is fine, but sometimes a full finetune is necessary for higher quality output.

That being said, backwards pass free training keeps making more and more progress. Seems like a short matter of time before it becomes practical.

Tostino · on June 26, 2023

Look at QLoRA. The QLoRA can be attached to all layers, allowing you to alter behavior with much less data than the original LoRA implementation. It seems to "stick" better.

I just fine tuned a ~30b parameter model on my 2x 3090s to check it out. It worked fantastically. I should be able to fine tune up-to 65b parameter models locally but wanted to get my dataset right on a smaller model before trying.

throwaway72762 · on June 26, 2023

Are there any repos and steps you can point to to do this? I'd love to try to do exactly what you describe. I have been trying to do the same and have run into a lot of repos with broken dependencies.

Tostino · on June 26, 2023

I used: https://github.com/artidoro/qlora but there are quite a few others that likely work better. It was literally my first attempt at doing anything like this, and took the better part of an evening to work through CUDA/Python issues to get it training, and ~20 hours of training.

re-thc · on June 26, 2023

> is extremely crowded right now but I am not optimistic about their future

Why not? Every larger "cloud" company seems to be randomly buying 1 at the moment to offer "AI" so might get some good deals. This is clearly 1 of them - panic buy.