Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This class of startup, "build domain specific LLMS using your own data", is extremely crowded right now but I am not optimistic about their future. For large companies, the actual modeling work for this is already easy for any ML team, thanks to existing FOSS work on stuff like PEFT and LoRA. The hard part is figuring out what data goes into the fine tuning process and how to get this data in a usable form, but this is very business specific and can't be automated in a SaaS process.

For SMBs, the value would be in using the LLM to generate responses to customer Q&A/search queries. But these companies aren't going to integrate some external third party service, they'll only use it if it's already baked into their CMS - Wordpress/Shopify/Wix/etc. I just don't see who the final consumer for this product would be.



> "build domain specific LLMS using your own data",

It seems to me that the vast majority of these people would be better off just doing semantic search with their documents chunked, run through an embeddings process, and stored in a vector database, with the search queries and results then run through an LLM at the final step to create an actual "answer". For applications where this is not practical, I agree that LoRA should be the next approach. I have a hard time believing that the future is everyone training their own domain specific LLMs from the ground up.


I wholeheartedly agree with this. Vector databases are easily updatable, searchable by recency, and you can verify where the information came from. Training a custom frozen LLM for every company seems insane. Each company’s data is not that unique - it’s just the numbers that matter, for which you need a vector or traditional database.


> thanks to existing FOSS work on stuff like PEFT and LoRA

YMMV. Sometimes a LORA is fine, but sometimes a full finetune is necessary for higher quality output.

That being said, backwards pass free training keeps making more and more progress. Seems like a short matter of time before it becomes practical.


Look at QLoRA. The QLoRA can be attached to all layers, allowing you to alter behavior with much less data than the original LoRA implementation. It seems to "stick" better.

I just fine tuned a ~30b parameter model on my 2x 3090s to check it out. It worked fantastically. I should be able to fine tune up-to 65b parameter models locally but wanted to get my dataset right on a smaller model before trying.


Are there any repos and steps you can point to to do this? I'd love to try to do exactly what you describe. I have been trying to do the same and have run into a lot of repos with broken dependencies.


I used: https://github.com/artidoro/qlora but there are quite a few others that likely work better. It was literally my first attempt at doing anything like this, and took the better part of an evening to work through CUDA/Python issues to get it training, and ~20 hours of training.


> is extremely crowded right now but I am not optimistic about their future

Why not? Every larger "cloud" company seems to be randomly buying 1 at the moment to offer "AI" so might get some good deals. This is clearly 1 of them - panic buy.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: