Could you provide more details on this matter? Specifically, I'm interested in knowing which base model you've utilized and the approach you've taken to fine-tune it. Your insights would be greatly appreciated and highly beneficial.
For narrow stuff you can do better job than base gpt4/mistral/etc model. You fine tune it with your very custom data, stuff that got didn’t seem to be trained on, it will generalize it well.
You're not wrong. There's been a lot of drama over licensing and releasing datasets, and a lot of the LLM scene are just pitchmen and promoters with no better grasp over what they're doing than "trust me, it's better".
Like with "prompt engineering", a lot of people are just hiding how much of the heavy lifting is from base models and a fluke of the merge. The past few "secret" set leaks were low/no delta diffs to common releases.
I said it a year ago, but if we want to wowed, make this a job for MLIS holders and references librarians. Without thorough, thoughtful curation, these things are just toys in the wrong hands.