This is an ad. You'd be best served avoiding additional dependencies. At this point, you don't want to be trading off simplicity for ease. Even transformers + huggingface feels like too much bloat.
The tools and mechanisms to get a model to do what you want is ever so changing, ever so quickly. Build and understand a notebook yourself, and reduce dependencies. You will need to switch them.
People who dislike things are just so much more vocal than people who like them. I've used Huggingface extensively, they are trying to do a lot, but its always been the most convenient/flexible for my finetuning use cases.
Yes, I'm a huge fan of Huggingface. There's a tendency to always distrust anybody that is a company that is trying to make money. But "makes some money with some of their product offerings" != "is incapable of producing valuable resources." It's always a balance, but I think Huggingface is doing well at both being a huge resource for the NLP community and having a viable business that allows them to keep being such a resource.
The code is very enterprise oriented and reads more like Java than Python. Bootstrapping a VC backed company off of open source is a known strategy for achieving growth needed for future funding and acquisition.
At some point, all the nice things they offer for free or cheaper will go away or become expensive.
I'm fine with the Huggingface piece, but this joins the long list of blog posts that make it to the top of hn with the message "Easily fine tune an LLM! …by tying yourself to our proprietary platform"
Is this a good test case for all of these competing open (and even closed) source LLMs:
feed it a list of YouTube/SoundCloud quality “artists + song titles” and ask it to clean them up/figure out how split/parse them into CSV or JSON and then identify their genre
I want to make sure I’m not being too harsh when I criticize these as useless if they can’t do this “basic” task because I’m pretty sure I was able to get GPT-3.5 to do this reasonably well for about $0.50 with no token cost optimization
I’m just curious why people are so infatuated and putting so much effort into all of these other open source models if they couldn’t complete this basic task.
If you fine tune them to be task specific, they'll perform well. In my experience, this control loop is a better investment than "prompt engineering". (When I say task specific, I mean very task specific)
GPT4 over the API is too fine tuned, which constrains its behavior. It fails to capture nuance in instructions. When you have the bag of weights, you can actually control your model. Having actual control over the model, and understanding the infrastructure that it's running on helps you meet actual SLAs.
And it's cheaper, if you're not backed by infinite venture money.
Could you explain what you mean by fine-tune? For example, I don't have the answers to what the songs parsed out + genre identified into JSON looks like. You're saying I'd have to train the model with known answers, and then maybe it could predict with some accuracy going forward?
I don't see how this warrants the extra exciting popularity of LLAMA2, etc.
I still haven't found my own personal niche "good enough" test case
I think it depends a lot on the scale of what you're trying to do, whether it's worth it or not to invest in OSS/DIY. If you're one person looking to do a "one off" task like organizing some of your own music, then you're correct that it's probably not worth it to invest time and effort into getting an open source model to do it for you. Just pay $0.50 and be done with it! But if you want to build an app that does that for people, and you want to host it for free/cheap, the costs could add up quickly. And especially if you are a company with a language task that will have lots of users--the up front R&D cost can definitely be worth it to save on costs of usage.
I was trying to argue “the open source models can’t do it in their current state”
I’m curious what could be done time investment wise by a single person like myself that could “tweak” LLAMA2 into being able to do a task it by default can’t
You can use this https://github.com/PygmalionAI/training-code
Or, you can use this; for QLoRA https://github.com/artidoro/qlora
The tools and mechanisms to get a model to do what you want is ever so changing, ever so quickly. Build and understand a notebook yourself, and reduce dependencies. You will need to switch them.