Hacker News new | past | comments | ask | show | jobs | submit login
Tuning and Testing Llama 2, Flan-T5, and GPT-J with LoRA, Sematic, and Gradio (sematic.dev)
98 points by josh-sematic on July 26, 2023 | hide | past | favorite | 22 comments



This is an ad. You'd be best served avoiding additional dependencies. At this point, you don't want to be trading off simplicity for ease. Even transformers + huggingface feels like too much bloat.

You can use this https://github.com/PygmalionAI/training-code

Or, you can use this; for QLoRA https://github.com/artidoro/qlora

The tools and mechanisms to get a model to do what you want is ever so changing, ever so quickly. Build and understand a notebook yourself, and reduce dependencies. You will need to switch them.


Huggingface + Transformers is and has been since at least 2018 the atlas holding up the rest of the NLP and pretty much all of the AI community.

Their unwavering commitment to open-source should be celebrated by all tech enthusiasts. Not sure why people poo-poo on them.


People who dislike things are just so much more vocal than people who like them. I've used Huggingface extensively, they are trying to do a lot, but its always been the most convenient/flexible for my finetuning use cases.

Thank you Huggingface!


Yes, I'm a huge fan of Huggingface. There's a tendency to always distrust anybody that is a company that is trying to make money. But "makes some money with some of their product offerings" != "is incapable of producing valuable resources." It's always a balance, but I think Huggingface is doing well at both being a huge resource for the NLP community and having a viable business that allows them to keep being such a resource.


The code is very enterprise oriented and reads more like Java than Python. Bootstrapping a VC backed company off of open source is a known strategy for achieving growth needed for future funding and acquisition.

At some point, all the nice things they offer for free or cheaper will go away or become expensive.


Bloated does not mean bad. It means bloated. Which, for my purposes, makes it not the best choice.


I'm fine with the Huggingface piece, but this joins the long list of blog posts that make it to the top of hn with the message "Easily fine tune an LLM! …by tying yourself to our proprietary platform"


FWIW, this is not proprietary, it's all FOSS. And hopefully there's something interesting in there even if you don't use any of the tools mentioned.


Fair enough—I was too lazy to click through to Sematic's site and see that it is indeed FOSS!


lol, qlora and pygmalion both wrap huggingface


yes, unfortunately :(

did you know that the weight adapter happens on demand, on inference, in the LoRA forward pass function?

leaky, leaky, leaky

GGML will save us, surely


For the Pygmalion thing, what should we use for the LoRA parameters?


There is too much overloading of terms these days. I saw LoRA, thought LoRa, and wondered why someone would spell GNU Radio as Gradio.


Only 2 hard problems in computer science: (1) cache invalidation (2) naming things (3) off-by-one errors :-D


Is this a good test case for all of these competing open (and even closed) source LLMs:

feed it a list of YouTube/SoundCloud quality “artists + song titles” and ask it to clean them up/figure out how split/parse them into CSV or JSON and then identify their genre

I want to make sure I’m not being too harsh when I criticize these as useless if they can’t do this “basic” task because I’m pretty sure I was able to get GPT-3.5 to do this reasonably well for about $0.50 with no token cost optimization

I’m just curious why people are so infatuated and putting so much effort into all of these other open source models if they couldn’t complete this basic task.


> I’m just curious

- Some people are committed to open source.

- Some people want to play with/learn/modify the technology, not just use it instrumentally.

- Some people want to play with these models without the surveillance that comes with renting them on OPC.

I'm sure there are other reasons I'm not thinking of, but I'm in the middle of that particular Venn diagram.


If you fine tune them to be task specific, they'll perform well. In my experience, this control loop is a better investment than "prompt engineering". (When I say task specific, I mean very task specific)

GPT4 over the API is too fine tuned, which constrains its behavior. It fails to capture nuance in instructions. When you have the bag of weights, you can actually control your model. Having actual control over the model, and understanding the infrastructure that it's running on helps you meet actual SLAs.

And it's cheaper, if you're not backed by infinite venture money.

https://arxiv.org/abs/2307.13269


Could you explain what you mean by fine-tune? For example, I don't have the answers to what the songs parsed out + genre identified into JSON looks like. You're saying I'd have to train the model with known answers, and then maybe it could predict with some accuracy going forward?

I don't see how this warrants the extra exciting popularity of LLAMA2, etc.

I still haven't found my own personal niche "good enough" test case



I think it depends a lot on the scale of what you're trying to do, whether it's worth it or not to invest in OSS/DIY. If you're one person looking to do a "one off" task like organizing some of your own music, then you're correct that it's probably not worth it to invest time and effort into getting an open source model to do it for you. Just pay $0.50 and be done with it! But if you want to build an app that does that for people, and you want to host it for free/cheap, the costs could add up quickly. And especially if you are a company with a language task that will have lots of users--the up front R&D cost can definitely be worth it to save on costs of usage.


I was trying to argue “the open source models can’t do it in their current state”

I’m curious what could be done time investment wise by a single person like myself that could “tweak” LLAMA2 into being able to do a task it by default can’t


This was the first explainer of LoRA that actually "clicked" for me.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: