Beginner's Guide to Llama Models

jmorgan · 2023-08-12T07:56:22

An easy way to try many of the fine-tuned Llama 2 models is https://github.com/jmorganca/ollama.

A maintainer of the project has been collecting a full list here (with different quantization levels), most of which are Llama 2-based: https://gist.github.com/mchiang0610/b959e3c189ec1e948e4f6a1f...

Since the release of Llama 2 the number of models based on it has been growing significantly.. some popular ones:

- codeup (A code generation model - DeepSE)

- llama2-uncensored (George Sung)

- nous-hermes-llama2 (Nous Research)

- wizardlm-uncensored (WizardLM)

- stablebeluga (Stability AI)

The article also recommends oobabooga's text-generation-webui which includes a full web dashboard.

xcdzvyn · 2023-08-12T08:36:52

Wow, thanks! I really like Ollama, so I'm glad the models listed at the top of the Readme aren't all you could use. Do you know where the data in that Gist came from? Is there a registry somewhere?

jmorgan · 2023-08-12T09:37:58

There is! While not easy to use yet, there's a sort-of-hidden way models can be listed with:

  curl https://ollama.ai/v2/_catalog | jq

Then to list "tags" for a given model (e.g. llama2):

  curl https://ollama.ai/v2/library/llama2/tags/list | jq

xcdzvyn · 2023-08-13T05:30:39

Oh, you're the maintainer! Hahaha my bad. Thanks :)

danielbln · 2023-08-12T08:41:52

llama2-uncensored is a lot of fun, interacting with an unaligned/uncensored model feels very fresh.

thatguymike · 2023-08-12T06:55:44

I find the whole site being covered in images of skinny waifu girls... offputting. The guide itself seems fine, if high-level. [This](https://chat.lmsys.org/?leaderboard) is a really nice link I hadn't seen before.

raincole · 2023-08-12T07:42:02

I don't mind waifu girls, but I really don't know how these images are related to Llama. It's not an article about Stable Diffusion right...?

Huge content farm vibe.

Edit: I read the article carefully. Yeah, not just content farm vibe. It's a content farm.

> What can you do with Llama models?

> You can use Llama models the same ways you use ChatGPT.

> Chat. Just ask questions about things you want to know.

> Coding. Ask for a short program to do something in a specific computer language.

> Outlines. Giving an outline of certain technical topics.

> Creative writing. Let the model write a story for you.

> Information extraction. Summarize an essay. Ask specific questions about an essay.

> Rewrite. Write your paragraph in a different tone and style.

Obvious padding content for SEO. Flagged.

smeej · 2023-08-12T12:31:21

I only got a couple paragraphs in before thinking, "This was written by a fluff AI, and probably not even Llama." I think Llama has better English proficiency.

iinnPP · 2023-08-12T11:59:55

Maybe they just like anime a lot? Hanlon's razor and all.

mufti_menk · 2023-08-12T07:22:55

I found them beautiful

cypress66 · 2023-08-12T07:25:30

Yeah, a lot of them are very cool.

brabel · 2023-08-12T07:35:23

Same here... the images are beautiful but seem completely unrelated to the topic (other than being AI-generated - but they could've generated images that have some sort of relation to the topic instead!). It kind of shows what the author has been using the AI for, I suppose :P.

kristofferg · 2023-08-12T07:41:31

I would usually agree but in this case its a pun on “models”. I find it quite funny.

brabel · 2023-08-12T08:40:16

Oh I see... and a "model" is always young, skinny, white and female I guess (cannot blame the AI though, I guess that is what you get if you look at the internet as a whole as your training data).

EDIT: I had to lookup "waifu" and it seems the author probably prompted for "waifu model"(?)... according to Wikipedia, for those like me who are not into that sort of thing:

"A Waifu is an illustrated female character from an anime or any non-live action media in which an individual becomes sexually attracted to."

kristofferg · 2023-08-12T08:48:28

You are right in that is neither high-brow or PC. :)

mhaberl · 2023-08-12T08:26:27

> skinny waifu girls

I think the idea of the images was a Lama (the animal) and a model (the girl) - Lama models

> The guide itself seems fine

The piece is informative for those new to the subject, but not much beyond that.

bdavbdav · 2023-08-12T07:38:00

In my presentations about it at work, I’ve been generating pictures of my dog doing things to illustrate various things. No different I suppose.

Anon4Now · 2023-08-12T07:46:43

I thought they were a bit over the top, but what really caught my eye was how creepy looking that first llama is.

milar · 2023-08-12T07:46:47

This is literally AI generated txt and gen’d images.

HN isn’t prepared yet.

Prepare for 100x more of these.

raincole · 2023-08-12T07:50:09

Checked the submitter's history. Clearly "spam my content farm posts on HN and hope one of them gets lucky".

ulnarkressty · 2023-08-12T08:00:43

I tried multiple flavors of llama models, they are all quite dumb. Even the 70b parameter one. It knows about more things which the smaller models just hallucinate when asked, but still cannot do even slightly more complex tasks.

I'm also not sure about the current testing methodologies i.e. the 'passed the SAT' hype. Given that the training set already contains much of the information, we should probably compare the AI results with humans having unlimited time and access to the required material.

smcin · 2023-08-12T22:19:37

Post us some sample prompts and answers please.

cowthulhu · 2023-08-12T06:33:40

Does anyone know how well a fine-tuned Llama model will do compared to GPT 4 on complex tasks?

soultrees · 2023-08-12T07:15:26

From what I understand is that it’s not really close for anything involving creative reasoning but for basic instruct is on par.

milar · 2023-08-12T07:49:01

GPT4 exceeds all of them.

Wait til next year, as the Brooklyn dodgers said

yu3zhou4 · 2023-08-12T07:47:29

Can you recommend another systematized guides like this about LLMs and ML in general? Even though was quite limited in info, I like the structured and concise form and I’d be happy to learn about similar blog posts

jmorgan · 2023-08-12T08:06:46

I really enjoyed Anrej Kaparthy's llama2.c project (https://github.com/karpathy/llama2.c), which runs through creating and running a miniature Llama2 architecture model from scratch.

marcopicentini · 2023-08-12T08:47:07

Can I train a llama 2 model on custom data and make it expert on my data knowledge?

For example, If I give it thousands of law pages it will be a domain expert about law ?

avion23 · 2023-08-12T08:27:39

I've evaluated some models. The best ones are based on llama-2. Good is for example codeup-llama-2-13b-chat-hf

Uncensored (partially) is nous-hermes-llama2-13b

tony12345678 · 2023-08-17T21:16:33

As a Mac user, I think Ollama is amazing. Thank you! :). Is there any chance that functionality could be added for fine-tuning (e.g. document/text file uploads)?

crossroadsguy · 2023-08-12T09:27:00

What's a good and friendly place to start for a seasoned Android developer to get a peek into the world of AI/ML as we are seeing all this today (or specifically last few months maybe), who is kind of having a FOMO cum curiosity (and a little worry about career/future and all) about all this? To get a taste of things.

jbjbjbjb · 2023-08-12T08:48:19

I tried running a llama model using oobabooga but kept running into one problem after the next.

Anyone know of a config that might work (even quite slowly) on an i9 laptop, 32gb ram with nvidia graphics 8gb?

MahdeenSky · 2023-08-12T07:02:15

+1 uncensored models

neilv · 2023-08-12T07:36:26

> How to install Llama models?

> See the installation guide for Windows and the installation guide for Mac.

Much "open" LLM/SD/etc. grassroots stuff seems to be shooting themselves in the face, by pushing others to closed platforms.

Now is one of the times to be increasing pressure for open platforms, not backsliding.

abwizz · 2023-08-12T07:41:03

yes, they are, but it seems a good fit nonetheless

speedgoose · 2023-08-12T06:24:19

What are the hardware requirements to fine tune llama ?

bart__ · 2023-08-12T09:56:27

~20 GB vram for the 7B model and 48 GB for the 13B model. It depends on the context size as well. I'd recommend renting a 4090 from a cloud provider like runpod/vast ai to get started, using a PEFT tutorial.

speedgoose · 2023-08-12T11:33:08

Thanks. What about the 70B model? I assume a 4090 will not be enough. Is it linear system requirements ?

bart__ · 2023-08-12T11:40:33

4090 only has 24 GB and will only be able to fine tune (and merge, which is more memory intensive) the 7B model. The RTX6000 with 48 GB is able to fine tune the 13B model. The 70B model presumably needs multiple GPUs, like 4 RTX6000. For people starting out, you can also use a free GPU from Google colab to fine tune a 7B model. Finetuning 70B gets more expensive and I would suggest trying smaller models first with a high quality dataset.

It is mostly linear I think.

speedgoose · 2023-08-12T12:33:44

Thanks. My plan is to use this research cluster: https://www.ex3.simula.no/resources

I will probably train how to fine tune on the small model but I don’t really need to use a worse model to save money.