Hardly. I've played a lot with the 7,13, and 30B llamas as well as the 7 and 13B...

crooked-v · on March 20, 2023

For me the easiest comparison between models is to give it an absurd but entirely possible request, like "Write me a fanfic where the Animorphs battle the truck from Duel, but in the style of Mark Twain". So far nothing else I've tried has done even as well as GPT 3.5 yet, let alone GPT 4.

gumby · on March 20, 2023

I couldn't do it either as I have no idea what Animorphs or Duel are.

ianlevesque · on March 20, 2023

That's ok, just ask ChatGPT to explain them to you.

crooked-v · on March 20, 2023

ChatGPT 4 input:

> Can you give separate summaries of the Animorphs book series and the film Duel?

The reply:

> Animorphs Book Series Summary: The Animorphs series, written by K.A. Applegate, is a popular young adult science fiction book series spanning 54 books and several companion novels. The story revolves around five teenagers—Jake, Rachel, Tobias, Cassie, and Marco—who are granted the power to morph into any animal they touch by a dying alien named Elfangor. The teens, known as the Animorphs, use their new abilities to battle an evil alien race called the Yeerks, who are secretly invading Earth by taking control of humans' minds. Throughout the series, the Animorphs learn about the complexities of war, friendship, and sacrifice as they try to save humanity from the Yeerk invasion.

> Duel Film Summary: Duel is a 1971 television movie directed by Steven Spielberg and based on the short story by Richard Matheson. The film follows a mild-mannered traveling salesman named David Mann (played by Dennis Weaver), who unwittingly finds himself in a deadly game of cat and mouse on a remote California highway. Mann is pursued by a mysterious and relentless tanker truck driver, who seems determined to kill him for no apparent reason. With escalating tension and suspense, the film delves into themes of fear, paranoia, and the primal fight for survival. Duel is notable for being Spielberg's directorial debut, showcasing his skillful storytelling and unique visual style that would later become hallmarks of his career.

TedDoesntTalk · on March 20, 2023

Duel is a great film but requires patience compared to a lot of today's films. There's not much action for quite some time -- just a man driving through the desert. Even when things pick up, it's still slow pacing by today's standards. It's from another era, and I doubt anyone would make a film today with the same pacing.

bobtheunbuilder · on March 26, 2023

Tried the same question with 30B llama got the following:

The Animorphs book series was a young adult science fiction series written by Katherine Applegate and her husband Michael Grant, under the name K. A. Applegate. The series is made up of fifty-four books, first published between June 1996 and May 2001. The series revolves around five children given the power to transform into animals (the ability to "morph") as they are thrust into a war with an alien species called Yeerks that are determined to take over Earth.

Duel is a television film directed by Steven Spielberg and written by Richard Matheson, based on his short story of the same name. It stars Dennis Weaver as David Mann, a middle-aged salesman driving on a business trip. While traveling down a two lane highway through remote California desert, he is taunted in a life or death duel by the unseen driver of a massive and monstrous Peterbilt 281 tanker truck.

---- You have to remember to prompt it right, since its based on autocorrect just as ChatGPT but a lot there is happening on the background before the text is sendt to the model.... My prompt and settings here was.

Repeat_penalty: 1.176 n_predict: 1000 temp: 0.7 top_k 40 top_p 0.1

--- Transcript of a dialog, where the User interacts with an Assistant named Bob. Bob is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision.

User: Hello, Bob. Bob: Hello. How may I help you today? User: Please tell me the largest city in Europe. Bob: Sure. The largest city in Europe is Moscow, the capital of Russia. User:Can you give separate summaries of the Animorphs book series and the film Duel? ----

vdfs · on March 20, 2023

I wonder if they will add "chatgpt it" or "gpt it" to the dictionaries

MildlySerious · on March 20, 2023

I wouldn't be surprised if GPT ends up as the the AskJeeves of LLMs.

pram · on March 20, 2023

How exactly do you get it to keep going? Every time I try a prompt like this in the playground it spits out a couple paragraphs and then refuses to generate anything further, even with tokens maxed out.

sharedfrog · on March 20, 2023

Simply telling it "continue" has worked for me.

crooked-v · on March 20, 2023

You can also tell it "Continue, but with (...)" to fine-tune the output further, or "Tell that again, but with (...)" to adjust the previous response.

typon · on March 20, 2023

I found that "continue exactly where you left off" will continue in a way that you expect it to more often.

jazzyjackson · on March 20, 2023

I never used the OpenAI playground, see if Poe.com will work for you. They give free access to GPT3.5 and Claude, Antrhopic AIs competitor. Paid subscription for GPT4 and Claude+

Karrot_Kream · on March 20, 2023

You can just resubmit the prompt and existing history to have the model continue generating new tokens.

jonplackett · on March 20, 2023

Giving it an expected word count works well.

starik36 · on March 20, 2023

That is my experience as well. I've tried various models but nothing comes even close to the current ChatGPT implementation (when it manages to stay up).

TedDoesntTalk · on March 20, 2023

> Write me a fanfic where the Animorphs battle the truck from Duel, but in the style of Mark Twain

Whoa. I want to read this! Duel - what a great film. Twain - amazing writer. Animorphs - published after my teen years but sounds like a great story!

Larrikin · on March 20, 2023

Highly recommend Animorphs. As an adult the books can be read through pretty quickly, but the story really holds up. Deals with some heavy topics of genocide, enslavement (even exploring the justifications of the enslavers), PTSD among other topics but was still considered a kids book.

It becomes obvious in the middle when some of the books were written by ghost writers, but the books are so easy to read I don't really recommend skipping them. If you must you could probably get away with reading the first ten, last ten, but should definitely read all of the Chronicle books.

crooked-v · on March 20, 2023

Maybe the most striking bit of its exploration of the morality of war is that the series spends something like 45 books building up the alien Andalites as beneficent do-gooders who the team of protagonists expect to swoop in and fix everything (including one of the protagonists who is a young Andalite stranded on Earth), and then it turns out that they've been as morally worn down by the ongoing interstellar war as the protagonists have in their guerrilla conflicts on Earth and are effectively as hostile to humans as the original invaders.

SllX · on March 20, 2023

> If you must you could probably get away with reading the first ten, last ten

Not sure that you can. If you were to skip any, probably only 31 through 39 are completely skippable, maybe some of the late 20s but I would still read 29 and 30 at a minimum. Some of the teens and 20s might be skippable after 13 but there’s a fair amount of world-building outside the Chronicles series in the 20s; and 40 onwards is setting up the end game and then the end game. 41 and 48 are both weird but also kind of key towards finalizing the characters of the two cover characters in the end game.

EDIT: actually 33 and 38 shouldn’t be skipped either. They’re Tobias and Ax books and there’s so few of those that they’re all kind of essential, but maybe the Tobias books just a little bit more essential.

Izkata · on March 22, 2023

Don't forget 20, 21, and 22 are the David arc.

satvikpendem · on March 20, 2023

You might need to fix your parameters. From the text-generation-gui guide:

> For a more creative chat, use: temp 0.72, rep pen 1.1, top_k 0, and top_p 0.73

> For a more precise chat, use temp 0.7, repetition_penalty 1.1764705882352942 (1/0.85), top_k 40, and top_p 0.1

https://old.reddit.com/r/LocalLLaMA/comments/11o6o3f/how_to_...

https://old.reddit.com/r/singularity/comments/11vsvro/in_cas...

https://twitter.com/theshawwn/status/1632569215348531201

---

That being said, I found the OpenAssistant model much better: https://huggingface.co/spaces/olivierdehaene/chat-llm-stream...

It's also completely OSS, Apache 2.0, unlike LLaMA and Alpaca which are non-commercial.

circuit10 · on March 20, 2023

I’m impressed by how ChatGPT-like it is but also it’s saying things like

“No, OpenAI does not have an API for dogs. They do, however, have an API for other animals, such as cats. To retrieve an image of a cat, you can use the OpenAI API for Dogs API and select the cat breed or type.”

superkuh · on March 21, 2023

I've tried all sorts of parameters including those exact ones. As for the huggingface stuff, it's not exactly clear how to use it without going down the python dependency rabbit hole. I am not confident I could get the correct python packages all together on Debian 11 to support running it. The llama.cpp stuff is very simple to compile and run comparatively.

UncleOxidant · on March 20, 2023

> the alpaca 7B _4-bit_ [and presumably also 4bit for the 13B, 30B and larger parameter sets]

This is the wild card here, though, isn't it? OpenAI's chatGPT likely uses more than 4 bits for it's parameters. IIRC the original LLaMA params were 16bit floats and they were quantitized down to 4bit - considering that large amount of compression, they sill do pretty OK, but not as good as chatGPT. I wonder how the alpaca/LLaMA models would do with 16bit floating point params (as they were originally trained)? What if they would have gone with 8 bits for the params as a compromise?

EDIT: Come to think of it, unless you're using vectorized ops on a CPU, 4 bit and 8 bit math is going to run at the same speed (for most popular CPUs), is it not? So why did they go all the way down to 4 bits instead of stopping at 8 bits (other than to make the param files 1/2 the size)?

EDIT2: looking through the alpacca.cpp code and there is mention of AVX, AVX2, AVX512 (and NEON on ARM) so it probably is taking advantage of vectorized ops where that's possible.

leodriesch · on March 20, 2023

Not an expert on the matter so take this with a grain of salt, but I’d say the compression is also about VRAM/RAM, which seems to be the more limiting factor over inference speed.

onlyrealcuzzo · on March 20, 2023

It's interesting that when ChatGPT 3.5 came out - everyone said, this is it! It's ready for primetime.

And now that there's a few competitors in the same league - 3.5 quality is suddenly garbage and only 4.0 is good enough.

Was it good enough before or wasn't it?

jonplackett · on March 20, 2023

I think it’s going to be like movie special effects.

When Jurassic park first came out, or even something like Star Trek next gen. It looked AMAZING. So so realistic. But then…. As time goes on new things showed us what realistic could be.

I think we actually got better at seeing.

Same thing here. The more time you spend with it the more you notice things that don’t quite work. And then the new thing solves those problems, but we’ll find more wrongness

coeneedell · on March 20, 2023

The problem is that you’ve identified two distinct and non-overlapping sets of people as “everyone”. Everyone who was applauding 3.5 when it came out were industry hype people. Even the critical voices were industry hype people, paid to assume the AI is powerful and write about the possible negative consequences of that assumption.

Now we’ve all gotten familiar with 3.5, and we’ve come to understand its limitations, so the public knows it’s not a “godlike” AI.

Luckily there’s a fresh new model, not technically different from the earlier one but it cost more money to build. The hype group can start again, citing the publicly known limitations of 3.5. But in 6 months we’ll understand what’s wrong with it, and the public will be talking about the limitations, just in time for 4.5.

bioemerl · on March 20, 2023

It's really not good enough yet, it's impressive for what it is in our current time. But we're looking at the 1980s computers.

They are neat, they are useful, but they can do so much more.

LASR · on March 20, 2023

In my personal testing, I throw some sophisticated use cases at LLMs - particularly chain of thought reasoning. None of the models out there are able to do this this well, except for the OG GPT-3 Davinci-003. Even the newer turbo models are not as good.

I am playing around with GPT-4 this week though. Let’s see how that goes.

stavros · on March 20, 2023

The newer turbo models are the ChatGPT models, and are worse than text-davinci-003, in my experience. The gpt-4 model is also not as good as the GPT-4 chat version, which is very odd.

LASR · on March 21, 2023

Yeah I think it lines up with Davinci being 10x as expensive as Turbo.

genericacct · on March 20, 2023

fwiw 7B is totaly useless for the subset of non english languages i've used, 13B a bit less so, but nowhere near as good as gpt.

GPT's performance in non-trivial translation tasks is unbelievable. all those articles mentioning jobs that are going to be replaced fail to mention translators are probably going to be the first.

user_named · on March 21, 2023

You need the translators to QA the output from GPT. It's less work but not much less, and more types of translation work becomes feasible when leveraging GPT. I'm guessing the job market for translators will grow, not decline.

thomasahle · on March 20, 2023

3 bits? Is that for all weights in the network?

superkuh · on March 20, 2023

As far as I know, yes. https://arxiv.org/abs/2210.17323

"Specifically, GPTQ can quantize GPT models with 175 billion parameters in approximately four GPU hours, reducing the bitwidth down to 3 or 4 bits per weight, with negligible accuracy degradation relative to the uncompressed baseline."

This would be 175 billion 3 bit weights instead of 175 billion 16 (or 32!) bit weights. It massively reduces the size of the model. It makes loading it in ram on consumer computers feasible. The number of parameters stays the same.

rnosov · on March 20, 2023

> https://arxiv.org/abs/2210.17323

I've read the paper and to be honest I'm not sure what to make of it. Their headline benchmark is perplexity on WikiText2 which would not be particularly relevant to most users. If you look at the tables in the appendix A.4 with some more relevant benchmarks you'll sometimes find that straight RTN 4 bit quantisation beats both GPTQ and even full 16 bit original! No explanation of it is given in the paper.

sebzim4500 · on March 20, 2023

Some of those benchmarks have a pretty small sample size IIRC, might just be coincidence that the noise introduced by RTN just happens to slightly improve them.

GPTQ beats RTN on almost every benchmark at almost every size, though.

coeneedell · on March 20, 2023

I wonder if reducing the bit depth of parameters like we have been acts as a normalization feature in these huge deep models.

rcme · on March 20, 2023

The number of parameters stays the same, but the amount of information encodable by those parameters is not the same.

thomasahle · on March 20, 2023

But they have to expand it back out to actually use it, right? Or does NVIDIA support 3 bit matrix mult?

DrJosiah · on March 20, 2023

It might have been a typo, as the current llama.cpp / alpaca.cpp included quantizers default to 4 bits.

throwaway1851 · on March 20, 2023

Hm. I haven’t tried the local installs yet. However, when the Alpaca web demo was live, I did find it to be comparable (though not quite as capable) to davinci-003. It answered arbitrary factual questions about pop culture references, law, medicine, and programming. It generated rhymes and poems. (I didn’t try asking for the emoji thing, so can’t say anything about that.) It performed natural language tasks such as information extraction and summarization. And it did all of it coherently.

nickthegreek · on March 20, 2023

Where does one find the 13B alpaca model?

superkuh · on March 20, 2023

Be aware this file is a single ~8GB 4-bit model (ggml-alpaca-13b-q4.bin) instead of the 2x ~4GB models (ggml-model-q4_0.bin, ggml-model-q4_0.bin.1) that most llama.cpp style inference running programs expect. You'll probably have to edit the line,

    n_parts = LLAMA_N_PARTS.at(hparams.n_embd);

in chat.cpp (or main.cpp) to hard code it to treat this 1 file model properly like,

    n_parts = 1;

Or re-write the parameter config subroutine to recognize and handle non-standard weights file.

magnet: magnet:?xt=urn:btih:053b3d54d2e77ff020ebddf51dad681f2a651071&dn=ggml-alpaca-13b-q4.bin&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce&tr=udp%3A%2F%2Fopentracker.i2p.rocks%3A6969%2Fannounce&tr=udp%3A%2F%2Ftracker.openbittorrent.com%3A6969%2Fannounce&tr=udp%3A%2F%2F9.rarbg.com%3A2810%2Fannounce

torrent: https://btcache.me/torrent/053B3D54D2E77FF020EBDDF51DAD681F2...

torrent: https://torrage.info/torrent.php?h=053b3d54d2e77ff020ebddf51...

via: https://github.com/antimatter15/alpaca.cpp

codetrotter · on March 20, 2023

> represent a movie plot as emoji

This sounded like a really cool idea but I asked ChatGPT to do this for the plot of the movie The Shawashank Redemption and there is no way that I would ever have been able to guess that movie from the emojis it gave me. Perhaps GPT-4 does a better job at it.

ryoshu · on March 20, 2023

So what you're saying is it's a matter of time?