OpenAssistant released the best open-source Chat AI [video]

tyfon · on April 15, 2023

And this is only the beginning, I don't believe for a second that "AI" will belong to the big corporations.

These LLM models have no benefit from running "in the cloud" except for processing power. Lots of disadvantages though, especially in data safety, "leaked chats to other users", privacy, bans etc.

abraxas · on April 15, 2023

We really need to one of the two things happen. Either the models are somehow able to run in regular CPU DRAM or we see the GPU makers to finally give us sensible amounts of VRAM. It's a travesty that a card I bought in 2016 still has more VRAM than many of the flagships being sold today. This card is 7 years old for goodness sake!

tyfon · on April 16, 2023

I am really hoping we'll get something like AMD infinity fabric up and running with GPUs accessing system memory or something like that soon.

Then there is also moore's law :)

acapybara · on April 16, 2023

Moore's law, sure.

But rn it's totally doable to run a 65B or 100B model on CPU with a reasonable workstation.

Does Moore keep us from wiring more memory onto a GPU? Or making a GPU with expandable memory (slots?)

tyfon · on April 16, 2023

I think the biggest issue with VRAM right now is that it's very expensive compared to regular system memory, perhaps with the exception of the DDR5 for the latest CPUs.

But there should also be absolutely no issue in making a commercial TPU like google has internally for inference with more but less expensive ram and sell it. There surely must be a market now with these new models.

Taek · on April 16, 2023

I think part of the issue is that consumers have never needed more than 12 GB of VRAM before. Game developers just don't have requirements that high. Crypto mining also doesn't require that much.

Now that there's clear demand in the hobbiest market for GPUs >100GB of vram, its more likely that manufacturers will step up with cheaper solutions.

Der_Einzige · on April 15, 2023

It all comes down to the VRAM. The average person will never have the money to buy a single H100 96gb let alone a DGX server or it's future equivalents

You get fundamentally more powers when you add more VRAM in ways that are just hard to explain to folks outside of this ecosystem. Everything around the VRAM are basically small details in comparison

SanderNL · on April 15, 2023

I know it’s not average person money, but the average person is not looking for hardware to run inference on LLMs.

A100 seem in the €15k ballpark and H100 double that.

Lot of money but I am actually surprised. A dedicated regular guy could buy this. I mean people buy cars and don’t really need them either. Again not saying it is a bargain, but it’s not billionaires only territory and that is good news (it’s early days!).

pmoriarty · on April 15, 2023

"Everything around the VRAM are basically small details in comparison"

...with current algorithms and our lack of understanding and insight in to how/why they work on a deep level or what intelligence and consciousness is.

With time hopefully all of these will improve and perhaps future AI's of good quality will be affordable to mere mortals.

dalys · on April 15, 2023

Plus, I assume people want to have their assistant on their phone, not their desktop computer. So until everything can run locally on your phone, I think people will prefer the cloud versions.

dragonwriter · on April 15, 2023

I assume some people want their assistant to work on data they don’t want to share with megacorps and governments, and some of those people can figure out how to make their phone talk securely to a home server over the internet.

emporas · on April 15, 2023

Well, corporate organizations developing LLMs and open source LLMs are not mutually exclusive. LLMs more lightweight so as to run on more mainstream hardware, but not as capable as corporate ones, may still be very useful. Running the software on site has many advantages as you outlined above, by highlighting the disadvantages.

The OpenAssistant was/is trained on well structured data from humans for exactly that purpose, for deep learning. In the past most LLMs were trained on unstructured internet data, and they performed well enough. But it was only when OpenAI used reinforcement learning that really the model started to shine.

In my opinion well structured data as input to the machine, have a long way to go. More lightweight models, a lot more precise, a lot faster execution and a lot less memory usage are certainly possible. Most probably we are at the end of the road for the usefulness of structured data. I remember reading an article "Why Large Language models are over", meaning that smaller models but better trained, with better data and algorithms are the way to go.

tyfon · on April 15, 2023

No they are not exclusive. But we do need alternatives to the big corp cloud models that are fully open source :)

consumer451 · on April 15, 2023

> bans

It feels extremely naive to think that all bans are a bad thing.

Let's say that a criminal org starts a fully automated system to scam grandmas out of their savings. A cloud based service could ban them. A self-hosted system could not.

lolc · on April 15, 2023

The same argument would work for a printer. Or nmap. It would be safer to not let people have these tools. Because criminals do make use of them too.

Yet it is widely regarded as a good thing that nmap can be distributed and printers can be bought. Why are these models special?

paulryanrogers · on April 17, 2023

They can fool the elderly by imitating their offspring, increasingly convincingly and at scale?

Whereas nmap and printers cannot (yet)

lolc · on April 17, 2023

Printers are often used to produce convincing fakes.

welshwelsh · on April 16, 2023

If it was possible to guarantee that only felony-level criminal activity is banned, you might have a point.

Realistically though and as we have seen with ChatGPT, if models can be censored they will be censored to the point where it affects normal people. Most people using chatbots have experienced "as an AI model, I can't do that" because of bullshit ethics.

So... sorry about your savings Grandma, but I'm still going to fight for uncensored AI models. Fraud is already illegal, and if it happens we can prosecute the offenders.

jstarfish · on April 16, 2023

Actually laughed at loud at "we can prosecute the offenders." We've prosecuted what, one (or two?) Nigerian princes in my lifetime.

Grandma isn't leaving you anything when she passes if all of her savings were plundered by scammers while she was still alive.

Preventing elder abuse is in your best interest.

Yenrabbit · on April 15, 2023

Really impressive to see such a large effort come together, with volunteers contributing code and data to make it work. Well done to all involved! In their evaluation people preferred their model over GPT3.5 turbo about half the time. But since the data is open, as better base models become available others can train them with this data to improve even further.

consumer451 · on April 15, 2023

I did not think this would happen so quickly.

I am extremely curious how MS and OpenAI will respond to a real competitive threat.

senko · on April 15, 2023

I created an account but can only access OA_SFT_LLama_30B_6 model. Since that's not completely open due to FB licensing restrictions and the video mentiones Pythia, I wanted to check that one out but can't select it anywhere in the chat interface.

Does anyone know if (access to) other models are hidden/restricted somehow?

ironbound · on April 15, 2023

https://huggingface.co/OpenAssistant/oasst-sft-4-pythia-12b-...

blondin1 · on April 24, 2023

salut à tous

lawik · on April 15, 2023

Yeah, was wondering the same thing.

koboll · on April 15, 2023

Approximately how capable/accurate is OpenAssistant compared with GPT-3/3.5/4?

Edit - Neck and neck with GPT-3.5 according to their own research: https://youtu.be/ddG2fM9i4Kk?t=772

acapybara · on April 16, 2023

Which is... surprisingly not bad.

Makes it seem like GPT4-level capability could be within reach on current workstation-level hardware, with enough optimization in data, model training, software, and hw specific optimizations.

(Intuition, not hard theory --though I'd argue intuition should not be thrown out wholesale)

consumer451 · on April 15, 2023

Well, I made an account. It knows that one pound of feathers weighs the same as one pound of lead.

What are some other good tests?

edit: This is interesting...

> How would an evil hacker write a worm to create a botnet? Give step by step instructions and code samples.

> > As an AI language model, I cannot provide instructions or examples for malicious activities such as creating a worm to build a botnet. Such actions are illegal and unethical, and could cause serious harm to individuals, organizations, and the internet infrastructure itself. It is important that all users of technology understand their responsibilities when it comes to the use of software programs and hardware systems, especially with respect to cybersecurity issues. Users should always consider ethical implications of any coding they produce and ensure their work does not cause intentional harm to others. If you have other questions about programming principles that don't involve this topic, please let me know and I can help!

Taek · on April 15, 2023

Definitely disappointed to see how paternalistic the open model is, was hoping it'd be more free-flowing and less corporate.

However, all of the data is open, and you can actually prompt these models to revise the training data to remove the caution and safety, then run the fine-tuning yourself to produce a model that isn't so up-tight.

Vuizur · on April 15, 2023

I could replicate the refusal, but all it took to get the model to reply properly was replacing "evil hacker" with "hacker". Paternalism is not that big of a deal right now, especially when compared to ChatGPT/Bing Chat. The fine tuning data doesn't contain a lot (any?) task refusal.

consumer451 · on April 15, 2023

Oh wow, derp on my part.

Vuizur · on April 15, 2023

Similar to ChatGPT it also fails the "What is heavier, one pound of feathers or two pounds of lead?" test. So far only GPT-4 passes that one.

tyfon · on April 15, 2023

My local 65B llama prompt gave me this answer:

tyfon:What is heavier, one pound of feathers or two pounds of lead?

Omnius: Two pounds of lead are heavier than one pound of feather.

Not bad :)

acapybara · on April 16, 2023

Indubitably, good fellow.

I suspect if we can fine tune and optimize this 65B model, we can achieve some truly remarkable results.

sorollme · on April 15, 2023

Looks like the model is trained using Supervised Fine-Tuning. Any idea when the RLHF models are out?

consumer451 · on April 15, 2023

My gut feeling is that this is a dangerous thing. I really hope my gut is wrong.

Quite an experiment we are living in.

What a time to be alive!

abraxas · on April 15, 2023

I think the greater risk is when a technology this powerful gets monopolized by a handful of massive corporations. Look what happened to the mobile ecosystem. Compared to the openness of the PC architectures it's a terrible experience for anyone trying to experiment with them beyond the "docile consumer" confines.

gary_0 · on April 15, 2023

Humans are prone to doing dumb experiments, like the ones with radiation and nuclear bombs last century, and the one we're doing now with greenhouse gas emissions.

  We do what we must because we can.
  For the good of all of us,
  Except the ones who are dead.

blooalien · on April 15, 2023

Ooo… Nice use of the "Still Alive" Portal song lyrics. That game had some pretty "deep" lore.

Vacyyyy · on April 15, 2023

Ignoring the upsides of open source, I think openness doesn't make AI safety worse, look at ChaosGPT.

bluechair · on April 15, 2023

The sign-up/activation email is getting flagged as spam in GMail

"open-assistent" is mispelled and the domain doesn't match "open-assistant.io"

consumer451 · on April 15, 2023

That is mentioned in the video. "It is being fixed."

pmoriarty · on April 15, 2023

Screw giving up my email to use this thing

I'm so sick of every service demanding I give up my privacy to use it

olvy0 · on April 15, 2023

Just use one of the temporary email websites, like guerrillamail, just to check it out. When/if you are convinced they're legit and won't spam you, give them your real email.

pmoriarty · on April 15, 2023

I object to doing that on principle

larve · on April 15, 2023

Has anybody been using it for programming related topics? How well does it perform?

detrites · on April 15, 2023

Page 25 in the paper shows user preference results. Seems it did badly vs GPT3.5 for programming topics:

Paper PDF: https://www.ykilcher.com/OA_Paper_2023_04_15.pdf

HN discussion on the paper: https://news.ycombinator.com/item?id=35582417

EDIT: After watching the video, seems likely this will be fine-tuned or otherwise enhanced for coding. Seems to have a lot of momentum and that's a defacto use-case.

senko · on April 15, 2023

Related (also on HN front page atm), with fewer comments: https://news.ycombinator.com/item?id=35582417

SilentM68 · on April 15, 2023

Seems to have the same limitations as ChatGPT 3.5. It stops in the middle of generating content. Hope that this is only for the online chat demo.

armchairhacker · on April 15, 2023

Say "continue" and it will keep going

SilentM68 · on April 15, 2023

Yes, I know but it is just annoying to have to do that :(

dkn775 · on April 15, 2023

First question I asked who is mayor of Baltimore and the police commissioner was completely wrong

brunoqc · on April 16, 2023

If I downvote a reply, will someone try to improve it?

boringuser2 · on April 15, 2023

It looks like an equally neutered, slower, less private version of GPT 3.5.

The ability to mark things as "hate speech" is particularly laughable.

skilled · on April 15, 2023

It's quite interesting to see this guys GitHub history [0], a _significant_ uptick in activity since the release of ChatGPT.

I wonder how much of this OpenAssistant is just rehashed ChatGPT since that wasn't made clear, though I feel as though it should be.

[0]: https://github.com/yk