A less anthropomorphic approach might be to say that LLMs can predict the correc...

lysozyme · on May 22, 2023

I think parent has hit on the how and GP has hit on the why.

How LLMs are able to give convincing wrong answers: they “can predict the correct ‘shape’ of an answer” (parent).

Why LLMs are able to give convincing wrong answers is a little more complicated, but basically it’s because the model is tuned by human feedback. The reinforcement learning from human feedback (RLHF) that is used to tune LLM products like ChatGPT is a system based on human ranking. It’s a matter of getting exactly what you ask for.

If you tune a model by having humans rank the outputs, despite your best efforts to instruct the humans to be dispassionate and select which outputs are most convincing/best/most informative, I think what you’ll get is a bias towards answers humans like. Not every human will know every answer, so sometimes they’ll select one that’s wrong but likable. And that’s what’s used to tune the model.

You might be able to improve this with curated training data (maybe something a little more robust than having graders grade each other). I don’t know if it’s entirely fixable though.

The brilliant thing about the parent’s comment about the “shape” of the answer is that it reveals how much humans have (uh, historically, now, I guess) relied on the shape of information to convey its trustworthiness. Expand the notion of “shape” a bit to include the medium. If somebody bothered to take the time to correctly shape an answer, we take that as a sign of trustworthiness, like how you might trust something written in a carefully-typeset book more than this comment.

Surely no one would take the time to write a whole book on a topic they know nothing about. Implies books are trustworthy. Look at all the effort that went in. Proof of effort. When perfectly-shaped answers in exactly the form you expected are presented in a friendly way and commercial context, they certainly read as trustworthy as Campbell’s soup cans. But LLMs can generate books worth of nonsense in exactly the right shapes without effort, so we as readers can no longer use the shape of an answer to hint at its trustworthiness.

So maybe the answer is just to train on books only, because they are the highest quality source of training data. And carefully select and accredit the tuning data, so the model only knows the truth. It’s a data problem, not a model problem

jstarfish · on May 22, 2023

Cool, thanks for tying a neat ribbon around it.

> The brilliant thing about the parent’s comment about the “shape” of the answer is that it reveals how much humans have (uh, historically, now, I guess) relied on the shape of information to convey its trustworthiness.

This is the basis of Rumor. If you tell a story about someone that is entirely false but sounds like something they're already suspected of or known to do, people will generally believe it without verification since the "shape" of the story fits people's expectations of the subject.

To date I've decried the choice of "hallucination" instead of "lies" for false LLM output, but it now seems clear to me that LLMs are a literal rumor mill.

flagrant_taco · on May 22, 2023

What's the point of the technology if it will provide an answer regardless of the accuracy? And what prevents this from being dangerous when the factual and ficticious answers are indistinguishable?

TeMPOraL · on May 22, 2023

We have the same problem with people. Somehow, we've managed to build a civilization that can, occasionally, fly people to the Moon and get them back.

Even if LLMs never get any more reliable than your average human, they're still valuable because they know much more than any single human ever could, run faster, only eat electricity, and can be scaled up without all kinds of nasty social and political problems. That's huge on its own.

Or, put another way, LLMs are kind of an concentrated digital extract of human cognitive capacity, without consciousness or personhood.

ben_w · on May 22, 2023

> without consciousness or personhood.

Hopefully, for the former.

Be a bit terrifying if it turns out "attention is all you need" for that too.

deadeye · on May 22, 2023

"without all kinds of nasty social and political problems"

I assure you, those still exist in AI. AI follows whatever political dogma it is trained on, regardless of if you point out how logically flawed it is.

If it is trained to say 1+1=3, then no matter what proofs you provide, it will not budge.

skybrian · on May 22, 2023

Yes, it could be dangerous if you blindly rely on its reliability for something safety-related. But many creative processes are unreliable. For example, coming up with bad ideas while brainstorming is pretty harmless if nobody misunderstands it.

Generally, you want some external way of verifying that you have something useful. Sometimes that happens naturally. Ask a chatbot to recommend a paper to read and then search for it, and you’ll find out pretty quick if it doesn’t exist.

flagrant_taco · on May 22, 2023

What happens when the tech isn't only being used to answer a human's questions during a shortlived conversation though?

The common case we see publicized today is people poking around with prompts, but isn't it more likely, or at least a risk, that mass adoption will look more like AI running as longlived processes talked with managing done system on their own?

dragonwriter · on May 22, 2023

> The common case we see publicized today is people poking around with prompts, but isn't it more likely, or at least a risk, that mass adoption will look more like AI running as longlived processes talked with managing done system on their own?

If by “AI” you mean “bare GPT-style LLMs”, no, they can’t do that.

If you mean “systems consisting of LLMs being called in a loop by software which uses a prompt structure carefully designed and tested for the operating domain, and which has other safeguards on behavior, sure, that’s more probable.

skybrian · on May 22, 2023

Yes, people are doing that. I think it's risky.

One way to think about it, though, is that many important processes have a non-zero error rate. Particular those involving people. If you can put bounds on the error rate and recover from most errors, maybe you can live with it?

An assumption that error rates will remain stable is often pretty dubious, though.

hetman · on May 22, 2023

Not if they're bad at it. ChatGPT and friends is a tool that's useful for some things and that's where it'll see adoption. Misuses if the technology will likely be exposed as such pretty quickly.

lynx23 · on May 22, 2023

These are the 1-million dollar questions when it comes to LLMs. How useful is it to talk to a human who likes to talk, and prefers to say something over admiting they dont know? And if you have a person with münchhausensyndrome in your circles, how dangerous is it to listen to them and accidentally picking up a lie? LLMs with temp > 0.5 are effectively like these people.

jstarfish · on May 22, 2023

I have the same concerns, but am feeling more comfortable about Munchausen-by-LLM not undermining Truth as long as answers are non-deterministic.

Think about it: 100 people ask Jeeves who won the space race. They would all get the same results.

100 people ask Google who won the space race. They'll all get the same results, but in different orders.

100 people ask ChatGPT who won the space race. All 100 get a different result.

The LLM itself just emulates the collective opinions of everyone in a bar, so it's not a credible source (and cannot be cited anyway). Any two of these people arguing their respective GPT-sourced opinions at trivia night are going to be forced to go to a more authoritative source to settle the dispute. This is no different than the status quo...

nullsense · on May 22, 2023

The number one problem for generalized intelligence is establishing trust.

dragonwriter · on May 22, 2023

> What's the point of the technology if it will provide an answer regardless of the accuracy?

The purpose is to serve as a component of a system which also includes features, such as the prompt structure upthread, that mitigates the undesired behavior while keeping the useful behaviors.

twelve40 · on May 22, 2023

for one, telling people something they like to hear is an amazing marketing tactic

eternalban · on May 22, 2023

I suggest using 'form' instead of 'shape'; the latter is mainly concerned with external form. In context of LLMs, form would be the internal mapping, and shape the decoded text that is emitted.

slim · on May 22, 2023

those things are anthropomorphic by design. there's no point in being cautious, unless it's from an ideological stand point

eternalban · on May 22, 2023

I think the social concerns around attributing personhood to LLMs transcend ideological concerns.