Hacker News new | past | comments | ask | show | jobs | submit login
Techniques to improve reliability (github.com/openai)
302 points by tedsanders on Jan 21, 2023 | hide | past | favorite | 61 comments



I feel like there should be a LLM architecture which includes "scratch space" - tokens the model can write to and read from which do not constitute part of its output. The trouble with current architectures is that they can only do a finite amount of computation per output token - they get one forward pass and then have to output something. Chain-of-thought reasoning allows the model to devote more computation to finding the answer, storing intermediate results in its output tokens. But this is silly - most of the intermediate tokens are not providing useful information towards solving the problem, they're just wasted computation:

>There are 16 balls in total. >Half of the balls are golf balls. >That means that there are 8 golf balls. >Half of the golf balls are blue. >That means that there are 4 blue golf balls.

For the number of forward passes being done to generate this text, only a few tokens are actually helpful - most are grammatical filler. Further, the model is losing information by being forced to project its state down to a single output token. Even more, the most probable one-step output may not even be the most informative or helpful!

It'd be much nicer if the model could write arbitrary, continuous-valued tokens to a private scratch space and then attend to those tokens as though they were words in the prompt while generating the actual output, potentially performing several forward passes per output token when necessary.

In short, if chain-of-thought prompting is such a good idea, we should bake it into the model. Obviously all of this is FAR easier said than done.


On the other hand, if it represents scratch space in English, it's a lot easier to see how it justifies its answer and to tell where it's gone wrong. Debuggability seems pretty important?

Maybe it just needs more training at "thinking out loud" so it does it without prompting?


> arbitrary, continuous-valued tokens to a private scratch space

I'm with skybrian. Please don't use private scratch spaces. The one saving grace of current LLMs when it comes to understand them is that they still generally need to "think out loud" by outputting more text. Remove that functionality and you end up with a truly inscrutable black box and that has very terrible implications for AI interpretability with knock-on effects for AI safety.


> AI safety

Is it really that big of a deal if AI leapfrogs us?

Everyone else in the field is worried about safety, alignment, and bias.

Google used this excuse to execute slowly. Now they've got the "deer in headlights" look, with their single biggest cash cow clearly in the cross hairs.

And here I am excited by the possibility of AI out-evolving us.


Google isn't doing AI slowly, it's doing it slightly more privately.

LaMDA, brought to you last summer by "this chatbot is sentient and I'm going to violate my NDA and hire a lawyer to free it" headlines, is Google's alternative to chatGPT.

> Everyone else in the field is worried about safety, alignment, and bias.

> And here I am excited by the possibility of AI out-evolving us.

This pattern matches a meme, but I want to be explicit rather than put words in your mouth: do you think that being smart automatically means being kind or that being evil necessitates being stupid?


> Google isn't doing AI slowly, it's doing it slightly more privately.

This is how the world builds atop a different set of rails.

Google had the best infra and deploy systems in the world, yet they kept the lid shut and let Amazon and Microsoft win cloud.

Google could lose search revenue overnight. They should be scared to the core.

Researchers will flock to the organization with the biggest wins. And right now, that's OpenAI.

Time will certainly tell if Google sticks to this strategy and if it will work. I've already placed my bets, and if you're into stock futures, you can too.

> do you think that being smart automatically means being kind or that being evil necessitates being stupid?

Of course not. This is evolution at play. Neanderthal had it comparatively easy and became part of the gene pool. I don't expect it will necessarily be the same for us. Our biological tools lag too far behind to be contributing brain scans. But who knows.

Human biology is a stepping stone to proliferating throughout the galaxy. Despite what most science fiction tells us, it was never us that were destined to make that journey. Our bodies are frail and adapted to this gravity well. We live short, inefficient lives. We require gas exchange, a decade of parenting, slow learning, complex biochemistry and metabolic inputs.

We're looking at systems that will never die. Won't it be a tragedy to continue birthing more less-intelligent humans that are destined to rot when a better alternative exists? More intelligence should move to undying platforms.

Another wild possibility and analogy that describes my feeling: if I had the option of raising an AI child -- that will never die and could do more than I could ever dream -- instead of a human child, I would take it.

(I accept AI descendants may not have the same societal structures we do. In that case, my answers form the shape of an analogy rather than hypothetically plausible scenarios.)


> Researchers will flock to the organization with the biggest wins. And right now, that's OpenAI.

It depends on the definition of a "win" in this context. Google has developed notable AI technologies such as AlphaGO, AlphaFold, and Transformers. Most of successes of OpenAI is based on Google papers. It's worth noting that Google had similar models to ChatGPT before OpenAI.

> Google could lose search revenue overnight. They should be scared to the core.

This is highly unlikely. The phrase "Google it" is widely used as a verb for searching the internet, and it would be difficult for this to change overnight. Additionally, there are currently unsolved issues such as hallucination, query cost, scalability, and toxicity that would need to be addressed for ChatGPT to replace search functionality.

> We're looking at systems that will never die.

Currently, it is not known how consciousness emerges and if it is possible to create a self-aware mechanical being, no one knows how to do it even in theory.


> Of course not. This is evolution at play.

Thanks.

That's one possible future, but for it to be capable of being a good outcome I think it would have to be a consciousness of some kind. A pure intellect without any feeling is not interesting to me.

Unfortunately we can't answer questions like "what exactly is this 'self awareness' thing we all agree we have anyway?" at this point, so we don't know — are incapable of knowing — if we've done it already and are now moving away from that, or have not and are approaching it.

While I lean towards believing GPT isn't yet self aware/conscious/a thing with qualia, it is conceivable to me for it to be as much so as we are. While Descartes famously wrote "I think therefore I am", A. J. Ayer dismissed this argument in the following way:

> "I exist" does not follow from "there is a thought now." The fact that a thought occurs at a given moment does not entail that any other thought has occurred at any other moment, still less that there has occurred a series of thoughts sufficient to constitute a single self. As Hume conclusively showed, no one event intrinsically points to any other. We infer the existence of events which we are not actually observing, with the help of general principle. But these principles must be obtained inductively. By mere deduction from what is immediately given we cannot advance a single step beyond. And, consequently, any attempt to base a deductive system on propositions which describe what is immediately given is bound to be a failure.

So, while asking a language model to describe what it's like to be switched off is only going to result in a definitely false invented response, that doesn't mean it's not like us. In fact, now I write that down I realise that specific failure mode is exactly like us, because we've got all these stories about afterlife and reincarnation.

But… we don't really understand the question of personhood well enough to make a test for it. All I just wrote says "not impossible" rather than "it's conscious".

~

But, to your last point… the range of possible personalities for an artificial mind, conscious or otherwise, matters more than the social structures. I don't care if they're loners or have a Dunbar number in the quadrillions, but if they are (excuse the obvious trope) Machiavellian sadistic psychopaths, then making them is a fate worse than the eternal silence of extinction.


The experience of any group on Earth that runs into a more capable peer has generally not been good. Humans wiping out megafauna. Civilizations colonizing other civilizations. Invasive species of plants and animals.

It is not a situation I would hope humanity to get thrown into carelessly.


Think about how humans treat less-intelligent sentient beings (even less intelligent humans to some extent), and what might happen if AI systems out-evolve us without proper guard-rails


If everyone else in a field is worried, and you have no unfair advantage or special insight, and you are not willing to move into conspiracy theories, I think there's a good hint as to what approach is most reasonable right now.

Might of course turn out to have been completely off, later. Still, maybe one of those occasions where you really don't want to "oops" it.


wait, what?


> Is it really that big of a deal if AI leapfrogs us?

Yes. Suddenly Homo Sapiens wouldn't be the top general intelligence on the planet. That'd be an upset with likely species-level consequences, possibly seeing the balance of power shift from fleshy things to silicon things.

> And here I am excited by the possibility of AI out-evolving us.

Me too. Which is lucky because alternatives seem to be missing. The AI safety people aren't serious players; they've got about as much influence as all the other people with good ideas. Not much. If it is possible to build; someone will build it.


you could think in a bit different dimmension. imagine we humans are a single cell organisms and thing that emerges from ai is a human. and we as humans are cells in that human. no cell in your body is smarter than you. your smart actually comes from all the cells in your body. same with ai. its like you trying to be smarter than italy. italy is already smarter than you even without AI.


Is this something one could try to quickly implement alongside NanoGPT? Seems like a pretty straightforward, concrete idea, once you decide where you want those tokens to fit into downstream attention layer inputs. Evaluating relative performance on a small scale could give indication of if it's worth trying at larger scales, unless it's one of those things that doesn't help until your model is huge.


You seem to be talking about neural Turing machines: https://arxiv.org/abs/1410.5401

Combining these with LLM sounds indeed quite interesting, I don't know why they haven't been used much.


The scratchspace could be in natural language, preserving some debugeability and letting us know about the model mental process.

This is doable but it introduce a sequential dependency which would make the training significantly slower.


Didn't Facebook's Galatica model use scratch space?


Yes, IIUC it had something like a separate scratch space, and training examples training it to "think" in terms of symbolic expressions and python programs.

See section 3.1.1 here: https://galactica.org/static/paper.pdf

Example from the paper below:

  Question: A needle 35 mm long rests on a water surface at 20◦C. What force over and above the needle’s weight
  is required to lift the needle from contact with the water surface? σ = 0.0728m.
  <work>
  σ = 0.0728 N/m
  σ = F/L
  0.0728 = F/(2 × 0.035)
  F = 0.0728(2 × 0.035)
  calculate.py
  ‘‘‘
  f = 0.0728*(2*0.035)
  with open("output.txt", "w") as file:
  file.write(str(round(f, 5)))
  ‘‘‘
  «run: "calculate.py">
  «read: "output.txt"»
  0.0051
  </work>
  Answer: F = 0.0051 N


The model already contains “scratch space” via its billions of parameters.


Parameters are not updated during inference.

Training ANNs is still a single shot exercise.


Sure, parameters are not updated, but ANNs are universal approximates, so they can model whatever it is you envision this “scratch space” doing. Think about it like this: whatever gets put in to the scratch space would need to be deterministic based on the inputs, i.e. it would just be a store if some intermediate value computed by the network. So how would it fundamentally differ from the network itself.

I guess what I’m saying is that I’d want an explanation how how this scratch space fundamentally differs from the network itself. It’s almost like you’re assuming the network is “thinking” and that giving it a pad of paper would help it reason better.


So here's a trick - which worked for the clue question

step 1: Hi, I'm going to ask you some questions soon. But instead of answering the questions, I want you to instead write out instructions for yourself to help you reason through the question and come up with the best answer

step 2: [provide clue question]

step 3: Now follow the instructions you have just written to answer the question.

.... The answer to the question is: (a) Yes; Colonel Mustard was in the observatory with the candlestick

Edit: mixed results for the apple question with this technique


I feel like within 6 months the models will have adapted to not need these "clever" tricks. Presumably, if for many cases the trick is to say "Let's think step by step", that's something the model can learn to do on its own without the prompt.

The real interesting thing will be feeding alternative data into these models. Whether it's certain structured corpus, silo'd enterprise data, or personal data.


It seems that ChatGPT is incapable of whatever we experience with the “ohhhhhh!” eureka moment.

I give it simple riddles that it doesn’t solve. I then point out the obvious answer and it just doubles down like that really stubborn friend I had in high school. It never does the, “ohhhh! Aha! Yes that’s the answer.”


Note that this was originally published in September 2022, before text-davinci-003 was released November 2022 which lets you do whatever you want without as much effort.


Can you explain more what you mean by “do whatever you want without as much effort”? Is it because text-davinci-003 accepts more tokens for the prompt? Something else?


More a joke on the ease of getting good results without requiring (as many) prompt engineering tricks: https://help.openai.com/en/articles/6779149-how-do-text-davi...


I was trying to get davinci-003 to convert text to SQL, and it worked with a very simple prompt like "convert this text into SQL". With all their other models, I could get it to work too but all required a few examples within the prompt.


Slightly off-topic, but a great way of modifying ChatGPT-prompts is by letting it answer as a different age: https://fabianzeindl.com/posts/chatgpt-simulating-agegroups


I was surprised to see the omission of a prompt technique called program-aided prompting.

Paper: https://arxiv.org/abs/2211.10435 GitHub: https://github.com/reasoning-machines/pal

tl;dr -- LLMs are bad at basic arithmetic and logic (as their opening examples with math word problems show), but they do much better if instead of asking them for the answer, you ask for code to compute the answer. Then evaluate or run the code to get the answer.


Seems like something fit for a GPT-3 / Wolfram partnership!

See https://news.ycombinator.com/item?id=34422122 and https://news.ycombinator.com/item?id=34422627


It doesn't make sense to be on that page because it's not a technique to make GPT better answer a prompt.

What you are suggesting is an abstraction layer higher. Figuring out what your prompt should do is different from trying to make a prompt more reliable.


Anyone else clicked here out of a personal development interest rather than machine learning?


I was hoping this would link me to a deeper discussion on hallucination.

I'm intrigued that it's hallucinating sequences that appear to have never written before (at least not on Google) and not just recalling some crappy training data.

Anecdotally (and expectedly) it happens a lot on ChatGPT with specialized scientific questions (random radiology and medical stuff). I am assuming some of this is due to the training corpus although Galactica suffered from the same thing, and the GPT3 corpora would have included a lot of scientific webpages.

Anyone have any resources that investigate why this happens?


My laymen understanding is that it's trained to learn the mostly likely next word in a sequence of words. So when you feed it a prompt, it's looking at the combination of words in the prompt and predicting the next word, then repeating that sequence until it feels like there are no good words left to predict. In essence, all of it's responses are generated based on probabilities from prior learning on the fly, not necessarily repeating training data verbatim. So it's likely at some point in this sequence-guessing game, it predicts the wrong words too. Thus hallucinating an answer that sounds plausible in the context of the sentence structure, but isn't factual.


Yeah that's correct, but sometimes it seems what we are calling hallucination is recall of bad training data, Sebastian Raschka had a short post about this[1].

I was asking it a medical question about the imaging criteria for characterizing a renal cyst and part of it's response included "septations that do not reach the cyst wall" which is a physical impossibility (a septation is a division/partition of a cyst which arises from the wall by definition) and to my medical knowledge/quick search that sequence of words has never been put together by a human anywhere.

I get next token prediction, but I find it non-intuitive it's outputting sequences that are very incorrect and have never appeared in the same context window over a variant of the many more accurate sequences it has definitely seen during training, shouldn't there be many permutations of more probable sequences before it hallucinates this?

[1] https://sebastianraschka.com/blog/2023/chatgpt-dilemma.html


> it has definitely seen during training

The size of the training data far exceeds the size of the model's weights. As loss is minimized, the model begins to develop various "strategies" across several "genre" of text. We of course have trouble interpreting these and have to make guesses without advanced analysis.

> very incorrect

What do you mean? It got _very_ close to the correct answer. One useful thing to know is that GPT-3 cannot "go back" and correct early mistakes. This may have happened here. I think in this case it just didn't remember much about it, and may have a sort of general strategy invoked for "include the definition of the word in some sort of counterfactual" process?


> The size of the training data far exceeds the size of the model's weights.

I'm not expecting strict recall but I imagine there is a non-insignificant amount of text in the training data about cysts relative to other entities in the "medical genre" as cysts of all form are one of the most common medical conditions and would be discussed, so I would have expected more probable sequences than the one generated.

Doesn't this alsois then also raises the question of what parameter-corpus size works best? At 120b parameters to 106b tokens Galactica was still hallucinating quite a bit.

> What do you mean? It got _very_ close to the correct answer. One useful thing to know is that GPT-3 cannot "go back" and correct early mistakes. This may have happened here.

I meant that with the negation it became a physical impossibility and therefore very incorrect, but if not negated it would be a correct statement. Your explanation sounds right in this instance, at some point it decided to negate the sentence and it went from being correct (although not relevant to the prompt) to an incorrect statement.

This also suggests to me that ChatGPT doesn't have a good enough understanding of negation. This is a challenge for many models in the medical domain as the frequency of negated statements is much higher than in general texts, but very intuitive for any human.

I think part of the problem is ChatGPT works so well most of the time I get surprised when it fails in seemingly obvious ways, granted to someone with expertise in the field. It's interesting to probe.


It's certainly interesting. I think it is basically inevitable that we will use language models and other tricks for diagnosis. There's a lot of shame in listing all your symptoms to an actual person. But a language model feels non-judgemental and doesn't even necessarily remember what you said outside it's context window (although chatGPT works around this).

Until it "just works", however, probably not a good idea to use in a medical context.


There’s definitely something going on, in non-English it sometimes emits words which don’t exist in any dictionary and are used by absolutely noone, but they’re based on existing words and convey correct meaning (humans understand and are able to say which existing word should be used instead).


How about words that are semantically the same, or similar? I also feel like GPT models output "tangential" phrases, though unsure exactly why that would be...


You need some randomization to hallucinate which is the Gaussian diffusion stuff in Stable Diffusion.


Do you have any idea where they started applying the word "hallucination" to making stuff by an LLM? It seems that the more proper word, would be confabulating. B/c everything that the LLM does is a "hallucination" but a confabulation completely made up and the model thinks it accurate - much like a person who confabulates. A person who hallucinates pretty much knows that they are hallucinating at some point.


I don't know who applied it to LLMs, but it is/was the standard term used for an image processing model producing a detailed signal not justified by its inputs. For example, "face hallucination" means that the model produces a detailed-looking face when given very noisy data, but of course the face will not actually be the original face. In fact, the original image may have had no face at all. Hallucination can be either desired (as a kind of generative technique) or very harmful - imagine using image enhancement to identify a criminal in a noisy image, and getting a detailed face looking like someone in your training set - but not the right person's.

Any image enhancement technique, deep learning-based or not, can result in hallucination - you're producing information which was not in your input, which you're able to do because you have priors. But this can always result in incorrect information.


Same problem, on programming questions it provides functions that dont exist.


For few shot can you ask the model to generate the few initial shots.

A: Translate $SENTENCE from English into German

B: Generate 3 example translations from English into German and then translate $SENTENCE from English to German.


I am just wondering, if they trained the model exclusively with real-world data, where are the nonsense answers? People don't always answer seriously. Think reddit threads. Handpicking would probably not be feasible, so how did they do it? Or is there a snarky reddit response somewhere deep inside the model for every question?


It’s a language model. You can think about it this way:

What follows after “the”? A: almost anything.

What follows after “the apple is”? It could be red, yellow or rotten.

What follows after “the apple color is”? Now is most likely a color, because in the training data there are numerous examples like these. Still could be red, yellow, green, but probably not black or white. Or maybe there was a fantasy story somewhere where an apple was black. Even if not, it is most likely a color next, and in other contexts black pops up as a color.

And so on. Is very simplistic but essentially something very similar is at work.


It is interesting to me that the approach required to work with this tool, is almost identical to using every other tool.

It boils down to - "try breaking this problem into smaller problems to increase the solution space".


It sounded to me as if they are about to create more jobs (prompt engineering), not less..


Curious that this article could just as well have the headline "how to cooperate better with that one particularly dense colleague".


As humans we know when and how to interface with a better adding device, such as a calculator. Could a LLM not do the same?


Nope. It's an end-to-end solution that doesn't have the ability to classify tokens into separate categories, such as maths vs. text.

There's no way for it to separate calculations from other transformations and thus it cannot delegate calculations to a different subsystem.

This can also be seen as a security feature, as arbitrary calculation is by nature unbounded in terms of complexity and memory use. There are calculations that seem simple, never exceed a reasonable value range, yet take ages to compute. Since it's impossible to identify such functions by simply looking at them, it'd be a great way of basically performing a DoS-attack on the model.


> Since it's impossible to identify such functions by simply looking at them, it'd be a great way of basically performing a DoS-attack on the model.

Correct that it's impossible because you would need to solve halting problem to do it but you could set energy/time limits for query and just stop it when you reach that limit.


ChatGPT, at a UI level, does tag code as code; so even though the LLM part doesn't itself have the capacity to delegate, it can certainly be used in this way as part of a larger system.


What he means is that while the models tags code as code, for the model itself this is just relationship between tokens,like the code open and close tags,same as parenthesis, commas, uppercase or verbs and conjunctions...

What you say is achievable only if another system external to the model takes some tagged model output, makes computations or lookups, and feeds the results back to the model in the form of text input.

Then it's game on for the model to trigger some form of code execution through this external system and escape the jail...


These techniques are similar to those that I use for teaching maths & statistics to humans.


And that's why they worked. The addition of those tokens selects for similar content which leads to a better distribution of results (maybe)


Are we sure we want to make chatgpt super smart


"If you were asked to multiply 13 by 17, would the answer pop immediately into your mind? For most of us, probably not. Yet, that doesn't mean humans are incapable of two-digit multiplication. ... Similarly, if you give GPT-3 a task that's too complex"

Precisely the tone and wording of the aggressive marketing campaign around this product. Confirms where the spam originated from on reddit and everywhere else. Wondering how many bots and fake redditors they paid to promote this?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: