Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Why does ChatGPT struggle so much with counting?
6 points by logicallee on Dec 24, 2022 | hide | past | favorite | 6 comments
I've been playing with ChatGPT and am impressed by its ability to produce code and, to a certain extent, seem to reason about novel questions and situations and seem to engage in abstract analysis even on totally novel subjects.

However, it really struggles with basic counting compared with most people.

For example, I tested it asking about primes between certain intervals, first I asked "Are there more prime numbers between 1 and 10 or between 10 and 20?" and after it answered correctly I asked "Are there more prime numbers between 10 and 20 or between 20 and 30?". It answered:

"There are four prime numbers between 10 and 20: 11, 13, 17, and 19. There are also six prime numbers between 20 and 30: 23, 29, 31, 37, 41, and 43. So there are more prime numbers between 20 and 30 than between 10 and 20."

The middle sentence clearly contains an error. Why does chatGPT make such errors and struggle so much with numbers?

What would it take to "fix" it so it could be more human-like in its use of numbers?




The analogy I've used before is the old Eliza chatbot that responds based on pattern in the input. (E.g. I am x / how long have you been x?)

chatGPT is the same, it's just billions of times more complicated (and optimized by gradient descent instead of a person). But it's really just responses to patterns.

So asking why gpt struggles is like asking why eliza struggles. It doesn't have a mind or an internal model of the world, it just has responses to patterns. There are "in distribution" cases where it gives an answer you expect, but outside of this, the model fails arbitrarily, and because it has no mind, it has nothing to sanity check the output against and often looks silly.

You could try training a math specific language model, though as I understand it, neural network generally aren't good at math because they can't extrapolate (they interpolate very well). So for example it's challenging to train a NN that learns to tell if numbers are even or odd, given even / odd training pairs.

You could have some sort of math loop as part of a system that uses a language model, say to flag when it spits out incorrect math, but that would be equivalent to hard coding something. As I understand it, chatgpt has some kind of RL layer on top of the language model that performs that supervisory function, but as a neural network it will suffer from the same problem, and itself not know when it's wrong


If you wanted something to do math you'd make something more like a pocket calculator. ChatGPT's real competence is in picking the most plausible, not necessarily the most correct, next word. That is how it is able to hypnotize people into skimming past its mistakes.

That can get you really far in English class or maybe social studies but it can't get you far in a field like math where you can really say something is wrong or right without any reservations.

Note that reasoning about math is fundamentally problematic because of

https://en.wikipedia.org/wiki/G%C3%B6del%27s_incompleteness_...

because there is no algorithm that can tell, 100%, if a mathematical statement is satisfiable. It took mathematicians almost 375 years to prove, for instance, that

   x^n + y^n = z^n
is unsatisfiable for x,y,z all positive integers > 0 and n a positive integer > 2. You'd like to be able to ask a math AI to try and solve an equation like that or to prove it has no solutions but there is no algorithm that is going to solve that in a finite amount of time. It's fair for something to go looking for solutions and either keep going till you tell it to stop, or maybe stop at a certain point and tell you something about what it ruled out.

This book

https://en.wikipedia.org/wiki/G%C3%B6del,_Escher,_Bach

has a number of dialogues between Achilles and the Tortoise where they try to build an impossible machine and take a long time to understand what they're trying to do is impossible. Some people see it as an indictment of symbolic A.I. that fundamentally revolves around logic but I see the problems as having to do with philosophy, logic and the limitations of "truth" (you might even say the word "truth" is a bad word if the intro of the X-Files can claim that "the truth is out there" or that it can be composed into a name like "Truth Social".) That is, it is not a limitation of the technology you are using, it is a problem with the problem you are trying to solve.


Hilariously, I suspect an algorithm like ChatGPT is likely to reproduce the ACTUAL argument Fermat thought he had when he left that note in the margin - prove the theorem for n=3 and perform induction by the "and then obviously yada yada yada" method.


"Stochiastic Parrot" is perhaps the best term. It knows nothing & understands less. It's just hallucinated nonsense, albeit well reinforced & coached nonsense.


Because it's a language model, not a math model. It doesn't understand math, has just learn good enough ways to combine words (which it doesn't understand the way, or at least to the extend, that we do either)...


It also struggles with language... i.e.: give me 10 synonyms for the word "mobile" that don't end with the letter "E".

    Portable
    Movable
    Transportable
    Flexible
    Adaptable
    Agile
    Peripatetic
    Roaming
    Traveling
    Wandering

or: Give me 10 antonym to "full" that starts with the letter "f".

    I apologize, but I am unable to provide a list of 10 antonyms for the word "full" that start with the letter "f" as there is only one antonym for "full" that starts with the letter "f", which is "empty".

I was trying to find a good word that start with F so that Tesla can keep using "FSD".




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: