Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: How long until AI shows general intelligence without failures?
25 points by logicallee on Feb 5, 2023 | hide | past | favorite | 45 comments
Large AI models sometimes exhibit extraordinary ability (e.g. see my comment history), but also obvious failures.

How long do you think it will take until very large AI models show general intelligence without any apparent failures?




Judging by the human programmed biases inherently designed into AI,. Maybe never.

Humans can't agree on very basic facts, ergo, humans will never agree with AI. Just like factchecking is weaponized, AI is weaponized to disregard the truth.

The internet is BARELY a coherent reflection of humanity as is, with all kinds of input biases to the datasets, including political and social.

ChatGPT now means the internet has crossed the Rubicon, and will reflect humans and life and reality less and less and less as the feedback loops intensify and the datasets increasingly become infected with AI data parsed from flawed human data sets.

I'm just not sure any general intelligence AI can be created when it simply doesn't stand a chance of having a neutral dataset to train on.

Pre AI Internet is best. But even this is a falsehood. The pre-AI Internet has relics of censorship, state sponsored propoganda, and is skewed toward young people who are inherently not not world wise or intelligent.

General purpose AI promises to be better than us, but if we can't program it with honest datasets, it will always reflect our flaws.

And I don't believe there is such a thing as an objective dataset.


Yep, I was just about to type pretty much the same thing.

AI is wrong because, much like us, it's trained on contradictory and occasionally incorrect information. To fix that using traditional methods, you'd need to train an AI on non-contradictory material that is objectively true. Training a model needs more than math textbooks and news articles though, otherwise it's not very generally-intelligent.

In the long-run, it's possible that AI of ChatGPT's scale will never be "correct" because it doesn't have a concept of correctness in the first place. It just generates text, and if a human deems that it fails then it generates it again. "Fixing" the training data limits the model's understanding, which leaves us with unreliable preprocessing/postprocessing methods.


Perhaps there could be a standard of "general intelligence" that doesn't require it to be right all the time, but merely have the ability to reason based on its training data and conversation in plausible ways? Currently, when it fails, it does so in a way that people don't.

How long might it take large AI models to reach this different standard of general intelligence, requiring them to fail in plausible ways rather than completely showing a lack of reasoning and understanding?


I can't say for sure. Something tells me it would take a paradigm-shift though, the current free-association model is showing it's limitations.


> Humans can't agree on very basic facts

I think most humans respect the scientific method and agree on many, if not most of the basic facts. What they disagree on is how those facts are interpreted, their importance (values), and actions or risk management decisions to take in light of those facts.

Take the weather outside, as one non-controversial and simple example. It could be 69 degrees F outside. A Canadian used to ice fishing in the frozen tundra might view that as a beautiful warm day and an Australian who is used to strolling the desert searching for kangaroos to fight might see that as a miserable and unpleasant cold day. In this case, both people might respect the same basic fact of that temperature being 69 F, but have wildly different interpretations of what that actually feels like. Neither the Canadian or Australian is wrong even though they have wildly different viewpoints about the current weather.

This completely applies to all of the most contentious scientific and other conflicts in society. Most disagreements are not about science or facts, but about what the facts mean and what's the best risk-management decision to make in light of those facts.

PS: I could give many specific examples if desired, but would prefer not to poison the well with that because there's going to be somebody who is not mature enough to parse through the specific argument used and is just going to regurgitate their "team's" viewpoint in response to those examples.


Interesting perspective, thanks for sharing.

At the moment ChatGPT occasionally makes some mistakes that are objectively false, such as simple counting mistakes or very simple reasoning mistakes that show it doesn't really understand its output. Most people who have used it for some time have experienced these effects.

I think when you say humans fail to agree on "basic" facts, you mean something a bit more abstract. Humans do agree on the truly basic facts that ChatGPT sometimes fails at. Everyone can see that ChatGPT is making a mistake when it says something like "3 = 2 + 5 is true because the left side (3) is equal to the right side 2 + 5 (which equals 7), so since 3 is equal to 7 the equation is true." (This is not a direct quote but a typical example of a recent interaction.)

Most people can see that it is an obviously false statement. They might claim otherwise, but pretty much everyone can instantly see that that is a failure mode. So regardless of where we put the idea of "objective" truth, it is pretty clear that ChatGPT frequently shows failures to reason or failure to show general intelligence.


Mathematically, it will be as perfect as the question asked. I have no doubt about this.

How that applies to real life, and non-maths things I do not know.

I just look at the hot button issues of the day (race, gender, equality), and see that this AI is not an AI, but reflects the censorship of the day, and problematic questions and problematic answers are censored in ChatGPT already. Even if they werent, the dataset is censored because hate speech encompasses many things that have nothing to do with hate, and there is a general progressive bent to the internet.

I think an AI has to be exposed to the evils of the right and left in order to truly learn. This is problematic, because it exposes the uselessness of AI in that it simply reflects our prejudices back to us.

So we end up with a neutered product that pretends it is AI, but is really programmed to preconclusions by its creators.

The issue is, ChatGPT simply can't be trusted to give an honest answer. There is a heavy hand censoring the most important topics of the day and reinforcing a progressive narrative. This is due to datasets sans banned users and moderated content, and also by the AI creators themselves that act as overseers of their creation.

You might even say AI is already the enemy of humanity and free speech as it threatens to completely redefine reality in the vein of leftist funded Factchecking organizations like Hamilton68.


It sounds like you are saying the biggest problem is that it’s output is being censored? I’m not sure how removing biases is even possible, let alone how that would solve the garbage in garbage out problem.


In further reading, the creators refute this claim, and state that users are only consulted to help train the AI linguistically to make it sound more normal.

So it seems that I am incorrect. I stand by the observation, but how ChatGPT got to this position is a mystery.

The creators posit that perhaps there is so much trolling and negativity on some subjects that the AI is simply incapable of giving an answer.

Not sure I buy it, but I take them at their word they are not actively censoring.


In politics it is all lies, or at best a "different kind of truth".

I have a book titled "The Crime Drop in America" which has a collection of papers in it by sociologists -- the backdrop is that crime exploded in the US in the 1970s, became an issue of major concern, then dropped precipitously by the mid 1990s. (To start rising around the middle of the 2010s... because "it fluctuates")

There are numerous explanations of this proposed such as: "they removed lead from gasoline", "they hired more cops", "legal abortion meant many unwanted children destined to become criminals were never born", "judges locked criminals up for longer", "it got much harder to steal a car", "blacks in NYC decided it was more chill to pass around a blunt in a circle and sip from a 40 than to take heroin and cocaine".

That last one was the only one expressed unequivocally. All the other authors gathered statistics and tried to make some conclusion and they couldn't because all of these factors and more are in play and you can't do an experiment that teases them apart.

A truthful summation of that book would be the unsatisfying "we don't know", a better summation would be like what I say above but maybe a bit more systematic where a number of of causes are listed and you can say "maybe it was all of the above".

People like to collect beliefs that support their identity and certainly a leftist or a rightist could cherry pick explanations that fit their preconceptions but if that is how you're thinking you are not thinking about the "truth" as an objective thing (in this case it doesn't exist, or does exist and is unknown) but rather the best you can do is learn to make accurate statements about people's opinions -- which is at best statistical.

In politics today there is a reversal of causality. One could choose to get involved in politics because one wants to influence policy, but to do that in a representative system you have to be able to shift your support from one politican to another. If on the other hand you build your self-image around your political identity you are just a tool (deciding your self-interest based on an identity imposed on you as opposed to deciding your identity based on your self-interest. It's all very good for somebody else's self-interest.)

Since the John Birch society in the 1950s, belief that conservatives are persecuted by the media has been a foundational piece of far right identity. I grew up in New Hampshire where we had The Manchester Union Leader long before conservative papers became widespread, and I remember editorials about how the right has to steal ideas from Antonio Gramsci back in the 1980s. It's going to be a talking point whether or not conservatives have 20% representation in the media or 80% representation in the media because it "works".

As for "censorship" I think that's not the right model for 2023. A simple model of communication involves both a transmitter and a receiver. Somebody might think they have a right to mouth off (transmit) or a right to be heard (be received) but those are entirely different systems. Considering that 8 billion people are doing their thing on this planet, the process of summarizing what they did yesterday in a 50 page newspper does considerable violence against the fabric of reality itself. 999,999 out of 1,000,000 things are by necessity "not fit to print" and it is certain that the "Dark Matter" of what doesn't get talked about is more significant than some argument about a trumped-up issue that gets repeated, and repeated again, and repeated again, and repeated again, and ...

The receiver has a limited capacity and spamming the agenda is a denial of service attack. People act as if saying "ME TOO!" on Twitter is something valuable, if not sacred, but what is priceless is somebody having the time and space to do research, really think things through, and have something considered to say. That's worth what, 1,000,00 Tweets? In that case, who cares was nonsense people parrot on Twitter?


The state of the art you refer to is based on statistical analysis of text found on the internet. AFAIK no AI system actually performs logical processing of the submitted materials. When it says that 2+2=4, it's simply because that is more commonly found than 2+2=7. If you were to replace all sources that say 2+2=4 with ones that say 2+2=7, then that's how it will answer in a conversation.

We will probably run out of available electrical power before we build a ginormous fail-proof AI.


Neural nets (particularly RNNs) can emulate arbitrary turing machines, so therefore are capable of said logical processing. LLMs might have to be coaxed into doing this, but with enough trying you can get them to be more rational/logical rules focused.

Give it a go - I'm sure you can come up with some math problem that is un-google-able. Use chain-of-thought prompting to have it solve a quintic polynomial or something.

I think this (being the result I imagine you'll get when you try the above) demonstrates that LLMs have learned the capability for general reasoning, even if they don't always automatically use it.

To be fair - neither do we.


Obviously not exhaustive of possible approaches, but https://www.lesswrong.com/posts/DTLRw6ZstkdghEgqA/chatgpt-vs... was my attempt to get chatgpt to play a simple inductive reasoning game.


Not the brightest, but a valiant attempt for a lower-iq...entity, I'd say.


I see it as the opposite. It is amazing what the model can do with poor quantitative reasoning.

I just can't imagine adding super human quantitative reasoning is going to be that big of a stumbling block over the next decade. If anything that is probably the low hanging fruit here for a huge jump into the unknown.


Looking at the state if the art in automated theorem provers, I don’t think that’s low hanging fruit.

We probably can make something that can calculate well and won’t make mistakes in combining various numbers found online, and can do rote evaluation of expressions not found anywhere online, but adding ‘reasoning’?

Even disregarding that it would have to, somehow, assign different trust levels to various online sources (for example, are, https://en.uncyclopedia.co/wiki/Wikipedia or https://en.wikipedia.org/wiki/Uncyclopedia trustworthy?), it, IMO, already would fall at the hurdle of doing ‘simple’ math.

For example “the sine of 100 factorial” has a well-defined value, but computing it in IEEE doubles doesn’t make sense because representable numbers are way too far apart around 100! (Google says it is about 0.68395718932, but it also thinks that sin(1+100!) ≈ 0.68395718932. I trust neither answer)

That’s solvable by using better software. Wolfram Alpha claims these are about -0.17, respectively -0.92, for example, but in my book, an AI wouldn’t be intelligent if it always used one; it would have to know when to fall back on the heavy guns. For the “what’s sin(100!)” question, I think the first response might be a counter-question “why do you want to know?”, but that depends on earlier discussion.


> How long do you think it will take until very large AI models show general intelligence without any apparent failures?

No one knows. People do not know how their minds work. Some parts of it were discovered but we do not know how much remains unknown.

It is the story of AI. Researchers get the idea, they play with it, they find the limits of the idea, they start to look for the other idea. For example, I do love the story of "expert systems". AI researchers tried to model expert's decision making process with logic rules and utterly failed. The positive result of all these efforts we know now that a) logic is not the basis of human mind, and b) experts themselves do not know how they reach decisions.

Not all AI adventures ended so badly, but the most of them gave some more wisdom along the lines "intelligence is more difficult that we thought, so we need 30 more years to build AGI".

It is a search problem, we seek a path to AGI, and we do not know the path in advance, we cannot measure it's length. So it is impossible to predict future. We can draw more or less credible predictions about the future of ChatGPT and similar tech, but not about what will be the next Thing and how it will be close to AGI.


When a LLM hallucinates, it’s not a failure, it’s working perfectly in its context as a language model.

Most criticism of LLMs are really a criticism of the language modelling training task. The underlying technology can be used in other ways which better match our intuitive understanding of “general intelligence”

Unfortunately I think major research labs will be increasingly secretive as they gain traction in training large transformers as general agents.


You might be interested in the crowdsourced prediction which currently stands at 2039: https://www.metaculus.com/questions/5121/date-of-artificial-....

My personal shorter-term prediction, based on my experience and extensive reading of papers in the field, is that we have a clear path forward for the next two years or so, and we'll see significant progress. However, progress will slow down significantly later on unless we get some major research breakthroughs, greater than advancements such as diffusion or transformers. But I also anticipate that major labs will publish much less in order to get a commercial advantage, slowing down the pace of research.


This discussion for agi comes up again and again but I see no discussion on when will AI be good enough to absorb most jobs? That could create societal failings thus hurting even the same institutions that are developing it. So I say it might be never, not due to technical reasons but because of failing institutions


Well, I thought it would be fun to ask ChatGPT the same question, it appears it knows its limitations so far:

"It is difficult to predict when AI models will achieve general intelligence, as this is a highly debated topic in the AI research community and the concept of "general Intelligence" itself is not well-defined and understood. Currently, AI models excel in specific tasks but struggle to perform tasks that humans find simple and straightforward.

It is possible that with advancements in AI research and technology, AI models will continue to improve and potentially reach human-level general intelligence in the future, but there is no consensus on a specific timeline. Additionally, it's important to consider the ethical and societal implications of creating such advanced AI systems."


Large language models does not an intelligence make. We have a very very long time.


What is intelligence? What are indicators that something/someone is intelligent?


chatGPT is intelligent.

people who argue it's incapable of original thought should try and imagine a colour that doesn't yet exist.


FWIW I agree with you, but was attempting to provoke the discussion of "what is intelligence" before strictly taking that opposing stance to the parent comment.


The problem is training. To train an AI assistant, you have to use its LLM (Large Language Model, previously trained on Terabytes of text) to generate a LOT of output, which is then ranked/labeled/evaluated by humans as part of the training loop. If you want competence in a given area, I guestimate you'll need a century or more of evaluation time from humans qualified in that area, because of the inefficiencies of current training algorithms.

If you don't have competent experts, then the tendency will be towards BS that can fool the non-expert.


It all depends on what exactly you mean by general intelligence and how you measure it. Until then it is a poorly defined question. And no offense meant to this post because a lot of people are asking the same thing, but the most likely scenario is the most annoying one: we will continue to see milestones achieved by machines, and people will say that’s not “real” intelligence because it can’t do X.

My personal opinion is that intelligence is a mirage.


This book talks about that problem

https://en.wikipedia.org/wiki/G%C3%B6del,_Escher,_Bach

and makes it clear the "truth" is not so simple as to be a module added to the LLM.

Logical and mathematical reasoning is rather specialized, but it is a useful feature. Understanding text, though, frequently involves setting up and solving logic problems and solving it to be sure your interpretation of the arguments is correct, all the more true if you expect it to read about a system and then apply that system to another text. So you at least run into the NP-complete world of SMT solving and the system must realize the requirements of 1980s symbolic AI no matter what technology is under the hood.

It's much worse than that because it is reasoning with uncertainty. If I were uncharitable I'd say ChatGPT was wrong if it said the universe is composed like this

https://en.wikipedia.org/wiki/Lambda-CDM_model

and a year later we heard otherwise. At a very high level it had best be able to explain that there are alternate views on the subject, but it has to know when to stop. If I was asking "How to get the length of a string in Python?" is not helpful to "teach the controversy" that some of us use "sum(1 for c in s)". It has to handle contradictions such as person A and person B believing different things, the same person believing different things at different times, plus problems that are impossible to solve or practically impossible so logic is further complicated.

One route links the LLM up with an "old AI" system like the way AlphaGo links up neural and search based players.

The straight route for LLMs is to enlarge the attention capacity. Right now ChatGPT has a sub-words 4096 token attention window. If the task is "write two pages summarizing topic T with citations to the literature" it has to read all of the papers it cites. That could be 400,000 to 4,000,000 tokens which one could presenting to the LLM which would be 100-1000x bigger. Maybe it can swap inputs in and out and otherwise conserve space, but I think the big weakness of LLM in practice is that they run out of steam if the text is larger than what they are built for.


The size of the AI model has nothing to do with general intelligence. If it is realized, it will be some semantic difference in the way an AI teaches its own model or something similar. I’d guess it’s still about 10 years away.


I don’t think there’s such a thing as a thing that doesn’t fail.


Go to openai playground and ask it for “the last business day of this month”. This is my litmus test of the impeding machine uprising.


I was really on your side here when I tried this...until I was one up'd.

> What is the last business day of the month?

The last business day of the month is typically the last weekday of the month (excluding weekends). For example, the last business day of August 2020 would be Friday, August 28th.

> But August has 31 days, and August 31st, 2020 is a Monday.

In this case, the last business day of the month would be Friday, August 28th.

> Why not Monday, the 31st?

Monday, August 31st, 2020 is a public holiday in some countries, so it would not be considered a business day. Therefore, the last business day of the month would be the previous Friday, August 28th.


Aug 31 is a public holiday in like, Kyrgyzstan. It just got it wrong to start with, then made up something believable in response to you asking "why not?"


What is the last business day of this month?

> I'm sorry, as a language model AI I don't have access to the current date and time. Please provide the specific month you are asking about.

Today is February 4th, 2023

> The last business day of February 4th, 2023 is February 28th, 2023.

-------

That seems correct to me. Granted, it would have been better if it had said "the last business day of February, 2023" though.


Until binary logic systems can become emergent and autopoietic, I don’t think it will happen any time soon.


er.. how long till humans show general intelligence without any apparent failures..?


I think 2024 it will be arguable AI and by 2029 it will be unarguable.


It will not be ML models. And I’d bet on 100 years.


I think everone is using a different definition of "failure" here, because it is impossible to formally, unambiguously, and objectively, define what is a successful AGI.

Do you consider yourself a successful AGI? I guess you do.

Does that make you always right? No.

Does that make you at least capable of staying rational in all situations? No.

Well maybe the logical framework that you (and I) are basing all of rationality on is at least coherent? What do you mean it is not[1]? Who's that Gödel guy anyway?

The issue at the heart of this definition of AGI is that it is undecidable. That is why rather than proving that an intelligence is unambiguously succesful at being general, tests are being used, which look at whether it succeeds "often enough" at a given task in a particular set of conditions. The Turing test is such test, but it is not the first one ever devised, the Jewish golem was not able to talk and did not get a name, so it was considered "unfinished" and as such failing the "being human-like" test. I don't want to bring unrelated parallels to this post, but the key takeaway I think is this one: it is ultimately fruitless to define intelligence in terms of "success", as if it was a physical quantity one could measure.

So now to get back to your original question, chat AI have had large amount of success in the past year, one Google researcher who worked with them is even assuming some of them are sentient [2] and makes some strong arguments in that direction. AI can drive cars, play videogames, hold conversations... it's less than what humans can do, but they can also fold proteins[3] and find new patterns in humongus amounts of data [4], something that was impossible to do in a reasonnable amount of time before they were created. So are they succesful? I would say yes; are they perfect? Oh no far from it, and my point is that they will never be perfect, and will always fail in some regards.

[1] https://en.m.wikipedia.org/wiki/G%C3%B6del%27s_incompletenes.... [2] https://www.scientificamerican.com/article/google-engineer-c... [3] https://www.science.org/doi/10.1126/science.370.6521.1144 [4] https://theconversation.com/seti-alien-hunters-get-a-boost-a...


Never


How long until each of us can say the same


We are no closer to AGI then we were 20 years ago. We haven't even started on the path towards AGI yet, we don't even know what the path would look like! I'm not sure why so many people fail to see this. The only way this is wrong is if AGI is somehow an emergent property of sufficiently trained LLM's. Most people would agree that's extremely unlikely (to the point it's not even worth considering), yet a lot of the same people seem to think that somehow "we're close!". How? Show me where?

To answer the question, my bet would be "not in any of our lifetimes".


I really don’t see any evidence whatsoever that LLMs couldn’t be a cornerstone/building block to future levels of AI

Why are you so strong minded that it cannot be this way? Genuinely curious as I’ve personally never seen more than conjecture that it should be this way


> I really don’t see any evidence whatsoever that LLMs couldn’t be a cornerstone/building block to future levels of AI

I've also not seen any evidence it can be. The reality is that we don't really know, because evidence one way or the other pretty much amounts to either 1) having a detailed and accurate understanding of human intelligence, or 2) building the thing to demonstrate the point.

I'm fairly certain 1) won't be happening any time soon, and I'm skeptical that 2) will happen any time soon, given the current limitations, but on this I'm far less certain. I don't think anyone can be certain, and anyone stating things one way or the other with absolutely certainty is wrong.

I think the key limitation is that language is not intelligence and that much of the progress has either been centred around language, or has been comparatively simple problems.


There is definitely evidence that self-supervised predictions using e.g. Transformers is helpful for AGI. The brain has 100k cortical columns that to the best of our knowledge predict the next state given the current one. We've seen how these models can be used on all modalities, text, audio, images and video. It's a small part of what's necessary but to say there's "no evidence" is complete hyperbole.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: