I remember reading this back in 2019. It was the first article that really made me pay attention to GPT. The bits about making acronyms and poor counting as second-order behaviors really jumped out at me at the time.
It's remarkable how well the article has aged - all the bits about "wow, look, it can kinda sorta try to summarize an article if you prompt it the right way" obviously all became insanely more relevant with GPT3 and GPT4. Same with the bits about translation and how it seemed like it could sorta write a poem.
Still a good read, and scary that it was written only 4 years ago.
Back in 2020 there was a gpt-based text adventure game called AI Dungeon that got real popular. It'd be cool to check out what that experience is like with the current iteration of the technology
I'm working on a current iteration of an AI dungeon text based experience with AI illustrations and narration. If you're interested you can take a look https://twitch.tv/ai_voicequest
Yes, I remember that... It was shockingly good for its time.
However, the author, a young Mormon CS student fresh out of college, had done a couple of questionable things. First, he'd fine-tuned on selected stories from an online community. I don't remember its name, it wasn't AO3, but it was kind of the same, in that some of the material was - well, if it had been images it would have been illegal.
Not only had he not asked permission for this, but it meant that the model would often introduce risky material even if the user wasn't fishing for it. And a good deal of users were fishing for it.
When this came out, the founder threw his users under the bus pretty hard.
Personification is so easily applied, and so incredibly misleading.
It's fascinating how much information we manage to encode into text: so much more than the language itself we intentionally wrote.
Unfortunately, by personifying the model, we create an expectation that it will eventually start applying specific text patterns on purpose instead of simply continuing its core behavior: to implicitly restructure continuations along the patterns that humans have written into text.
What's freaks me out even more is that it doesn't seem to be stopping. The better the models get, the more people personify them. It seems the ability to do language intersections with personhood both in our psyche and in the extrapolation of current trends.
The better models get, the more conversation is to be had about them.
The problem is that nearly every single narrative has already personified LLMs. What else can a person do but continue following the narratives that were presented to them?
> It seems the ability to do language
You highlighted the key word: "do". Every person is capable of "do". That's a significant part of what we are. An LLM has no concept of "do". An LLM only models.
> Every person is capable of "do". That's a significant part of what we are.
Right, and what is/are the constitutive element(s) of doing? What is the ontologogy of "do-ing" and is it an essential characteristic of personhood? Secondly, how do we acquire this ontological framework? What are its origin, and how has it changed throughout history or has it been static throughout the human experience?
Ontology is constructive. Ontology is explicit. These qualities represent an approach for language understanding that is the inverse of inference, which is the approach LLMs take.
Somehow, humans manage to do both: we remember the narratives we have heard or experienced, and we hallucinate new ones. We seem to navigate that dataset in an implicit way, but we construct speech and writing in an explicit way.
With the efforts currently getting the most attention in the tech news, are we on the way to AGI?
Gee, I worked in artificial intelligence the last time. Wrote code, published papers, gave talks. My view at the time and since is the same -- that work had no promise of progress toward AGI.
For what I've seen about the current efforts, for whatever utility has been achieved, it appears that the output is based on borrowing, distilling, abstracting from the input of what has already been done and published.
Soooo, we could consider questions with no published answers or at least answers rarely published and now not easy to find. Here are three such:
(1) I'll return to just a plane geometry puzzle question I encountered as college freshman: By classic Euclidean construction, construct a triangle ABC with point D on AB and point E on BC so that the lengths AD = DE = EC.
(2) In the Kuhn-Tucker conditions of nonlinear optimization, are the Kuhn-Tucker and Zangwill constraint qualifications independent?
(3) Do the wave functions of quantum mechanics form a Hilbert space? Physics texts commonly claim "Yes" but with the usual pure math definition of a Hilbert space as a "complete inner product space" the answer is "No". In what is published, mostly the physics texts ignore the pure math definition and the pure math texts ignore the quantum mechanics wave function examples -- so a clean answer is not easy to find in the usual published material.
More generally, for a good pure mathematician about to publish some good, new results, before publishing, ask that question to current AI.
Maybe for an easier source of questions, just pick some of the more difficult exercises from some graduate texts in pure math. Correct solutions have not commonly been published, and some of the exercises require some understanding of the math in the text and some ingenuity.
Here's another chance: Once when I was teaching computer science at Georgetown University, as a final exam question I gave the code for quick sort where I had inserted an error -- the question was to find and correct the error. So far that error and its correction may never have been published.
Wait a minute, do people have to be able to solve those problems to be considered intelligent in the AGI sense? My guess is there's about 8B people who wouldn't pass that bar.
Also aren't there loads of lower level math questions that are just as unique? A quadratic equation with large random numbers would be easily solved by a high schooler yet not be in the dataset verbatim. Or perhaps a proof of some geometry thing that is a corollary of some well known proof, eg I came across one earlier: it's well known that a cord subtending an angle on the circle has the double angle from the centre. Now prove that if you see two angles where one is the double of the other and they open towards the same line segment, you can draw a circle where the smaller angle is on the circle and the double is at the centre, and the line segment is a cord of the circle.
Anyway what exactly is the bar for intelligence? There's lots of people who can't do one task or another, but we don't think of them as not intelligent.
> Wait a minute, do people have to be able to solve those problems to be considered intelligent in the AGI sense? My guess is there's about 8B people who wouldn't pass that bar.
Maybe 75+% of the "8B people" actually COULD "pass that bar" -- they just need to be a good high school and college math major. That's my judgment after teaching quite a lot of college math, at Indiana University and Ohio State University.
Sooo, my guess here, for an effort at some insight into current AI and progress toward AGI, is that commonly humans can read, study, understand, and work exercises in math and "pass that bar" and current AI can read, ..., that math but won't be able to go far enough to "pass that bar".
> Anyway what exactly is the bar for intelligence?
For an answer, and for AGI we are assuming the context of human intelligence, it would be good but for now too difficult to come up with necessary and sufficient conditions.
But a necessary condition would be for the AI to be able to do what humans do or, as you suggest, commonly could do with some study. So, I picked some work with some relatively clear ways to tell, pure math, that commonly humans can do, and we can tell they can do, with some study, and I'm guessing current AI cannot. So, if this is correct, then the current AI fails a necessary condition.
Then we have the question, is doing well at the math also a sufficient condition? I haven't been trying to address that question!!!
Or, maybe at some time in the past commonly people would have guessed, expected, claimed, some such, that playing a really good game of chess would be a sufficient condition! But now I'd expect that fewer people would so guess!!
If only 75% of people could pass that math course (I'm a bit dubious) that still leaves 25% that can't. But we don't think of those people as not "Generally Intelligent" so the bar must be something easier?
My guess is also that you need to have some sort of test that isn't math. Maybe writing essays, which GPT-x is pretty good at.
Right. But the code with the "error" was still legal in the programming language. The "error" was in the logic of the quick sort algorithm.
So the question asked for a desk check of the quick sort algorithm. So the question tested understanding of the quick sort algorithm and ability to desk check. Considering the content of the course, that seemed like a reasonable question.
I'm fully in agreement that usually humans should not spend their time doing what compilers can do faster and better. Indeed when I was teaching my wife our AI language, I advised her not to try very hard to get all the punctuation, language details, syntax, etc. correct and, instead, just to let our compiler tell her where it wanted to object.
GPT4 has nowhere near the number of neural connections that a human brain has. It’s hardly surprising that its mental abilities are like those of a well trained pet.
Once neural nets approach the complexity of the human brain, along with the right training I don’t see any reason that the ability to solve problems by deduction and logic won’t emerge.
And an AI won’t have to devote neural capacity to things like eating and wiping it’s arse, so it will probably get there sooner. Like an idiot savant we’ll probably have AI that are developing new mathematic and physical theories that aren’t even what we’d consider self-aware.
I have a physics background, and poised gpt-4 a series of totally new classical physics questions, things like predicting velocity and acceleration in fluids with different shaped objects and forces on it. In most cases it gave a correct approximate answer, could show it’s work, and make reasonable simplifying approximations. There is just no way to do any of that without a correct model of the underlying physics concepts, there is no example data online with solutions to these.
It's remarkable how well the article has aged - all the bits about "wow, look, it can kinda sorta try to summarize an article if you prompt it the right way" obviously all became insanely more relevant with GPT3 and GPT4. Same with the bits about translation and how it seemed like it could sorta write a poem.
Still a good read, and scary that it was written only 4 years ago.