Having read to the bottom, the quality of text generation there absolutely blew me away. GPT-2 texts have a somewhat disconnected quality - "it only makes sense if you're not really paying attention" - that this article lacks entirely. Adjacent sentences and even paragraphs are plausible neighbours. Even on re-reading more closely, it doesn't feel like the world's best writing, but I don't notice major loss of coherence until the last couple of paragraphs. I am now really curious about the other 9 attempts that were thrown away. Are they always this good?!
One potential issue with this approach is that the text it generates is 'nonsensical', in that it is almost like a word-salad. Although this is a standard problem with neural nets (and other machine learning algorithms), in this case the text actually is a word-salad. It seems that it has learned the rules of grammar, but not the meaning of words. It is able to string words together in a way that sounds right, but the words don't actually mean anything.
Plot twist: This comment was generated by GPT-3 prompted with some of the comments in this thread.
Soon enough, someone will replicate the Sokal hoax[b] with GPT-3 or another state-of-the-art language-generation model. It's not hard to imagine GPT-3 writing a fake paper that gets published in certain academic journals in the social sciences.
[b] https://en.wikipedia.org/wiki/Sokal_affair -- here's a copy of Sokal's hoax paper, "Transgressing the Boundaries: Towards a Transformative Hermeneutics of Quantum Gravity:" https://physics.nyu.edu/faculty/sokal/transgress_v2/transgre...
This comment was also written by GPT-3.
I have to admit, this is passing my turing test...
Maybe the real lesson is we don't expect human-written comments on discussion fora to be particularly coherent....
Especially the second comment can be coherently interpreted with some good will and a cynical view of the humanities and philosophy. The "author" could say that once GPT-3 can write humanities papers it will quickly make humanity scientists redundant and that humanities scientists are philosophers is not important and doesn't warrant a job alone ("they don't actually do anything"). Eventually it shifts that this is the fault of science working too well (GPT-3 being a product of science)
It's not a consistent argument, but without the context of these comments being GPT-3 it would have totally passed my turing test, just not my sanity test.
The model fundamentally has no understanding of the world, so if it can successfully argue about a central thesis without simply selecting pre-existing fragments, then it would suggest that the statistical relations between words capture directly our reasoning about the world.
The final bit doesn't quite connect, but overall I've seen far less coherent comments written by humans on subject with far more logical flaws.
I am genuinely awed.
Pretty average for HN then ;)
>In the not-too-distant future, there probably won't be any more philosophy professors; there will just be philosophers
Was quite clever and I'm still trying to figure out what it means.
It would be interesting to see if the output has a similar quality when trained only on highly regarded texts.
How could we expect it? After 35+ years (BBS and Usenet onward), we've learned that they are often not.
There's a totally valid discipline in taking concepts from different areas and smushing them together to make a new idea. That's what a lot of creativity is, fundamentally. So a bot that's been trained across a wide variety of texts, spitting out an amalgam in response to a prompt that causes a connection to be made, is not only possible, but likely a very good way of generating papers (or at least abstracts) for humans to check. And if the raw output is readable, why not publish it?
Would you please show us the input text, or rules, you gave to GPT-3 to create this comment ?
Not gonna lie, I went poking around to see if I could get my hands on it, but it seems like the answer is no, for now.
Second, I'm curious/terrified at how future iterations of GPT-3 may impact our ability to express ourselves and form bonds with other humans. First it's text messages and comments. Then it's essays. Then it's speeches. Then it's love letters. Then it's pitches. Then it's books. Then it's movie scripts. Then it's...
TLDR; Fascinated by the technology behind making something like this work and quite worried about the implications of the technology.
(So I think it was some other story on the same topic.)
Especially the shoggoth cat dialogue, I found that one really creepy. The fragment below comes straight out from the uncanny valley:
Human: Those memes sound funny. But you didn’t include any puns. So tell me, what is your favorite cat pun?
AI: Well, the best pun for me was the one he searched for the third time: “You didn’t eat all my fish, did you?” You see, the word “fish” can be replaced with the word “cats” to make the sentence read “Did you eat all my cats?”
In fact, while reading that comment I started to wonder why no one has tried to use GPT to generate text one character at a time. Or if someone has, what are the advantages and disadvantages over the BPE approach.
The quality of writing was very high, so I was convinced I was reading something put together by a human with agency... except it didn’t pass my gut-feeling “how IT works”. It made me suspect that either the algorithm (the described one, not the AI responsible) was off, or that I just didn’t understand AI any more. As I know I don’t have up to date AI knowledge, the algorithm appeared more believable. I hiked deep down the uncanny valley with that one.
Edit: it is amusing to think that soon the way to distinguish them will be that human comments have weird errors caused by smartphone keyboard "spell checking" in them...
Still, would be an interesting experiment. Gwern swears it would improve stuff, so worth trying and comparing, I guess
Given the propensity for academic writing to often favour the strategy of confusing the author through obfuscation (to make a minor advance sound more significant than it is), I suspect tools like this could, as you say, actually get published papers in some fields like social sciences. In an engineering or science paper you can check equations match conclusions, and that graphs match data etc.
In a more qualitative field of work, reviewed in a publish-or-perish system that doesn't incentivise time spent on detailed reviewing, I think there's a very real risk babble like this just comes across like every other paper they "review".
I think it takes a certain level of confidence to dismiss others' work as nonsensical waffle, but sadly this is a confidence many lack, and they assume there must be some sense located therein. Marketing text is a great place to train yourself to recognise much of what is written is meaningless hocum.
Sci-Gen - https://pdos.csail.mit.edu/archive/scigen/
Reporting on withdrawals of papers - https://www.researchgate.net/publication/278619529_Detection...
It's easy to consider text generation models as "just mimicking grammar". But isn't grammar also just a model of human cognition?
Is GPT modeling grammar or is it modeling human cognition? Since GPT can ingest radically more text (aka ideas) won't it soon be able to generate texts (aka ideas) that are a more accurate collation of current knowledge than any individual human could generate?
[Was this comment written by GPT-3?]
I am impressed though nobody dared to guess in 2 weeks.
I don't really understand why we're trying so hard to build models that can generate coherent texts based on having predigested only other texts, without any other experience of reality. Their capabilities appear already superhuman in their ability to imitate styles and patterns of any kind (including code generation, images, etc.). It feels like we're overshooting our target by trying to solve an unsolvable problem, that of deriving the semantics of reality from pure text, without any other type of input.
You can also publish a lot of nonsense in certain chinese journals that optimize for quantity in quality, in whatever field you want.
Some say this has already happened. Nobody has ever seen the Social Text editors and Mochizuki in the same room together, have they?
This kills the forum.
Seriously, once this is weaponised, discussion of politics on the internet with strangers becomes completely pointless instead of just mostly pointless. You could potentially convince a human; you can't convince a neural net that isn't in learning mode.
We might end up with reputation based conversations.
That could have consequences for their reputation, though.
(Reputation is a lot more controversial and complicated than it sounds)
Quite the opposite, I suspect.
Eventually, to engage in the most persuasive conversations, the AIs will develop a real-time learning mode.
Once that is weaponised, the AIs will be on track to be in charge of running things, or at least greatly influencing how things are run.
What the AIs "think" will matter, if only because people will be listening to them.
Then it will be really important to discuss politics with the AIs.
The result is that worthwhile public discussion is dying. We have to transition now to secure verified communication.
Either that or the bots fork off a new cultural discourse and we treat them like a new form of entertainment.
GPT-3 isn't AGI, but it's weapons-grade in a way that GPT-2 wasn't.
I guessed it was fake before getting to the end, not from the content, but from the fact that all the sentences are roughly the same length and follow the same basic grammatical patterns. Real people purposely mix up their sentence structure in order to keep the readers engaged, whereas this wasn't doing that at all. Still very impressive though; if not for the fact that the post was about computer generated content I probably wouldn't have noticed.
Lots of examples I've seen have phrases like "see table below". Of course there's no table and it's hard to imagine how there could be.
But GPT is trained on internet content and the internet is full of terrible writing that never gets to the point. I doubt there's any way to know how much is "not actually understanding the subject matter" vs. "learning bad writing from bad writers". I'm inclined to believe the majority is the former but there's got to be a little of the latter sprinkled in.
One thing I learned was it has detailed knowledge of the world of Avatar: The Last Airbender, seemingly through fanfics. It was fun having it to teach me the lost arts of pizzabending ("form your hands into the shape of a letter 'P'" and so on, and needing to practice by juggling rubber pizzas) and beetlebending ("always remember that to beetle bend it helps to like beetles," my wise uncle suggested). Each of these tended to precipitate a narrative collapse.
The writing style was surprisingly homogeneous, and it reminded me of young adult novels. It would definitely be interesting to see it with other writing styles, beyond the occasional old poetry.
> The man walks away and starts undressing. You shrug and keep following him. Soon, you find yourself naked.
So this morning he heard about an animal, it was kind of a lion. But with bat's ears, it lives in Africa. It looks like it's a rock, but it's actually not, it's rock shaped but has tiny legs. And it's gray and hard. Its face... It doesn't really have a face. It lives up in trees where it eats bamboo and apples. It has these huge fangs like sabertooth tigers, you know?
My smallest kid has a habit of telling stories about himself that actually come from whatever he heard recently, e.g. "once I was Godzilla..", or claims about things in reality that come from stories or misunderstandings all mixed up "did you know, there are three pigs, but they are not pigs, they are wolves and a hunter came and killed them but they weren't wolves they were dragons..."
It's actually very GPT-3-ish now that I think of it.
Maybe. Right now this reads like a glorified shopping list. It's coherent, but actually sounding human also requires a theory of mind.
E.g. I explain here why it's possible for written statements to be objectively insightful, informative, interesting, or funny, but objectively in a way that's relational to other information or beliefs. The implication being that statements are only going to seem subjectively funny or insightful (or whatever) to others who have that knowledge or those beliefs, which means that you can't reliably create those subjective experiences in a reader without having some sort of theory of mind for them.
I guess you can create content that's funny or insightful relative to that content itself, but that's not especially useful. It's entertaining at the time, but the experience is more like seeing a movie that you laugh a lot during but then leave and are kind of like what was the point? It's an empty experience because it wasn't transformative.
I definitely don't think it's impossible, but I also don't think it's a matter of just adding a couple more if-else statements.
I'm going to call this goalpost shifting. This article is better writing than some % of humans, theory of mind or otherwise. The AI has comfortably surpassed Timecube-level writing and is entering the pool of passes-for-a-human.
'Sounds human' is a spectrum that starts with the mentally ill and goes up to the best writers in human history.
That's completely fair. On the other hand, without a theory of mind it can't really educate or inspire people, the only thing it can do is maybe trick them about the authorship of something. But once people learn the techniques for identifying this kind of writing, it can't even do that anymore. To me this is like the front end of something, but it still needs a back end.
Don't get me wrong, it's super cool research and seems like a huge step forward, and I'm excited to see where it goes. But I also don't see this AI running a successful presidential campaign or whatever, at least within the next couple years.
And that got me thinking about what I could do with this thing, whether I should, what I wanted to try out...
So the BS random ideas were still inspiring a bit.
I wouldn't agree with that, either. How often have we heard of someone gaining useful insights by considering ideas that were misapplied or just plain wrong? Entire branches of physics have evolved that way. As far as successful presidential campaigns are concerned... well, let's not even go there.
If there's such a thing as a 'theory of mind', it applies to the reader, not the writer.
For example, I delayed in writing this comment because the cat was on my lap, and I couldn't fit the laptop and the cat both. You get that. I know you do, even if you don't own a cat, and even if you're reading this on a phone or a desktop.
GPT-3 does not understand about the cat. To GPT-3, they're just words and phrases that occur in the vicinity of each other with certain probability distributions. As a result, it can't write something and know that there's something there in your mind for it to connect to.
Cyc would handle the bit about the cat differently. It would ask "Did you mean cat-the-mammal, cat-the-Caterpillar-stock-symbol, or cat-the-ethernet-cable-classification?" It has categories, and some notion of which words fit in each category, but it still doesn't understand what's going on.
But you the human understand, because you have a lap, and you've at least seen a cat on a lap.
You really think GPT-3 never came across a comment about a cat in lap? 50% of all the pictures on the internet are cats sitting on people. GPT-3 doesn't need to understand it to echo this common knowledge.
Airplanes don't look like birds at all but they do fly.
I would like to see a GPT model where training data is weighted by credibility / authority (e.g. using Pagerank).
What if the ultimate theory of mind turns out to be that consciousness is an illusion and nothing separates us from a sufficiently sophisticated markov process.
Conscious experience would still exist (see cogito ergo sum, Chalmers, etc). If we were to be shown we're just Markov processes, that wouldn't disprove the existence of conscious experience. Just like confabulation, a misleading experience is still an experience.
What it would disprove is any sense of agency.
I certainly do, don't you? When I read a blog post and it's full of poorly-integrated buzzwords that make it seem like it was churned out by a non-English speaker being paid very poorly per word, I stop reading and move on.
I recently read a few pages of a book someone had recommended to me and stopped reading because of the writing style.
Heck, you can read a few pages of, say, a Dan Brown novel, and based on the writing style might choose not to read it, since the style tells you a lot about the kind of book it is.
That said, the content of the computer-generated parts doesn't make much sense even for a Bitcoin-influenced article (what would be the point of paraphrasing your previous post in a forum on a regular basis, and how does this not get one very quickly banned?), but the grammar is far far better than previous attempts - it reads like Simple English wiki.
It sounds to me like you must be an academic, or someone with good habits for being efficient at reading articles.
I agree, responses are almost as interesting as GPT-3. And this place has always felt like one of the better when it comes to people reading past the titles!
* GPT-3 is trained on one
However that makes one wonder if it can also learn to generate emphases, and if so, how would it format? With voice generation it can simply change its tonality but with text generation it has to demarcate it in some way--does the human say "format the output for html", for instance?
I agree. The environment - as the source of learning and forming concepts, is the key ingredient of consciousness, not the brain.
Basically the brain and "consciousness" isn't as fancy as we think?
I currently work on synbio × web archival.
Some of us are cooking up futuretech aimed at storing all of IA (archive.org) in a shoebox. Others are working on putting archival tools in more normal web users' hands, and making those tools do things that people tend to value more in the short-term, like help them understand what they're researching, rather than merely stash pages.
My ambitions for web archives are outsized compared to other archivists, but I'm fine with that. I'm looking beyond web archives as we currently understand them toward web archives as something else that doesn't quite exist yet: everyday artefacts, colocated and integrated with other web technology to an extent that they serve in essential sensemaking, workflow, and maybe security roles.
Right now, some obvious, pressing priorities are (a) preserving vastly more content and (b) doing more with the archives themselves.
A: The overwhelming majority of born-digital content is lost within a far narrower time-slice than would admit preservation at current rates, and data growth is accelerating beyond the reach of conventional storage media. So, for me, the world's current largest x is never the true object of my desire. I'm after a way to hold the world that is and the world to come.
Ideally, that world to come is one where lifelong data stewardship of everything from your own genome to your digital footprint is ubiquitously available and loss of information has been largely rendered optional.
This, of course, requires magic storage density that simply defies fundamental limitations of conventional storage media. I'm strongly confident that we're getting early glimpses of the first real Magic contenders. All lie outside, or on the far periphery of, the evolutionary tree that got us the storage media we have today. For instance, I'm running an art exhibition that involves encoding all the works on DNA.
B: Distributed archival that comes almost as naturally as browsing is well within reach, and with that comes some very new potential for distributed computation on archives. One hand washes the other.
One important thing to realize here is that, in many cases, you can name a very small handful of individuals as the reason why current archival resources exist. GPT-3 is cracking the surface by training on data produced by one guy named Sebastian, for instance.
…i'm sorta tired and have to respond to something about every twitter snapshot since June being broken, though, so I'll pick this back up later.
I could use… what's the word? I think it's more funding.
But you are right, it can't be both in the context of this article :)
Now, I’m not so sure :)
Saying that, I briefly saw the first sentence of your comment and went to read the article with the idea that trickery was afoot, specifically guessing correctly the nature of the article. And yet, even then, on the back foot... it fooled me. Incredible.
It was relatively good, although I began to suspect it was GPT3 generated about halfway through (partially because the style felt a bit stiff but also just out of a shayamalan-what-a-twist 6th sense of mine that was tingling)
I agree with you. I suspect few people have read until the end to realize that, in fact, ...
I then reread it and it indeed read like a weird, rambling, incoherent article. Looking at it closely, it had a good many contradictory, meaningless and incoherent sentences.("It is a popular forum with many types of posts and posters.")
The headline, however, seemed about right.
It's true the nonsense in this article is a bit different than the nonsense of a GPT-2 article. But the thing GPT-2 paragraphs sound pretty coherent 'till they suddenly go off the rail. This is more like an article that was never quite on the rails and so it's slightly more internally cohesive. But not "better".
Maybe the article just reflects the author's style. Anyone have a GPT-3 test site link?
GPT-3 is objectively a step forward in the field of AI text-generation, but the current hype on VC Twitter misrepresents the model's current capabilities. GPT-3 isn't magic.
With so many weights, it practically encodes a massive Internet text database.
However, if the output needs to be curated and edited by humans, the scale and automation is gone - we just get a different manual process, with a modest improvement to speed at cost of some decline in quality, and that's not very impactful.
Google at this point favours long form content for many search intents. Being able to generate thousands of these pages in one-click is a real problem. Not just because of popular topics e.g. "covid-19 symptoms" but more so for the long tail e.g. "should I drink coffee to cure covid-19".
It may be that Google's algorithms don't care at all how human-like the text is, or that their own recognition algorithm/NN (whatever they use) isn't fooled. Even if it is affected, Google has the money and corpus to build its own competing NN to recognize GPT-3 text.
That said, there might be a different threat to Google. GPT-3 seems really useful as a search engine of sorts (with the first answer implementing the 'I'm Feeling Lucky' button). Tune it for a query syntax, and for getting the 'top X' results somehow, then we just need the web corpus and a basic filter over the results. We could have a very interesting Google competitor.
More than cherry-picking, there's the Eliza Effect - it's pretty easy to make people think generated text is intelligent. That text can seem intelligent for a while isn't necessarily impressive at all.
Makes me worry about my own reading comprehension, but I think what happened was that since it was posted on HN and got upvoted a lot, I simply assumed that anything that I didn't understand was not the writer's fault, but mine.
For instance, it was unclear from the post what the bitcoinforum experiment was about, but I just dismissed it as me not being attentive enough while reading.
At one point GPT-3 writes: "The forum also has many people I don’t like. I expect them to be disproportionately excited by the possibility of having a new poster that appears to be intelligent and relevant." Why would people he doesn't like be paricularly excited about a new intelligent poster? Again I just assumed that I missed the author's point, not that it was nonsensical.
Twice it refers to tables or screenshots that are not included, but it seemed like an innocent mistake. "When I post to the forum as myself, people frequently mention that they think I must be a bot to be able to post so quickly" seemed like another simple mistake, meaning to say that when he posted as GPT-3, people thought he was being too quick.
This is like a written Rorschach test, when I'm convinced that what I'm reading must make sense, then I'll guess at the author's intent and force it to make sense, forgiving a lot of mistakes or inconsistencies.
Bots offering idiocy and idiocy generally has done lots of damage. But by idiocy here I would quite carefully calculated cleverly polarized positions and I don't think just bot-rot would be enough (to maybe coin a phrase).
It’s cool, but it looked like very basic stuff - the type of UI that is very easy to create in a few minutes. (And really with what was setup behind the scenes - maybe just as fast to just write the code.)
The hard part about software development is not those bits which are common, but the parts that are unique to our specific solution.
Search terms tweaked for your unique interests, and not a commercial entity's, for example.
Is reddit gold really that valuable?
Surely there are easier ways.
> really useful
We already have enough 2020 reddit commenters regurgitating 2010 hn threads regurgitating 2000 slashdot threads, thanks.
This will accelerate development. Is the current version there? Probably not. But GPT-4 might, and would then accelerate the development of future versions.
Even though this is not "magic", it sounds like it will turn into a practically usable and extremely valuable tool soon.
However I like spirit of optimism and first looks at encouraging and very promising results.
Honestly not that impressive since you can get comparable results with a series of regex rules given that there are limited ways to describe your intent e.g. "create a button of colour <colour> at the <location of button>"
I believe the hype is that people think they can replace the designer by "just telling the computer" what they want. I don't believe that will work, as they already have trouble telling a human what they want, and a computer won't really know what to do with "I want it to kind of feel like it's from that movie with the blue people that Cameron did, you know?"
In my experience, people have a hard time writing their ideas about designs & features down, because they don't know what they want. They want to talk about it abstractly with somebody who has a better understanding of the field so that person can help them develop the idea. I don't think ML will cover that part any time soon.
From an academic standpoint, writing is part of the thinking process. If you haven't written it down, you haven't fully thought it through. If it feels difficult, that's probably because your understanding isn't as complete as you thought it was.
From a software development standpoint, implementing something is part of the thinking process. Ever notice how the requirements have a tendency to break as soon as you actually try to implement them? If a spot seems difficult it just means you hadn't really figured it out yet.
I 100% agree. I noticed a giant shift in tasks when I made one client write tickets instead of making phone calls. Writing it down forces you to think it through.
And I agree about software development as well, yes. Though I think it's even rare to have somebody describe all the features they want unless it's an experienced software developer who basically writes a textual representation of the application.
But for most PMs (that I've worked with at least), they have vague ideas about what they want, and bringing them into focus is a back and forth with developers and designers. I don't see them getting anywhere with an NLP automaton, but maybe with an Eliza-style system: "Give me a big yellow button saying 'Sign up'" - "Why do you want a big yellow button saying 'Sign up'?" - "You're right, that's too on the nose... give me a link saying 'Sign up'"...
@balajis being generated by GPT-3 would make a lot of sense, though.
Granted, it seems like there was a lot of behind the scenes work to make that happen.
It's qualitatively different than GPT-2. I was on a discord with someone that has access to it and a bunch of us were throwing ideas out for prompts. One of them was to provide an anonymized bio of someone and see if it could guess who it was. The format was 'this person...they..and then they...\nQ: Who is this person?\nA: '
At the first pass it didn't guess correctly. But we erased its response and tried again and it got the answer correct. We then asked it to screenwrite some porn and tell jokes. Yes there were some misses, but it got things right so frequently that you can start to see the future.
Having all of this capability in one package is pretty remarkable and nothing has approached it to date.
"Text generation" undersells it a little bit. What are humans except "text generation" machines? Language is the stuff of reason. GPT-3 has demonstrated capabilities that we believed were exclusive to humanity --- humor, logic, sarcasm, cultural references --- in an automatic and generalizable way. It's far more than a "text generation" system. I look forward to seeing what GPT-4 and GPT-5 can do. I suspect we're all going to be amazed by what we get when we continue down this path of simple scaling (and sparse-ification) of transformer architectures trained on (basically) the whole internet.
The ability to grow and choose our own direction: to choose what our goals are, curiosity, self-awareness, desire. To imply that GPT-3 is anything close to strong AI is kind of ridiculous.
I predict within a few years, the descendants of GPT-3 will use very different fundamental units for processing that differ greatly from the current state-of-the-art (i.e. they won't use BPEs and their ilk anymore, except for final output) and will be far more powerful as a result.
I do agree with you. We, as somewhat intelligent beings, do not base our thinking on words or language AFAIK, even though it's our best way to convey ideas to others. And we learn through experience, way faster than GPT-3 does, with fewer shots. It looks like the attention mechanisms are what made these models actually start to understand things... But those attention mechanisms are still very raw and mainly designed to be easy to execute on current hardware, I wonder how fast will we refine that.
Finally it looks like, once trained, these models don't learn when we use them. It definitely doesn't learn through experience and that's a major limitation on how intelligent it can be.
I think sentience like most things is a spectrum, so I'm not really sure what you mean by sentient, but I would argue that for most people the bar for sentience is much higher than text prediction. The Chinese room is only one aspect of our minds, and we don't even know what consciousness is.
And to be fair, reasonable people stake out positions on both sides of this debate: I'm not claiming that the alternative proposition is somehow unreasonable. It's a legitimate subject of scholarly disagreement.
Nevertheless, I'm still firm on language. Why? Because all complexity is ultimately about symbolic manipulation of terms representing the process of manipulation itself. ("Godel, Escher, Bach" is a fantastic exploration of this concept.) How can you manipulate concepts without assigning terms to their parts? That's what language is.
The question I like to ask is this: are there any ideas that you cannot express using language? No? Then how is thought distinct from language?
Yes, people (myself included) experience a "tip of the tongue" experience where you feel like you have an idea you can't just yet express. But maybe this experience is what reason feels like. Why should idea formation take only one "clock cycle" in the brain? Why should we be unaware of the process?
I think this feeling of having an idea yet being unable to formulate it is just the neural equivalent of a CPU pipeline stall. It's not evidence that we can have ideas without language: it's evidence that ideas sometime take a little while to gel.
I think as highly social beings we often annotate all of our thoughts with the language we could use to communicate them, which could lead us to believe that the thoughts are indistinguishable from the language, but that conclusion seems like an error to me. I’ve also heard some people talk about how they are “visual” or “geometric” thinkers and sometimes think in terms of images and structures without words.
Not sure there's one I can communicate to you, but I'm perfectly capable of forgetting the word for something and still knowing unambiguously yet wordlessly what it is, that's an experience.
Catching a ball? Running? Experiencing emotions from wordless music? Viewing scenery? Engaging with a computer game? How are they not conscious experiences?
To me this indicates a very narrow view of consciousness. Consider for a moment the quiet consciousness of the cerebellum for example.
I like the way David F. Wallace put it: 'Both flesh and not'. There's an astounding amount of consciousness that is not bound by language. One can even argue that language might hinder those forms of consciousness from even arising.
What is the role of the body in consciousness, then?
> only context it has is its prompt
The only real context is its latent representation of the prompt, there's nothing fundamentally limiting visual, auditory, symbolic, and mixed prompts as long as they map to a common latent space and the generator is trained on it.
Text generation doesn't chop wood, optimize speedruns, build machinery or win 100-metre dashes.
Text may be involved in training for these things, but to say that doing them is text generation would be like saying that... since compiling code and running AlphaZero both generates bits, AlphaZero is a compiler.
This does not impress me in the slightest.
Taking billions and billions of input corpora and making some of them _sound like_ something a human would say is not impressive. Even if it's at a high school vocabulary level. It may have underlying correlative structure, but there's nothing interesting about the generated artifacts of these algorithms. If we're looking for a cost-effective way to replace content marketing spam... great! We've succeeded! If not, there's nothing interesting or intelligent in these models.
I'll be impressed the day I can see a program that can 1) only rely on its own limited experiential inputs and not billions of artifacts (from already mature persons), and 2) come up with the funny insights of a 3-year-old.
Little children can say things that sound nonsensical but are intelligent. This sounds intelligent but is nonsensical.
Seriously, a few years ago recognizing if there's a bird in a photo was an example of a "virtually impossible" task: https://xkcd.com/1425/
Yeah I mean, I agree. But in my opinion, it's a case of "doing the wrong thing right" instead of a more useful "doing the right thing wrong."
I grant that these automated models are useful for low-value classification/generation tasks at high-frequency scale. I don't think that in any way is related to intelligence though, and the only reason I think they've been pursued is because of immediate economic usefulness _shrug_.
When high-value, low-frequency tasks begin to be reproduced by software without intervention, I think we'll be closer to intelligence. This is just mimicry. Change the parameters even in the slightest (e.g. have this algorithm try to "learn" the article it created to actually do something in the world) and it all falls down.
Progress is often made with steps that would have been astonishing a few years ago. And every time the bar is raised higher. Rightly so, but characterizing this as doing the wrong thing is missing the point of what we, and the system, are learning.
Yes it's not intelligence. But then, it's not even clear that we ourselves can define intelligence at all… not all philosophers agree on this. Daniel Dennett (philosopher and computer scientist) for example thinks that consciousness may be just a collection of illusions and tricks a mind plays with itself as it models different facets of and lenses into what it stores and perceives.
I think you missed my point. I think we're going in the wrong direction for AI entirely, and these "advances" are fundamentally misguided. OpenAI is explicitly about "intelligence," and so we should question if this is in fact that.
It's clear that humans have fundamental intelligence much better than all of this stuff with 6 orders of magnitude less input (at least of the same data sort) on a problem.
Perhaps it would be better to say, "I think the ML winter is just around the corner" as opposed to "the AI winter is just around the corner." That said, this really is math, and these algos still don't actually do anything resembling true intelligence.
>6 orders of magnitude less input
That is utterly mistaken.
We have the input of millions of generations of evolution which have shaped our brains and given us a lot of instinctive knowledge that we do not need to learn from environmental input that happens during our lifetime.
Instead it was learned over the course of billions of years, during the lifetimes of other organisms that preceded us.
Our brain structure was developed and tuned by all these inputs to have some built in pretrained models. That’s what instincts are. Billions of years in the making. Millions, at the very least, if you want to restrict it to recent primates, although doing so is nonsensical.
>That is utterly mistaken.
I did say of data of the "same sort".
What's absolutely crazy is somehow we think of our DNA base pairs as somehow more important than the physical context that DNA ends up in (society, humans, talking, etc.)
We have the ability to be intelligent and make thoughts with 1 millionth the amount of textual data as this OpenAI GPT-3 study. Maybe... just maybe... intelligence is far more related to things other than just having more data.
I'll actually expand on this and throw this out there: intelligence is in a way antagonistic to more data.
A more intelligent agent needs less knowledge to make a better decision. It's like a function that can do the same computation with fewer inputs. A less intelligent agent requires a lookup table of previously computed intelligent things instead of figuring it out on its own. I think all these "AI" studies are glorified lookup tables.
Note in particular that "like a function that can do the same computation with fewer inputs" maps very well to GPT-3 - it can complete many interesting tasks by just having a few samples provided to it, instead of having to fine-tune it with more training.
The reason it doesn't need more training is because it's already trained itself with millions of lifetimes of human data and encoded that in the parameters!
Humans aren't born trained with data. The fact that we're throwing more and more data at this problem is crazy. The compression ratio of GPT-3 is worse than GPT-2.
You know what else is trained by the experiences of thousands of individual (and billions of collective) human lifetimes of data? And several trillions of non-human ones?
> Humans aren't born trained with data.
That's either very wrong or about to evolve into a no true scotsman regarding what counts as data.
AKA "why is it so hard to swat a fly?" because they literally have a direct linkage betweeen sensing incoming air pressure and jumping. Thats why fly swatters don't make a lot of air pressure.
Why do you yank your hand back when you get burned? It's not a controlled reaction. Where did you learn it? You didn't.
If you think the brain is much more than a chemical computer you are sadly mistaken. I would encourage you (not really but it's funny to say) to go experiment with psychedelics and steroids and you will quickly realize that these substances can take over your own perceived intelligence.
The most fascinating of all of this is articles/documentaries about trans people that have started taking hormones and how their perception of the world -drastically- changed. From "feeling" a fast car all of a sudden, to being able to visualize flavors. It's absolutely amazing.
This direction has produced results that eluded 30+ years of research. What is the evidence that this is the wrong direction?
Of course evolutionary algorithms are just one direction as well. But that doesn’t mean that nothing else is happening.
IIRC the following is attributable to either Margaret Atwood or Iris Murdoch:
"A writer should be able to look into a room [full of people] and understand [in breadth] what is going on."
A computer that is actually fluent in English — as in, understands the language and can use it context-appropriately — should blow your entire mind.
Did you never do grammar diagrams in grade school? :-)
The "context" and structure of language is a formula. When you have billions of inputs to that formula, it's not surprising you can get a fit or push that fit backwards to generate a data set.
This algorithm does not "understand" the things it's saying. If it did, that wouldn't be the end of the chain. It could, without training, make investment decisions on that advice, because it would understand the context of what it had just come up with. Plenty of other examples abound.
Humans or animals don't get to have their firmware "upgraded" or software "retrained" every time a new hype paper comes out. They have to use a very limited and basically fixed set of inputs + their own personal history for the rest of their lives. And the outputs they create become internalized and used as inputs to other tasks.
We could make 1M models that do little tasks very well, but unless they can be combined in such a way that the models cooperate and have agency over time, this is just a math problem. And I do say "just" in a derogatory way here. Most of this stuff could have been done by the scientific community decades ago if they had the hardware and quantity of ad clicks/emails/events/gifs to do what are basically sophisticated linear algebra tasks.
Hasn't the typical human taken in orders of magnitude more data than this example? And the data has been of both direct sensory experience and texts from other people as well.
Have you read GPT-3s 175 billion parameters (words, sentences, papers, I don't care) of anything? Do you know all the words used in that corpus? Nobody has or does.
A child of a small age can listen to a very small set of things and not just come up with words to communicate with mama and papa what they learned, but they can reuse it. And this I think is key, because the language part of that is at least partially secondarily. The little kid understands what they're talking about even if they have a hard time communicating it to an adult. The fact they take creative leaps to use their extremely limited vocabulary to communicate their knowledge is amazing.
Your post was generated using GPT-3 and 175 billion parameters of pre-existing human writing, contextualized, distilled, and cross-referenced with terminology we've agreed on for centuries. It's a parrot, and I remain unimpressed.
Take the learned knowledge of GPT-3 (because it must be so smart right?) and have it actually do something. Buy stocks, make chemical formulas, build planes. If you are not broke or dead by the end of that exercise, I'll be impressed and believe GPT-3 knows things.
What's unimpressive about a stunningly believable parrot? I think, at the very least, that GPT-3 is knowledgeable enough to answer any trivia you throw at it, and creative enough to write original poetry that a college student could have plausibly written.
Not everything worth doing is as high-stakes as buying stocks, making chemical formulas, or building planes.
Sigh. When DNA becomes human, it doesn't have a-priori access to all the world's knowledge and yet it still develops intelligence without it. And that little DNA machine learns and grows over time.
When thousands of scientists and billions of human artifacts and 1000X more compute are put into the philosophical successor of GPT-3, it won't be as impressive as what happens when a 2 year old becomes a 3 year old. (It will probably make GPT-4 even less impressive than GPT-3, because the inputs vis-a-vis outputs will be even that much more removed from what humans already do.)
So basically like DNA?
All DNA does it encode for how to grow, build, and mantain a human body. That human body has the potential to learn a language and communicate, but if you put a baby human inside an empty room and drop in food, it will never learn language and never communicate. DNA isn't magic and comparing "millions of years of evolution" of DNA is nothing like the Petabytes of data that GPT-3 needs to operate.
Again DNA has no knowledge embedded in it, it has no words or data embedded. Data in the sense that we imagine Wikipedia stored in JSON files on a hard disk. DNA stores an algorithm for growth of a human, that's it.
The GPT-3 model is probably > 700GB in size. That is, for GPT to be able to generate text it needs an absolutely massive "memory" of existing text which it can recite verbatim. In contrast, young human children can generate more novel insights with many orders of magnitude less data in "memory" and less training time.
If you mean that anything except full general intelligence is unimpressive than that seems like a fairly high standard.
The human brain requires less training, but to some extent it is pretrained by our genetic code. The human brain will take on a predictable structure with any sort of training.
This post was generated using GPT-3. [;)]
Can’t tell if you are kidding or not, but if you aren’t, mind sharing links about the researcher for the curious?
edit: Can't seem to find it which is a shame. I think it may have been included in a TED talk.
> While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples 
I don’t know what constitutes an example in this case but let’s assume it means 1 blog article. I don’t know many humans that read thousands or tens of thousands of blog articles on a specific topic. And if I did I’d expect that human to write a much more interesting article.
To me, this and other similar generated texts from OpenAI feel bland / generic.
Take a listen to the generated music from OpenAI - https://openai.com/blog/jukebox/. It’s pretty bad, but in a weird way. It’s technically correct - in key, on beat, ect. And even some of the music it generates is technically hard to do, but it sounds so painfully generic.
> All the impressive achievements of deep learning amount to just curve fitting
Judea Perl 
This comment was written by a human :)
I'd like to play devils advocate here.
Given one blog article in a foreign language: Would a human be able to write coherent future articles?
With no teacher or context whatsoever how many articles would one have to read before they could write something that would 'fool' a native speaker? 1000, 100,000?
I have no idea how to measure the quantity/quality of contextual and sensory data we are constantly processing from just existing in the real world, however, it is vital to solving these tasks in a human way - yet it is a dataset that no machine has access to
I would argue comparing 'like for like' disregards the rich data we swim amongst as humans, making it an unfair comparison
Why then, the continued obsession with building single-media models?
Is focusing on the Turing test and language proficiency bringing us further away from the goals of legitimate intelligence?
I would argue "yes", which was my original comment. At no point in us trying to replicate what an adult sounds like have we actually demonstrated anything remotely like the IQ of a small child. And there's this big gap where it's implied by some that this process goes 1) sound like an adult -> 2) think like an adult, which seems to be missing the boat imo. (There's logically this intermediate step where we have this adult-sounding monster AI child.)
If we could constrain the vocabulary to that a child might be exposed to, the correlative trickery of these models would be more obvious. The (exceptionally good) quality of these curve fits wouldn't trick us with vocabulary and syntax that looks like something we'd say. The dumb things would sound dumb, and the smart things would sound smart. And maybe, probably even, that would require us fusing in all sorts of other experiential models to make that happen.
I think it's literally just working with available data. With some back of the envelope math, GPT-3's training corpus is thousands of lifetimes of language heard. All else equal, I'm sure the ML community would almost unanimously agree that thousands of lifetimes of other data with many modes of interaction and different media would be better. It would take forever to do and would cost insane amounts of money. But some kinds of labels are relatively cheap, and some data don't need labels at all, like this internet text corpus. I think that explains the obsession with single-media models. There's a lot more work to do and this is, believe it or not, still the low hanging fruit.
But why not just 1 lifetime of different kinds of data? Heck, why not an environment of 3 years of multi-media data that a child would experience? That wouldn't cost insane amounts of money (or probably anything even close to what we've spent on deep learning as a species).
A corpus limited to the experiences of a single agent would create a very compelling case for intelligence if at the end of that training there was something that sounded and acted smart. It couldn't "jump the gun" as it were, by a lookup of some very intelligent statement that was made somewhere else. It would imply the agent was creatively generating new models as opposed to finding pre-existing ones. It'd even be generous to plain-ol'-AI as well as deep learning, because it would allow both causal models to explain learned explicit knowledge (symbolic), or interesting tacit behavior (empirical ML).
How would you imagine creating such an environment in a way that allows you to train models quickly?
We've been conditioned to accept articles where there's a lot of words and paragraphs and paragraphs of buildup, but nothing actually being said.
(For context, the vast majority of the article was generated by GPT-3 itself).
2) Quite irrelevant,that's a motivation problem