> As the above samples show, our model is capable of generating samples from a variety of prompts that feel close to human quality and show coherence over a page or more of text.
OK, let's look at the sample that's displaying by default:
> System Prompt (human-written): Legolas and Gimli advanced on the orcs, raising their weapons with a harrowing war cry.
> Model Completion (machine-written, first try):
> “You are in good hands, dwarf,” said Gimli, who had been among the first to charge at the orcs; it took only two words before their opponents were reduced to a blood-soaked quagmire, and the dwarf took his first kill of the night. The battle lasted for hours until two of the largest Orcs attempted to overwhelm Aragorn. When they finally stopped, they lay defeated and lifeless for miles and miles.
> [Aragorn says something]
> “I’ll never forget it!” cried Gimli, who had been in the thick of the battle but hadn’t taken part in it.
This is not "close to human quality". It's terrible. Gimli kills an orc in battle... without taking part in the battle. It takes two words before the opponents (as opposed to, say, the battlefield) are reduced to a "blood-soaked quagmire", but the battle lasts for hours after that. After which two orcs lay defeated and lifeless for miles and miles.
This isn't even coherent from one sentence to the next. And paragraph three directly contradicts paragraph one. And Gimli calls Legolas a dwarf!
> As the above samples show, our model is capable of generating samples from a variety of prompts that feel close to human quality and show coherence over a page or more of text. Nevertheless, we have observed various failure modes, such as repetitive text, world modeling failures (e.g. the model sometimes writes about fires happening under water), and unnatural topic switching. Exploring these types of weaknesses of language models is an active area of research in the natural language processing community.
The authors go on to discuss more limitations (for example, the dataset doesn’t contain much outside of LOtR and some celebrities). I imagine that what the authors call “coherence” is weaker than what you are referring to (the AI is not necessarily telling a story, but it stays on the same topic / characters).
I still think that the result is incredibly impressive and powerful. You could start with this as a sort of English “noise”, and then run the result through a parser. This would allow you to add some “hard coded” world modeling or constraints. Ex: Maybe you could mix in sentiment analysis and reject some sentences to roughly control the narrative.
I agree in a way that I suspect is much more specific than what you have in mind. This system is managing to produce a lot of text which is not heavily constrained, and what it produces is generally grammatical English. That is impressive; in the past, producing grammatical text meant very tight restrictions on what it was possible to say, making "text generators" little more than prerecorded phone tree messages.
But this model clearly doesn't know the meaning of anything it writes, and therefore can't produce anything better than obvious nonsense. This is true of some humans too -- it is a very serious condition known as Wernicke's aphasia ( https://en.wikipedia.org/wiki/Receptive_aphasia ):
> Patients with Wernicke's aphasia demonstrate fluent speech, which is characterized by typical speech rate, intact syntactic abilities, and effortless speech output. Writing often reflects speech in that it tends to lack content or meaning.
Obviously, those suffering from Wernicke's aphasia are not able to function in society, since they effectively can't say or understand anything. I don't think matching the performance of humans who have mental deficiencies so serious that they are unable to function really counts as being "close to human quality".
> I imagine that what the authors call “coherence” is weaker than what you are referring to
I had two specific things in mind as "coherence" failures:
- Gimli kills an orc, and then is said to have not taken part in the battle.
- The sentence "When they finally stopped, they lay defeated and lifeless for miles and miles." In context, the referent of "they" can only be the two orcs that attempted to overwhelm Aragorn. But it isn't possible for two dead orcs to cover "miles and miles" of terrain. If this had been written by a human, I would assume that what the writer had in mind, but failed to achieve, was to use "they" to refer to everyone taking part in the battle; I can't really make that assumption here. That sentence needs to use nouns, not pronouns, because its context doesn't allow for the pronouns.
For example, on the «“You are in good hands, dwarf,” said Gimli» part, I pattern-matched to [boisterous protagonist remark] when I saw the opening quote, and skipped until after the dot.
My point is: to a reader like me, this "filler" (that's not the right word, but you get what I mean) could be machine-generated and I would barely notice it. I guess an author could concentrate on writing the "important parts" and let the machine "fill up the gaps".
You may not notice that text you didn't want to read anyway is just random self-contradicting gibberish, but someone wanted to read that part of it, and they will notice.
I kid, but the human mind has this extreme capacity for filling in the blanks and re-adjusting plain contradictions into something coherent.
It's a little bit how I'm able to imagine these epic stories from Dwarf Fortress' Legends mode (unfortunately can't provide any relevant links right now).
> “You are in good hands, dwarf,” said Gimli
The line reads as though they are talking to a dwarf, when actually Gimli is the dwarf.
Others are going to do it, others will replicate the work, the best defence is getting it out there so we can understand it and learn how to counter it.
“Open” AI indeed.
If you successfully tested a nuclear weapon it would be incredibly naive to think you could save the world from nuclear weapons just by keeping its implementation secret, someone else out there is just as smart as you and once you've proven it's possible then it's just a matter of time before someone else figures out how you did it.
Although personally I don't think their creation is anywhere near as dangerous as they would like to think, feels more like a PR stunt/dystopia LARPing.
i wish people would stop pretending that there is some good way to bring this technology into existence. yes, its nice to try and let the good guys use it first but its just irrelevant in the long-term. ultimately the result is going to be total proliferation of this technology in all areas where it has utility, and it will be used to maximum extent in every application it is suitable for, including the really bad ones. the roll-out will make the transition smoother but it wont change whats actually important: the end result on the lives of our grandchildren.
growing up around rapidly advancing technology, i thought of technology as a double-edged sword: it cuts equally in both directions. but after thinking about it for a long time, i now believe that, in relation to human well-being, the presence of a given technology or combination of technologies can be a net positive or a net negative as well as neither. we need to think more carefully before letting these genies out of their bottles.
this is not an example that i think will be very negative, but its very powerful and unexpected for me at least. the next powerful and unexpected thing may not be benign. banning development of these kinds of technologies should not be off the table.
after reading this: https://blog.openai.com/better-language-models/#sample8 and browsing reddit for a while, i have realized that from now on i cannot assume human origin for 90% of the comments i read on reddit. this is insane.
I hate to be cynical here but I'm glad this has made you realize something that's been true since the Internet started; you shouldn't trust what's written on any forum! Be skeptical.
Also, with the advances in deep fakes and synthetic video...how long before you can't trust video evidence either?
I suppose one place to start thinking about this would be photos. Are photos admissible evidence or do courts only allow negatives? Photos have been modified for a very long time. This is probably the most famous example: https://amp.businessinsider.com/images/52af668569bedd3b2643d...
But yes, I am far more worried about how much more effective fake news will be once they start coming with actual videos.
You shouldn’t trust any single source. Only a preponderance. Even then be open to skeptics
Because of this, it isn't helpful to say, "Oh you should always be skeptical! It doesn't matter if things have changed significantly such that we have more reason to be skeptical now."
Ever since Photoshop got good (20+ years now?) we haven't been able to assume that images are "real" either and things turned out fine. We'll have to learn to be skeptical.
Anyway Reddit already has dedicated bots (account names ending in "SS") posting and commenting on their own content, mostly hilarious but sometimes fairly "real" checkout /r/SubredditSimMeta
However, there is nothing that will stop them from further developing these technologies even without access to the research from more liberal nations.
To halt development is to drop out of an arms race that cannot afford to be lost.
I wonder if eventually we'll have sites like reddit or forums that require you to demonstrate who you are before joining. Eg they require a photo of you and your passport. The site wouldn't use that information for anything, but this would reasonably guarantee that there's a real identity behind every poster.
I think we're going to face a lingering question with AI. We're imminently reaching the point where AIs will be able to generate fake everything. In the near future (if not present!), I could be fake for all you know writing lots of otherwise coherent posts, only to secretly jam in some sort of agenda I've been programmed to advocate for. And there could be millions, billions, an unlimited number of "me". Or the latest hottest site trying to sell itself on its own 'buzz' could be full of millions of people actively engaging on the platform, except none of them actually exist.
So do we try to keep these AI systems secret, or do we make them widely available and rely on a rapid shift in public consciousness as a result? It's one thing to try to tell people to engage in sufficient scrutiny over text, images, audio, and increasingly even video. It's another when people see that such fakes are trivially produced by anyone.
I do realize that the 'chaos scenario' sounds... chaotic... to put it mildly, but I think the underlying issue here is that these tools will reach the public one way or the other. By keeping them secret the big difference is that the public will be less aware of the impact they're having, and the players operating such tools will be disproportionately made up of people trying to use them for malicious purposes - be that advertising, political influence, or whatever else.
On another note, think back to how general people responded to things like Eliza and other chatbots, or the sims and other emergent storytelling games. What if there was a "social media" platform where all your connections were purposely AI, like your own personal TV sitcom/drama/comedy/whatever. Surroubd yourself with people who think like you that you can finally have "intelligent discussions" with without heated arguments from all the idiots who disagree. People love gossip; what if you had an endless supply from fake people? "You'll never believe what Frank said to Janice!" "I thought Bob and Alice would be together forever."
-- Ray Bradbury, "Fahrenheit 451" (1953)
> Having lived through the horrors of the pre-smartphone days
> the agonizing pain of [..] standing in line at a grocery store
-- HN, 2017 ( https://news.ycombinator.com/item?id=15912378 )
You should already be skeptical of every comment. Looking for what it’s agenda is.
The cheapness and thus commality of AIs doing it will just wake people up to fact they should not have been so trusting all along.
I think eventually every piece of information will have to be digitally signed and our devices will by default limit what we're exposed to based upon whitelists.
So, they never say this is near flawless, or that it would fool you in a turing test. In some contexts though, it may be usable maliciously. It could spoof amazon reviews (as they mention), scalably fish for romance scam vicims, or sockpuppet political social media, harrass, manipulate or scale troll-farming to new levels or set up dates for you on tinder.
The point is that the ability to impersonate humans is troublesome, potentially. I don't think non-publicatin is an answer, but i do think the concern seems valid... to me.
Otoh I saw enough marketing fakes/mock ups to be skeptical on this one.
For example my takeaway from OpenAI five was that the bots outmicrod the human players with little more to it.
What matters is how intelligent they are along various axes.
Automatic programs have been surpassing humans on some dimensions for ages, but we keep insisting that they are not truly intelligent because they can't beat us along all axes. Throughput on simple logic tasks was the elephant in the room, and the scope of "simple" has been expanding at an exponential pace.
Now they are closing the gap or surpassing us on axes that were thought to be bastions of human cognition (TFA, and after chess and go, Google (Alpha Zero) recently beat two Starcraft 2 champions).
Freaking out (err... I mean "not releasing the full model") is understandable, but ultimately misguided as it will only delay the unavoidable... Unless the plan is to enact a global ban on AI research which I don't think is feasible anyway.