One potential issue with this approach is that the text it generates is 'nonsensical', in that it is almost like a word-salad. Although this is a standard problem with neural nets (and other machine learning algorithms), in this case the text actually is a word-salad. It seems that it has learned the rules of grammar, but not the meaning of words. It is able to string words together in a way that sounds right, but the words don't actually mean anything.
Plot twist: This comment was generated by GPT-3 prompted with some of the comments in this thread.
Soon enough, someone will replicate the Sokal hoax[b] with GPT-3 or another state-of-the-art language-generation model. It's not hard to imagine GPT-3 writing a fake paper that gets published in certain academic journals in the social sciences.
[b] https://en.wikipedia.org/wiki/Sokal_affair -- here's a copy of Sokal's hoax paper, "Transgressing the Boundaries: Towards a Transformative Hermeneutics of Quantum Gravity:" https://physics.nyu.edu/faculty/sokal/transgress_v2/transgre...
This comment was also written by GPT-3.
I have to admit, this is passing my turing test...
Maybe the real lesson is we don't expect human-written comments on discussion fora to be particularly coherent....
Especially the second comment can be coherently interpreted with some good will and a cynical view of the humanities and philosophy. The "author" could say that once GPT-3 can write humanities papers it will quickly make humanity scientists redundant and that humanities scientists are philosophers is not important and doesn't warrant a job alone ("they don't actually do anything"). Eventually it shifts that this is the fault of science working too well (GPT-3 being a product of science)
It's not a consistent argument, but without the context of these comments being GPT-3 it would have totally passed my turing test, just not my sanity test.
The model fundamentally has no understanding of the world, so if it can successfully argue about a central thesis without simply selecting pre-existing fragments, then it would suggest that the statistical relations between words capture directly our reasoning about the world.
The final bit doesn't quite connect, but overall I've seen far less coherent comments written by humans on subject with far more logical flaws.
I am genuinely awed.
Pretty average for HN then ;)
>In the not-too-distant future, there probably won't be any more philosophy professors; there will just be philosophers
Was quite clever and I'm still trying to figure out what it means.
It would be interesting to see if the output has a similar quality when trained only on highly regarded texts.
How could we expect it? After 35+ years (BBS and Usenet onward), we've learned that they are often not.
There's a totally valid discipline in taking concepts from different areas and smushing them together to make a new idea. That's what a lot of creativity is, fundamentally. So a bot that's been trained across a wide variety of texts, spitting out an amalgam in response to a prompt that causes a connection to be made, is not only possible, but likely a very good way of generating papers (or at least abstracts) for humans to check. And if the raw output is readable, why not publish it?
Would you please show us the input text, or rules, you gave to GPT-3 to create this comment ?
Not gonna lie, I went poking around to see if I could get my hands on it, but it seems like the answer is no, for now.
Second, I'm curious/terrified at how future iterations of GPT-3 may impact our ability to express ourselves and form bonds with other humans. First it's text messages and comments. Then it's essays. Then it's speeches. Then it's love letters. Then it's pitches. Then it's books. Then it's movie scripts. Then it's...
TLDR; Fascinated by the technology behind making something like this work and quite worried about the implications of the technology.
(So I think it was some other story on the same topic.)
Especially the shoggoth cat dialogue, I found that one really creepy. The fragment below comes straight out from the uncanny valley:
Human: Those memes sound funny. But you didn’t include any puns. So tell me, what is your favorite cat pun?
AI: Well, the best pun for me was the one he searched for the third time: “You didn’t eat all my fish, did you?” You see, the word “fish” can be replaced with the word “cats” to make the sentence read “Did you eat all my cats?”
In fact, while reading that comment I started to wonder why no one has tried to use GPT to generate text one character at a time. Or if someone has, what are the advantages and disadvantages over the BPE approach.
The quality of writing was very high, so I was convinced I was reading something put together by a human with agency... except it didn’t pass my gut-feeling “how IT works”. It made me suspect that either the algorithm (the described one, not the AI responsible) was off, or that I just didn’t understand AI any more. As I know I don’t have up to date AI knowledge, the algorithm appeared more believable. I hiked deep down the uncanny valley with that one.
Edit: it is amusing to think that soon the way to distinguish them will be that human comments have weird errors caused by smartphone keyboard "spell checking" in them...
Still, would be an interesting experiment. Gwern swears it would improve stuff, so worth trying and comparing, I guess
Given the propensity for academic writing to often favour the strategy of confusing the author through obfuscation (to make a minor advance sound more significant than it is), I suspect tools like this could, as you say, actually get published papers in some fields like social sciences. In an engineering or science paper you can check equations match conclusions, and that graphs match data etc.
In a more qualitative field of work, reviewed in a publish-or-perish system that doesn't incentivise time spent on detailed reviewing, I think there's a very real risk babble like this just comes across like every other paper they "review".
I think it takes a certain level of confidence to dismiss others' work as nonsensical waffle, but sadly this is a confidence many lack, and they assume there must be some sense located therein. Marketing text is a great place to train yourself to recognise much of what is written is meaningless hocum.
Sci-Gen - https://pdos.csail.mit.edu/archive/scigen/
Reporting on withdrawals of papers - https://www.researchgate.net/publication/278619529_Detection...
It's easy to consider text generation models as "just mimicking grammar". But isn't grammar also just a model of human cognition?
Is GPT modeling grammar or is it modeling human cognition? Since GPT can ingest radically more text (aka ideas) won't it soon be able to generate texts (aka ideas) that are a more accurate collation of current knowledge than any individual human could generate?
[Was this comment written by GPT-3?]
I am impressed though nobody dared to guess in 2 weeks.
I don't really understand why we're trying so hard to build models that can generate coherent texts based on having predigested only other texts, without any other experience of reality. Their capabilities appear already superhuman in their ability to imitate styles and patterns of any kind (including code generation, images, etc.). It feels like we're overshooting our target by trying to solve an unsolvable problem, that of deriving the semantics of reality from pure text, without any other type of input.
You can also publish a lot of nonsense in certain chinese journals that optimize for quantity in quality, in whatever field you want.
Some say this has already happened. Nobody has ever seen the Social Text editors and Mochizuki in the same room together, have they?
This kills the forum.
Seriously, once this is weaponised, discussion of politics on the internet with strangers becomes completely pointless instead of just mostly pointless. You could potentially convince a human; you can't convince a neural net that isn't in learning mode.
We might end up with reputation based conversations.
That could have consequences for their reputation, though.
(Reputation is a lot more controversial and complicated than it sounds)
Quite the opposite, I suspect.
Eventually, to engage in the most persuasive conversations, the AIs will develop a real-time learning mode.
Once that is weaponised, the AIs will be on track to be in charge of running things, or at least greatly influencing how things are run.
What the AIs "think" will matter, if only because people will be listening to them.
Then it will be really important to discuss politics with the AIs.