Hacker News new | past | comments | ask | show | jobs | submit login
Generating Fake Conversations by Fine-Tuning OpenAI's GPT on Data from Messenger (tenoke.github.io)
46 points by Tenoke 9 days ago | hide | past | web | favorite | 7 comments





Sorry, but these generated conversations seem nonsensical, nothing like the OpenAI results.

It uses their small model and a tiny dataset in comparison (and a small amount of training). It is more showcasing how much it learns (and doesn't learn) with those limitations in place. As well as allowing you to recreate it with perhaps a few minutes of work and less than an hour of waiting.

Also, I wouldn't say the results are nonsensical - I think it has learned a lot more than a markov chain or a simple rnn but I agree that especially on the surface they dont even sound like they surpass Eliza by much. Moreover, it is significantly more apparent how much it learns about the different people you've talked to AFTER you run it on your own data.

For a somewhat more novel/interesting result with fine-tuning GPT, I can recommend checking out gwern's post[1] on training it on a big poetry corpus.

1. https://www.gwern.net/RNN-metadata#finetuning-the-gpt-2-smal...


As it happens, nshepperd ran his finetuning GPT-2-small on our IRC channel. I'd tried it before with char-RNN back in 2015 or so, and I have to say, GPT-2-small trained way faster and better than my IRC char-RNN did.

The samples also looked a lot better than OP's. I assume that's because he ran it for more like a day on a few hundred MB of chat logs.



found this really interesting, I wonder if it could be used to create ~realistic conversations on a new discussion platform so it appears like people are there? :)

Who's going to connect this to CleverBot and let the hilarity ensue?

I think it is a step of progress really - they are able to sound like vapid morons if incongruously named ones - due to being named like middle earth elves type like elders who have the financial understanding of teenagers who haven't been allowed the barest wiff of legal documents or checkbooks.

Personally to me it reads like it could fit in written by someone disgrubtled in IT mocking the blatantly unqualified nepotism installed morons who they have to work with. It could be embedded in an "Office Space" style novel as actual chats between "worthless heir brigade" being eavesdropped.

It is passing Turing in the same sense spamming random keys based upon virtual keys on a keyboard and alternating silence and way too many messages seems like a real totally illiterate toddler got a hold of the keyboard for a bit.




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: