
Generating Poems with GPT-2 - gwern
https://www.gwern.net/GPT-2
======
CYHSM
Nice work! I was wondering if you noticed changes in the output coherence
during training?

I fine-tuned it on the corpus of The Office quotes [1] and I noticed that a
loss of around 0.9 gives me the most 'humorous' outputs. This may be
subjective but I think for comedy the surprise plays a huge role and for
longer training (and loss around 0.4) it feels overly unsurprising and
therefore less funny. I also tried sampling with temperatures >1 but then it
just goes crazy (e.g. some outputs are completely in Latin).

[1]
[https://www.reddit.com/r/MachineLearning/comments/bmn0og/p_l...](https://www.reddit.com/r/MachineLearning/comments/bmn0og/p_language_model_gpt2_finetuned_on_the_office/)

~~~
gwern
I get a lot of Latin and Spanish in mine but I think that's because they
actually are represented in the poetry corpus. Not too surprising that the
regular GPT-2s are also exposed to a lot of foreign language, as Reddit is not
a strictly anglophone website, and that it'll remember despite some finetuning
(there are so many parameters in it, after all).

I do look at the training samples but I've never noticed a worsening of
'coherence' in the samples, so to speak. I wonder if that what overfitting
looks like? My PG corpus is so large that the GPT-2s struggle to converge,
much less overfit, so I don't know what overfitting would look like. You could
try using the new pseudo-validation loss checking feature nshepperd added to
see if there's any connection between the validation loss and your perception
of coherence.

