
Controlling Text Generation with Plug and Play Language Models - aleyan
https://eng.uber.com/pplm/
======
sovreign
This is incredible! The minimum impact I see is much improved chatbots and
conversational UIs. Text generation libraries in the future might be like
super powered printf formatting where the number placeholders are actually
pplm discriminators used to force GPT-2 to add a position for text. Then the
rest of the sentence prefills %s placeholders using a combination of other
pplm discriminators.

e.g. printf("%[budget]s %f", 654) where the stribg placeholder is generated
using the bag of words discriminator of "budget" related sentences that also
contain a single float placeholder (which is then replaced with your custom
value).

~~~
gwern
What's particularly nice is if you can plug in a classifier for things like
esthetics based on human ratings, along the lines of 'learning from human
preferences'
[https://arxiv.org/abs/1909.08593](https://arxiv.org/abs/1909.08593) but
better - why spend the enormous effort running PPO to brute force the
classifier to obtain desired text or image output, when you can just backprop
through it and let the classifier itself tell you how exactly to improve the
inputs?

(I was writing up this idea as a proposal for improving preference learning at
[https://www.gwern.net/GPT-2-preference-
learning#optimization...](https://www.gwern.net/GPT-2-preference-
learning#optimization-by-backprop-not-blackbox) based on my experiences idly
waiting for PPO to run and being annoyed by the divergences, and then Uber
just... blogs it out. Like a madman.)

------
master_yoda_1
Snake oil ;)

~~~
boboddy99
Isn't that all of deep learning?

