
Auto-Generating Clickbait with Recurrent Neural Networks - lars
http://larseidnes.com/2015/10/13/auto-generating-clickbait-with-recurrent-neural-networks/
======
thenomad
If I could feed this an article and have it generate headlines based on the
text of that article (and they were any good), there is a solid chance I would
pay real money for that service.

Headlines are an absolute pain, and as the article says, they're decidedly
unoriginal most of the time. I can't see an obvious reason that an AI would be
much worse at creating them as a human.

~~~
SixSigma
Instead of generating random articles

    
    
      1. Generate click bait headlines
      2. Write suitable copy for them
      3. ???
      4. Profit
    

Where 3, of course, is "build ad network"

~~~
morgante
I know of at least company which actually does this (tests headlines before
writing the actual articles).

~~~
bcohen5055
I've always wondered if sites do this the other way around. A site invests a
lot in content they create you would think to get the most out of it they
would serve it with different headlines to different demographics. Knowing
from Facebook what headlines you have clicked through in the past can indicate
how they should write future headlines to get you again.

~~~
morgante
Oh, they definitely do it the other way around extensively.

------
blisterpeanuts
I like the notion of swamping the Internet with fake click-bait headlines, to
dilute the attractiveness of this (to me, odious) form.

Give me sincere, honest news and discussion, or else shut up.

Unfortunately, someone out there must really have a craving for "weird old
tricks" and "shocking conclusions".

It's a sort of race-to-the-bottom, least common denominator effect.

Maybe someone will write a browser extension that filters out obvious click-
bait headlines. Now _that_ would be clever!

~~~
billmalarky
People don't have a craving for this kind of crap, that is they don't actively
search for it. It works by exploiting the brain. It's the publishing
equivalent to junk food. We know it's awful. We know it's bad for us. But we
struggle to not consume it because it's cheap and it pings our reward systems.

~~~
AnimalMuppet
Actually, I think that the clickbait junk _makes us think that it will ping
our reward systems_. For me, at least, it doesn't really reward me very well
(even in a junk food way).

Maybe this means that the real clickbait trash is training me not to click on
it, so I don't need the fake to do so?

~~~
TeMPOraL
I cured myself out of clickbait headlines after I clicked on few and learned
to expect no content on the other side. It's a simple association, really. You
click on something X-y, you get no reward, you learn not to waste time on
X-looking things.

~~~
billmalarky
For you certainly. But the proof of the pudding is in the tasting. Clickbait
drives an insane amount of traffic and it shows no slowing down. Cosmo has
been successfully using "clickbait" titles for decades on the cover of their
magazines.

~~~
TeMPOraL
Of course. I'm not denying the effectiveness of this technique, just providing
a n=1 datapoint. Maybe my personal idiosyncrasies make me immune to _that
particular_ type of traffic-driving technique (I have no doubts I'm vulnerable
to other methods).

------
rndn
Could this RNN model perhaps be used to filter click bait headlines from HN
automatically? Perhaps one could perform some sort of backward beam search to
figure out how likely a particular headline would've been produced by it. If
there are words in a headline that the model doesn't know, one could perhaps
just let it replace it with one that it knows.

------
oneJob
Now if we can just teach AI to get sidetracked reading all this content we'd
also prevent Judgement Day.

SkyNet: (speaking to self?) "Unleash hell on humans. Launch all missiles."

SkyNet: (responding to self?) "Not now, not now. Let me finish this article on
John Stamos's belly button."

------
ChuckMcM
[https://xkcd.com/1283/](https://xkcd.com/1283/)

I really find RNNs to be pretty cool. When they are combined with a natural
human tendency to see patterns they are hilarious. So perhaps we need to
update our million monkeys hypothesis to a million RNNs with typewriters
coming up with all the works of Shakespeare.

~~~
notahacker
Shakespearean RNN [http://cs.stanford.edu/people/karpathy/char-
rnn/shakespear.t...](http://cs.stanford.edu/people/karpathy/char-
rnn/shakespear.txt)

Surprisingly convincing if viewed as excerpts rather than a play.

Now to find some English teachers to try to interpret what Shakespeare meant
by some of those lines!

~~~
mey
[https://xkcd.com/356/](https://xkcd.com/356/)

------
clickok
Nice! I've wanted to do something like this for awhile, too, but haven't had
the time yet.

What's interesting to me, from a research point of view, is the degree of
nuance the network uncovers for the clickbait. We all know that <person> is
going to be doing <intriguing action>, but for each person these actions are
slightly different. The sentence completions for "Barack Obama Says..." are
mainly politics related while "Kim Kardashian Says..." involve Kim commenting
on herself.

So it might not really understand what it's saying, but it captures the fact
those two people will tend to produce different headlines.

Neat Idea: what if we tried the same thing with headlines from the New York
Times (or maybe a basket of newspapers)? We would likely find that the
Clickbait RNN's vision of Obama is a lot different from the Newspaper RNN's
Obama. Teasing apart the differences would likely give you a lot more insight
into how the two readerships view the president than any number polls would.

------
mikkom
What I'm surprised most is that the headlines seem not to be much better than
your average markov chain output

~~~
IanCal
I think this is for three main reasons:

1\. You can do really well with a simple grammar

2\. You only need short output

3\. Lack of training data

There's not an incredibly rich structure to extract, and with short outputs
the weirdness doesn't compound and cycles aren't as likely. A common _small_
dataset for playing with RNNs is all of Shakespeare which is somewhere in the
region of 1M words.

However, this is still fun and interesting!

~~~
mikkom
> 3\. Lack of training data

> [...]

> There's not an incredibly rich structure to extract, and with short outputs
> the weirdness doesn't compound and cycles aren't as likely. A common small
> dataset for playing with RNNs is all of Shakespeare which is somewhere in
> the region of 1M words.

He does state that the network is trained with 2M headlines, meaning ~5-20M
words. That _should_ be enough.

I would have thought that RNN would somehow work better. It would be
interesting to see direct comparison of fake hacker news headlines generated
with Markov chains versus RNN.

~~~
IanCal
True, I had managed to miss that, although it's working on 200 dimensional
vectors rather than single letters as in the small shakespeare dataset. That
feels like it might make it harder to train. I've personally found more
problems dealing with Glove vectors compared to the word2vec ones, but I don't
have any hard data for that.

------
VLM
This was an enjoyable article. There is an obvious extension which is to mturk
the results and feed the mturk data back into the net. Just give the turkers 5
headlines and ask them which they would click first, repeat a hundred times
per a thousand turkers or whatever.

Years ago I considered applying for DoD grant money to implement something
reminiscent of all this for military propaganda. That went approximately
nowhere, not even past the first steps. Someone else should try this (insert
obvious famous news network joke here, although I was serious about the
proposal). To save time I'll point out I never got beyond the earliest steps
because there is a vaguely infinite pool of clickbaitable English speakers on
the turk, but the pool of bilingual Arabic (or whatever) speakers with good
taste in pro-usa propaganda is extremely small, so the tech side was easy to
scale but the mandatory human side simply couldn't scale enough to make the
output realistically anything but a joke.

------
rlu
> The training converges after a few days of number crunching on a GTX980 GPU.
> Let’s take a look at the results.

Stupid question: why is the GPU important here? I would have thought this was
more of a CPU task..??

(then again, as I typed this I remembered that bitcoin farming is supposed to
be GPU intensive so I'm guessing the "why" for that is the same as this)

~~~
soggypretzels
GPU's are really good at parallel tasks such as calculating the color of every
pixel on the screen, or doing the same operation on a large dataset. According
to Newegg, the GTX980 has 2048 CUDA cores (parallel processing cores) that run
at ~1266 MHz as opposed to a nice CPU which might have 4 cores that run at 4
GHZ. In other words, if you want to manipulate a whole bunch of things in one
way in parallel, you can program it to use the GPU effectively, if you want to
manipulate one thing a whole bunch of ways in series, CPU is your best bet.

(note: this is massively oversimplified)

~~~
semi-extrinsic
Coarse rule-of-thumb: running on Geforce class GPUs you can get up to 5x,
maaaybe 10x the performance per dollar as compared to a top-line CPU. Assuming
your problem scales well on GPUs, many problems don't. The GTX980 is actually
a great performer. For Tesla class systems like the K40 it's a lot closer to
equal with the CPU on performance/$ (they're not much faster than the GTX980
but a lot more expensive). But you can get an edge with the Teslas when you
start comparing multi-GPU clusters to multi-CPU clusters, since with GPUs you
need less of the super-expensive interconnect hardware. (You're not going to
put GTX cards in a cluster, you'd have massive reliability problems.)

IMHO, the guys showing 100x speedups on GPUs are Doing It Wrong; they use a
poor implementation on the CPU, use just one CPU core, consider a very
synthetic benchmark, or a bunch of other tricks.

------
imaginenore
Getting this error:

    
    
        Error: 500 Internal Server Error
    
        Sorry, the requested URL 'http://clickotron.com/' caused an error:
    
        Internal Server Error
        Exception:
    
        IOError(24, 'Too many open files')
        Traceback:
    
        Traceback (most recent call last):
          File "/usr/local/lib/python2.7/dist-packages/bottle.py", line 862, in _handle
            return route.call(**args)
          File "/usr/local/lib/python2.7/dist-packages/bottle.py", line 1732, in wrapper
            rv = callback(*a, **ka)
          File "server.py", line 69, in index
            return template('index', left_articles=left_articles, right_articles=right_articles)
          File "/usr/local/lib/python2.7/dist-packages/bottle.py", line 3595, in template
            return TEMPLATES[tplid].render(kwargs)
          File "/usr/local/lib/python2.7/dist-packages/bottle.py", line 3399, in render
            self.execute(stdout, env)
          File "/usr/local/lib/python2.7/dist-packages/bottle.py", line 3386, in execute
            eval(self.co, env)
          File "/usr/local/lib/python2.7/dist-packages/bottle.py", line 189, in __get__
            value = obj.__dict__[self.func.__name__] = self.func(obj)
          File "/usr/local/lib/python2.7/dist-packages/bottle.py", line 3344, in co
            return compile(self.code, self.filename or '<string>', 'exec')
          File "/usr/local/lib/python2.7/dist-packages/bottle.py", line 189, in __get__
            value = obj.__dict__[self.func.__name__] = self.func(obj)
          File "/usr/local/lib/python2.7/dist-packages/bottle.py", line 3350, in code
            with open(self.filename, 'rb') as f:
        IOError: [Errno 24] Too many open files: '/home/ubuntu/clickotron/views/index.tpl'

~~~
striking
That'll teach you to do disk I/O on every page render.

~~~
DanielBMarkham
Yep. Have a separate process cache a few up and simply cp over the active one
to be served as a big bag of immutable bits. Bonus points for using a CDN.

------
juddlyon
I can't stop laughing at these. Check out the Click-o-tron site:
[http://clickotron.com/](http://clickotron.com/)

~~~
tylerpachal
My favorite: "residents can't remember if they lost their wine at the same
time." [1]

[1] [http://clickotron.com/article/5588/residents-cant-
remember-i...](http://clickotron.com/article/5588/residents-cant-remember-if-
they-lost-their-wine-at-the-same)

------
flashman
I used a simpler technique (character level language modelling) to come up
with an Australian real estate listing generator:
[http://electronsoup.net/realtybot](http://electronsoup.net/realtybot)

This is pre-generated, not live, for performance reasons. There are a few
hundred thousand items though, so the effect is similar.

The data source is several tens of thousands of real estate listings that I
scraped and parsed.

------
OhHeyItsE
This is simply brilliant.

(Ranking algorithm baked into a stored procedure notwithstanding. [ducks])

------
neikos
I am not sure how much I would give credit to the idea that the neural network
'gets' anything as it is written in the article.

> Yet, the network knows that the Romney Camp criticizing the president is a
> plausible headline.

I am pretty certain that the network does not know any of this and instead
just happens to be understood by us as making sense.

~~~
notahacker
_Life Is About A Giant White House Close To A Body In These Red Carpet Looks
From Prince William’s Epic ‘Dinner With Johnny '_

from the article would be a good counterexample of the neural network
"getting" anything.

If you're an algorithm "White House", "Prince William" and "Dinner With
Johnny" is to "Red Carpet" as "Romney" is to "Camp" and "Bad President".

------
andrewtbham
tldr; guy uses rnn lstm to create link bait site.

hopes crowd sourcing will filter out non-sense.

[http://clickotron.com/](http://clickotron.com/)

~~~
eb0la
Site down. Did HN readers crashed the server? Everything old is new again
(slashdot effect)?

------
chipgap98
"Tips From Two And A Half Men : Getting Real" is great. Some of the generate
titles are incredible

------
billconan
I can't understand the first two layer RNN which according to the author
optimized the word vectors.

it says:

During training, we can follow the gradient down into these word vectors and
fine-tune the vector representations specifically for the task of generating
clickbait, thus further improving the generalization accuracy of the complete
model.

how to you follow the gradient down into these word vectors?

if word vectors are the input of the network, don't we only train the weight
of the network? how come the input vectors get optimized during the process?

------
alkonaut
Missed opportunity for HN headline.

This program generates random clickbait headlines. You won't believe what
happens next. You'll love #7.

------
indiv0
Reminds me of Headline Smasher [0].

Some pretty fun ones there but it doesn't use RNNs. It just merges existing
headlines.

[0]:
[http://www.headlinesmasher.com/best/all](http://www.headlinesmasher.com/best/all)

------
kidgorgeous
Great tutorial. Been looking to do something like this for a while.
Bookmarked!

------
smpetrey
I think this one is my favorite:

Life Is About — Or Still Didn’t Know Me

~~~
JorgeGT
The "top" article in "clickotron.com" is "New President Is 'Hours Away' From
Royal Pregnancy" :)

------
CephalopodMD
Your main site is down. Bottle can't handle serving files scalably or
something? Point is, it broke.

~~~
lars
That was exactly the problem, bottle+gevent serving static files. It's moved
behind nginx now. (But you might have to wait for a DNS propagation before you
get to the new server.)

------
hilti
Interesting blog post, but site is down. How much traffic do You get from HN?

------
joshdance
500 Internal Server Error on the site where you could upvote em.

~~~
lars
Working on it:) It's getting a bit more traffic than expected at the moment.

