
How I trained an AI to detect satire in under an hour - thetall0ne
https://towardsdatascience.com/how-i-trained-an-ai-to-detect-satire-in-under-an-hour-2b8b300ea805
======
colinhmit
Anecdotally, I've noticed the onion uses certain phrases over and over again
in articles - "area man" comes to mind.

Did you really train it to detect satire, or just the onion writer's
conventions? How does it perform when trained on onion articles and tested
against some non-onion satire publication?

~~~
turdnagel
Exactly. A bit of an exaggerated headline, no? The first paragraph basically
refutes the headline. What was built was an Onion article detector.

~~~
beal
Article seems like its a thinly veiled ad.

~~~
24gttghh
All the bolding of text basically gives it away...

------
minimaxir
This has the same input data fidelity issues as the author's previous approach
toward identifying fake news, which was flagged to death for being misleading:
[https://news.ycombinator.com/item?id=16128295](https://news.ycombinator.com/item?id=16128295)

A sample size of 600 for _text data_ is literally nothing for these types of
models. (although atleast the classes are balanced this time)

------
latenightcoding
This is a thinly veiled "machinebox" ad. Thanks for teaching us how to overfit
in under an hour OP.

~~~
minimaxir
I suspect the OP cheated by only training on 1 epoch; otherwise, the overfit
would become obvious as the validation accuracy craters.

------
happertiger
As always it’s easy to apply this technology to differentiated content from a
single publisher vs another publisher, as is the case here. In addition the
onion is satire, but satire is the easier use case because not only is it
single source content (as mentioned, but the larger the author pool and the
more differentiated the model the higher difficulty accuracy becomes) but it
doesn’t have to take into account less outrageous articles built on subtler
genres like parody and sarcasm. Subtle crushes machine learning algs ime.

Love the concept, but it’s be great to see a deeper exploration as a demo.
Keep going!

------
MajorSauce
Without diminishing the author's efforts, I would say that he quickly teaches
his AI how to recognize The Onion articles, instead of satire articles in
general.

I would be curious to see results with 3-4 news sources for each groups.

------
komali2
This is what I loved about Google cloud machine learning API (or whatever
mixture of the above nouns it's called now). I found it during my final
project as a coding bootcamp student and got it up and running within a day,
telling me whether a sentence was in one of three given languages.

Machine learning / ai things like this are so simple and approachable right
now. Just fill a .CSV and upload it, boom, training model.

------
mozumder
Can't tell if this is satire or not.. author needs to put his own article into
his classifier.

------
Daniel3
I think the author has used the word "trained" satirically.

