
DeepMind uses the Daily Mail as a huge training corpus for text comprehension - ilyaeck
http://www.technologyreview.com/view/538616/google-deepmind-teaches-artificial-intelligence-machines-to-read/
======
paulsutter
The original Deepmind paper [1] is based on a really smart idea. Algorithmic
development relies on measuring the performance of any proposed algorithm. For
reading comprehension, performance is evaluated using Q&A about the corpus.
Its difficult to find a large corpus with a comprehensive set of questions
about the content.

Deepmind is cleverly converting the Daily Mail article summaries into
questions by removing a proper noun. For example:

Question: Producer X will not press charges against Jeremy Clarkson, his
lawyer says.

Answer: Oisin Tymon

They are using the Daily Mail corpus to develop their algorithm, and that's
smart. They aren't relying on it as an important source of information. Maybe
all you guys with the dismissive comments have a better idea?

[1] [http://arxiv.org/pdf/1506.03340.pdf](http://arxiv.org/pdf/1506.03340.pdf)

EDIT: Thanks Otik, reworded the opening sentance

~~~
Otik
For a moment I thought you were calling the Mail "a great paper", but actually
the simple language used by it is probably quite good for this purpose.

Of course, it does mean that if you ask Deep Mind about the cause of anything
negative it will probably tell you "immigrants".

~~~
collyw
or "benefits scroungers"

------
johntaitorg
Another use for the Daily Mail:
[https://www.youtube.com/watch?v=xPlEIryW8zA](https://www.youtube.com/watch?v=xPlEIryW8zA)

------
Animats
The Daily Mail? Not the Times?

~~~
andyjohnson0
Apart from any other problems it may have as a knowledge source, the (London)
Times is paywalled.

------
peteretep
If you thought institutionalised racism, sexism and chauvinism were bad now,
the singularity is going to _suck_.

------
SixSigma
This is how Judge Death starts.

The crime is life, the sentence death.

------
KaiserPro
I know why they did it, because its the lowest common denominator for celeb
gossip.

However its shit for science, overtly paedophilic, and the only western news
sit that seems to have a special section devoted to curating and promoting
ISIS propaganda.

Why is it bad? because if you are looking for facts, the daily mail is a bad
source.

If you are looking for natural language, its a good source, however its full
of nuanced racism, sexism, classism & basically everything else thats wrong
with britian.

It'll be good at describing house prices though.

~~~
mattnewport
Overtly paedophilic? It's been a while since I lived in the UK and I never
read the Mail when I did but this particular accusation is new to me. I've
heard most of the others levelled at it before but what's the story behind
that one? If anything I'd have associated them with Brass Eye style
"paedophiles under the bed" paranoia (though in light of revelations of the
last few years maybe that wasn't so unjustified).

~~~
afandian
EDIT: this was the Daily Star not the Daily Mail.

This _is_ the Daily Mail though:
[https://www.youtube.com/watch?v=NKzgDBumCr0](https://www.youtube.com/watch?v=NKzgDBumCr0)

That was one of the great ironies of the Brass Eye special. One one page of
the Daily Star a criticism of the episode. On the other page a sexualised
picture of a child.

[http://www.anorak.co.uk/303258/news/leveson-inquiry-
charlott...](http://www.anorak.co.uk/303258/news/leveson-inquiry-charlotte-
churchs-15-year-old-breasts-and-chris-jefferies-hair.html/)

There's tonnes of stuff like this. They're quite open about it.

~~~
longwave
That particular incident was in the Daily Star rather than the Mail.

~~~
afandian
Thanks, my mistake.

------
latenightcoding
This is great are they modelling a neural network to detect bull shit ?

------
nbevans
Somewhat worrying that they are feeding DeepMind a diet of DailyMail articles!

Maybe this is why Skynet turned rogue. Reading daily trash about celebrities
and body image dysmorphia inducing trash, is enough to make anyone go mad.

------
jacknews
Haha, absolutely classic! What will it end up comprehending? David Beckham's
love life? How brown skinned foreigners are taking all the jobs? Etc.

~~~
Xophmeister
Hey DeepMind, what causes cancer?

> EVERYTHING!

~~~
dghf
> EXCEPT FOR THE STUFF THAT CURES IT!

"The Daily Mail Oncological Ontology Project: a blog following the daily
mail’s ongoing mission to divide all the inanimate objects in the world into
those that cause or cure cancer."

[https://thedailymailoncologicalontologyproject.wordpress.com...](https://thedailymailoncologicalontologyproject.wordpress.com/)

[http://dailymailoncology.tumblr.com/](http://dailymailoncology.tumblr.com/)

------
Fede_V
Great, so we will have a racist and reactionary AI that thinks celebrity
gossip and immigrant bashing are the most important things in the world.

For the non-AIs reading this thread, I highly reccomend
[https://addons.mozilla.org/en-US/firefox/addon/kitten-
block/](https://addons.mozilla.org/en-US/firefox/addon/kitten-block/) a plugin
that replaces every daily mail link with a random picture from kittens & tea,
in case you mistakenly click on a daily mail link.

Snark aside, the paper is really cool though :)

~~~
vixen99
You read trash but I read light amusing gossip. The great thing about dissing
the Mail is that you don't have to present any argument; you just toss in a
few words and phrases like 'racist', 'trash' and 'immigrant-bashing' (defined
as 'the slightest hint of criticism at all regarding immigration levels; add
an argued case study and you really are in the gutter), add a dash of stock
kitten-smear and the whackjob is done. You're on the high ground and the
pathetic masses are on the low.

~~~
Fede_V
If you'd like, I can dig up several instances of the daily mail being terrible
(specifically, their reporting about 'soandso causes cancer', their use of
single outrageous instances to infer non-existing trends, the horrendous way
they treat their employees, their casual sexism, etc etc..) but that seems out
of place in an article which is mostly concerned about machine learning and
NLP.

I fully stand by the statement that the Daily Mail is a reactionary piece of
trash that appeals to the lowest and basest sentiments of its readers though.

