

Did a Human or a Computer Write This? - tokenadult
http://www.nytimes.com/interactive/2015/03/08/opinion/sunday/algorithm-human-quiz.html

======
sanqui
Previously discussed (a month ago):
[https://news.ycombinator.com/item?id=9166967](https://news.ycombinator.com/item?id=9166967)

------
mthq
Although impressive, this quiz's content seems cherry picked for situations
where bots can do well.

On the one hand we have highly informational content such as the one about the
earthquake, earnings report and sports game. Reporting on this kind of
information can be done in a context free way. Sentences and paragraphs can be
self contained and need not reference each other or greater contexts. I only
missed the sports game in part because I'm not an American and was not
familiar with the game involved. In hindsight the earnings reports stands out
from the others since much broader connections are made then just reporting
the raw information itself.

On the other hand we have poetry, which is often so hard to parse that
nonsense and genius are hard to tell apart, especially for short excerpts like
these. I missed the poetry app one because although it read like nonsense for
me, as a non native speaker much of these Shakespearean poems do on first
sight. I don't know exactly why but "True Love" and "Absurdistan" immidetly
stuck me as fake and real respectivly. For example I thought it was weird to
be that her nerves where strained as _two_ tight strings, and that someone
would make her drink hot wine. Also the "True Love" excerpt seems very factual
and deseperate where "Absurdistan" paints more of an atmosphere.

~~~
graylights
It could report on small events that are normally not covered to. Further it
could customize articles based on readers interests. Have a favorite player?
He is now highlighted in your articles.

It could also do vanity articles. Go back to your college days and write about
your games, making you the highlight.

Worse it could be influenced to do personalized advertisements in the middle
of the article.

------
_yosefk
Define "write." Is the output of printf("Should I compare %s to a summer's
day", name) written by a human or a computer?

If it's defined to be "written by a human", then I bet that most (all?) of
their examples are written by a human in a similar way.

If it's defined to be "written by a computer", then guessing if a human or a
computer wrote any of their examples is like guessing the output of coin
tossing.

~~~
asQuirreL
Perhaps some notion of entropy would be useful here? Normal information
theoretic definitions of entropy usually rely on "bits of entropy". I.e. if
you are given information in the form, "Should I compare %s to a summer's day"
then how many new bits of information are we actually getting? In this case,
it is unbounded, because what we are calling `name` can be arbitrarily long.

But what about if we define entropy in terms of something like parts of
speech? Then, the computer can only change one part of speech (the noun phrase
to which we compare a summer's day) whilst maintaining the context of the
sentence, so in terms of that metric, entropy is low.

With this parts of speech entropy metric in hand maybe we could declare the
output of a computer to be "written by a computer" if the entropy in its
sentences approaches that of a human writing about the same topic.

 _Disclaimer:_ I spent about as much time thinking about this as you did just
reading it, so I don't expect this is in any way a fully formed or
implementable idea, just food for thought.

~~~
_yosefk
You need to know how the sentences were constructed then. In their Turing-
test-lookalike webpage you get to guess based on the final sentence, which is
kinda pointless without the conversation part of the test which precludes
canned responses/format strings.

