
The strange truth about fiction - friggeri
https://www.facebook.com/notes/facebook-data-science/the-strange-truth-about-fiction/10152215561458859
======
jawns
Data-viz rant ...

There's a word cloud about halfway down the post that shows words frequently
used in comments on "rumor" posts. Words that are darkly shaded are associated
with true rumors, and words that are lightly shaded are associated with false
rumors.

I've never been into word clouds as a data visualization tool, but if you do
use a word cloud, and you're using color to communicate something relevant
about the data, PLEASE do not use a monochrome gray gradient, as this post has
done. It's really difficult to tell whether "government" is slightly darker or
lighter than "certainly," for instance.

A blue-red gradient would have worked a lot nicer in this case, in my humble
opinion.

~~~
deaneckles
I don't like word clouds either (and I'm one of the authors!). You'll notice
there is no cloud in the actual paper, just in the blog post.

Do you have any references about human perception of blue-red versus
monochrome gradients? Would be interested to have good recommendations for
such cases.

~~~
pygy_
Hijacking the thread, since you're around:

Regarding the last graph, the observed data looks like it matches a log-normal
distribution, not a power law... Is this correct? Not sure how to interpret
it, though. How did you compute your estimation?

~~~
deaneckles
I think log-normal distributions usually fit this kind of data as well as
power laws do. Anyway, such a plot is not the best way to choose between those
two models (see
[http://arxiv.org/abs/0706.1062](http://arxiv.org/abs/0706.1062)). We don't
actually do any such test in the paper though (or say that it is a power law
or log-normal). You might want to check out the paper for more details.
[https://www.facebook.com/publications/244240069095667/](https://www.facebook.com/publications/244240069095667/)

------
peterwwillis
It was hard for me to find again, so i'll leave this here if anyone else is
interested. The psychology behind a lot of how rumors form and perpetuate fall
under herd behavior, availability cascades and information cascades. Pretty
much all of it is based on a group of people deciding that something they all
agree on is important, such as a significant threat, or a particularly good
outcome. Everything from public policy to stock market prices to internet
memes and media scandals seem to be decided this way.

[http://en.wikipedia.org/wiki/Herd_behavior](http://en.wikipedia.org/wiki/Herd_behavior)
[http://en.wikipedia.org/wiki/Availability_cascade](http://en.wikipedia.org/wiki/Availability_cascade)
[http://en.wikipedia.org/wiki/Information_cascade](http://en.wikipedia.org/wiki/Information_cascade)

------
jmzbond
This is really interesting, and I remember reading something similar awhile
ago and having the same thought as I had today. I wonder how this would look
if they segmented the data by education levels.

When I look through my social media news feed (or read BuzzFeed posts on the
hilarious dumb things that have been said on the Internet), I see a very big
difference in things that are posted, if just simply from those that are
college-educated vs. those that are not. Certainly this is not a guarantee.
I'm sure I've shared stuff that has been fake, and even reputable news
agencies make mistakes. But to me the data would be a[nother] compelling
argument for better and more access to education--to stop the damn rumors! (My
real goal is to put Snopes out of business.)

------
regoldste
Very interesting indeed. I think it raises many more questions than it answers
about human behavior and the infectiousness of rumors in social media. I'd be
particularly interested to see an analysis of whether sharing a false rumor
has any effect on the reshare rate of a user's future posts. In other words:
is a user's influence or perceived reliability (as measured by the relative
rate of reshares of his/her future posts) diminished following the initial
share of a false rumor? Reduced reshare rates could be a positive reflection
of an increasingly skeptical and better-informed user community. Consistent
reshare rates would be...a less optimistic sign.

------
pessimizer
I wonder how close this matches up to the "unsustainable growth" graphs from
yesterday:
[https://news.ycombinator.com/item?id=7662841](https://news.ycombinator.com/item?id=7662841)

edit: also, in light of "A Batesian Mimicry Explanation of Business Cycles"
[https://news.ycombinator.com/item?id=7634628](https://news.ycombinator.com/item?id=7634628)
, could this be a good basis for a bubble investment model?

~~~
twic
So are we still waiting for a Snopes for startups?

------
eloff
Seems like it would be easy to follow the identified snopes links and parse
the true or false they give in order to put a verified or debunked link with
every share (linking to snopes). Now that would be useful. All this data
mining is interesting, but just tells us what we already know (people share
lot's of rumors that are frequently false.) Why not do something about it?

~~~
pessimizer
>Why not do something about it?

Because they may not care, and they may not have recorded this information for
the purpose of social engineering?

There's value in studying the dynamics of processes without any intention to
manipulate them.

------
bitJericho
I wonder who creates these rumors? Why is it people feel the need to lie
about, well, anything under the sun?

~~~
anigbrowl
Fools, trolls, manipulators.

A lot of rumors are the result of poor understanding or misinterpretation. For
example, there was a big rumor a year or two back that the US government was
buying hundreds of millions of rounds of ammunition, sparking fears of
everything from a conspiracy to drive up prices to an imminent imposition of
martial law. The basis of the rumor was a government solicitation to bid on
ammunition _pricing_ , so that the government could lock in its ammunition
purchases for the next few years at a fixed and predictable price, rather than
being subject to spot price fluctuations which would make budgeting more
difficult and possibly result in higher costs. However, to understand that
that the government was looking for (essentially) a call option required a
basic knowledge of finance and a fairly high level of reading ability to deal
with the 'officialese'. It's not surprising that many people misinterpreted a
request for a pricing guarantee as a solicitation for actual supply. There was
also a misunderstanding of how much ammunition the government actually uses,
with people who cheerfully fire off hundreds of rounds during their own
practice sessions overlooking the fact that federal employees who carry
firearms also have to take part in training and practice sessions where they
expend large quantities of ammunition. More here:
[http://www.snopes.com/politics/guns/ssabullets.asp](http://www.snopes.com/politics/guns/ssabullets.asp)

Trolls shouldn't need much explanation; a tradition of spreading outlandish or
silly rumors is as old as the hills, and for trolls every day is April 1st.

Then you have people with a vested interest in spreading rumors of one sort or
another, often for political ends. Negative rumors about 'Obamacare' have been
widespread in recent years for obvious reasons, likewise people on the left
often expressed their animus towards the previous Republican administration by
making up negative stories reflecting their view of how that administration
would behave, or what would motivate them in the case of some otherwise
random-seeming fact.

EDIT: here's an outstanding example right here on HN, today:
[https://news.ycombinator.com/item?id=7667068](https://news.ycombinator.com/item?id=7667068)

There are a great many people who are not especially concerned with truth, but
with shaping people's behavior in order to bring about a certain result -
whether that is getting people to buy a product or a book, or to change the
price of a financial asset, or to create a more favorable political
environment etc.

------
RankingMember
Despite my animus (well-placed or otherwise) towards Facebook, these blog
posts are damned interesting.

