
US Military Scientists Solve a Fundamental Problem of Viral Marketing - denzil_correa
http://www.technologyreview.com/view/519361/us-military-scientists-solve-the-fundamental-problem-of-viral-marketing/
======
ignostic
> _" For example, with the FourSquare online social network, under majority
> threshold (50% of incoming neighbors previously adopted), a viral marketeer
> could expect a 297-fold return on investment"_

This is illustrative of what bothers me about the paper: they're already
trying to market the algorithm, and they're not being honest about its
limitations. Now I'm uncertain of whether to trust them on the network
analysis part at all.

Take the cited example: What is the return? It will vary wildly and take into
account factors like how related the message is to the product/service,
previous exposure, quality and likelihood to share, and website
quality/conversion rate. Second, what is the investment? Marketing costs can
range from 0 to many tens or even hundreds of thousands.

To those who say the "message is all that matters": relevant, quality content
fails to go viral all the time. It's easy to think it doesn't happen if you
don't work in marketing, since you'll never see it. Identifying key people
effectively really can have value in a marketing campaign. That said,
connected people (also most people) will likely ignore you if your message or
content is shit (uninteresting, unsurprising, unclear, etc.)

~~~
technotony
What's interesting, and not covered by the article, is why this is interesting
to the military. I imagine that by identifying this group you could also stop
virality of ideas, eg radical Islamic terrorists. Presumably the US has lots
of data connecting phone records of these people, and they could use this
network analysis to figure out who to target to reduce future flow of new
ideas... could be powerful!

~~~
dak1
It could also be used offensively to start a rebellion or change political
outcomes.

~~~
nanomage
I smell political econometrics a brewing.

~~~
nwzpaperman
All edges derive their value from the other participants not having them also.
Handheld programmable calculators once made Thomas Petterfy, Steve Fossett,
Joe Ritchie and others very rich men. When people saw their success with black
scholes in the options pits and followed them into the pits with calculators,
the pits died and all of the trading volume moved off the floors.

When everyone got computers that were faster than calculators, hedge funds and
bank trading desks bought mainframes and colocated them next to or with the
exchanges' servers and all of the trading volume moved to high-frequency
trading-friendly dark pools and the exchanges...

In the context of political influence, what does everyone think all of the
campaign money goes toward these days? A lot of angry people start revolutions
and there is nothing technology can do to placate someone(s)who is being
abused for any reason and certainly not at scale!

All of these marketing firms are chasing the mass market consumers/voters that
have had diminishing spending power for over a decade now. Here's a tip: build
something really expensive and desirable for someone really rich because they
have lots of spending power and few things to do with the money.

------
lifeisstillgood
This is a pretty clever hack. The problem is if the tipping point theory is
correct (#) (that you will perform behaviour X if a sufficent number of your
immediate circle also are performing behaviour X) you still need to find a
subset of the network to start the ball rolling.

The solution is simple - say the tipping point is 20 friends must do X for you
to do it. So walk the whole network and find everyone with more than 20
friends. Remove those with _the most friends_ (say the those > 99th
percentile).

Now walk the network again, and find those with more than 20 friends and again
99th percentile goes. Eventually you remove the 20th friend from everyone and
those left have 19 or fewer friends.

Now tell all these seed group to do behaviour X.

Now put back the very last set you removed. And they are guaranteed to have at
least 20 friends, and all those friends will be doing behaviour X. Now put
back the penultimate group, and because the most recent arrivals are now also
doing behaviour X ....

Problems:

1\. For any social network at a given point there is just one seed group.
Right now (or 6 days ago) this group is being identified. And sold.

2\. the tipping point theory as a whole is a bit dodgy (see below).

3\. Feasibility - they only mentioned orders of magnitude smaller subsets.
Lets be kind and suggest that its 3 orders of magnitude. for LinkedIn that
leaves a seed group of 100,000's. Not the size you can just invite into a
focus group.

Overall, really a cool hack, and I swear its worth ponying up on and selling
to excited digital agencies

(#) Good article on Debunking of tipping point theory -
[http://www.fastcompany.com/641124/tipping-point-
toast](http://www.fastcompany.com/641124/tipping-point-toast)

Edit: Wanted to rewrite my below comment that got a bit confused

~~~
seanlinehan
In terms of #3, it's even worse than 100,000's - for LinkedIn it would likely
be millions:

"In general, online social networks had the smallest seed sets - 13 networks
of this type had an _average seed set size less than 2% of the population_
(these networks were all in Category A). We also noticed, that for most
networks, there was a linear realtion between threshold value and seed size"

Though, for a company with direct access to their users via the UI, it would
be a fairly trivial task to reach a significant subset. LinkedIn could
reasonably push UI updates only to the target population. Given they have full
access in the first place, I'm uncertain as to _why_ they would want to engage
in this form of marketing, though.

~~~
BWStearns
>> I'm uncertain as to why they would want to engage in this form of
marketing, though

They might not want to on their own, however if some advertiser really had the
desire to try to hit the entire seed population then LinkedIn could sell that
target population at a higher CPM because of the relatively high projected
value of those individuals.

------
pitchups
The "tipping point" theory popularized by Malcolm Gladwell's book was proven
to be too simplistic and flawed by Duncan Watts, a network researcher
currently at Yahoo. His book "Everything is Obvious - Once You Know the
Answer" debunks this popular misconception. He describes many studies,
simulations and actual experiments that show that how fast something spreads
virally has less to do with where it starts (influential groups/tipping
point), than how susceptible a person is to being influenced (ie how
infectious the idea is to begin with). The book is great btw.

~~~
jmackinn
Thank you for this bringing this up. It was the first thing I thought about
when reading the article. I also recommend the book.

------
wahsd
Some may not realize the context of this research. Consider the funding and
authors. This solution is meant for and will be, what can effectively be
called, weaponized. There are many efforts that will be quite interested in
this research and will build it into their solutions that are sold to all the
players we all so well know at this point, and others you have never heard of
and never will.

Consider a future that is beyond the present in which social network analysis
is used for identification, targeting, and disruption of social networks of
all kinds ...something like OWS if you will...; when the subject type of
research is implemented to understand how to prevent opposition by those who's
interest it is that you and those around you don't oppose, cannot organize,
and are disrupted faster by knowing exactly who the linchpin is that has to be
neutralized to disperse any organization.

~~~
antocv
I am inclined to believe the reason for the lack of any meaningful protests
and quite strange vanishment of OWS is in great part due to these new tools by
the people in power. Meaningful protests would be against the using of
taxpayer money to gift to private industries and the sellout ones digital life
and rights to foreign countries.

------
evolve2k
Assuming the use of the technique spreads will it have a self defeating affect
whereby having everyone 'spam' these core agents that people will begin to
change their behavioyr in some way?

~~~
mathattack
That's exactly what will happen. The weakness of much social science is people
make judgments based on no external influence. Once the influence happens, the
outcomes change. This is also why market inefficiencies disappear after
they've been published. In more specific terms, you might accept 4 spams from
a friend, but not 20 Candy Crush invites.

~~~
vdaniuk
Well, if I would want to communicate a concept, FOSS, for example, I would
create lots of various packaging for that idea: stories, infographics, videos
etc. to prevent saturation.

------
sitkack
One of my fears about the meta data collection is that when coupled with
research like this it will enable monitoring and control of a small number of
important people (important in the sense of spreading information, effecting
change). In the same vein you take it a step further and detect those who self
police, they can be effectively ignored. I am sure the machine is trying to
efficiently figure out who the "do the right thing" boyscouts are so it can
remove them from the system.

------
sixQuarks
This doesn't seem useful to me. It doesn't matter if you "find" the optimal
people to send a message to. The important thing is the message, and whether
they will find it interesting enough to spread in the first place.

If it was so easy to get something viral, then a blanket message sent to a
large group of people would automatically result in viral marketing. ie -
spam. And we see how often spam becomes viral....

~~~
vidarh
It is useful because it means you can strategically identify a much smaller
set of users to focus on. If you get the seed group to trigger, you get
massive payoff. And so you can target initial messages to them, and analyse
who takes action and who doesn't and why, iterate, and retarget on the seed
group rather than wasting lots of time on the much larger full network.

~~~
sixQuarks
True. I guess I'm just a bit skeptical about whether this would really work.
And as others pointed out, if marketers start using this type of targeting,
then this "seed" audience will be over-saturated.

------
jqueryin
Here's a direct link to the PDF, which is far more valuable than the article
itself:

"A Scalable Heuristic for Viral Marketing Under the Tipping Model"

[http://arxiv.org/pdf/1309.2963v1.pdf](http://arxiv.org/pdf/1309.2963v1.pdf)

------
api
In other news: democracy is dead. It's been hacked.

(That's actually been true for a long time. The exploits are only getting
better now. It's a bit like SSL.)

------
jmngomes
Isn't producing content that people want to watch and share the "fundamental
problem of viral marketing"?

Obviously, being able to identify high value "seeds" is paramount, but it
appears to be more related to cost-reduction (not having to contact more seeds
than necessary).

Centrality measures, along with propagation simulation algorithms, already
helped identifying seeds...but without "the proper content", I doubt that good
seed classification can, alone, "solve the problem".

------
Sniffnoy
Everyone seems to be reacting to the "viral marketing" framing of it, when
basically this is just a graph theory problem.

Reading this, it appears that finding seed sets is an old problem. Normally
people focus on finding minimum-size seed sets, but here they're just focusing
on small ones. However, they don't appear to have actually proven any upper
bounds on the sizes of the seed sets found this way; they've just observed
that empirically it's small. Which is still useful.

------
dullroar
What a great way to stifle dissent - find the people most efficient at
spreading it and neutralize them via arrest, National Security Letters or
"other means." No wonder the military is funding this. They aren't interested
in marketing - they are interested in the control of the flow of information,
period.

------
dguido
Cool to see Paulo up on Hacker News. One of our employees wrote a book with
him and his wife recently, examining the proliferation of cyber espionage and
"cyber war" in the last decade or so. We think it's one of the only sober,
thorough and technically accurate textbooks on the topic:
[http://www.trailofbits.com/books/#cyberwar](http://www.trailofbits.com/books/#cyberwar)

------
namelezz
How to effectively spread propaganda?

~~~
sitkack
That is most likely the funders goal of the original research.

~~~
gavinpc
And when you use "Flickr, FourSquare, Frienster, Last.FM, Digg (from Dec
2010), Yelp, YouTube and so on," you participate in the research.

------
scrrr
This seems to be a sort of iterative clustering of a graph to find highly
connected nodes.

I suppose a similar result is reached by sorting users in a sub-graph by the
time they spend online on that social network, since more time spent online
probably means more "friends".

~~~
marcosdumay
They remove the most connected nodes at each step. The one thing this
algorithm is not usefull for is for finding hightly connected nodes.

------
neves
Now the US can influence better all the "Facebook and tweeter revolutions".
NSA can analyse the social network graph and try to seed their news or
manifestations. Or even better: they can turn off the influencing nodes that
are against american policies.

------
andrenotgiant
Given this algorithm. What does the Seed Group end up looking like? What do
they have in common?

It seems to me that in the real world, this group would be people with the
most friends, or people with very diverse friends, kind of a no-brainer.

~~~
lifeisstillgood
Not really - its finding those who connect most networks - the routers if you
will.

From the article they remove those who have the _most_ connections first. Lets
say that if more than 10 of your LinkedIn links have photos, then you will
upload one too. So they find everyone who has >10 friends, order them and
remove the top 20% of most connected individuals. Then repeat. Stop when you
have a group of people none of whom has >10 (extant) friends.

This group of people can then be given a virus (behaviour/whatever). Now put
back the most recently removed group of people. It is guaranteed that each of
those new people will all have 10+ friends _all of whom exhibit this new
virus_.

Its pretty clever. Now I need to grab graph-tool and start playing !

Also:

> Lastly, we find that highly clustered local neighborhoods, together with
> dense network-wide community structures, suppress a trend's ability to
> spread under the tipping model.

That matters and frankly is the _future of the internet_. We shall most likely
see geo-physical mesh neighborhoods. Always on, mobile or not, connectivity
_to the people around you_. It probably will make a resurgance of democracy
and community, likely to solve enourmous caching problems, and utterly destroy
loads of business models. And yes ! its Maths and Science that proves its !

------
unclebucknasty
It seems implausible to turn this into a business. So, you have this seed
group per network and, what, charge some amount to reach this group on the
assumption that they will care enough to propagate the message? Seems that
quality and relevance of content are still the overriding factors.

Also, wouldn't many of the same people be members of the seed group, leading
to message fatigue amongst their connections?

Seems like something out of which you may raise the hopes of many a marketing
department, but which ultimately proves impracticable.

------
snorkel
The way social marketing works in the real world is a marketing team assigns a
budget of X to give to celebrity Y to mention product Z in their Twitter feed.
The only measures of interest are how many followers does that celebrity have,
how much does each mention cost, and how many leads did it generate. Applying
this network analysis could help identify hidden celebrities that would charge
less, and how soon before someone launches a bidding service based on that?

------
vdaniuk
The fundamental problem of viral marketing is developing an efficient,
repeatable process of content production that is very similar to venture
companies: few pieces of content will have incredible roi,some will have ok
roi and most will have negative return on investment. While this research will
help if applied correctly it doesn't solve the fundamental problem. BTW,
buzzfeed has some success using this approach.

------
mbesto
This suffers from one fundamental flaw - if it is indeed "guaranteeable" and
they sell everyone on the algorithm, it no longer becomes effective. In other
words their test cases, as far as I can see, do not take into consideration
the behavior of other people using the same algorithm. It reminds me of
algorithms in economics.

------
noelherrick
> It is based on the idea that an individual will eventually receive a message
> if a certain proportion of his or her friends already have that message.
> This proportion is a critical threshold and is crucial in their approach.

The question I had was how to find the tipping point. Is this done through
tests on a smaller group?

------
espadagroup
This is interesting though this biggest problem is their seed groups are in
the size of 1%-3% of the networks population. If you can affect that much of a
large enough network, to constitute 'viral', in the first place then you
probably don't need an algorithm like this anyway.

------
6d0debc071
I wonder what the worth of a node is, i.e. how credible they are, and how
that's altered when you start using them as an attack vector. It seems
possible to me that the people with small numbers of friends in multiple
graphs would just find themselves put on ignore lists.

------
moheeb
Couldn't anyone who's ever designed a router also sell their algorithm to
marketers?

------
guscost
First the "critical threshold" is described as a ratio, then as an integer. Am
I misreading it somehow?

~~~
lutusp
In one reference, a critical threshold is described in generic terms, in
another it's described as s certain number of acquaintances that put you over
the threshold. They're not incompatible -- one represents an example of the
other with a specific number attached.

It would be like my saying, "25% is sufficient", and in a later sentence
saying, "eight out of thirty-two is enough". They both make the same
statement.

------
hcarvalhoalves
Next step: people getting cold called for "post this on your Facebook and earn
X" schemes.

