
Network Theory Breakthrough Reveals The Origin Of Outbreaks - Anon84
http://www.technologyreview.com/view/428906/network-theory-breakthrough-reveals-the-origin-of/
======
shasta
"By monitoring only 20% of the communities, we achieve an average error of
less than 4 hops between the estimated source and the ﬁrst infected
community", they say.

Less than 4 hops?? I can get within 6 hops by claiming that Kevin Bacon was
the source of the outbreak every time.

~~~
pathdependent
The hops were on communities connected by water supplies, not social graphs.
In the graph of communities connected by water supplies, 4 hops is relatively
short. (See Figure 3b in <http://arxiv.org/pdf/1208.2534v1.pdf>)

Unfortunately, I agree with you that it is difficult to judge the efficacy of
"less than 4 hops" without providing some network metrics. By _eye-balling_
figure 3b, it _looks like_ the graph diameter is in the 30s, and the average
degree of a community _appears_ to be about three. Given an outbreak, less
than 4 hops given (initially) limited surveillance could be important.

------
pedro-pinto
I'm surprised to see that our research made it to HN. If you'd like more
details, have a look at the derivations [1] and some supplementary case
studies [2] that were not published.

Groxx is right: what we're doing is quite simple conceptually, but we did
spend a lot of time figuring out what the right system model was (it had to be
both widely applicable and tractable). For example, some subtle variations of
the current model are very hard to solve.

\--

[1][http://www.pedropinto.org.s3.amazonaws.com/publications/loca...](http://www.pedropinto.org.s3.amazonaws.com/publications/locating_source_diffusion_networks_supplem.pdf)

[2][http://www.pedropinto.org.s3.amazonaws.com/publications/loca...](http://www.pedropinto.org.s3.amazonaws.com/publications/locating_source_diffusion_networks_cases.pdf)

~~~
Groxx
Awesome, thanks! I'll give them a more thorough read-through some time, it's
definitely interesting stuff. I figured my mental model might be over-
simplified :) Out of curiosity, are the 'subtle variations' included in the
papers, or what might they be? I'm curious where my gaps are.

~~~
pedro-pinto
Exponential (memoryless) propagation delays, for example. In this case, our
Gaussian (ie, suboptimal) estimator performs well due to the CLT, but the
optimal estimator is hard to compute.

------
Groxx
This seems very simple... am I missing something? A read through of the paper
(without analyzing the math) makes it sound like:

    
    
      * pick random nodes
      * record the time when they learn of the 'infection'
        and record the source
      * walk the tree/graph breadth-first from the 'informed'
        nodes to a distance determined by their timing
      * estimate source based on the greatest overlap
    

I can definitely appreciate the difficulty in proving that such a thing works,
and to what degree of accuracy. Breakthrough, sure, and possibly a big one
that opens doors to others - I really don't know the difficulty here. It
doesn't _look_ too hard, but I could be way, way off.

Algorithmically though, this seems like a relatively trivial way of estimating
the source of something. It's a graph-based form of estimating the source of
an earthquake based on the timing of a few recording stations.

------
washedup
this approach to networks blows my mind.. how can someone break into this
field, already having some programming knowledge?

~~~
pathdependent
Social network analysis (SNA) is increasingly a method of analysis for many
fields. Epidemiology is fascinating, but before deciding to jump into a new
field, think about how it may apply to your own sphere of interests.

Since you said you're a programmer, my advice would be to find or collect some
data you are interested in, and play with Python's networkx.[1] For some
starter data, check out Stanfords Large Network Dataset Collection.[2]

[1] <http://networkx.lanl.gov/> [2] <http://snap.stanford.edu/data/>

(Why were people down-voting this? It looked like a sincere question.)

~~~
washedup
thanks a bunch, this will be very helpful.

