

Microsoft Research applying spam-fighting techniques to attack HIV - fedxc
http://www.theverge.com/2011/12/4/2609172/microsoft-research-applying-spam-fighting-techniques-to-attack-hiv?utm_source=dlvr.it&utm_medium=twitter

======
kevinalexbrown
Here's a link to the actual papers (and code): (Note that none mentions
"spam")

<http://mscompbio.codeplex.com/wikipage?title=PhyloD>

Overall it's pretty kickass, and the papers themselves tell why it's HIV in
particular: HIV evolves extremely quickly within a specific host, so any
general solution needs to be able to account for a virus that can adapt to
your specific treatment regimen.

And that's the link to spam-fighting: rapidly changing strategies to fool your
immune system into thinking it's a good cell that should be let in, which is
amenable to similar spam-fighting approaches.

~~~
carbocation
These are great links, and answer many of the questions that I had. But there
must be something more to the story; PhyloD is from 2007 and the news article
is from 2011. Tangentially related, the research article that you linked is
cited by a paper in PNAS that describes extraordinarily constrained coevolving
group of sites in HIV (which appears to be a site of protein-protein
interaction in the Gag protein). Such a group is likely a good target for
multiple simultaneous therapeutics, as multiple mutations there likely reduce
HIV's intrinsic fitness: <http://www.pnas.org/content/108/28/11530.full>

~~~
ephermata
Checking David Heckerman's web page shows this paper from October 2011 in the
Journal of Virology. I don't know how it relates to the news article
specifically but it's interesting if you'd like to follow this stream of work.

[http://jvi.asm.org/content/early/2011/10/20/JVI.05577-11.abs...](http://jvi.asm.org/content/early/2011/10/20/JVI.05577-11.abstract)

(Disclosure: I work at Microsoft Research, but not on this project.)

------
carbocation
The article is quite light on details. The version from Microsoft Research is
somewhat more informative: [http://research.microsoft.com/en-
us/collaboration/stories/hi...](http://research.microsoft.com/en-
us/collaboration/stories/hiv_research_za.aspx)

My take is that MR is contributing compute resources as much as they are
contributing algos. It's too bad that neither article really describes the
problem domain nor the solution space. If this were a genetic study of HIV
resistance to ARVs (and it seems like it is but again, few details), one could
at least imagine having to look for 3-way interaction terms across the HIV
genome, a large search space. Would be interesting to know if this is the
problem they are tackling, and if MR's major contribution is algorithms or
compute.

------
yajoe
I love reading stories like this, but with a background in fighting spam I'm a
bit skeptical. I would love to hear more details.

For those of you who don't know, most of Microsoft's anti-spam efforts are
from techniques I suspect are hard to translate to the HIV domain. Microsoft
catches the overwhelming majority of spam (98%) using IP address blocking.
Most of the remaining spam is caught using long lists of regular expressions
managed by humans. I would not expect researchers to be crafting regular
expressions or mapping blocklists to protein sequences. Maybe they are, but
the article makes it sound like some algorithmic approach.

Obviously there are more modern techniques for fighting spam, but Microsoft
isn't using them yet, and I hardly think of Microsoft as a leader in this
space.

~~~
FooBarWidget
The people in Microsoft Research are very different people than the ones in
charge of Hotmail. It could be that the Research people are more familiar with
content analysis methods.

~~~
jarin
I've said it before, but I really think Microsoft would be a completely
different (and better) company if Microsoft Research was a bigger part of
their product development. Just look at the insane stuff they're doing with
Kinect, for instance.

~~~
riffraff
there is a chance if they were more involved there would be an implicit loss
on their freedom to do amazing stuff.

------
JonnieCache
I'm guessing this just means they used some bayesian maths.

------
heckerma
David Heckerman here...

We are using a key principle that we use in fighting spam to fight HIV. In the
case of spam, we have spammers changing their emails to get around the spam
filters. So, we go after their Achilles heel: their need to extract money. In
the case of HIV, we have HIV mutating to get around our immune system. Here,
we’re again looking for the Achilles heel(s) of HIV—vulnerable spots on the
virus that, if they mutate, compromise the function of the virus. One step in
this approach is to catalog the spots along HIV that our immune system can
target. This is where PhyloD comes in. Here are some of the articles
describing our search for targets using PhyloD:

• G. Alter, D. Heckerman, A. Schneidewind, L. Fadda, C. Kadie, J. Carlson, C.
Oniangue-Ndza, M. Martin, B. Li, S. Khakoo, M. Carrington, T. Allen, M. and
Altfeld M. HIV-1 adaptation to NK-cell-mediated immune pressure. Nature, 476
(7358): 96-100, August 2011.

• A. Bansal, J. Carlson, J. Yan, O. Akinsiku, M. Schaefer, S. Sabbaj, A. Bet,
D. Levy, S. Heath, J. Tang, R. Kaslow, B. Walker, T. Ndungu, P. Goulder, D.
Heckerman, E. Hunter, and P. Goepfert. CD8 T cell response and evolutionary
pressure to HIV-1 cryptic epitopes derived from antisense transcription. JEM,
10.1084/jem.20092060, January 2010.

• C. Berger, J. Carlson, C. Brumme, K. Hartman, Z. Brumme, L. Henry, P.
Rosato, A. Piechocka-Trocha, M. Brockman, P. Harrigan, D. Heckerman, D.
Kaufmann, and Ch. Brander. Viral adaptation to immune selection pressure by
HLA class I-restricted CTL responses targeting epitopes in HIV frameshift
sequences. JEM, 10.1084/jem.20091808, January 2010.

