
How Microsoft computer scientists and researchers are working to 'solve' cancer - isp
https://news.microsoft.com/stories/computingcancer/
======
isp
Research in progress at Microsoft Research at Cambridge, UK.

This has been picked up by the media (how I heard about it) - but with the
usual hyperbole, e.g.,

\- Microsoft will 'solve' cancer within 10 years by 'reprogramming' diseased
cells: [http://www.telegraph.co.uk/science/2016/09/20/microsoft-
will...](http://www.telegraph.co.uk/science/2016/09/20/microsoft-will-solve-
cancer-within-10-years-by-reprogramming-dis/)

\- Microsoft is reprogramming cancer:
[http://www.pcauthority.com.au/News/437929,microsoft-is-
repro...](http://www.pcauthority.com.au/News/437929,microsoft-is-
reprogramming-cancer.aspx)

~~~
charlesism
\- Microsoft, in a Flash of Genius, Will Soon Devise "Grand Unified Theory"

\- Microsoft Shortly to Invent Electric Car, and Probably Outsell Tesla by a
Factor of 3

\- Microsoft Only Two Years Away from Composing a Song So Beautiful it
Revolutionizes Music Industry

~~~
Arnt
And a few days ago on HN:

\- Oracle launches cloud service, predicts Amazon's demise

------
joeyspn
It's great that we have all these research teams tackling the cancer problem
from a machine-learning/genomics/personalised-medicine angle...

But do they have a common/open standard for technologies (ML, CV, etc),
sharing datasets and findings, trials, etc? IMO something like HL7 [0] but for
cancer research would be a game-changer.

I see a lot of promising approaches but not so good coordination, with
everybody working in silos and developing their own framework... or maybe it's
what I see on the surface.

[0] [http://www.hl7.org/](http://www.hl7.org/)

~~~
patall
There are. Data types are pretty much standardized on the basic level (eg.
fastq, bam, bed, vcf etc.). Else, most is done by the big science consortia
like TCGA, ICGC & PanCancer. And obviously the big databases from NCBI and
EBI-EMBL. Trials are standardized by EMA and FDA.

~~~
dekhn
FASTQ, BAM, BED, VCF, etc are all standardized in terms of file format, but
the semantics of the content are not. Harmonizing data across multiple
semantics is still a very challenging problem that few people take seriously.

~~~
patall
Like the good old "chr1" == "1" problem? Well, yeah that’s true. But still I
think we have those consortia in place, why do we need another authority. They
just have to set (and somewhat enforce) the rules.

~~~
dekhn
It goes far beyond "chr1" = "1". There are many other fields with next-to-
impossible to interpret data. The consortia have, at best, advisory roles,
they cannot enforce any rules. And given an extensible file format, third
parties are going to extend it with poorly defined/standardized data. Then the
consumers have to harmonize the data. Data harmonization (or cleaning) it
typically one of the bottleneck steps in applying machine learning at scale.

------
patall
For those not that deeply involved into this: There is the whole field of
bioinformatics that is involved in this kind of research, creating data in the
range of multiple peta-bytes, using some of the biggest computing clusters
around. That a big enterprise like Microsoft wants their part of the cake is
obvious, especially as this is going to be a billion dollar market. But there
is no reason to especially belief in Microsoft, they are just doing what
everyone else is doing (and only with more money and/or expertise if they
spent more than a few billion dollar on it, else this is mere marketing,
nothing really note worthy). Maybe its a nice touch for the field that now
even Microsoft has noticed it.

~~~
zbjornson
Indeed there's a whole field, and I think Microsoft has a strong potential to
radically improve it.

Most bioinformaticians I know are poor developers, have limited knowledge of
existing algorithms, and/or have critically limited bio knowledge. As a result
you get buggy, slow and unmaintainable software (see: all the MRI analysis
software posts recently), reinvention of classical algorithms (anything from
basic statistical tests to full on clustering methods) or tools lacking the
full scope required to actually be useful.

I think MS could change this by having strong CS/SWEs with longer tenures than
you see in academia, backed by more money and essentially unlimited computing
resources, with a high profile name to win them collaborators who can share
good datasets. (Compared with Google, they're not trying to generate data in-
house as far as I've heard.) Some of the most notable advances in
bioinformatics have come from applying decades-old algorithms that never made
it to bio before then. Even if MS just accelerates this by applying people who
know about these algorithms, that would be very helpful.

~~~
_of
Cancer cannot only be solved with algorithms and machine learning. Remember
that it's biology, not informatics. A lot of smart people have been working on
cancer for decades. It's unlikely MS will 'solve' cancer.

------
dekhn
Stop saying you're going to "solve" or "cure" cancer. that doesn't even make
sense. Show some humility.

~~~
JabavuAdams
Independent of any overly-optimistic time-frames, why does this not make
sense?

I take it to mean that we will understand cancer so completely that we can
render it a minor disease.

Oh man, I had to call in sick last week 'cause I had a nasty cancer, but I'm
feeling fine now.

~~~
dekhn
I strongly suggest reading "The Biology of Cancer" by Weinberg to get a better
understanding of why we will likely not understand cancer so completely we can
render it a minor disease.

------
philipkglass
The last two projects mentioned in the article (Literome and better
interpretation of medical images) sound like they may be useful. The projects
that are described as cell-debugging and cell-reprogramming sound like the
sort of hubristic projects that someone with a good computing background but
less than an undergraduate knowledge of cell biology might propose (at least
as the projects are described in the article).

About a decade ago I was involved with a US Department of Energy effort to
model radiation damage to DNA in silico, and more peripherally involved with a
proteomics effort working out of the same institution. They didn't work very
well. The number one problem was that we didn't have good baseline data from
experiments to work with. IMO, you'd get better mileage out of automating in
vitro experiments and ensuring that the data is reproducible before you start
trying to leverage the latest in computer science. How much can you machine-
learn from garbage inputs? Biology experiments even at the isolated cell level
are notoriously fiddly. Let _the biologists_ (or at least people who have
worked with them) explain the problems that need solving in biology, and then
bring in CS people as required. Starting with a computing-oriented perspective
is just begging to misunderstand what biological research needs.

------
muzster
Hats off to the new Microsoft for a worthy endeavour.

I just hope that the eventual solution is cheap for those that need it and
that the tools and techniques are readily available to the hacker community
(i.e. open sourced).. otherwise I'll put my hat back on.

------
zh217
[http://xkcd.com/1736/](http://xkcd.com/1736/)

~~~
tromobne8vb
This one is also relevant: [https://xkcd.com/793/](https://xkcd.com/793/)

------
silphe
I look at their publications list: [https://www.microsoft.com/en-
us/research/research-area/medic...](https://www.microsoft.com/en-
us/research/research-area/medical-health-genomics/?q&content-
type=publications)

With all due respects, the claim of the article is overblown. Those are the
kind of articles a very good department would publish, nothing special or game
changing.

------
reasonattlm
There is a very interesting dynamic at work in medical research strategy where
the scientific urge to completely map our biochemistry meets the engineering
urge to make tangible progress in repairing things when they go wrong.

The truth of the matter is that most of the mapping of molecular biochemistry
being done in this age of enormous datasets and genetics is of very little
immediate application. Meanwhile paths that can use what we already know to
generate enormous impact on disease get little attention. This is an age of
personalized medicine as the mainstream, but really the overwhelming majority
of serious disease mechanisms, those that are age-related or infectious, are
exactly the same for everyone. Personalized medicine is a wondrous machine for
making money by the look of it, but won't deliver on its promises.

For example in cancer the best path forward is to turn off telomerase and ALT,
preventing telomere lengthening. That will work for all cancers, one type of
treatment for every cancer type, and will cost no more to bring to fruition
than any one of the recent examples of cancer therapies brought to market. But
that gets a fraction of the attention that goes towards mapping every last
part of the genetics and cellular biochemistry of cancer. The efforts to
create therapies that spin off from that type of mapping work are largely very
limited, in that they involve targeting a mechanism or marker very specific to
only one or a few of the hundreds of types of cancer, and even those types are
capable of evolving away from that marker or mechanism when it is successfully
targeted.

Cancer is hard because the research and development community largely follows
a terrible high level strategy for implementation of therapies, even though in
the long term the mapping strategy is exactly what scientists should be doing.

~~~
whenwillitstop
Are you claiming you know how to solve cancer?

~~~
swalsh
I think he is talking about this: [http://www.bu.edu/research/articles/a-new-
tactic-for-fightin...](http://www.bu.edu/research/articles/a-new-tactic-for-
fighting-cancer/)

It's actually pretty interesting.

------
NKCSS
Cancer is just a name to group a lot of different diseases under; there can't
be an universal cure...

~~~
rubber_duck
If you had a general purpose way to selectively kill specific cell types then
there could theoretically be a universal cure ?

~~~
dekhn
No, because cancer always evolves. If you don't kill 100% of cancer cells, the
remaining cells are likely resistant to whatever you used to kill the specific
cell type, and they will grow up and continue causing problems.

What you would need is something that recognized cancer cells and only cancer
cells with 100% specificity/100% selectivity, and killed all of them. This is
near impossible for something that can evolve.

~~~
TeMPOraL
It's an arms race, but we can in principle stay mostly ahead in it. Evolution
is not a perfect solution finder, and in the future, we may be able to
constrain it by reducing the probability of a mutation we can't handle to
arbitrarily low values. We know how the optimization process works here, what
we lack today is enough control over the problem space to screw that process
up.

~~~
dekhn
Arms races are a great way to bankrupt yourself. as for "Evolution is not a
perfect solution finder, and in the future, we may be able to constrain it by
reducing the probability of a mutation we can't handle to arbitrarily low
values", that's a nice speculation, however, it's just a speculation, and not
one well-founded in fact.

~~~
TeMPOraL
It's a one race we can't avoid - biology will not cooperate to stop evolving
pathogens if we collectively agree to abandon medical research.

As for the speculation part, it kind of follows from the math of things.
Evolution is very dumb compared to us, it just has the numbers advantage.

~~~
dekhn
I heartily encourage you to look at the current state of antibiotics and
appreciate that while evolution may not be "smart", it has more advantages
that just numbers.

As for cancer, it's not clear that more effective cancer will evolve in the
near term if we don't apply selection pressure to it- cancer has existed in
humans with little to no change for millions of years and in mammals for far
longer than that.

------
jbb555
Well good for them and I wish them success. But I'll believe it when I see it.

------
youdontknowtho
Why would you post marketing material and then have a problem with it being
marketing?

Google X is this kind of thing 1000x.

------
pjc50
\- Microsoft Cure For Cancer To Ship With Windows 10 Whether You Like It Or
Not

(joke)

------
ph0rque
Notwithstanding the grandiose statements, I love this direction: cure cancer
by somehow telling the cancerous cells to become normal cells.

------
JabavuAdams
We should harness cancer to engineer intentional structures. Is it brain
cancer, or is it a neural lace?

------
hprotagonist
[http://blogs.sciencemag.org/pipeline/archives/2016/09/21/bet...](http://blogs.sciencemag.org/pipeline/archives/2016/09/21/better-
faster-more-comprehensive-manure-distribution)

Derek Lowe's take, as usual, is a good counterpoint and a giant "Take With
Lump Of Salt" NB.

>I have beaten on this theme many times on the blog, so for those who haven’t
heard me rant on the subject, let me refer you to this post and the links in
it. Put shortly – and these sorts of stories tend to put actual oncology
researchers in a pretty short mood – the cell/computer analogy is too facile
to be useful. And that goes, with chocolate sprinkles on it, for all the
subsidiary analogies, such as DNA/source code, disease/bug, etc. One one
level, these things do sort of fit, but it’s not a level that you can get much
use out of. DNA is much, much messier than any usable code ever written, and
it’s messier on several different levels and in a lot of different ways. These
(which include the complications of transcriptional regulation, post-
transcriptional modification, epigenetic factors, repair mechanisms and
mutation rates, and much, much, more), have no good analogies (especially when
taken together) in coding. And these DNA-level concerns are only the
beginning! That’s where you start working on an actual therapy; that’s what we
call “Target ID”, and it’s way, way back in the process of finding a drug. So
many complications await you after that – you can easily spend your entire
working life on them, and many of us have.

And that’s why many of us who have actually been working on diseases like
cancer get a little testy when we see folks from computer science coming in
with this “Gosh darn it fellows, do I have to do everything myself?” attitude.
Years of working with (human-designed) hardware running (human-written) code
have given many people in that field what I think is an exaggerated idea of
human capabilities (at least as they are right now). When you can write code
that gets used by hundreds of millions of people in their daily lives, it’s
understandable to think that you’re able to just reach in and change reality
by sheer braininess and force of will. Unfortunately the world of code and
computational hardware, as important, useful, and lucrative as it is, is just
a sandbox compared to the real physical universe, of which living creatures
are just a tiny little part. But biology has no debugging programs, no
annotations, no manuals. It wasn’t written by humans – in fact, as far as we
know, it wasn’t written by anyone at all, it “just grew” in a process that has
no good counterpart to the ways that humans generally get things done. ....

>If you remove the hubris from the Microsoft announcement, though, which takes
sandblasters and water cannons, you get to something that could be
interesting. It’s another machine learning approach to biology, from what I
can make out, and I’m not opposed in principle to that sort of thing at all.
It has to be approached with caution, though, because any application of
machine learning to the biology literature has to take into account that a
good percentage of that literature is crap, and that negative results (which
have great value for these systems) are grievously underrepresented in it as
well. I think that machine approaches to understanding biological pathways
will, in the end, probably be the way to go, because it’s too complex for us
to keep it all straight in our minds (not human!) But we’re not there. There
are many, many important things that we simply don’t understand very well, and
many others, I’m sure, that we just flat-out don’t even know exist yet. Debug
that.

So if Microsoft wants to apply machine learning to cancer biology, I’m all for
it. But they should just go and try it and report back when something
interesting comes out of it, rather than beginning by making a big noise in
the newspapers. You want to cure cancer? Go do it; don’t sit around giving
interviews about how you’re going to cure cancer real soon now. I’m sure that
someone at the company imagines this as a big blazing publicity sendoff,
fireworks and balloons and all the rest of it. But to anyone who works in the
field, it’s more like a grenade going off inside a manure pile.

~~~
lloyd-christmas
_“We’re in a revolution with respect to cancer treatment”_

Reading this quote from the original article make me nauseous. We're in a
revolution with respect to cancer RESEARCH. PEOPLE are the ones being treated.
You can't only focus on a math equation. You also need to focus both on
palliative care and explainable research. The article seemed to save that
acknowledgement for the last two sentences.

My father has been an oncology physician/researcher for 35 years. He regularly
has to tell to some of his non-physician researchers that their work needs to
be explainable to both his patients and those who will continue funding the
research, yet they consistently show up with black box neural networks saying
"we found it!"... found what exactly? He has to be able to convince his
master-of-the-universe patients that he knows more than WebMD or their
favorite holistic blog, see: Steve Jobs. It's incredibly more common than one
would hope, see: anti-vax.

Just to clarify, there is nothing wrong with ML research into cancer. However,
research isn't treatment. Oncology is depressing. Fewer and fewer doctors are
interested in specializing in it. The overall suicide rate of physicians is
double the general population, and that's not even oncology-specific. There's
a revolution with respect to cancer treatment, and it's a pretty scary
downward spiral.

Announcements such as these can end up taking money AWAY from cancer
treatment. The government sees this stuff and jumps on the bandwagon. "Why
fund humans when computers can do it?" Government research grants pay my
father's salary along with all of his physician-researchers and students.
Without that funding, people just get to read about the upcoming research
while they wait to die.

------
barking
That's Cancer with a big C, not the operating system also known as Linux

