
Medical Breakthrough in Spinal Cord Injuries Was Made by a Computer Program - fahimulhaq
http://www.fastcoexist.com/3052282/the-latest-medical-breakthrough-in-spinal-cord-injuries-was-made-by-a-computer-program
======
skywhopper
The substance of the article is quite interesting, but the headline and
premise--that it was a computer program and not humans who found the result--
is ridiculous. The computer program did not collate and index the raw data and
notes. The computer program did not choose the relevant inputs from the sum of
all knowledge. And most importantly, the computer program did not write
itself.

Software is a tool that humans create and use, not an entity in itself. Even
if you think true AI is near at hand, this article describes nothing of the
sort.

Houses are far easier to build with saws, hammers, and nails than by
manipulating wood, earth, and metal with our bare hands, but that does mean
the tools built the house.

~~~
Dn_Ab
I agree that this wasn't done by the computer (did computers uncover the Higgs
Boson?) but I also do not believe humans can take most of the credit: this was
the result of a Man Machine System team up—trying to disentangle credit
assignment is not a worthwhile activity. Roughly and from a quick reading of a
paper thickly frosted with jargon I am unfamiliar with, the method works by
creating networks—which highlight key relationships—for visualization by
searching for stable clusters in a reduced dimensionality space of the
variables.

Humans are there to explore the visualizations, interpret the network
structures and understand the clusters and variables. The machines are
intelligent too; they do the heavy work of comparing large numbers of points
in a high dimensional space, factorization and searching for a way to express
the data in a manner that makes it easier to uncover promising research
directions and hypotheses.

Scanning this, it seems the most valuable contribution are their network
visualization and exploratory tools. I think they should be proud of those and
see no need to stretch so mightily to connect this to Stronger AI. As Vinge
notes, "I am suggesting that we recognize that in network and interface
research there is something as profound (and potential wild) as Artificial
Intelligence."

[http://www.nature.com/ncomms/2015/151014/ncomms9581/full/nco...](http://www.nature.com/ncomms/2015/151014/ncomms9581/full/ncomms9581.html)

~~~
Yomammas_Lemma
>I agree that this wasn't done by the computer (did computers uncover the
Higgs Boson?) but I also do not believe humans can take most of the credit:
this was the result of a Man Machine System team up

You realize that they're using software made by a team of mathematicians and
software developers, right? If you want to give credit to the software, give
credit to the people who wrote the code and discovered the mathematics. This
isn't any different than how physicists would use Mathematica.

------
mfoy_
I think the second big take away is that this was only made possible because
the scientists willingly shared their "dark data"\-- data and lab notes from
failed experiments. I wonder how much data is hoarded privately and never
opened up and analyzed like this.

~~~
cryoshon
Frequently a lot of this hoarded data is flawed or defective due to improper
setup or execution of the experiment. That isn't to say the information in
this "dark data" is useless, but it needs to be taken in context. The cleanest
data with the best results are put forward into a paper; the chaff is not.

~~~
chris_wot
Shouldn't this be noted? Isn't providing the "best" data risking making data
fit your hypothesis?

~~~
technofiend
Not if your dark data is "I forgot to autoclave an instrument and contaminated
my samples." In that case without an unexpected positive result it's just
error and not worth reporting.

~~~
catshirt
biology is not my forte but- i imagine "dark data" (with context) is always
more valuable than no data.

~~~
dmd
Last time your build failed because you made a typo, did you package it up and
release that version anyway?

~~~
wyager
Bad analogy. There's a lot more to learn from a failed medical experiment than
a failed build.

~~~
cryoshon
Depends on whether the lessons have already been learned or not.

For stuff like "we fucked up our culture, therefore our cells couldn't do
whatever we wanted" is well understood.

In clinical trials where some patients may respond to a treatment and others
not, there's definitely a lot more to learn there, if you have a large enough
data set and a plurality of controls.

------
Herodotus38
Here is the actual paper:
[http://www.nature.com/ncomms/2015/151014/ncomms9581/full/nco...](http://www.nature.com/ncomms/2015/151014/ncomms9581/full/ncomms9581.html)

Coming from medicine I think the title of "medical breakthrough" is too
generous. It's a great proof of concept but all this says is that in rats,
high Bp in thoracic spinal cord injuries was associated with worse outcomes.
Id like to see a follow up on human data from perioperative Bp recordings
next. If it still holds true, then you can research whether an intervention in
Bp control makes a difference. I'm not a neurosurgeon but I'm sure the
correlation btw Bp and sci outcomes has been looked at before

~~~
jlnielson
Preliminary evidence has been found in humans, looking at the other end of the
spectrum with hypotension.

[http://online.liebertpub.com/doi/10.1089/neu.2014.3778](http://online.liebertpub.com/doi/10.1089/neu.2014.3778)

We are now looking at the hypertension relationship in humans, as well as
mechanistic studies in rats.

------
cryoshon
"The process was outlined in a paper published today in Nature, and hints at
the possibility of medical breakthroughs lurking in the data of failed
experiments."

If there was some way to make sense of data from negative-results experiments
reliably, it would be absolutely revolutionary and certainly turn our ideas
about what constitutes a successful experiment onto its head. I am very
hopeful for fruitful results from the methods outlined in this article.

I worry about the usage of "failed" experimental data here, though. I've
"failed" a lot of experiments for reasons other than not finding the effect I
was looking for in my data. Any exploration of data from negative-results
experiments needs to be taken very narrowly, with a deep understanding of
exactly what effect is being examined.

Experiments are frequently designed incorrectly for studying the effect they
want, and are almost always not suitably controlled for examining non-primary
effects. Try to find a trend throughout non-primary effects over a large swath
of experiments, and I'm sure you will-- but it may be noise.

~~~
pmiller2
> Any exploration of data from negative-results experiments needs to be taken
> very narrowly, with a deep understanding of exactly what effect is being
> examined.

I disagree. There's value in data mining previous experiments, just not
conclusive value. As long as the results of such data mining are limited to
generating new hypotheses (which are then tested by experiments explicitly
designed to do so), I think this methodology can have great value. In this
particular case, I don't think it's a surprise that perioperative hypertension
is associated with worse outcomes, but the hypothesis that controlling BP with
medication before surgery and on through recovery might produce better
outcomes is worth investigating.

------
fanofyan
I'm a frontend developer at Ayasdi and we're hiring!!!

[http://www.ayasdi.com/company/careers/](http://www.ayasdi.com/company/careers/)

Come be a part of future breakthroughs.

~~~
jrowley
I can't see your opportunities because the iframe embedded is having too many
redirects. Just a heads up. :/

------
programnature
Topological data analysis is amazing, too bad all the hype around DL is
leaving it in relative obscurity.

------
topologix
Hey HN folks - I am the co-founder and CEO of Ayasdi. If you have questions
about the math/CS aspects of this, happy to answer.

~~~
steamer25
Do you recommend any good primers on topology? I thought this
([https://colah.github.io/posts/2014-03-NN-Manifolds-
Topology/](https://colah.github.io/posts/2014-03-NN-Manifolds-Topology/)) was
an interesting article and I see what looks like some great papers and videos
available at [http://www.ayasdi.com/approach/data-
scientist/](http://www.ayasdi.com/approach/data-scientist/), but I don't know
the difference between homotopy and homology (yet) :) .

What kinds of infrastructure/tech do you think will have the most utility for
topological data analysis in the near future? E.g., GPUs, Apache Spark, FPGAs,
etc.

Any thoughts on an Ayasdi public offering? I'd like to consider investing but
I don't have millions of dollars (yet) :) .

Thanks for your time.

~~~
topologix
Hey,

Some reading material: A very general blog about philosophy :
[http://radar.oreilly.com/2015/07/data-has-a-
shape.html](http://radar.oreilly.com/2015/07/data-has-a-shape.html)

    
    
    		A slightly more in-depth blog : https://shapeofdata.wordpress.com/2013/08/27/mapper-and-the-choice-of-scale/
    
    		A very accessible book about topology (especially from an algorithms perspective) : http://www.amazon.com/Computing-Cambridge-Monographs-Computational-Mathematics/dp/0521136091/ref=sr_1_1?ie=UTF8&qid=1444971634&sr=8-1&keywords=topology+for+computing
    
    		Blog exposing persistent homology : https://normaldeviate.wordpress.com/2012/07/01/topological-data-analysis/
    
    		Videos exposing persistent homology : 
    			https://www.youtube.com/watch?v=CKfUzmznd9g
    			https://www.youtube.com/watch?v=CKfUzmznd9g
    
    	Some free software:
    		Python Mapper by Daniel Müllner : http://danifold.net/mapper/index.html
    
    		JPlex library by Harlan Sexton : http://www.math.colostate.edu/~adams/jplex/index.html
    
    		Dionysus by Dimitriy Morozov : http://www.mrzv.org/software/dionysus/
    
    		Topological Data Analysis in R : https://cran.r-project.org/web/packages/TDA/vignettes/article.pdf
    
    	Infrastructure
    		Our tech stack is:
    			Backend
    				HDFS for storage
    				Our ML and Math code is hand-rolled C++ and Assembly(7% LOC)
    				All coordination/distributed systems code is in Java
    				ZMQ for communication
    				Protocol Buffers for protocol
    			Frontend
    				D3
    				Backbone
    				Hand-rolled webGL graph visualization (we open sourced it at https://github.com/ayasdi/grapher)
    
    		We currently don't use GPUs or any other fancy hardware primarily because today, our customers use commodity hardware and getting F1000 companies to buy cutting-edge hardware is just plain horrible.
    
    		We have an awesome GPU rig at our offices that we test algorithms on and it can really make our algorithms scream, but again, none of our customers have/are willing to invest in GPUs.
    
    		Apache Spark - it is interesting that in our experience, making it work for ML algorithms is really too much work unless you invest the time to understand the framework and its fundamentals. It performs very well for ETL type tasks, which is what we use it for.
    
    	On a public offering: no comment :)
    
    	If you have more questions - I am easy to find :)
    

Gurjeet

------
meeper16
Hidden relationship mining has taken a few different paths from TDA to LDA
graphical modelling (Michael Jordan, David Blei) to Vector Space driven
(Berkeley Lab). Extracting hidden relationships in datasets and using these to
form new hypothesis and enable or even make new discoveries is certainly the
future...

Lawrence Berkeley National Laboratory vector space for hidden relationships:
[http://newscenter.lbl.gov/news-
releases/2008/07/09/berkeley-...](http://newscenter.lbl.gov/news-
releases/2008/07/09/berkeley-lab-wins-four-2008-rd-100-awards/)

A Search Engine that Thinks [http://newscenter.lbl.gov/feature-
stories/2005/03/31/a-searc...](http://newscenter.lbl.gov/feature-
stories/2005/03/31/a-search-engine-that-thinks-almost)

Statistical modeling of biomedical corpora: mining the Caenorhabditis Genetic
Center Bibliography for genes related to life span - Blei DM1, Franks K,
Jordan MI, Mian IS. -
[http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1533868](http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1533868)

------
xixi77
One immediate question is whether this might be a result of overfitting --
when enough hypotheses are tested, some will surely be confirmed at any given
significance level. Still quite interesting of course; now what is needed is a
follow-up study on other datasets (preferrably human), or an experiment to
confirm. A better (but less exciting) title would be "A computer program
suggests a promising avenue of research".

------
chris_wot
So my main take away is that scientist should publish their original data.
Frankly, I'm amazed they don't already!

------
xbmcuser
Things like this why I feel something like google deepmind can be a game
changer for the sciences if the research data of all human research data was
available to them. They might never reach the point of true AI but they would
still beat all humans and finding relationships between data humans cant even
remember.

------
jmpeax
"The process was outlined in a paper published today in Nature
Communications". NO LINK TO THE PAPER ARGGGHH!!!!!

------
drcross
Could someone please offer a TLDR; The article seems very clickbait and full
of teases.

~~~
harveywi
From the article: "In the case of the spinal cord injury data, Ayasdi’s TDA-
driven approach mostly confirmed what researchers already knew: The drugs
didn’t work."

How this "Ayasdi" company's analysis probably works (based on "Topology based
data analysis identifies a subgroup of breast cancers with a unique mutational
profile and excellent survival" and the original "Mapper" paper "Topological
Methods for the Analysis of High Dimensional Data Sets and 3D Object
Recognition"): They take point cloud data and connect each point with its
neighbors (the distance metric that is used is probably domain-specific) to
build a proximity graph that approximates a simplicial complex. As input to
their algorithm, they also have one or more scalar functions defined on the
point cloud data that contain information which is relative to the problem at
hand. For example, each point could be a gene, and maybe the scalar function
value at that gene could be probability of association with some disease, and
the distance between two genes might be the Levenshtein distance between their
genetic codes.

With data in this form, they approximate the Reeb graph of one of the scalar
functions, which is a sort of "data skeleton." They can do potentially
interesting/useful things with it.

The approximation of the Reeb graph reveals zero-cycles (connected components
of the simplicial complex) and some one-cycles (handles/tunnels in the graph,
sort of like holes in a donut). This "skeleton" of the data allows them to do
a variety of things, such as segment the data into components that are
(approximately) topologically "simple" (they do not contain any 1-cycles),
identify local maxima/minima, find saddle points where forks in the data merge
together, and locate "essential saddles" which constitute the high points and
low points of handles/tunnels. They can also remove "topological noise", which
helps them to separate spurious topological thingies from features that might
be important.

Their technique doesn't necessarily recover "true" topological information
since a lot of what they do is approximate. There are actually more accurate
techniques (e.g., simplicial homology, or fast Reeb graph algorithms) for
getting an exact answer, albeit with potentially higher computational cost.

Topological data analysis is a big field, and this Ayasdi company appears to
mainly use this one approach (but I could be wrong). I think they are trying
to lay claim to the term "topological data analysis" and get people with money
excited about it.

~~~
topologix
One quick edit to this description : We (Ayasdi) have generalized the notion
of Reeb Graph's - such that it is no longer limited to single scalar
functions. While in the single scalar function the mapper algorithm is an
(extremely efficient) approximation to the Reeb Graph, in the multiple scalar
function case, it has no direct theoretical analogue (although the notion of
Reeb Spaces is similar).

We are generally not trying to lay claim to the phrase "Topological Data
Analysis" and not going around suing people for using it. In fact we still
support research in academia and actively publish in the field. TDA is the
basis of what we do so it is the most efficient way of describing it.

------
findjashua
made possible _

