Hacker News new | past | comments | ask | show | jobs | submit login
Large-Scale Machine Learning for Drug Discovery (googleresearch.blogspot.com)
58 points by rey12rey on March 2, 2015 | hide | past | favorite | 11 comments



The scale of this work is quite impressive but the idea of applying neural nets to chemistry dates to the 80's or 90's, possibly earlier. It all comes down to the quality of the data and whether the model(s) are actually predictive rather than retrospective. Medicinal chemistry is a conservative field, but given the series of "totally remakes the field" events that have occurred over the last 30+ years, some of the resistance is understandable. Gasteiger and Zupan have a survey from 1993: Neural Networks in Chemistry (Angew Chem Int Ed Engl 32,503).

The big thing will be if it helps prevent compounds dying in the clinic, as such late in the day failures are the most expensive. ADME/Tox prediction would be a huge win, although QSAR/QSPR has been fighting this war basically forever. The companies have the data while the researchers have the methods and bodies - getting these two entities to work together have prevented success in the past. Although the data is not as good as one would hope, particularly after the merger mania in pharma have mixed up all the datasets.

Edit: They published a whole book about it in 1999: Neural Networks in Chemistry and Drug Design (Wiley, probably $$). Also thought to add that I have no relationship to his work.


We hosted a smaller scale version of this on Kaggle with Merck several years ago: https://www.kaggle.com/c/MerckActivity

It's great to see that competition's result both supported and extended!


Rich Caruana's original paper on multitask learning (from the 90s) talks about this very kind of multi-headed neural networks and it doesn't appear that this Google paper has done anything new with the algorithm.

So, what's actually novel in this paper?

The AUC improvements they show are fairly modest. Training on millions of data points is pretty old hat in the neural network world.


That they can improve drug discovery? That isn't interesting enough?


>That they can improve drug discovery?

ICML is not a molecular biology journal and can't judge this paper on those merits (and probably neither can most people on news.yc). From a machine learning perspective, though, there isn't much here.


Seems like it's still under review. I feel it's a bit improper to put out a press release when the paper is still under review -- that can easily influence the review process.


Presumably more accurate predictions on the dataset are beneficial. And applications of ML should be of interest to machine learning journals, but I can't speak for that.


I've long wanted some way to donate all of my health data to support things like this. Anyone know any way to do this?


Check out the work being done by John Wilbanks and sagebase. Also, myire is an early stage startup with this goal in mind.

Check out my talks at gravesmedical.com and let me know if you are interested


You can submit data to PGP: http://www.personalgenomes.org/harvard/sign-up

I contributed my whole genome sequence/variants a while ago.


This is terrific work.

On a related note, Atomwise has been doing this for over a year. And they've been able to make predictions that have been tested in the real world in areas such as multiple sclerosis, C. Difficile, and Ebola.

http://www.atomwise.com/

Full disclosure: They are my YC batch mates.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: