
Neural Networks for the Prediction of Organic Chemistry Reactions - fitzwatermellow
http://pubs.acs.org/doi/abs/10.1021/acscentsci.6b00219
======
duvenaud
One of the authors here. Ask us anything!

This paper is just a first step - what we'd really like to use this for is
designing recipes for synthesizing new molecules.

I would also be remiss if I didn't link to a closely-related paper from
another group that came out at the same time:
[http://pubs.acs.org/doi/abs/10.1021/ci5006614](http://pubs.acs.org/doi/abs/10.1021/ci5006614)

~~~
HarryHirsch
What's worrisome is that on the graphical abstract you have nucleophilic
substitution in the neopentyl position, a reaction every chemist _knows_ won't
proceed rapidly. And it takes place in dimethyl ether solvent, which every
chemist knows it's a gas.

It looks a bit what you see in bad teaching materials, chemistry that is
almost correct, but won't work well for some reason we are not telling the
kids about. Please alleviate my concerns! :-)

~~~
duvenaud
That diagram is just meant to show what type the inputs and outputs of the
neural network are. I'm not a chemist myself, but the other two authors are.

~~~
chillacy
Heh, not a chemist either, but the parent probably has the same feeling I get
when I watch "hackers" in crime dramas break into systems and see a bunch of
HTML and CSS on their terminals.

------
kenrick95
I skimmed through the paper (through my university's subscription) and found
out that the source code & data is (or will be) available on GitHub (yay):
[https://github.com/jnwei/neural_reaction_fingerprint](https://github.com/jnwei/neural_reaction_fingerprint)

~~~
Roboprog
I'm just an old geezer programmer, but my daughter is studying e-tox at UC
Davis. I have discussed with her how important computer code is becoming in
the ability to recreate results in studies.

This is a good example of providing that kind of transparency.

~~~
walrus
Somewhat off-topic, but there's a professor at UC Davis whose blog I've
followed for several years who advocates open, reproducible science. Your
daughter might be interested in the the workshops his lab runs: [https://dib-
training.readthedocs.io/en/pub/](https://dib-training.readthedocs.io/en/pub/).
The next one is Nov 17.

~~~
Roboprog
Thanks, passing it on.

------
iopuy
Abstract:

"Reaction prediction remains one of the major challenges for organic chemistry
and is a prerequisite for efficient synthetic planning. It is desirable to
develop algorithms that, like humans, “learn” from being exposed to examples
of the application of the rules of organic chemistry. We explore the use of
neural networks for predicting reaction types, using a new reaction
fingerprinting method. We combine this predictor with SMARTS transformations
to build a system which, given a set of reagents and reactants, predicts the
likely products. We test this method on problems from a popular organic
chemistry textbook."

------
wuschel
Very interesting, great concept! The paper is on my to-read list!

I am only afraid that the datasets you have used might not be of sufficiently
quality for a neural network application. There are old recipes when the state
of art in chemistry was at an earlier stage e.g. before the discovery of
specific mechanisms, molecule classes, analytics and general concepts. Also,
as mentioned in this thread, there are aspects of the synthetic chemists work
and experience that might not be taken into consideration in this approach.

~~~
duvenaud
Yes, the datasets were the real limiting factor for this project, and as you
point out, reactions depend on many more things than their reagents. Hopefully
this work will inspire someone to build a better dataset of reaction setups
and outcomes!

------
amelius
I'm not a chemist, and didn't read the paper, but would it be helpful if the
neural network had additional inputs coming from e.g. a (simplified)
Schrodinger equation solver?

~~~
jnwei
Orbital energies, or other solutions from the Schrodinger equation, would
probably help the prediction if they were included as inputs. If you were to
do this, you'd have to be a little careful about the cost of doing the quantum
mechanics calculation on a whole data set of reactants in your reaction
database, but it could be feasible with a cheap method.

------
jamez1
Forgive my ignorance:

What actually makes it hard to predict a chemical reaction? Can't we
empirically deduce them from quantum mechanics?

~~~
grzm
Theoretically, yes, if we knew all of the physics laws accurately enough.
However, there are a lot of behaviors that are difficult to predict in the
aggregate. That's one of the reasons we end up with different specializations:
quantum mechanics, atomic physics, molecular chemistry, organic chemistry,
etc.

Another example would be protein folding. Even within the same "level" of
chemistry, predicting the three-dimensional structure of a protein molecule
based purely on the chemistry we understand and the protein sequence is a hard
problem. We're getting better at it, but it's still hard.

------
Houshalter
I get "Server Busy".

