

Learning Quantum Mechanics: Machines vs. Humans [video] - bendyBus
https://skillsmatter.com/skillscasts/6109-learning-quantum-mechanics-machines-versus-humans

======
bendyBus
There is a lot of hype surrounding Deep Neural Networks which seem to be able
to solve extremely challenging machine learning problems with almost no human
tuning of parameters.

This is an informal talk at the big-O meetup in London discussing the
application of machine learning to quantum mechanical simulations of atoms and
molecules. This is a particularly demanding application of machine learning
techniques. The requirements for regression accuracy are very high, and in
addition a number of physical laws need to be obeyed by the learning
algorithm. The point of the talk is to invite discussion about the relative
merits of domain expertise _versus_ general-purpose algorithms for high-
performance machine learning.

------
Xcelerate
Wow, this is interesting. I came up with an idea similar to this about two
years ago, but never really worked on it. I guess I should have! Within the
past year this kind of thing has _really_ taken off. This concept is also
being used in molecular dynamics. The LAMMPS developers have recently added a
new "SNAP" potential that performs a Gaussian approximation of the bispectrum
of common atom neighborhood configurations (force field generation on the
fly). See
[http://lammps.sandia.gov/doc/pair_snap.html](http://lammps.sandia.gov/doc/pair_snap.html).
The paper on their implementation is available online, but it isn't even
published yet. In 2014 alone, the number of papers on machine learning for QM
and MD tasks has increased by a HUGE number. I get the impression this kind of
thing has turned into a race.

I believe this is the paper that started it all: Bartók 2010
[http://journals.aps.org/prl/abstract/10.1103/PhysRevLett.104...](http://journals.aps.org/prl/abstract/10.1103/PhysRevLett.104.136403)

The goal was to remove translational, rotational, and permutative degrees of
freedom (DOF) from a collection of atoms, something I had been attempting
unsuccessfully on my own for a while. The radial distribution function has
traditionally been used a lot in MD, but it only captures a small portion of
the total degrees of freedom for a collection of atoms (which is also the
reason it's difficult to develop structural molecular models from neutron
scattering data alone). The bispectrum on the other hand captures almost all
of the DOF in a way that removes the angular dependence. It's ingenious
really. It describes the probability distribution of atoms as a projection
onto the surface of a 4D sphere, and the locations of the atoms are given by
4D spherical harmonic basis functions. New configurations are then smoothly
interpolated from DFT calculations. This even includes the effect of electron
correlation in MD! (Well, so far as the functional used in DFT is successful
at that task).

The funny thing is that the concept of the bispectrum has been around for a
very long time. The Bartók paper cites a 1991 paper on its usage for signal
processing. It's always interesting to me how useful cross-discipline research
is.

~~~
bendyBus
I'm not sure if I would agree with categorising the field as a race. I
absolutely agree that the number of people using Machine Learning within the
atomistic simulation community is skyrocketing. But everyone is just exploring
the range of possibilities and trying to see what the essential elements are
for it to be successful. I think having more people working on it is a great
thing!

Regarding LAMMPS, actually the GAP code is now also easily usable there with
this plugin :
[https://github.com/libAtoms/QUIPforLAMMPS](https://github.com/libAtoms/QUIPforLAMMPS)

The bispectrum is indeed a very powerful tool, but is not the ideal feature
vector for representing the atomic environment. You should have a read of
Bartok's more recent paper on this:
[http://journals.aps.org/prb/abstract/10.1103/PhysRevB.87.184...](http://journals.aps.org/prb/abstract/10.1103/PhysRevB.87.184115)
. One of the issues is that the bispectrum starts with an approximation of the
neighbourhood atomic density as a sum of delta functions. Trying to represent
such sharp features in a basis set expansion is actually very slowly
converging. So the idea behind SOAP is to build a covariance kernel by
directly comparing a smooth measure of the similarity of environments, which
is also invariant to all physically relevant symmetry operations.

I would also like to add that in addition to GAP and SNAP, there are people
like Jörg Behler doing this with Neural Networks and Francesco Paesani/Greg
Medders with a different regression schemes. But in addition to making
potential energy surfaces there are people like Paul Popelier `learning'
atomic charges for building force fields and people in Vijay Pande's group
doing machine learning on MD trajectories, which is something that excites me
a great deal and I would love to understand in more detail.

It's a very exciting time to be in this field!

~~~
Xcelerate
Thanks for the reply! This is very useful, particularly the paper you linked.
With my research right now, I'm trying to come up with a visual representation
of a "characteristic" atomic neighborhood around ions at different energy
levels. More specifically, how the distribution of atoms surrounding a high vs
low energy ion are different. But visually, there is no easily discernible
difference, even though the radial distribution functions are very different
for each energy level. That's why I'm studying other forms of distribution
representations. Are you a member of one of the groups you listed, or just
learning about it for your own research?

~~~
bendyBus
Yes, I'm part of the GAP research group (with Albert Bartok Partay and Gabor
Csanyi). Representing environments is definitely still an ongoing research
project. Other things I've played with in the past include identifying crystal
structures at finite temperature (i.e. a classification rather than regression
task), or differentiating between amorphous phases (since e.g. water has two
amorphous solid phases with a 1st order transition in between, but there is no
way in hell you would be able to tell one from the other visually.

We're currently working on a really ambitious new way to represent
environments, but it's really preliminary at the moment.

Regarding your ion issue, what about the angular components? The radial
functions really only tell you so much...

But more fundamentally, what do you mean by ion energy levels? I'm presuming
you mean a metallic nucleus+core electrons, in a condensed phase at finite
temperature. But of course that `atomic energy' -insofar as it exists- is a
continuous function of position and not quantised, so I'm unsure what you mean
by energy levels in this context.

~~~
Xcelerate
That sounds like really exciting research. I guess the ultimate goal in
representing environments would be to construct a function that captures all
the statistically relevant information of a particular "type" of atomic
neighborhood. It would reduce the degrees of freedom to the bare minimum
necessary to recreate a similar environment that correctly reproduces any
property of interest (kind of like data compression for materials). Would that
be right?

The deal with the ions is that we are studying the transport and storage of
lithium in a new type of carbon anode. The anode consists of small crystalline
domains distributed throughout an amorphous carbon matrix. In order to capture
all of the features of this material, we end up with a system of almost a
million atoms. At the end of an equilibration run, the lithium ions can be
found at different locations within the carbon matrix. It turns out that the
potential energy of the lithium ions (as computed from the reactive potential
the simulation was performed with) has a wide range of values. So we can sort
these ions into bins of a histogram (this is what I meant by "energy levels").
And because there are so many samples, the radial distribution functions
(RDFs) can be computed for all Li-C pairs in each separate energy bin. (The
computed RDFs are useful because the results can be compared to the
experimental RDFs obtained from neutron scattering.)

However, zooming in on ions of different potential energies reveals very
little _visual_ difference in the local environment. Yet we know there is
definitely a difference in the structure because of the RDFs, but we cannot
get a good understanding of it or provide a good representation of it. So
that's when I discovered the 2010 paper by Bartók. As you mentioned, I want to
figure out how the _entire_ local atomic neighborhood affects the energy (as
opposed to only the radial component), and I also want to create a 3D graphic
that provides a good visualization of the differences between the atomic
neighborhoods. In that sense, I need something that would compare the atomic
neighborhoods without regard to translation, rotation, reflection, or
permutation of identical atoms.

------
b3njamin
Has this video been restricted lately? It keeps showing the message: Sorry
Because of its privacy settings, this video cannot be played here.

~~~
theoengland
Hey, it should be OK. You have to sign up as a Skills Matter member, it's
free.

~~~
b3njamin
Thank you theoengland. I registered with skillsmatter now, but the message
keeps showing. I'll take it to SM in order not to waste HN. Thank you for your
help.

