
Lagrangian Neural Networks - hardmaru
https://greydanus.github.io/2020/03/10/lagrangian-nns/
======
qchris
On a personal note, I consider the university class where I learned about
Lagrangians for the first time to be a pivotal moment in my life. It was the
first time I experienced a fundamental sense of awe and beauty about a
mathematical construct, in physics or otherwise. I'd thought physics was
useful and interesting before that, but seeing derivation of and application
the Lagrangian formulation to mechanical systems was, and is, simply gorgeous
to me. And as someone who recently began dipping their toe into neural
networks, this is really very cool.

~~~
sgrey
Chris, glad you like it. I can relate to that feeling, which I also had as an
undergrad. It was a big motivation for this work -- and I think the analogy
can go even further. For example, there's a really cool lecun paper about NN
training dynamics actually being a lagrangian system:
[http://yann.lecun.com/exdb/publis/pdf/lecun-88.pdf](http://yann.lecun.com/exdb/publis/pdf/lecun-88.pdf).
Another thing I want to try is literally write down a learning problem where S
(the action) is the cost function and then optimize it...

~~~
qchris
That does seem interesting! After skimming that paper, I think I'm going to
need to sit down with it in order to really parse through things, though. Some
of the operator combinations seem to be things I haven't worked with jointly
before. I'd definitely be interested to see the results of using the action as
the cost function, though!

------
xorand
I found the previous Hamiltonian Neural Networks [0]. If the authors are here,
I'd be interested to use [1] for a version with dissipation.

[0] [https://greydanus.github.io/2019/05/15/hamiltonian-
nns/](https://greydanus.github.io/2019/05/15/hamiltonian-nns/)

[1] [https://arxiv.org/abs/1902.04598](https://arxiv.org/abs/1902.04598)

------
madhadron
This is a neat idea, and it's interesting to see how well it handles a chaotic
system like the double pendulum.

I'd love to see a plot of the analytic Lagrangian vs the numerical one over
the two parameter space of the double pendulum. How uniform is the
approximation? Since the double pendulum isn't ergodic, I'd be curious to see
if there's a correlation between probability of occupying a state and
precision there. If there were it could be used as a hint of where to look for
non-ergodic behavior in experimental systems.

Another possible fun thing to do with it: imagine you have a chain hanging
from two points in the presence of an uneven mass distribution producing
gravity. Since you've got the machinery in place for the calculus of
variations more generally than just Lagrangian mechanics, you might be able to
get the neural net to produce an estimate of the mass distribution from the
shape of the chain, which is a toy example of something that might be usable
in mineral exploration.

------
jeherr
Hi, I work in a highly related field and I'd like to ask a few questions if
you wouldn't mind. I found the work very interesting. My understanding is that
essentially, because of the way you have formulated the problem, you're able
to add a loss to encourage the NN to learn the invarience rather than enforce
it strictly. Is this correct?

I've tried implementing something somewhat related where I had a rotationally
invariant learning target but was trying to use a feature vector which wasn't.
I would randomly rotate my samples and add a loss function on the gradient of
the rotation parameters to encourage the gradient w.r.t. rotation to be 0.
Maybe in 2D this would have worked but in 3D it seemed to be too difficult for
the NN to learn well enough for conservation of energy. It seems your examples
use relatively simple model systems as examples. Do you have any insight into
how this might work with more complex invariences?

~~~
shoyer
No, in this paper we enforced the Lagrangian structure is enforced as a hard
constraint, via the model architecture. It's not just a loss term.

Soft constraints via loss functions can also help, but in my experience they
are much less effective than hard constraints. My impression is that this is
pretty broadly consistent with the experience of others working in this field.

For neural nets with 3D invariance, I would strongly recommend looking into
the literature on "group equivariant convolutions". This has been very active
area of research over the past few years, e.g., see the work of Taco Cohen:
[https://arxiv.org/abs/1902.04615](https://arxiv.org/abs/1902.04615)

~~~
jeherr
Hmm, I must be misunderstanding the implementation somehow. Thanks for the
tip. I’ll have to dig a bit deeper to understand this work here.

------
hcarvalhoalves
Since you found a way to model a Lagrangian without an analytical solution,
wouldn't it be interesting to throw data from systems we don't _usually_
assume to be, and find out if they could be modelled as one by looking at the
error?

------
evanb
I'm a lattice field theorist, exploring how to leverage NN in algorithms for
quantum field theory that remain exact even with NNs in them, so that the NNs
just provide acceleration.

One annoying thing I've encountered is that I have some symmetries that I
cannot figure out how to enforce. For example, if I have two degrees of
freedom a and b and know that my physical system has a symmetry under exchange
of a and b. Suppose I want to train a network to compute something in my
system. For each configuration of my system I can train on (a,b) and (b,a).
But the order in which I feed those as training matters, so that the network
only has _approximate_ exchange symmetry, rather than exact.

Is there a way around this inexactness?

~~~
shoyer
You can enforce exact symmetry in neural networks with the right sort of model
structure. For permutation invariance in particular, take a look at Deep Sets:
[https://arxiv.org/abs/1703.06114](https://arxiv.org/abs/1703.06114)

------
mhh__
Interesting, coming from physics rather than an ML background what does it
mean "practically" to learn a symmetry of a system? Is it the quantity <=>
Noether's theorem being constant?

~~~
tnecniv
Having skimmed the research paper and done some work with both dynamics and
ML, my interpretation of their statement is the following:

You want to learn a function that represents the dynamics of your system,
either as a function of the system state or some output like a picture of the
system. If you just apply some NN technique directly, this is possible but
will require a lot of data since the NN doesn't have any knowledge of physics.
If you use their system, you are trying to learn the Lagrangian of the system,
which contains information on e.g. its symmetries, and bakes in physical
knowledge into the learning problem at hand. As a result, less data is needed
to learn the system dynamics.

~~~
vectorrain
I don't know why symmetries related with physics or Lagrangian of the system?
Could you give me more specific instructions or some reading materials?

------
Jenz
As a high school student with an admiration of mathematics (and therefore of
physics, ML, and whatnot :D) I must thank the author for this.

Glancing over the paper I understand little, there’s too much math I don’t
know (yet— starting uni this year— I promise I’ll get there) but the
application is absolutely beatiful, as I have taken physics for almost two
years now, Appendix B made my day.

------
peter_d_sherman
Excerpt:

The Principle of Least Action

[...]

At first glance, S seems like an arbitrary combination of energies. But it has
one remarkable property.

 _It turns out that for all possible paths between x0 and x1, there is only
one path that gives a stationary value of S. Moreover, that path is the one
that nature always takes._

------
max_
Why aren't all scientific papers written this simply?

------
7532yahoogmail
Great post. This is why I love hacker news

------
behnamoh
For God's sake please use another font. I don't understand this minimalism
trend that emphasizes on _thin_ fonts. Stop it.

~~~
artfulhippo
Counterpoint: I think the typography is wonderful and highly readable as is.

~~~
benrbray
I think it may be an issue with font rendering on different operating systems,
screen sizes, resolutions, etc.. The webpage looks quite different on my phone
vs. tablet vs. laptop. It's least readable on my tablet (Microsoft Surface 2)
which has an insanely high dpi.

~~~
sgrey
Author with the edgy fonts here. I think you're right ben. This gives me
motivation support mobile/tablets/etc., so that'll happen soon

