Hacker News new | past | comments | ask | show | jobs | submit login
Noether's Theorem in a Nutshell (2020) (ucr.edu)
104 points by _Microft on Aug 9, 2022 | hide | past | favorite | 43 comments



I really like Wikipedia's illustrated explanation of Noether's theorem [1]. Euler-Lagrange equations can also be proved using a similar trick - make an infinitesimal smooth 'bump' in a realizable path and consider the change in velocity on both sides of the bump and how it is cancelled out by the change in position so that the net change of action is 0.

I think that integration by parts can also be derived in a similar way but I would have to make a proper blog post with illustrations to make it clear.

Sometimes I also think of Noether's theorem as an application of the Euler-Lagrange equations in a coordinate system where the continuous symmetry changes only one coordinate; the conserved quantity is the associated generalized momentum (because the lagrangian is independent of that coordinate).

[1] https://en.wikipedia.org/wiki/Noether%27s_theorem#Brief_illu...


> I think that integration by parts can also be derived in a similar way but I would have to make a proper blog post with illustrations to make it clear.

It looks not dissimilar to how one might think of Stokes's theorem: https://en.wikipedia.org/wiki/Generalized_Stokes_theorem#Und... .


As a dilletante, I sadly found this post rather impenetrable.

* There is a function L(q,q'), the Lagrangian. How do I compute this function? Does it have an analytical expression? How does it depend on p or F?

* There is a parameter s, "sending q to some new position q(s)". Do you mean q is actually a function q(s)? Do I define q(s) any way I want, or do I solve q(s) given the constraints of the system?

I wonder whether working out a very simple example, e.g. point moving on a line subject to force F, would have made this post a bit more readable for a wider audience.

Stealth Edit: Many many thanks for the explanations! This makes so much more sense now.


* There is a function L(q,q'), the Lagrangian. How do I compute this function? Does it have an analytical expression? How does it depend on p or F?

I tend to think of the Lagrangian as "something that someone defined that happens to be useful". In classical mechanics, the notion of total energy (kinetic plus potential) is something we can all understand. The Lagrangian, which is kinetic minus potential energy, looks strange but happens to observe unusually useful properties. If you integrate the Lagrangian of a point particle, it will always take the path that minimizes the integral. We call this integral the action and the universe seems to behave in a way that minimizes the action. I don't admit to have a deeper understanding of why this happens but every experiment ever done seems to observe this.

The equation after Fig. 19-3 gives an analytic expression for the action: https://www.feynmanlectures.caltech.edu/II_19.html

Different physical systems will have different Lagrangians but once formulated, the physical system will always behave in a way that minimizes the action. In principle, if you can formulate a Lagrangian for the universe, you can predict how it will behave because you know it will behave in a way that minimizes the action.

EDIT: Added more details.


I've always been unsatisfied with this description of the Lagrangian. I spent some time trying to figure out a better justification, and never was happy with it. Maybe someone reading this thread knows?

A few things I have but could not fully translate into math:

* S, action, is the arc-length of a world-line. L = dS/dt. It is a natural law that systems move along "straight" world-lines, geodesics, in the absence of interactions. With an interaction the combined system (original + interactor) still moves in a "straight"/minimal-length world line (wrt to the proper time of the combined system) but each subsystem's world line curves to achieve SOME property. What is this property?

* The "future-looking" nature of the least action principle is unsatisfying: sure, the action is minimized over an interval in time. Since L = dS/dt, the E-L equations are that same principle expressed at a single moment in time--right?

* Is there some sense in which L is "orthogonal" to the space spanned by constant energy and momentum? Actual trajectories conserve E and p; their variation in E or p along the trajectory is 0. The actual trajectory is DEFINED by being the one where variation of L, _off the trajectory_, is 0. So lines of constant L are orthogonal to the subspace of constant E and p—is there more to it than that? Could we run that backwards to get a satisfactory derivation of L? Is this the same as saying that lines of constant L are lines of MAXIMAL variation in E and p (presumably, in 4-momentum)?


A Lagrangian is like newtons equation, its a description of the system, you don't have any mathematical definition for it since it is a law of physics and not maths.

Edit:

> but each subsystem's world line curves to achieve SOME property. What is this property?

You have to provide that as a part of the explanation of the system. Lagrangians are just dumb functions, you have to be smart about how you choose them, there is no magic here you just have to understand the physics and construct the Lagrangian that has whatever properties you want.


I'm pretty sure you're answering a different question.

I'm saying:

- it appears that the "stationary action" is the same as "worldline follows a geodesic of the spacetime metric" (https://en.wikipedia.org/wiki/Relativistic_Lagrangian_mechan...). I first heard this from a particle physics professor in undergrad.

- for a compound system, the individual elements follow worldline geodesics, with the other particles factored out into a "potential" / as force terms in E-L equation. These represent how 1 particle, when distorted from its non-interacting trajectory, trades off against the other particles distorting from their OWN free trajectories. It's analogous to how heat energy flows between two systems to maximize their joint entropy; here, the particles' trajectories distort each other via forces to minimize their joint spacetime arc length. (As measured, presumably, in ANY reference frame)

- yes, in any specific context, the Lagrangian represents HOW those things trade off, in the same way that in any specific thermodynamics scenario the microstate structure can tell us d(entropy)/d(energy) and let us compute the equilibrium state.

- but how to complete the analogy to thermodynamic equilibrium--what is the "temperature", exactly, what is "heat"? In a QFT context, the interaction is ITSELF a particle, and has its own contribution to the Lagrangian and action, but how to think about it classically?

It has always seemed to me that this line of thinking is the most natural way to express "Stationary Action".


I have created a resource that I think addresses your dissatisfaction.

The information is available on physics.stackexchange https://physics.stackexchange.com/a/670705

I use 'Hamilton's stationary action' to refer to the action concept of Classical Mechanics.

For Hamilton's stationary action the standard presentation is that it is demonstrated that F=ma can be recovered from Hamilton's stationary action.

Here's the thing: in physics it is common that derivation can be performed in either direction, and that applies in this case too. Hamilton's stationary action can be derived from F=ma

The derivation proceeds in two stages: 1. Derivation of the Work-Energy theorem from F=ma 2. Demonstration that in all cases where the Work-Energy theorem holds good Hamilton's stationary action will hold good also

Importantly, it's not retracing of the steps. The from-F=ma-to-Hamilton derivation hinges on the Work-Energy theorem. It's a different path altogether.

The steps of the derivation show why Hamilton's stationary action holds good. It achieves the justification you are looking for.

The derivation that I present is for the case of Hamilton's stationary action specifically; I'm positive the reasoning generalizes to all areas where an action concept is applied.

The demonstration is illustrated with interactive diagrams. (On physics.stackexchange the diagrams are posted as animated GIFs, the frames of the GIF are successive screenshots of the interactive diagram.)

Each diagram has one or more sliders, to explore variation of a trial trajectory. The diagram shows how the kinetic energy and potential energy respond to variation sweep.

The interactive diagrams are on my own website: http://cleonis.nl/physics/phys256/energy_position_equation.p...


> S, action, is the arc-length of a world-line. L = dS/dt. It is a natural law that systems move along "straight" world-lines, geodesics, in the absence of interactions.

This is a roughly correct description of GR, where the curvature is given by the Einstein field equations. It's not true in classical electromagnetism, or in QM.

> The "future-looking" nature of the least action principle is unsatisfying: sure, the action is minimized over an interval in time. Since L = dS/dt, the E-L equations are that same principle expressed at a single moment in time--right?

Yes, although the global principle is that the action is stationary, not minimal.

> Is there some sense in which L is "orthogonal" to the space spanned by constant energy and momentum?

Not quite, but you're on the right track. Conserved quantities of integrable systems are "orthogonal" to the Hamiltonian H, in that the poisson bracket `{f, H} = -df/dt` and so `{f, H} = 0` for constant `f`. The Lagrangian arises as the Legendre transform of H.


L <> d/dt S, because dS/dt = 0

S is a functional of q which is a function of t, the t binding doesn't really exist on the left to take a derivative.

every system minimizes the action. a subsystem is just some part of q (the coordinate of configuration space)

Actual trajectories don't conserve momentum though. A pendulum in a potential has varying momentum.


I find this rather suspect. Write it as:

S[q(t), t_0, t] = int_{t_0}^{t} L(q(t'), q̇(t'), t') dt'

Formally, S is a function of the upper limit of integration, and dS/dt = L, yes? I don't see why we can't treat it this way. It is the arc-length formula of the space time metric, expressed as an integral in one privileged time coordinate ( https://en.wikipedia.org/wiki/Relativistic_Lagrangian_mechan... ). Though, it's Lorentz invariant, and we could express it as an integral along the trajectory of the particle, in which case it's just = a constant factor * the proper time. The whole idea of "varying q while keeping the boundaries q[t_0], q[t], t_0, and t fixed" is perfectly understandable as a condition on q, but doesn't stop us from using the formula for S in other ways--for one, to come up with a general principle behind the condition on q.

Actual trajectories of closed systems do conserve momentum. The earth is interacting with the pendulum.


Hmm I still don't see why you can make that argument?

I'm thinking that by chain rule d/dt S = dq/dt dS/dq. But we assert that the variation of S is zero for real trajectories. so dS/dq = 0


There's no way that's right. It's mixing up "t" as the upper limit of integration (S[q] is really S[q, t_0, t]) vs t as the argument of q; dS/dt in the latter sense doesn't even make sense because the argument of q is "integrated out" in the expression of S.


> every system minimizes the action

Action is stationary, not necessarily minimal.


Minor correction, but actually classical paths make the action stationary, they don't minimize it. All minima are stationary points but not all stationary points are minima. Sometimes the classical path maximises the action, and sometimes it is a saddle point.


It's impenetrable to you because you are missing a lot of background. There is a formulation of classical mechanics that is totally different from (and yet mathematically equivalent to) Newton's. It is based on the definition of a new quantity called the action (which is a very misleading term, but we're stuck with it because history) and the principle that particles move in such a way as to minimize this quantity. It turns out that it's easier to turn the crank on the math for complex mechanical systems when you formulate it in this way. If you want the gory details, Wikipedia is a good place to start:

https://en.wikipedia.org/wiki/Lagrangian_mechanics

and if you want to do a deep dive, there is this:

https://mitpress.mit.edu/9780262028967/

A free HTML version of this was available a while back, and probably still is, but I can't find it just now.


HTML of first ed. is on Sussman's site:

http://groups.csail.mit.edu/mac/users/gjs/6946/sicm-html/



1) Lagrangians are confusing. I would recommend watching some physics and engineering lectures on it, but basically they're a mathematical tool used to simplify complicated situations which would be intractable to do using direct Newtonian mechanics. You can treat the Lagrangian as a sort of "energy" that physical systems have, related to kinetic and potential energy, which summarizes the entire state of the system, and the equation for calculating the Lagrangian summarizes the behavior of the system.

2) This is an abuse of notation. We have q, our initial position, and some transformation T(q, s) which takes q (a position) and s (the parameter which controls how the transformation behaves), and returns a new position. We can thus "shadow" q with a new variable, q = lambda s: T(q /* old value */, s).

T(q,s) is also left undefined here, because the point is that this works for all transformations that satisfy certain criteria.

I might write the theorem as:

Suppose there is a system that varies with time with a state that can be stored with two variables, q, and dq/dt or q̇.

For all functions L(q, q̇),

Let p = 𝛿L/𝛿q̇

Let F = 𝛿L/𝛿q

If ṗ=F, then:

For all T(q, s):

if d/ds (L(T(q, s), d/dt T(q, s))) = 0, then:

d/dt (p * d/ds (T(q, s))) = 0.

Which is sorta opaque, but hopefully more explicit about what's being assumed vs. calculated.


> How do I compute this function?

For the kind of problems in physics 101, the Lagrangian is just kinetic energy minus potential energy: L = T - U. This is definitionally true; there's no "why" to that. Think of L as the formal specification of a system. There's no point in asking "why does this mass on a spring have L = 1/2mv^2 - 1/2kx^2": that equation specifies the problem.

In higher level physics, e.g. QFT, there are often identifiably kinetic and potential terms as well. However, there are also often other terms. At that level, you generally write L = sum of the most general possible terms that don't violate symmetries you want the system to have. This ability to work backwards from symmetries to L is to me the more useful perspective on Noether's theorem than going from knowing L to the symmetries.

> Do I define q(s) any way I want, or do I solve q(s) given the constraints of the system?

q(s) is the function which minimizes the time integral of L, and is solved for via the [principle of least action](https://en.wikipedia.org/wiki/Stationary-action_principle), which requires functional derivatives to understand. Most physicists, myself included, only marginally understand the math here and do a bit of hand waving. "d" "del" "delta" potato potahto.

The principle of least action also generates "equations of motion" from L. These are the differential equations that can be solved for x(t).

So, in short: a system is formally specified by its L, which has measurable meaning only because of the principle of least action.

If you didn't specify the principle of least action, or something else, it would be nearly meaningless to give an expression for L. You might as well say: "the mass on a spring has a potato of 1/3kx^97 - 1/2m^8*v^3 + 3". "Nearly" in that at least that would tell me that two systems with unequal potato are different systems.


It's a classic physics post.

If you have studied classical (Langrangian) mechanics, then everything said in this article would make sense. But then, you would also be aware of Noether's theorem.

Currently doing research in physics, and I have to say that a lot of books / articles have this kind of approach: they don't give enough details such that the subject in question would be approachable by a novice (let's say having studied analysis), but not enough details either to be appreciated by an expert in the subject.


> (let's say having studied analysis), but not enough details either to be appreciated by an expert in the subject.

Perhaps you might want to cast your eyes over these 50 or so slides:

https://www-users.cse.umn.edu/~olver/t_/noetherth.pdf


I am sorry, I haven't had the time to look into the details of these slides.

But I am not saying there aren't some great ressources around here. These books and articles (that I talked about) can still offer some good insights. Generally, to gain a deep understanding of a particular subject, you need to read a lot of books and articles (each giving its own insight), try to work the equations out by yourself and teach it to someone else.


A simpler way to describe Noether's Theorem is that for everything you can change about the coordinate system which leaves the physics the same, it's possible to derive a real physical quantity that is conserved. For instance, if we redefined the origin from (0,0,0) to (dx,0,0), all the x's would get dx added to them, but the actual motions of everything would still be the same. This means there's a symmetry with respect to X. Using Noether's Theorem, it's possible to derive a physical quantity that must be conserved, which in the case of positional symmetry, happens to be momentum. Similarly, redefining time to t'=t+dt doesn't change the motions, which can be used to derive something that is conserved, which happens to be energy. Some other ones include multiplying everything be e^(i*theta) in quantum mechanics, which derives conservation of probability, and changing the angle of the (x,y,z) axes at the origin, which derives conservation of angular momentum.


> * There is a function L(q,q'), the Lagrangian. How do I compute this function? Does it have an analytical expression? How does it depend on p or F?

L can be a function which changes with the system(a spring or a free particle will have different formulae), just like the Force in classical physics changes with the system. The goal here is to find properties independent of any particular expression for L only assuming that L is invariant under certain symmetries.

> * There is a parameter s, "sending q to some new position q(s)". Do you mean q is actually a function q(s)? Do I define q(s) any way I want, or do I solve q(s) given the constraints of the system?

There is a bit of overloading of notation which can cause some confusion, q is not a function of s, but s is a parameter for a symmetry Tₛ which moves points in the space, here the line (For example, Tₛ(q)= q+sv is the translations in the v direction parametrized by s). So, one can read q(s) as Tₛ(q) and d(q(s))/ds as the derivative d(Tₛ(q))/ds. For the translation, the derivative will be v at all points, but it can be different at different points for instance when we are in a plane and Tₛ(q) is rotating q around the origin by s units, then the derivative with respect to s will be a tangent vector to the circle containing q. This derivative vector field is called the infinitesimal symmetry, as the symmetry Tₛ is just the flow along the vector field.

BTW, you are in good company with your question. Many mathematicians and cs people also get confused with physics notation where the type of functions is not clear (where the domain and range of the function is not clear). This issue is prominent when deriving the Euler-Lagrange equations. Read the preface of the referenced book http://groups.csail.mit.edu/mac/users/gjs/6946/sicm-html/boo...


I’ve complained to physicists and mathematicians both about their annoyingly ambiguous notational conventions. The response has always been hand-waving about “intuition” and claims that the notation chosen isn’t really important. It feels like it’s inherited by tradition from a time when ink and paper were expensive. Obviously I’m biased, but I really hope they learn a thing or two from formal computing scientists.


i'll add to the other comments that usually in theoretical physics we often build lagrangians directly from the symmetries: if you only have terms that are dot products then the lagrangian is invariant under rotations and angular momentum is automatically conserved via noether.

it's an incredibly powerful tool to build theories!


It might be easier to think about the Hamilton function H which is basically the Energy E=T+V...but as a function of p and q. H can be converted into L=T-V using the Legendre transformation. L is what people use to proof all this stuff


The Lazy Universe by Coopersmith is a nice book on this "best kept secret" of physics.


If you're serious about understanding this stuff, be prepared to set aside a lot of time. no lifehacks to learning physics, unlike many other things.


> There is a parameter s, "sending q to some new position q(s)".

This is a typo, author meant "sending s to some new position q(s)".


No, s is the parameter of the symmetry, not a parameterization of some path.


PBS Space Time has done a couple of episodes on Stationary Action and Noether's Theorem that are approachable by those not yet already exposed to Langrangians:

https://www.youtube.com/watch?v=Q_CQDSlmboA

https://www.youtube.com/watch?v=04ERSb06dOg

Sabine Hossenfelder also has a good episode on it:

https://www.youtube.com/watch?v=A0da8TEeaeE


Noether's Theorem is really something I found extremely beautiful as a grad student, I always felt like Emmy Noether didn't really get the recognition she deserved.

My favourite way of explaining Noether's theorem to non-physicists is to think of it in terms of portal guns. If you have a portal gun that can freely create instantaneous travel between points in space, you can clearly see how momentum and energy are not conserved anymore. In fact, that's pretty much every mechanic in the game is designed to exploit that.

If spacetime is no longer translation and time invariant, momentum and energy are no longer conserved. Of course, this is a bit of a circular argument but the converse statement is also interesting to think about. If you claim you have a perpetual motion/infinite energy machine, somewhere along the line translation and time invariance are no longer a thing.


“I always felt like Emmy Noether didn't really get the recognition she deserved.”

Working on it: https://lee-phillips.org/ENbirthday2022.html


To see how this works with the Hamiltonian and QM check out this recent (and equally brief) post by Peter Woit:

https://www.math.columbia.edu/~woit/wordpress/?p=13015


I would love to hear a good relatively-plain-English retelling (assuming some good undergrad physics knowledge) of how to understand these statements?

> Time translation symmetry gives conservation of energy; space translation symmetry gives conservation of momentum; rotation symmetry gives conservation of angular momentum, etc

Something about how if you see that processes can be step-wise played forwards or backwards, it implies the certain quantity is conserved? How?


I'm not sure if it makes sense to explain physics in English because it tends to get really imprecise and unclear.

That being said, suppose you have a theory which describes physical processes. Given that this theory must hold true for every variation of some parameters (rotation, translation, time, ...) then you can derive some things which are constant (angular momentum, linear momentum, energy, ...). Now one can spend many hours thinking about the relation between energy and time with fuzzy philosophical definitions or one can just accept that the math turns out that way (given some assumptions of that theory).


I guess very roughly you can say that if you describe a system with some set of parameters, which when changed in particular ways leave the system's quantities (energy, momentum, etc..) unchanged then you have a connection between symmetry and conserved quantities.


But note that this can be misleading (as could be any hand-wavy “explanation”). For example, a time shift does not affect the system’s angular momentum, but that does not mean that the conservation of the latter follows from the symmetry alluded to by the former.


An example:

The statement that it does not matter whether I do my (thought) experiment today or tomorrow, can be used mathematically deduce the conservation of some quantity which we can intuit is the mechanical energy.

More technically it's a differentiable symmetry: we can do the experiment a day, a week, a year in the future - this time can be varied smoothly.

If you know calculus read the first few chapters of Landau and Lifshitz book on classical mechanics.


Just wanted to recommend the Landau and Lifshitz book as well. It's pretty dense but elegant.


Her personal story is also fascinating, https://en.wikipedia.org/wiki/Emmy_Noether

Thank you for all the excellent posts to help people like me understand a bit more about this interesting work.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: