Oh well, time to study this intensively. Here's a link for the curious: https://arxiv.org/abs/1411.6671
I've never really thought about it in terms of your comment, however, which is actually very motivating. Thanks for the perspective random internet person.
(Apologies for the OT comment)
It vindicates your thinking in that it indicates that you are picking up the techniques that professionals use to generate ideas. You just have to accept that as you're learning, most of the ideas will have been generated and addressed.
What gets awkward is when you're doing interdisciplinary research and you're using the generative mechanisms from one discipline in the other. By and large both sides stare at you in confusion, or it requires multiple hour conversations to lead the other discipline's practitioners into a mental space where they can say, "Oh, yeah, that wouldn't be useful because of X." It makes the process of tuning your idea generation much slower. On the flip side, you hit the unknown way faster, sometimes great vistas of unknown where everyone goes, "Umm...wow. Yeah. Never heard of that continent. Let us know what you find."
Advise to new PhD candidates: take a boring idea and write the best paper you can on it. It will be 10x better received than your "great" idea.
If you wanted to train a neural network model that would generalize, you would need build in physical constraints into the model architecture. For example, dynamics are governed by a Hamiltonian based on pairwise interactions, which should be integrated as an ODE. A nice recent example of this from folks at DeepMind is this paper “Hamiltonian Graph Networks with ODE Integrators”: https://arxiv.org/abs/1909.12790. That said, if you go down this road too far you may find that you are just learning an approximation to Newton’s law of gravity, which we already know exactly!
The interesting work in this space is finding appropriate interpolation problems inside the context of state of the art physics based models. In any complex simulation, there are tunable parameters and approximations that are suitable to replace with ML. If I were working on this problem, I would start by making a differentiable version of Brutus, and try to deeply understand its strengths and weaknesses.
Neural nets are just high dimensional interpolation. The only reason why they are more powerful than “classical” ML is that you can fit them when embedded in an arbitrary computational graph, in which it is easy to embed prior knowledge (“differentiable programming” style). If you’re not doing that, you might as well just stick with the blackbox algorithms of Scikit-Learn.
In terms of practical applications I don't see what's the point of training a network on simulations produced by deterministic systems. But I guess they didn't really mean their technique as a replacement for deterministic systems because they say they envision it as a sort of crutch were Brutus falters.
They don't evolve the simulation very long and the errors rapidly increase with time. See Figure 3 for typical errors of about 20% on the validation set with clear signs of overfitting. https://arxiv.org/pdf/1910.07291.pdf
> Since its formulation by Sir Isaac Newton, the problem of solving the equations of motion for three bodies under their own gravitational force has remained practically unsolved. Currently, the solution for a given initialization can only be found by performing laborious iterative calculations that have unpredictable and potentially infinite computational cost, due to the system's chaotic nature. We show that an ensemble of solutions obtained using an arbitrarily precise numerical integrator can be used to train a deep artificial neural network (ANN) that, over a bounded time interval, provides accurate solutions at fixed computational cost and up to 100 million times faster than a state-of-the-art solver. Our results provide evidence that, for computationally challenging regions of phase-space, a trained ANN can replace existing numerical solvers, enabling fast and scalable simulations of many-body systems to shed light on outstanding phenomena such as the formation of black-hole binary systems or the origin of the core collapse in dense star clusters.
don't work well on GPU because threads don't access the same memory location based on the argument to teh lut. Some nonlinear basis or projection doesn't have the same problem.
For the planet scale objects, the game designers would have to make sure that the orbits don't go crazy (over times that can be explored at maximum time warp). But then if the goal is to keep those orbits stable, why not just save all the dev effort and just keep them on Keplerian rails?
For the smaller objects launched by the player, unstable orbits aren't necessarily desirable either. Imagine going on a multi-decade mission and returning home to find your favorite space station missing. Game devs aren't going to spend effort on complicated features that will just piss off most players.
As they added more planets, if, at any point, they had changed from the simple model to the true one, it would have broken all tutorials, delta-v calculations, extant saves, etc. It would also have required more changes to make the game properly playable. In KSP, players will often leave vessels in orbits on timewarp while focusing on something else- they'll have 10 missions ongoing at once. This works well when orbits aren't being perturbed, but in real life/real physics, you can't just leave a vessel in orbit around the moon for a year without course correction- it'll crash. The problem is, the game has no way to automate course correction/stationholding, and that would be far less trivial to add than merely changing the gravity model from one system to another.
I'm not apologising for the devs not having done it right, really. KSP didn't really live up to all its promises, though it is pretty good.
This is an exaggeration. This is an undergraduate level problem. It's not difficult to integrate the movement of 3 bodies under the influence of gravity. Sure, it can be tricky sometimes, or you need a small step size, but it's not rocket science.
> provides accurate solutions at fixed computational cost and up to 100 million times faster than a state-of-the-art solver
100Mi times faster is a bit of a stretch.
> Our results provide evidence that, for computationally challenging regions of phase-space, a trained ANN can replace existing numerical solvers, enabling fast and scalable simulations of many-body systems
Yes but extrapolating from 3 to several is exactly where this technique is bound to break.
Basically, you divide your space into pieces depending on the location of the masses, and you build a tree, where far away areas are treated as if they were point masses at the center of gravity. This converts an N^2 computation into an N log N one. I'm not clear whether this neural net does better than the well known fast approximate method.
Maybe I'm biased with my physics PhD, but I'd call it a physics problem. "Solving" it to me means a closed-form solution, not a particular prediction, but I guess the applied math problem is a different problem with a different solution.
Solving a problem in physics or math generally implies a closed form solution. Which this is, decidedly, not. It is in fact a special case, with low particle count, and doesn't return velocity information, thus leading to problems assessing whether some things like Energy, are actually conserved.
They discuss this. And have to use a second network to compute velocity, used for computing kinetic energy. In which they discover that in close near-collision dynamics, their error can spike. Unless they reproject the phase space coordinates specifically atop the "right" constant energy surface.
So ... yeah ... title of preprint is quite misleading. Their conclusions are also misleading.
Are they better/more general at numeric solution for arbitrary mass, arbitrary particle count, arbitrary initial conditions than, say, a good symplectic integrator of high order?
No. Actually they only studied the equal mass case of 3 particles. Given the combinatorics, I could not fathom the size of the training set they'd need for N(particles) of any reasonable size.
There they would start having to look into some of the approximations that the simulation folks have been using for a long ... long time. Just to get their training cost down far enough to be useful.
An example. Take a simulation with N ~ 10<sup>6</sup>. 1 Million objects. Take a mass distribution in a Poissonian distribution, or an EVD, or similar. Make it similar to approximate stellar masses. Create your initial conditions and generate your training sets from these initial conditions. How many of these simulations would you need to run, and for how long, in order to be able to get something which provided long time scale prediction, which meshed with reality?
Oh, take N at least 8 for planetary mass distribution. With satellites, so N ~ 200. Never mind planetary rings or Oort cloud/Kuiper belt/asteroid belt objects.
How does large does your training need to grow in order to be able to predict, with any degree of accuracy, orbital stability over gigayear intervals?
My point is not to beat on the authors, but to point out that this study, while interesting, is most definitely not a solution to the 3 body problem. There is vast literature on solutions to N-body problems in general. These problems are considered hard. Because, they actually are. There are some "short cuts", basically field-like approximations, that one can use to make some generally reasonable arguments about the interaction with many distant objects. They actual dynamics, and details of Poincare maps and phase space portraits are, again, very well studied phenomenon. With 100+ years of research behind this.
If they wanted a simpler dynamic problem to work with, the double pendulum exhibits chaotic behaviour. All they need to vary are 2 parameters for each arm/mass, length and mass. See how close they come to actual behavior, onset of trajectory divergence, etc.
[edit: slight clarification of second network computing velocity for kinetic energy]
Guess I'm biased too because I'm a theorist and started out from a math background.
Edit: Sorry, I assume you refer to the OP in which case yes, I agree that it's not a solution at all. Rather, it is a presumably faster-than-before simulation.
The post made me remember this article about predicting chaotic equations with machine learning with great accuracy:
But I also imagine as the input size grows within the orders of the hidden layers it will lose the advantage.
edit: Whether this NN approximations will enable us going well beyond what PDE solvers can do is a tough question, especially because PDE solvers are generally needed to build the training sets as far as I understand (i.e. they are trained on synthetic, not observational, data). It's possible that one could strategically and judiciously train smaller components of a NN system and then assemble the system to do things that the PDE solvers couldn't do, perhaps, in the way that one can often write unit tests for scientific software but not really test the final results directly (because there is nothing to test against). However I think that in general the lack of training data for the more complex system will always be a major obstacle, whereas it's not really for PDEs.
If the neural net gives you a solution, is there any cheap way to confirm it?
That said, even if you didn't trust it, it might be possible to use the approximate alleged solution spat out by the neural net as an initial solution to a trusted solver with guarantees. In some cases you might be able to get a guaranteed result from the proper solver with the neural net solution used to accelerate convergence
You can get decent error estimates for lots of these sorts of systems, with a manageable amount of extra computation.
It’s also faster since you never factor in the training time for the NN, alright.