Hacker News new | past | comments | ask | show | jobs | submit login
From models of galaxies to atoms, simple AI shortcuts speed up simulations (sciencemag.org)
85 points by DarkContinent on Feb 15, 2020 | hide | past | favorite | 36 comments



For some reason this reminds me of the famous xerox copier where the compression algorithm would swap out digits: https://news.ycombinator.com/item?id=6156238


I doubt it.

Simulations based on PDEs, ODEs, or DAEs have a particular mathematical foundation. Within the bounds of their solvers, they deliver precise results and can actually forecast physical behavior.

If such an "emulator" is much better in many cases but completely wrong in just one, it is basically useless, as presumably the verification of a solution takes as long as a classical simulation.


This misses the use of the AI model for prediction, search and optimisation, backed up by simulation for verification.

> it is basically useless, as presumably the verification of a solution takes as long as a classical simulation.

If and only if you're doing one simulation which is rarely, if ever, the case.


The point is that without a guaranteed error bound you would have to verify every approximation from your black-box AI.


Good fast approximations are great. Link up a few of them and use voting as a sanity check. Accepting the existence of uncertainty in return for speed is a good tradeoff. You don't always need accuracy, and even high quality simulations fail.

Sometimes you want laser-like precision. But often, you are better off with a flashlight.


The things you see with a flashlight at least correspond to things that actually exist. Replacing algorithms with proven error bounds with heuristics that have such bounds whatsoever is much more questionable.


The problem with that assumption is that while those solvers are based on a mathematical foundation (though I'd argue that some NN based simulations are too some papers out of Weinan E's lab at Princeton having some decent math to an untrained eye), the parameters that are fed into those are often just fit to some experimental data. And are often only verified in a heuristic manner. So while a neural net based simulator may not be transferable in the same way as a more general simulation engine (though depending on how you write them even they generally have limits), it may be fine in a certain domain which you can show by just checking how well it fits to certain experimental data you already have in existence.


This creates a very serious issue for epistemic leakage. If the sim is verified based on human eye heuristics, then it will be possible and very tempting to accidentally make the neural net satisfy those heuristics specifically. Then, scientists may use the heuristics to validate the results of a neural net designed specifically to pass their eye test, thinking that they are kicking the tires while in reality they are learning nothing.


Disclaimer, I am more referencing classical molecular dynamics in the atomic region. And what I am envisioning this type of thing for is not dependent on the mathematical model per se. So specifically my point is towards say GROMACS (or a similar MD simulation engine) where the force fields generally used for that are parameterized for biological/organic systems. Let's say we train our neural network on data fitting to lipids. So maybe a bunch of random data on lipids like their melting temperature, surface temperature, etc. Then we run the neural network simulator to model how these things form into larger structures (something that is for the most part impossible with current all atom approaches). And then we study that data to make an assessment on how certain structures can form etc.

Now if you are aware of the field, what I just described was (ignoring I do not remember the exact fitting data) the parameterization process of the MARTINI force field which is sufficiently good at lipids, kind of okay at proteins, and pretty bad at everything else. But within the bounds that you know the weakness of the system you can still use it to figure out experimental data. (Also as an aside MARTINI only got access to proteins and other things later on, thankfully force fields improve over time.)


Even the best neural network would then just interpolate the experimental data. A simulation tests a (simplified) mathematical model against experimental data. The former will never escape the boundaries of your observations while the latter can make new predictions.


I am mainly referring to the all atom molecular dynamics world, because that is where I am aware. But within that domain even the traditional simulators can only truly interpolate within experimental data. This is because the parameters within the simulator are generally only fit to a certain set of experimental data and therefore show poor performance anywhere else. Whether this is novel unseen molecules, different temperature domains, etc. While those parameters have gotten better they are laughably bad in certain areas.

But yes if you are creating a mathematical model from the ground and trying to use the model itself as a method to discover something about the system, then you are correct in that the new mathematical model may be able to tell you new things about the system. However, if you are trying to learn things from the simulations, then the type of simulator you are using matters less as long as it is accurate enough.


Interestingly, my lab has been working in emulators for one of our simulation models, and we're really struggling to make meaningful improvements.

It's faster, but we're not there yet on accuracy.


"When they were turbocharged with specialized graphical processing chips, they were between about 100,000 and 2 billion times faster than their simulations."

Now the critical question is: How much faster is it without AI, just because of the specialized dedicated processing chips?

Otherwise, they might be comparing a single virtualized CPU core against a high-end GPU for things like matrix multiplication ... and then the result that GPU > slow CPU isn't really that impressive.


An alternative question is: how much faster is it with the neural network-based emulation ("AI"), without the used of the specialized dedicated processing chips? I think the answer to this gives the information you are looking for.

The paper answers this question:

> While the simulations presented typically run in minutes to days, the DENSE emulators can process multiple sets of input parameters in milliseconds to a few seconds with one CPU core, or even faster when using a Titan X GPU card. For the GCM simulation which takes about 1150 CPU-hours to run, the emulator speedup is a factor of 110 million on a like-for-like basis, and over 2 billion with a GPU card. The speed up achieved by DENSE emulators for each test case is shown in Figure 2(h)


>Now the critical question is: How much faster is it without AI, just because of the specialized dedicated processing chips?

Based on similar work we are doing at the startup I work for, this isn't just GPU magic. ML is a heuristic alternative to simulations which already operate on specialized GPUs and TPUs. This modeling acceleration is one of the many ways in which ML is poised to change everything.

The same way that a human can, for instance, approximately draw iso-temperature lines around a candle flame, without having to perform simulations...except the neural net is some 99%+ as accurate and detailed as a full simulation. That's exactly why neural nets excel - they learn complex heuristics much like humans do, but with the added power of digitized computation and memory.


Don't most really complex physical calculations / simulations (e.g. weather, planetary movements etc.) involve chaotic interactions? So an NN being 99% correct will still result in "catastrophic" differences down the line?


So will quantization error when you grid your conventional sim. If you can make your cells and timesteps smaller with the NN than without, the loss in accuracy in one place could be compensated by the gain elsewhere. The only issue is, good luck with developing a formal theory of error propagation through your trained network that is faster to compute than the conventional sim itself. Sometimes you care about strict formal guarantees about sim error and other times you don't.


What is the worst case loss of accuracy possible from the ML version?


Usually there is not error bound (or we don't know how to proof them), so the worst case is that it is infinitely wrong (or as wrong as floating point numbers can express).


You're probably right, but from everything I've seen to date, the representations of functions that neural networks learn tend to be smooth, which is to say if your input is similar to the domain covered by your training data, you're quite unlikely to get totally crazy outputs...though I can't say I've come across anything that proves this.

However, I think the degree to which neural networks can extrapolate is still an open question, so it is important to thoroughly sample your problem space during training. That's a bit of an artform given that these spaces are typically high-dimensional.


Off by as much as two infinities, since floating point numbers have +infinity and -infinity. ;)


I made a benchmark a few months ago for a friend of mine working on computational physics. She needed to compute the determinant of a lot of matrices to perform MCMC sampling and wanted to know how much GPUs could speed up things. For this task, one V100 was equivalent to a 100 cpu cores, so a speed up of 100,000 is definitely not coming only from using GPUs vs CPUs.


The real question is how much the speed up is due to the low quality of the initial implementation. I’ve seen speed ups of 100-1000x on the same hardware. Academic simulation code is often bad.



I was at a talk last week where the speaker spent a little bit of time on using machine learning on a regression matrix that is trained by the results of a simulation. The simulation and variables in the regression matrix were chosen such that the AI could recreate an approximation of a known physical law. This is fairly exciting to me because if used to recreate a lot of laws in this field, it could then be used on experimental data to untangle some of the mess and identify the relationships for us. I could see this speeding along development of science.


> It randomly inserts layers of computation between the networks’ input and output, and tests and trains the resulting wiring with the limited data. If an added layer enhances performance, it’s more likely to be included in future variations.

Sounds a lot like genetic algorithms but with neural networks. I suspect we'll see more of this as people figure out how to run the search over neural network architectures that fit their own domains. Convolutions and transformers are great and all but we might as well let the computers do the search and optimization as well instead of waiting on human insights for stacking functions.


The underlying paper was previously discussed on hn here:

https://news.ycombinator.com/item?id=22132867

Note: The published paper is titled "Up to 2B times acceleration of scientific simulations with deep neural search", which can raise some hackles, including mine. Doesn't prove anything but still.


Here's a potential way to use adversarial techniques to generate training examples that could improve the accuracy of this approach: https://twitter.com/RoboTeddy/status/1228828411050655744


Who'd think compression works so well?

(yes, neural networks are compression engines)


This is not meaningful analogy due to being too generic. Using it does not add anything to discussion.

Any mathematical model is 'compressed' form of reality and that's why they works well. Instead of compressed, simplified or abstracted, is better term. Machine Learning adds heuristic data driven model to scientific model.


Do you mean in the same way that any mathematical function is a compression engine? That is, you implement something that can handle many cases (1+1, 2+3, 5+6) in a concise form?

It seems to me like the real magic of neural networks is that they make it easier to search for a function that solves (to some extent) a particular problem.


I always thought programming and even theory were knowledge compression


Compression in your context is as meaningless as Generalization.

Yes, you can say generalization is compression.


Except that "generalization" implies that it works for previously unseen problems, which is usually not the case for AI.

Compression, on the other hand, nicely captures the "learn and reproduce" approach that using AI entails.


Unseen problems is a ill defined term. There is a distinction between in domain and out of domain, both can be unseen by the model before.

Even human as agent requires training before being deployed to unseen problems. Generalization is conditioned on experience, after all.

AI generalizes to unseen in domain data given a specific task. That is why it is useful in the first place.


Not necessarily




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: