Hacker News new | past | comments | ask | show | jobs | submit login

The problem with that assumption is that while those solvers are based on a mathematical foundation (though I'd argue that some NN based simulations are too some papers out of Weinan E's lab at Princeton having some decent math to an untrained eye), the parameters that are fed into those are often just fit to some experimental data. And are often only verified in a heuristic manner. So while a neural net based simulator may not be transferable in the same way as a more general simulation engine (though depending on how you write them even they generally have limits), it may be fine in a certain domain which you can show by just checking how well it fits to certain experimental data you already have in existence.



This creates a very serious issue for epistemic leakage. If the sim is verified based on human eye heuristics, then it will be possible and very tempting to accidentally make the neural net satisfy those heuristics specifically. Then, scientists may use the heuristics to validate the results of a neural net designed specifically to pass their eye test, thinking that they are kicking the tires while in reality they are learning nothing.


Disclaimer, I am more referencing classical molecular dynamics in the atomic region. And what I am envisioning this type of thing for is not dependent on the mathematical model per se. So specifically my point is towards say GROMACS (or a similar MD simulation engine) where the force fields generally used for that are parameterized for biological/organic systems. Let's say we train our neural network on data fitting to lipids. So maybe a bunch of random data on lipids like their melting temperature, surface temperature, etc. Then we run the neural network simulator to model how these things form into larger structures (something that is for the most part impossible with current all atom approaches). And then we study that data to make an assessment on how certain structures can form etc.

Now if you are aware of the field, what I just described was (ignoring I do not remember the exact fitting data) the parameterization process of the MARTINI force field which is sufficiently good at lipids, kind of okay at proteins, and pretty bad at everything else. But within the bounds that you know the weakness of the system you can still use it to figure out experimental data. (Also as an aside MARTINI only got access to proteins and other things later on, thankfully force fields improve over time.)


Even the best neural network would then just interpolate the experimental data. A simulation tests a (simplified) mathematical model against experimental data. The former will never escape the boundaries of your observations while the latter can make new predictions.


I am mainly referring to the all atom molecular dynamics world, because that is where I am aware. But within that domain even the traditional simulators can only truly interpolate within experimental data. This is because the parameters within the simulator are generally only fit to a certain set of experimental data and therefore show poor performance anywhere else. Whether this is novel unseen molecules, different temperature domains, etc. While those parameters have gotten better they are laughably bad in certain areas.

But yes if you are creating a mathematical model from the ground and trying to use the model itself as a method to discover something about the system, then you are correct in that the new mathematical model may be able to tell you new things about the system. However, if you are trying to learn things from the simulations, then the type of simulator you are using matters less as long as it is accurate enough.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: