Hacker News new | past | comments | ask | show | jobs | submit login
Why is AI hard and physics simple? (arxiv.org)
59 points by bigdict 38 days ago | hide | past | favorite | 70 comments

Author has it completely backward. [edit: on further reading, the title is clickbait but the article content is consistent with my point below)

With only a few exceptions, ML is incredibly simple (there is no AI). The math is simple, the mechanics of evaluating it is simple, the reason it works is simple, and it only really works well if you have absurd amounts of data and CPU time.

Physics is... determining the mathematics you need to know on the fly while discovering and explaining many phenomena. You can spend decades focusing on matrix multiplications and other fairly straightforward trivia to analyze your particle trajectories, and then suddenly, you need to know group theory or some completely different field of math just to understand the basic modelling.

Personally I think the most impressive thing in physics and stats so far is our ability to predict the trajectories of solar system objects far into the future. After some very serious numerical analysis over the past 50 years, we've reached the point where there aren't many improvements we can make, and most of them come from identifying new objects, their position, and mass (data/parameters), and the real argument is about whether the underlying behavior is truly unpredictable even if you have perfect information.

Of course, the last best work in this area was done by Sussman who has been an ML researcher for some time: (https://www.researchgate.net/publication/6039194_Chaotic_Evo...)

As you can see, physicists pretty much invented all the math to do ML whilst solving other problems along the way: https://en.wikipedia.org/wiki/Stability_of_the_Solar_System

in fact many of my friends who were physics people, when I show them the code of a large scale batch training system they wonder why they did physics instead of CS because the math is so unbelivably simple compared ot the tensor path integrals they had to learn in Senior Physics.

people are missing the point

the author is talking about constraint optimization in physics vs ml/ai

constraint optimization is easier with a well defined prior vs the guess work in ml algos.

dl models in physics can scale and emergent phenomena will depend on scaling parameters -- this is coarse graining in physics.

I didn't miss the point. my graduate work was using constraint optimizers in molecular dynamics (n-body physics) and it translated to ML (I didn't have to relearn anything). The one part that was truly simpler is the objective functions in ML are believable to be convex, while n-body physics with constraints are highly non-convex).

people on the board, not you

> my graduate work was using constraint optimizers in molecular dynamics

me too for qm/mm sims -- did some rg/complex systems work too ;)

In the future you should probably respond to the people you intend your comment to be a response to.

The funny part is that Group theory is heavily integrated with the mathematics of neural networks. So much so that some mathematicians say that finding the invariants of groups is what makes NNs work.


"The key idea behind our results is that if we can find the symmetries of a problem then we can solve it"

wat? I've worked in neural nets for 30 years and I've never needed group theory. just linear algebra, multivariate calculus, number theory.

I appreciate that symmetries are great but that doesn't solve a problem.

The math behind ML is far from simple. Look at manifold learning and the math behind learning theory. I agree with the general point tho

The title is 'clickbait' for sure, but not neccessarily incorrect. After all 'simple' can mean different things to different people. And the author clarifies what he means by 'simple' in the rest of the article.

It's incorrect, and everyone knows why a provocative title is used even if the author has to spend the rest of the paper walking it back.

This was written to promote a book [0]

"As a first step in that direction, we discuss an upcoming book on the principles of deep learning theory that attempts to realize this approach.

Comments: written for a special issue of Machine Learning: Science and Technology as an invited perspective piece"

So take it for what it's worth.

[0] https://deeplearningtheory.com/PDLT.pdf

Ah, well that explains the clickbaity title, which is all most comments here are discussing at face value.

Physics isn't simple. It took (literally) thousands of years of study by very smart people to get to the point where what we call "intuition" about the physical world is what it is. And, as any physicist who paid attention in class will tell you, even that intuition isn't really right.

Any simplicity observed in physics is born of long familiarity, or is the result of underlying complexities being masked or approximated away.

Physics is simple in a certain sense and the paper explains this. In theory the force acting on something you drop could be a function of the state of every particle in the universe and each one could contribute with a different weight. But this is not the case, things are only influenced by things nearby and the weights involved are not arbitrary, the charge of each electron, for example, is the same. In that sense physics is unbelievable simple as compared to what it could be.

nobody has proved the conjecture "things are only influenced by things nearby" this is one of the largest ongoing arguments in QM (locality: https://web.mit.edu/asf/www/Press/Holmes_Physics_World_2017....)

We also don't know for sure the "constants" are constant throughout the universe (spatially and temporally) This is assumed for now, and seems almost certainly true.

I think it's unsafe to assume the above are Absolutely True and that that's why physics is simple.

nobody has proved the conjecture "things are only influenced by things nearby" this is one of the largest ongoing arguments in QM

This is a much more nuanced matter. Non-locality as in global wave function collapse or Bohmian mechanics has not the same consequences as classical non-locality, there is no causal influence from things you do to an entangled particle at the other end of the universe. Also to entangle particles they have to first interact locally before they can be separated.

We also don't know for sure the "constants" are constant throughout the universe (spatially and temporally) This is assumed for now, and seems almost certainly true.

This does not really change the argument, even if, for example, the fine-structure constant is not constant after all, then there will most likely be a hand full of other constants that describe how it varies over space and time. This is very different from every electron having a unique electric charge that is not governed by anything and the only way to figure it out is to measure it for each electron.

It would also probably make a difference how values are distributed, are the charges of the electrons nicely distributed and vary by a factor of two or ten? Some statistical theory could probably deal with that. But what if there is no nice distribution, if values vary by hundreds of orders of magnitude, if the expectation value or the variance of the charge is infinite? I am certainly unqualified to make any definitive statements but I can imagine physics to be weird to the point that it becomes mathematically intractable or at least only produces useless answers because of the amount of uncertainty that enters the equations.

In any case, we know that the universe is simple to a very good approximation, even if fundamentally everything depends on everything and the electron charges are all random, those effects are small and we can have a good approximation with simple theories.

some context from TFA:

"Please note that the notion of simplicity to which we are appealing is not at all meant to suggest that physics is trivial. Instead, we mean it as a compliment: we have the utmost respect for the work of physicists. This essay is an apologia for physicists doing machine learning qua physicists; it is meant to interrogate what it is about the approach or perspective of physicists that allows them to reach so far in explaining fundamental phenomena and then consider whether that same approach could be applied more broadly to understanding intelligence"

edit: added quotes

Physics is not simple. e=mc^2 might look nice on a t-shirt but it is a cherry picked formula. Newtonian mechanics you learn in high school might seem simple to someone with a MINT degree but they describe a small part of the world, and complicated things like friction are discarded.

General relativity, quantum chromodynamics, etc. They are all incredibly complicated.

Even bog standard classical mechanics can be very complicated and unintuitive, once you go to things like Hamiltonian dynamics and many-body problems. Then you have statistical Physics and, as you said, relativity and quantum mechanics. I’ve recently spent some time learning about some quantum gravity theories; these things are hard.

Yes. Chaotic systems arise in classical mechanics. So while often the formalism is easier to write down than for example in general relativity, calculating a solution is often very difficult or practically impossible.

It's probably worth distinguishing between conceptually simple and mechanically simple. A many body problem is easy to understand (relatively), you're just extending existing rules onto more things at once. Now, actually calculating a position at a time given some initial vectors? That's complicated.

You’re right. But many-body can also turn into statistical physics, which is quite complicated conceptually (basically when entropy and ergodicity become important). Then you have the best of analytical Mechanics, statistics, and quantum mechanics if you fancy.

What is a MINT degree?

And e=mc^2 is not a simple formula. Okay, it's simple. But deriving it and understanding why it's like that was what, junior year in college?

I think MINT is the German version of STEM.

Yeah it is. Sorry that a germanism slipped.

"But deriving it"

It was a while back but I think we derived it in school aged 16-17 in first year A level Physics (UK.) Is that roughly "junior year in college" elsewhere?

As I recall, deriving it required a solid grasp of special relativity and calculus. I mean, if you want to say that you did that at 16, I won't argue with you. But if you claim all students did that it seems unlikely.

Edit: Oxford University, since you are from the UK and that's reputable within the country, lists General Relativity (and hence E=MC^2's derivation) as a 3rd year course in the Physics program. So that makes it squarely in the junior year.

E=mc^2 is a result from special relativity. This is usually touched on earlier than general relativity.

Ah, thank you. It's been a while since I had to remember which was which.

No if you take AP physics in the US, yes if you don’t same thing in the UK if you don’t take physics for your A levels you ain’t gonna learn that.

Physicists will tell you that F=ma can get you through virtually all of kinematics and dynamics.

I consider that a big chunk of the world.

Only in terms of spherical cow models. Try to apply pure F=MA kinematics to the real world and deformation is one of many huge issues it’s ignoring.

F = dp/dt

which might not be m*a but usually is.

ai is top-down and physics is bottom-up.

physics is also mostly deterministic (attach probability distributions for stochastic/quantum stuff) and there are well defined rules (energy, symmetry, noether, etc).

at the end of the day ai has some space for models and so does physics. because physics has well defined rules it's easier to apply constraints to that space vs ai/ml where it's informed guesswork.

of course there will be a correspondence between parameters in a model and emergent physical phenomena ... and i'm sure really nice scaling laws, etc will come out , this is just coarse graining.

onsager and the likes were onto this stuff way before deep learning was a thing. i think this connection is uninteresting because optimization in it's heart is physics. dl is just one aspect.

- a physicist (escaped to the greener pastures of swe, shame really, i miss it but not the wl balance)

Most of the responses here seem to imply that the author doesn't understand that physics can be complicated (in the sense of being hard to learn or having big equations). He studies theoretical physics at MIT [1], so I expect he does.

On the content: it's pretty weird that our best models don't use much of the world's underlying structure at all. State-of-the-art vision models like vision transformers and MLP-mixer do just fine when you shuffle the pixels. You could argue that modern image datasets are so big that any relevant structure could be learned by attention, but it still feels like we're doing _something_ wrong when pixel order doesn't matter at all ¯\_(ツ)_/¯

[1] https://danintheory.com/

Vision models need the pixel ordering to match the one they have been trained on, in order to work.

They won't generalize after training to transformations of the data that they haven't been trained on, even simple ones such as rotations, whereas humans will.

So I would argue that vision model do use the "underlying structure", and even that one of their problems is that they make use of some of the "underlying structures" that are not actually important, such as image luminosity, rotations etc. I think people usually augments the data with these transformations beforehand during preprocessing to enforce invariance.

Physics is explaining what is, in a fairly testable domain.

AI is figuring out how to replicate something fuzzily understood, from a difficult to test domain, using techniques that are largely in a different domain altogether, because even for the parts of the inspirational domain we kinda-sorta understand, we don’t have the tools to directly reproduce them easily so at best we simulate them in alternative media.

(“Physics is simple” overstates the case, still, but “AI is hard” understates the case, so, relatively speaking, its something of a wash.)

Physics has had a 470 year head start, if you measure from Copernicus. Of course it looks simple. It wasn't simple at the time; it's taken something like 14 generations so far.

A fun read, but the title is somewhat misleading. Excerpts from the paper:

        This is why (understanding how) AI (works) is hard.

        This is why physics is (able to offer) simple (explanations for seemingly complicated natural phenomena) 
The author is arguing that:

1. "top-down analysis" to deep learning is ineffective due to the nature AI.

2. It should be possible to use the "sparsity" principle from physics as "a guiding principle of deep learning theory"

Where the "sparsity" principle dicit:

        The number of parameters defining a theory should be far less than the number of experiments that such a theory describes

Personally, I'm skeptical. I prefer to understand whole theater of ANN as an attempt to optimize neural-net for practicality. Even deep learning is a technique for structuring the network for efficient training.

I thinks so because even the most dumbest neural network - layers with high connectivity - can perform certain tasks, if we train it really really hard with fingers crossed. To avoid uncertainty and improve performance, it's necessary to structure the network. This cuts unnecessary computation, and prevents over-fitting and over-labeling.

That is, we have to inject bias to network, and this makes any bottom-up analysis biased and less generalizable.

As someone with a physics degree, physicists really have a unique ability to stroke their own egos. I think I agree with the premise - structure is certainly the most important thing in our physics and machine learning. However, a significant portion of the effectiveness behind ML is letting the computer find the structure for itself rather than doing it yourself. The most effective uses guide the learning via minimal structure.

The problem the article is basically describing is that many neural networks do not utilise all the information available, or are over trained, or are not in a sense compact enough. There is redundant information.

For example recognising pictures of cats, the fact you can rotate a cat picture and its still a cat isn't necessarily baked in. If you visualise layers there seems to be a lot of redundant information, (if you train on rotated cats you get layers that look like rotations of each other). If there is a mechanism to store this kind of information more compactly, I am not aware of it.

Perhaps there is the idea that some kind of structure that encodes the redundant information, should be emergent or natural, it should evolve out of the models without having to make arbitrary choices. On the other hand maybe we should turn to physics and add these physical kind of constraints, there is a way to renormalise or make use of symmetries to improve our current models. Perhaps a neural network is more like an algebraic variety (collection of polynomials) with a group action.

To make a stab at the question posed by the title:

We've known the principles of Newtonian physics since, well, Newton.

We'll need an Isaac-Newton-level of insight into intelligence before we can make it so simple.

Nevermind that we can't even seem to agree on the definition of intelligence.

As to the definition, I have an objection to calling the current big-data stats we do nowadays "machine learning" or "artificial intelligence". There is no intelligence there.

Rather, to allude to the article, I consider it more "machine muscle memory" or "artificial intuition". The algorithm can tell you, "I have a really good hunch about this based on the zillions of examples I've seen" but it can't derive underlying truths to reason about why.

Perhaps we are recapitulating phylogeny when it comes to artificial neural systems? We have something like an autonomous nervous system, and a brainstem, but we need so much more to get to intelligence.

All human endeavors will converge on equal difficulty because it's only limited by the capability of the people doing it.

Physics might be simple, but it's not easy. Especially if you want to make predictions for complex systems.

Physics might be simple, but a physics engine is extremely complex.

I don't know where to begin....

Physics simple?

Physical intuition?

The single most successful program of physics is quantum mechanics, and it is neither intuitive nor simple. Relativity, while conceptually simple, isn't so simple either and is far from intuitive (consider that momentum is conserved during a relativistic near-miss collision only if one considers the entire collision over a sufficiently length period, since at any moment the force vector between the bodies does NOT align with separation vector, since the force acting at one moment was exchanged, at c, when the bodies were in different positions).

There are a lot of simple concepts in physics, many of them basically teaching aids to get people started. When one gets deep into the field, simple and intuitive go by the wayside.

Why would you even bother responding based on just the headline of the article? The author is not using 'simple' as a synonym for 'easy to learn'.

TBF it's a terrible title and the overview isn't exactly enticing in it's implication:

Let's get physicists to look at AI so we might make some progress, btw here's a new book that tells us how

I'm not saying that's what the article is actually about but that's what I read from it and it's crass enough that I didn't read further.

It's a risk any catchy headline takes. Seems like the same property that entices people to click also entices people to engage the headline itself.

I read TFA, hence the reference to intuition. There is nothing in the article that makes a compelling case of physics being simple, other than rhetoric.

We forget at our peril the Michelson-Morley Experiment and the Ultraviolet Catastrophe, and if we forget these, we may assume now too that we have it all figured out.

Of course, active researchers in the subject, both theoretical and experimental, do not forget.

Because it is awful.

Titles have to be short, and as such they can't hope to represent the contents of the article completely accurately. If you wanted to do that you would have to make the title equal to the article's contents.

Based on the parts which I've read so far, a more accurate title would be 'Why some currently hot parts of AI not well understood, and some parts of Physics well understood?'

I think the original title is an ok approximation of this.

The author is talking about how a given physics model appears simple when they are presented with it, e.g. a particular quantum field theory. This is the kind of limited perspective about research that an undergraduate physicist may develop simply by solving the hand crafted problems that are presented to them.

However, the true difficulty in physics is arriving at that model in the first place. Decades of work offered up against experiment, the associated conceptual leaps in understanding required to get to e.g. a quantum field theory which succesfully predicts things are nothing short of a monumental achievement. To say that physics is simple is ludicrous.

You are missing the point the article. Author is not trying to argue that AI is 'harder' than physics, like a freshman cs major might argue with their physics friends.

Author is talking about how our physical theories, such as QFT, currently have more predictive power than any theories we currently have about machine learning/deep learning.

(Author has a PhD in theoretical physics).

While QFT makes some amazingly precise predictions in certain areas like the fine structure constant, it is nearly useless for predicting even most chemistry.

In practice, the computations required to use the QFT model are just too complex for modern computers when it comes to single atoms with more than a few protons, not to mention larger molecules. Instead, we must use simplified models like the Bohr model to make predictions about molecular bonds.

This actually seems to be very similar to AI where we understand not everything, but a lot about basic neurons, yet the emergent phenomena of intelligence is very difficult to predict due to the explosion of computational complexity.

That's a good point. I guess our current mathematics is not good enough to say much about the macroscopic behaviour of large interacting models.

I think the article misses the point of what physics is. It is not a collection of "sparse" models and principles, rather, it is a scientific discipline from which such models have emerged.

You will notice the article conflates the two things: physics and the known laws of physics (e.g. first para in section 1.2). Simplicity of the latter does not imply simplicity of the former, but the article assumes that it does in order to tackle/state the question as posed: "Why is AI hard and physics simple?".

IMO if all it takes is a few simple substitutions like “physics” -> “known laws of physics” to make the article or title make sense, then it’s unfair to say that the author has missed the point. C.f. Reading the strongest possible interpretation and all that.

for what it is worth: physics is not easy, never was, never will be. the painfully slow invention (or is it discovery?) of the mathematical machinery required to understand physical phenomena is one of the most astonishing feats of the human brain. In my view it sets it apart from anything remotely "AI"-sh.

Starting with early calculus, to differential geometry, Hilbert spaces and whatnot, the brain doesn't fit models to data, it is making up categories and concepts (new classes of metamodels if you wish) as it goes along (as improved experimental devices augment our sensory inputs). To cope and explain these totally indirect information flows the brain conjures up symmetries in invisible abstract spaces, invariants and visual imagery from alien internal worlds, pursues "thought experiments" to restore sanity... Machine learning style fitting of "model" to data is just one of the final steps. Important but hardly defining the process.

The "AI" folks oversold their goods by a breathtaking factor and are now exposed to the entire planet. No physicist will ever be able to bail you out :-)

> Ignoring the mathematician’s appeal to rigorous (re)formulation, we focus on the latter part of the quote: “The physicist’s often crude experience leads in an uncanny number of cases to an amazingly accurate description of a large class of phenomena.”

Non-practitioner in either field ...

If you don't focus on the more or less "defined" ML subset, and focus on the AI in the question, whatever "AI" has meant over the years, then ...

The reason why AI is hard and physics simple may be that physics already "is," while we may not yet have defined what AI should be and what to do with it .

Physics has guardrails of reality, however subtle and hidden they may be. What is described by physics is what it is, and will be there as it is, whether we correctly describe it or not, and whether we observe it or not. (Depending on your view of an observed universe's dependant relationship with its observer.)

AI is ... what? In any particular decade? We aren't describing anything that already is, and that comes with constraints. We're making it up.

Specification is hard.

The universe as described by physics, on the other hand, is well beyond specification and requirements gathering. We're just writing the missing manual.

They'll put anything on hep-th nowadays

Physics is real, and verifiable.

AI is a black box you whack with a stick until you like the results.

I call on theoretical physicists to do theoretical physics. “AI” has already attracted huge number of people (driven by meaningless buzzwords, and hoping to cash in).

Because physics is deductive whereas AI is (trying) to create a synthesis.

Physics is only semi-simple.

PSA: Anyone can publish just about anything on arxiv.org. Doesn't mean that it has any merit whatsoever.

Not anyone, you have to get vouched by someone who can publish there. Look at viXra for an example of a preprint server truly anyone can submit to.


Physics is simple? Can you describe to me what gravity is? Have you published your unified theory yet?

Physics is simple, is that so... Combine quantum mechanics with general relativity then. Einstein couldn't.

When newton defined his derivative, it took two pages. Any textbook nowawdays can do that in a paragraph.

AI is not yet common knowledge. It isn't as well understood.

But you just wait.

Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact