>the model automatically learned some nontrivial structure
what DL nets do inside is efficient (in many cases optimal) encoding which seems to be the same thing as analytical reduction to patterns what we do in our brain. The power of DL nets is only limited by hardware - ie. not limited at all going forward, and thus i think we, humans, will soon be left in the dust behind.
Extremely excited about this. I am wildly creative and talented from a mathematical standpoint. It's probably the best language I speak along with being the one that I am most proficient in. Mathematics basically comes to me as common sense.
The only thing that might throw me off is notation, which I have developed cheat sheets for. Using the cheat sheets, I verbalize the expression as plain English syntax in sequential order, linearizing it. This always clears up the confusion.
2) Device a neural network that leverages structural properties of the space you wish to investigate
3) Encode relationship in a manner allowing use of supervised learning and see if the net can learn a pattern.
4) If it failed, rethink pattern and go back to step 2 or give up; else use attribution and explainability tools to try and extract human understandable concepts. Go back to step 2 until human converges
5) Use the extracted concepts to generate a conjecture or aid a proof.
6) Profit
In a way it's like AlphaZero in that a nnet is helping the searcher prune a highly branching decision space.
Deepmind recently published another paper [for chess](https://arxiv.org/abs/2111.09259) which again was about extracting hidden knowledge from a neural network. Authors hope this can be an avenue through which machine learning can be useful to mathematics research. But even in the cases where their meta-algorithm works, I think they underestimate the difficulty of having expert neural network practitioners on hand, without which I expect step 3 failures will be massively inflated.
----------
Choice extracts:
> In each of these cases, the necessary models can be trained within several hours on a machine with a single graphics processing unit
> Further, in some domains the functions of interest may be difficult to learn in this paradigm. However, we believe there are many areas that could benefit from our methodology.
> The Bruhat interval of a pair of permutations is a partially ordered set of the elements of the group, and it can be represented as a directed acyclic graph. For modelling the Bruhat intervals, we use a message-passing neural network. We add two features at each node representing the in-degree and out-degree of that node.
> First, to gain confidence that the conjecture is correct, we trained a model to predict coefficients of the KL polynomial from the unlabelled Bruhat interval. We were able to do so across the different coefficients with reasonable accuracy giving some evidence that a general function may exist, as a four-step MPNN is a relatively simple function class. We trained a GraphNet model on the basis of a newly hypothesized representation and could achieve significantly better performance, lending evidence that it is a sufficient and helpful representation to understand the KL polynomial. To understand how the predictions were being made by the learned function f^, we used gradient-based attribution to define a salient subgraph SG for each example interval G
Not in favor of the headline. The deepmind blog clearly states
"Our results suggest that ML can complement maths research to guide intuition about a problem by detecting the existence of hypothesised patterns with supervised learning and giving insight into these patterns with attribution techniques from machine learning"
Since the title of the article we changed it to is vague and baity, I replaced it with a representative sentence from the article body, i.e. where Williamson describes the purpose of the research.
Can anyone elaborate on what these attribution techniques might be? My understanding is that explainability is a bit of a holy grail for DNN research - are reliable techniques starting to emerge?
Wasn’t Einstein just talking about Duck Typing? Seriously. Think about it. Do we live in a slideshow universe? Have we fully considered every consequence of not living in a slideshow universe?
This is pretty f*&king exciting if you ask me.