The sensitivity conjecture is actually pretty simple to understand, intuitively. (Its reduction to applications, and its proof, of course, are both a little bit more of the "magical" parts so-to-speak.)
The 'canonical' way of thinking about it is in terms of the hypercube graph, which is defined in the following way:
- The dimension-0 hypercube graph is... just a single vertex.
.
- The dimension-1 hypercube graph is two vertices connected by a single edge.
⋅-⋅
- The dimension-2 hypercube graph is four vertices, connected in a square pattern:
⋅-⋅
| |
⋅-⋅
- The dimension-3 hypercube graph is the graph representing a cube!
- The dimension-4 hypercube graph is the graph which takes a cube, makes a copy, and then connects the corresponding vertices.
...
- The dimension-n hypercube graph is the graph which takes a dimension-(n-1) hypercube graph, makes a copy, and connects the corresponding vertices.
Ok, so let's play a game (for now, think about the 3-dimensional cube graph as a concrete example): if you could color the vertices of the cube red or green, what is the largest number of red (or green) vertices you can find such that no two of the same color are adjacent to each other? Note that the 3D hypercube has 2³=8 vertices, and, more generally, the nD hypercube has 2ⁿ vertices, so you have a lot of vertices to color :)
So, it turns out (and I will give it to you as an exercise!) that you can color 2ⁿ⁻¹ vertices of the cube with two colors without any vertices of a single color being adjacent to each other, but, if you try changing the color of any one vertex then at least one of the colors will have (at least) √n neighbors that are the same color! (It may be worth reading this once or twice :)
This is the sensitivity conjecture! The fact that changing any one color would immediately imply that now you have at least √n neighbors that are all of the same color. (This is a slightly funky restatement of it, the usual one is in terms of subgraphs or boolean functions, but one is easily mapped to the other.)
It turns out a lot of important problems (in things like voting systems, for example) can be reduced to proving this conjecture, which gives a rather neat set of results "for free" when proving it, and why it's been such an interesting object for the past uhh, quite a few years in computer science :)
FWIW, I found your explanation helpful, but introducing two colours confused it needlessly for me.
I'd say: You have a hypercube of dimension n with 2^n vertices, and you start colouring one after another of them, without having any two adjacent vertices coloured.
You'll find that you can mark half of them (that is, 2^(n-1)) without any having a coloured neighbour (basically, you always go diagonal) - but as soon as you mark even one more, there'll be at least one vertex with at least sqrt(n) marked neighbours.
That's not the same thing. As a simple counter-example to what you're saying, given a 3D cube you can colour every vertex with either red or blue such that adjacent vertices are different colours.
OK, I've mis-understood what people are saying. I often find these colloquial re-phrasings more difficult and convoluted than the precise and exact statements, but perhaps my confusion will serve others in thinking through things.
Oh, I think both of these statements are quite precise (in the sense that reading them and turning them into math is rather immediate). Otoh, you are right and we are saying different things: I’m giving an equivalent formulation with a two-coloring (but there’s a bit of work to actually map it to the original problem) while the other post gives a direct formulation (though the “colors” part of the coloring is irrelevant, in that latter case).
I like graph theory a lot (even though my prof in uni was horrible at teaching it). One thing I like about it is at least we can easily visualize most graphs, as opposed to say, n-dimensional spaces from real analysis. I also like that coloring vertices is a thing, and that so many problems can be represented by graphs and we can apply so many graph algorithms on top of those graph representations.
One thing that always makes me laugh though is the terminology in math. Things like "hypercube" just get a chuckle out of me.
I didn't know what the sensitivity conjecture is but your explanation using coloring a graph is very straightforward.
> So, it turns out (and I will give it to you as an exercise!) that you can color 2ⁿ⁻¹ vertices of the cube with two colors without any vertices of a single color being adjacant to each other
For n=3, a three-dimensional cube, you're saying I can colour four vertices.
But I can colour all eight.
r --- g
|\ |\
| g --+ r
| | | |
g +-- r |
\| \|
r --- g
And I think there's a simple method to take a completely coloured n-cube and construct a completely coloured n+1-cube: make a copy, flip the colours, then connect corresponding vertices.
I don't think you misunderstood (or at least I interpreted GP's post the same way). I suppose GP meant to say that you can color all 2ⁿ vertices of the n-dimensional cube such that precisely half of them (2ⁿ⁻¹) are red and half of them are green and such that no vertices of a single color are adjacent to each other.
Yes! Sorry, I mixed up two sentences in my head there. I meant to say that you can color 2ⁿ⁻¹ green and 2ⁿ⁻¹ red (or, alternatively, you can color all 2ⁿ vertices with one of green or red) such that no two vertices of the same color share an edge. (Which is what you describe!)
The actual paper is rather easy to understand [1]. I glanced through it for a bit. Your construction of "Make a copy, flip the colors, the connect corresponding vertices" is very similar to the construction of the matrix A used in lemma 2.2
This matrix is essentially the adjacency-matrix of the hyper-cube, except with a few minus signs. Take the hyper-cube to have weighted edges of either 1 or -1, then the construction is. Take two hyper cubes, connect them, flip the sign of all internal connections in one of the cubes.
Well, turns out you’re completely right and I fucked up my reduction (quite badly it seems)! Alas, it also turns out I can’t edit nor delete the post, so, hopefully people get this far down the thread and read this :)
A vertex on an nD hypercube has n neighbours, so how do you manage to colour it to have √n neighbours? Shouldn't it be n? If it's √n, then do you round it up or down?
Sorry, if it wasn't clear it should have √n neighbours of the same color! (The number of actual neighbors of each vertex is fixed, as you said, to n :)
This is a bit of a random comment @dang, but it's always a bit of a shame when a long explanation gets posted and the comment that it is responding to dies or is downvoted for some reason or another. I've seen this happen a few times and am wondering if there is a possible fix?
I guess this is the risk one takes when responding to comments, but it feels like there might be a reasonable tradeoff/possible solution! :)
Yes, it's a problem. We can do things but unfortunately they're all manual for now, so we have to know about it, and that's a bottleneck. Thanks for letting us know!
From Scott Aaronson's blog (linked in the article):
> Another Update: In the comments section, my former student Shalev Ben-David points out a simplification [1] of Huang’s argument, which no longer uses Cauchy’s interlacing theorem. I thought there was no way this proof could possibly be made any simpler, and I was wrong!
> Aaronson and O’Donnell both called Huang’s paper the “book” proof of the sensitivity conjecture, referring to Paul Erdős’ notion of a celestial book in which God writes the perfect proof of every theorem. “I find it hard to imagine that even God knows how to prove the Sensitivity Conjecture in any simpler way than this,” Aaronson wrote.
I found this unreasonably funny, given we’re posting on a tech-focused forum. It’s like posting about “a proof by some dude called Albert Einstein” on a physics forum. :)
I really enjoyed Ben-David (and Shalev-Schwartz)'s book "Understanding machine learning". It's essentially about theory though, not "learn all about machine learning in torchsorflow in 10 days".
When of course, in reality rather than clickbaity titles, they only wrote up the final, concise proof using two pages _after_ thinking about, and working on, the problem for seven years, building on decades of specific research combined with seven years of learning new mathematics (new to Huang, not necessarily new to the world) that might offer ways into cracking this problem.
It's been interesting to watch the rapid degradation of the term clickbait, we've quickly gone from the original targets of the term ("Local mother discovers one small trick", "He did X, you WON'T believe what happened next"), to it now being used in probably a third of HN's comment sections to describe any title which is even mildly editorialised and not a longwinded statement of plain facts.
If anything I think the term clickbait is used more appropriately now. "Local mother discovers one small trick" is terrible clickbait because most people are used to it now in the same way a Nigerian prince scam won't work on most people.
Clickbait is anything that gets you to click on an article. In some cases, it can be good clickbait and an equally good article and in other cases it can be bad clickbait where the title states facts but the facts are misleading or require a specific interpretation or need more context or don't accurately describe the essence of the article.
I suppose the title could be slightly more accurate by replacing "solves" with "proves", but I think it's fine. The fact that proof is only 2 pages long is highly notable. This is not a case of a headline making a typical event look more notable than it is.
The closest I get in my career is striving for the simplest approach to building something in software, which is not all the easy path. The easy path always leads to complexity -- just look at any enterprise software project that's more than a year to two old.
I find the process of striving for simplicity gratifying and reading this article about similar -- but much longer process -- put a smile on my face.
The hard part about new simple things is getting people to understand the implications. Most people will shrug off something new and simple as the same as just another new complexity, when the implications are vastly different.
I don't see anyone arguing otherwise, nor is the title clickbait. The concision and comprehensibility of the proof is part of what makes it remarkable. Just like the opposite size and complexity of Wiles' 'Last Theorem' proof was part of its story.
I remember when this came out. I read the paper out of curiosity and was amazed how simple it was to follow, even though I hadn't even heard of the sensitivity conjecture before. It's beautiful.
The concept of sensitivity is actually not too far from the questions asked in ML. "The minimum number of bit flips needed to change the answer" -> adversarial attacks, prediction stability, corner case performance. "Which bit flips are needed to change the answer" -> feature importance.
I remember this article from 2019. I remember the writing style struck me then as being a little strange.
My first thought was "who's the mathematician"? But the article doesn't actually mention the author's name until a few paragraphs down, electing instead to put the spotlight on Scott Aaronson and other commentators. I felt then as I do now that it was a little lacking in respect.
I agree, it's extremely distasteful. I feel the author has cut their teeth writing clickbait headlines to make information extraction harder, the way a supermarket puts the milk at the far corner of the store. I want a link/cite to the original paper in the first para.
There's a econtalk or maybe a planet money that goes into detail on this: the cold chain is super important in food safety, especially for stuff like milk. You want to be able to move from refrigerated milk truck to grocery store food fridge rapidly, so you put the milk near the loading dock. And of course, you keep the loading dock as far away from customers as possible so they're not inconvenienced.
Moving the cold chain closer to checkout involves additional expenses, so it works more for higher margin products, and ones a bit more forgiving if there's a thaw, like Coke, Monster energy drinks, and canned coffee from Starbucks.
So the question becomes: would you pay 50 cents more for milk from a fridge at checkout, when you can get the same thing at the back?
But in the spirit of the question, search for "dark patterns" and "consumer nudging". Also, there were some Chicago economists writing on the topic. I don't have the book to tell you title/authors, but it was again something on nudging and exploiting the fast thinking for long term goals.
Had a similar experience with another (also Quanta?) article last year about some optical physics breakthrough made at my not-very-well-known alma mater, but the article was written as if the work was all done at MIT, which one of the collaborators was affiliated with. You had to scroll to the second page to see the name of the institution that the PI was actually at.
It is notable that in English, Chinese names are generally written given-family, but in French the original family-given order is preserved. I'm curious how other languages handle this -- any takers?
> in English, Chinese names are generally written given-family
Is that right? All the famous Chinese people I can think of are known in english by the family name first. Names of Chinese players of chess are always family-given in english.
I read after a moment's google that "Chinese people working in western countries usually adopt western order", maybe that's it.
Indian names I believe are similar, family or parent's name comes first[0], e.g. the ex-chess world champion Vishy Anand's name is Viswanathan Anand – Anand was/is actually his given name, but it's treated by english speakers as his surname. I read an interview where he said even he doesn't know any more which is his surname and which his first name hehe.
[0] Same in Hungarian, but always converted to given-family in english.
Possibly this depends on the subculture. One example picked at random, the (fabulous) film Black coal, thin ice is reported in Cahiers du Cinema as directed by Diao Yinan [1], but in IMDb as by Yi'nan Diao [2]
To be fair, it does link to the paper in literally the first words, which is in some sense the most important thing. But let's go ahead and record the name here since the article takes its time: the solver is Hao Huang.
I read it as more of an attempt to frame Hao Huang’s accomplishment in a way that a lay audience could better appreciate it. I didn’t get any sense that it was disrespectful.
Yes! How hard is it to have a headline that says "Mathematician Hao Huang solves Sensititvity Conjecture"? It's nearly impossible to distinguish news stories from the force-fed advertorials as it is. Now journalists are electing to imitate them?! "Texas Teacher Discovers Mortgage Loophole" indeed....
Perhaps it's just that the article isn't written in inverted pyramid style? In non-newspaper writing (e.g. a novella) having the main character be introduced in the 8th paragraph wouldn't be that odd.
Mathematics often deal in abstractions, and "God" may be used as a linguistic abstraction for ideal or perfect knowledge. It's a shorthand in writing casually about the most elegant mathematical insights. Its use is also a tradition via Erdos with "The Book", which Aaronson and many mathematicians pay homage to.
Right. AFAIK there is no universal Math deity like there is Caissa for chess that can serve as a representation of "perfect math", God is the next best thing.
If it's any consolation, Erdős himself said in one famous lecture that the "God" part is irrelevant: "You don't have to believe in God, but you should believe in The Book."
I think about this sometimes. I think it's related to my interest in the philosophy of mathematics. From the SEP article [1]:
> Bernays observed that when a mathematician is at work she "naively" treats the objects she is dealing with in a platonistic way. Every working mathematician, he says, is a platonist (Bernays 1935).
I think that's why. There's a tendency to think of mathematical objects platonically. And given that it would be very odd if platonic, mathematical objects interacted causally with our universe, well, it's a short hop, skip, and a jump over to that meaning they're things a God made. And here I mean "God" in a root-of-all-things sense, not a Judeo-Christian or similar sense.
Also, I don't think you deserve the downvotes you're getting. There may be a tone of condescension, and there may not be. But there's a totally charitable way of interpreting that question -- you noting a trend that you perceive -- and that's how I took it.
It's a shortcut, they're not literally invoking "God". It's an encoding of:
We don't know what the optimum is, but we can imagine the complete set of possible things, and from those there will be one that's best. So that's the optimum, and we want to refer to it. We don't know what it is, but we assume it exists, so let's just call it 'God's solution'
Indeed, but I expect that that for any theorem, that for some level of proof quality which is attained, that all proofs that are at least that good, all have at most a certain finite length.
And, as there are only finitely many proofs of any finite length, then the supremum of proof quality over any proofs of that theorem, is attained. (whether there is usually unique proof which attains it, I don't care to guess.)
B) It is a figure of speech that somewhere there is a book of "perfect" proofs/blueprint of the universe. Presumably whatever concept you call "God" has access to this, just like "Death" has a list of all living things etc..
FWIW, most mathematicians surveyed are "Mathematical Platonists," meaning they believe that mathematical objects exist in a way outside the human mind/the physical universe.
This mathematical "realm" is a very natural home for the variety of God a mathematician is likely to imagine.
Alternatively, if you like, God is one of the most powerful metaphors we as humans have.
I think it’s because God is often seen as the infinite, or uncreated, or all that can be known. The something that exists apart from any material existence.
Mathematical relationships and patterns have this quality.
Really what I wonder is why can’t we all just believe in god again?
Even if it was just to recognize the fact that there are things that exist in an immaterial way. That there are things that are good and true and beautiful whose goodness truth and beauty is not quantifiable or reduced to some set of atoms. We might disagree on what things are better physical manifestations of the immaterial things, but we don’t need to disagree on their existence.
Though I think it is because so many people have said “god is like this” and then posit some religious thing that affirms their often heinous actions, that people have chosen to say yeah I don’t believe in that.
But it seems sad that we can’t just talk about god in the sense of there are things bigger than ourself, even if we don’t know them because they are unknowable.
Even if we just acknowledged “God” as the first cause or the existence of things not material (like 2+2=4 is true even if there was no material thing to represent it), I feel like we could go a long way to finding that us humans, usually have more in common than we have not in common. That the things that divide us are actually smaller than the thing that unites us.
(I don’t see why your question was downvoted by the way)
I think that would be because God is nature (for those who believe in God) and math is how humans describe nature. That would be the same as saying that nature can't be expressed in its own language any simpler than this.
In most religions, God is an omnipotent, omniscient, and generally perfect being.
Mathematicians simply borrowed that definition.
For example, Rubik's cube "God's number" is 20. Because, of course, God, as an omniscient, omnipotent perfect being will solve the puzzle in the minimum possible amount of moves. And that number is 20 in the most complex case.
Computer science have a similar, more formal concept with oracles. An oracle is a machine that can quickly solve a problem. How it does it is not the question, it can be supernatural, but we simply assume that such a machine exists in order to study some theoretical problems.
It's not a terrible question, and even if the reference to God is via Erdős, there's still a reference. I think it's evident in the sciences as well, e.g. "God does not play dice with the universe."
The 'canonical' way of thinking about it is in terms of the hypercube graph, which is defined in the following way:
- The dimension-0 hypercube graph is... just a single vertex.
- The dimension-1 hypercube graph is two vertices connected by a single edge. - The dimension-2 hypercube graph is four vertices, connected in a square pattern: - The dimension-3 hypercube graph is the graph representing a cube!- The dimension-4 hypercube graph is the graph which takes a cube, makes a copy, and then connects the corresponding vertices.
...
- The dimension-n hypercube graph is the graph which takes a dimension-(n-1) hypercube graph, makes a copy, and connects the corresponding vertices.
Ok, so let's play a game (for now, think about the 3-dimensional cube graph as a concrete example): if you could color the vertices of the cube red or green, what is the largest number of red (or green) vertices you can find such that no two of the same color are adjacent to each other? Note that the 3D hypercube has 2³=8 vertices, and, more generally, the nD hypercube has 2ⁿ vertices, so you have a lot of vertices to color :)
So, it turns out (and I will give it to you as an exercise!) that you can color 2ⁿ⁻¹ vertices of the cube with two colors without any vertices of a single color being adjacent to each other, but, if you try changing the color of any one vertex then at least one of the colors will have (at least) √n neighbors that are the same color! (It may be worth reading this once or twice :)
This is the sensitivity conjecture! The fact that changing any one color would immediately imply that now you have at least √n neighbors that are all of the same color. (This is a slightly funky restatement of it, the usual one is in terms of subgraphs or boolean functions, but one is easily mapped to the other.)
It turns out a lot of important problems (in things like voting systems, for example) can be reduced to proving this conjecture, which gives a rather neat set of results "for free" when proving it, and why it's been such an interesting object for the past uhh, quite a few years in computer science :)