If you like this, there is a whole book full of visual proofs [1]. See also wikipedia [2].
A few years ago I re-drew a bunch of these in latex with my PhD advisor and another colleague [3]. We planned to print them as posters and hang them for a Pi day event that unfortunately never happened because the pandemic broke out.
Thank you and congratulations for this incredible work. Maybe consider adding some type of reference/credits to the PDF?
I think it would be useful for people downloading this and then not remembering where they got it from. It would be cool to be able to give credit where its due :).
This reminds me of this video about why we need to be very careful when inspecting visual proofs: https://www.youtube.com/watch?v=VYQVlVoWoPY (it includes a "proof" that pi equals exactly 4).
In this case, as someone else pointed out below, this proof has unjustified assumptions in it (at least it assumes that b < a).
My geometry teacher in 9th grade was adamant that we couldn't assume lengths and angles in diagrams except where explicitly labeled. In particular, he said that we should never assume that a diagram is drawn to scale; if the diagram happened to depict a quadrilateral as a square, unless something stated that it was a square (or we had sufficient information to determine that it was), we should treat it as an unidentified quadrilateral and proceed only with the actual information provided or else he would "take more points off than the question was worth" on tests. On one of our quizzes, he did provide a diagram that looked like a kite (two pairs of adjacent sides each with the same length but different length than the other pair) but listed the angles in a way that could only work for a parallelogram that was not a kite, and he made good on his word to take off extra points for people who misidentified it as a kite.
There is no problem with the proof except the assumption that the value in the limit is the same as the value at infinity. If we simply define pi(n) as a function from N U {inf}, which gives the value that "pi" takes at the nth step of the process, and pi(inf) as the value that it actually takes for the circle, then we simply have a function where lim n->inf pi(n) ≠ pi(lim n->inf). For all finite n, it equals 4, and then at infinity it equals 3.1415... .
There are ways to reformulate the above so that "infinity" isn't involved but this is the clearest way to think of it. It isn't much different than the Kronecker delta function delta(t), which is 1 at t=0 and 0 elsewhere. We have lim t->0 delta(t) ≠ delta(lim t->0 t).
Why not? Either b < a, b = a or b > a. If b = a both sides are zero and the result is trivial. If b > a just swap the names of a and b and multiply both sides by -1.
With this additional justification I accept it, yes. But (imo) now you are doing algebra. And then you might as well prove it just by distributing the RHS of the original expression.
This additional justification is so trivial it's just left out for brevity. In general a lot of statements are accepted even if not each and every detail is spelled out.
Let me fully spell out the proof that generality is not lost
Assume that b > a.
1. Swap the names a and b.
b^2-a^2 ?= (a+b)(b-a)
2. Multiply both sides by minus one
a^2-b^2 ?= -(a+b)(b-a)
3. Absorb the minus into the second factor
a^2-b^2 ?= (a+b)(a-b)
This is the same as the original equality we wanted to prove.
Now compare that to the purely algebraic proof for the whole theorem.
1. Distribute over the first parenthesis
(a+b)(b-a) = a(a-b) + b(a-b)
2. Distribute again
(a+b)(b-a) = a^2-ab + ba-b^2
3. Cancel equal terms
(a+b)(b-a) = a^2 - b^2
The proof that generality is not lost is of similar length to a full proof of the theorem!
So if you think the former can be skipped, then you must accept a proof of the whole theorem that simply reads "Follows from trivial algebraic manipulation"
They may be equally long, but length when written is IMO not a good measure of complexity.
The 2nd I would need pen and paper for to keep my head straight doing the distributive law. The 1st I can do in my head as:
"If b is larger than a, the left side will be negative instead of positive, but also "b-a" gets a negative sign, and these cancel each other, so it is the same"
But I agree with you anyway ...this is indeed "doing algebra".
I agree that the figure would have been better if it said that in that figure it displays the case where a is larger than b. And we should probably call it visualization of an algebraic proof, instead of visual proof.
Sorry, but no. First, "just switch a and b" is clearly wrong, and we had multiple people propose that (Without the adjustment of sign). So not that trivial, I would argue. At least not more trivial than the underlying equality one wants to prove in the first place.
if you just extend the metaphor in the diagram, and imagine a negative length to just refer to direction, sure it does :)
personally, I love visual proofs because they can communicate an idea efficiently, sure they have their pitfalls, but its less about the actual mechanism of the proof and more about the core idea that lets me appreciate how its working- and visual proofs add a pseudo-physical intuition that helps me appreciate it.
Trying the proof with a < b, with the b square from the bottom-right as in the diagram, I get a region to the top and left, and moving a piece (differently to the diagram) I get (a + b)(b - a) as a positive area for that region, and then flip the sign because it's negative.
You can do this if you extend your concepts of length and area to signed quantities. You have to be clear to explain how the setup of the drawing works, but instead of cutting away a square of side b out of the corner of the square of side a, you might end up appending a negatively-signed square of side −b.
.. now that I think about it, the "visual proof" only 'proves' the statement for a specific 'a' and 'b'. Probably there is a proof, that can handle all 'a' and 'b' pairs at once.
A visual proof is supposed to appeal to our visual intuition - I don't know about you, but negative areas are not something that is visually intuitive to me.
If you're bothered by the idea of negative-area rectangles, there's no need to justify the assumption that b < a, because that's the only way you can assign any meaning to either side of the equation.
The equation isn't about rectangles at all, so I disagree. The rectangles only exist for the purposes of this proof. Any resulting complications are on the proof, not the reader's burden.
One easy place to get some intuition for signed areas is in the context of integrals.
You know position is the integral of velocity right? So say you walk in a straight line from your starting point, then you keep slowing down until you come to a stop and walk backwards past your starting point.
If you were to graph your velocity vs time at some point it would dip below the t axis because your velocity would be negative. Ok cool. If you integrate from the point you came to a stop and started walking backwards you’re calculating the area above the negative velocity curve(between it and the time axis). You’ll find it is a negative area. You know it has to be negative because you walked backwards past your starting point so it gets so negative that it cancels out all the positive area from when you were walking forwards.
I don't see that as negative area. I see it as subtracting two areas. Both the room and the pillar have a positive area, none of them has negative lengths or areas.
Kids learn subtraction before they learn negative numbers - once you learn negative numbers, you know that addition and subtraction are almost interchangeable, but this is not necessarily intuitive to begin with.
It’s intuitive, just in more dimensions. People have different ideas and abilities to imagine/think dimensions, but on top of that we rarely train them to do that.
I think that build up to tensor fields should be in every school program. If you can’t think of a field, you’re mathematically disabled and too many basic ideas about real world are inaccessible to you. This limits the ability to vote on a set of topics and participate in non-local decisions that involve systemic understanding. Same for formal logic and statistics.
Once familiarized with that, you can easily start thinking of nonlinearly signed areas, complex areas and areas simultaneously positive and negative by an attribute.
That's interesting. To me it seems intuitive. It's a real area that can be drawn like any other. The sign is an operator describing what function to visualize, not a property of the measured area. So thinking of it in that way eliminates any need for the term "negative area."
But, intuition is subjective, so you may need to adjust the terminology to fit the visualization.
It is intuitive on some higher level, now that I know about signs, operators, negative numbers and all that. But when talking about visual information (i.e. "visual proof"), it isn't intuitive in that context.
In addition to that, for all I know there could be some pitfalls involved with negative areas which I'm not aware of. Even if there aren't any pitfalls, this isn't immediately obvious to someone who isn't familiar with the concept of negative area.
If I'm willing (or forced) to think in such abstract terms, I would much prefer an algebraic proof to this visual proof.
There are no serious pitfalls with oriented areas. Adding them to your arsenal of geometric proof tools will greatly simplify many proofs. Not having such a concept makes ancient geometry books much more complicated than they need to be, often requiring lots of detailed case analysis where the separate cases are essentially the same, just oriented opposite ways.
The concept of negative area still feels like it'd get messy in a hurry. For a square pillar, the side lengths should be the same, suddenly giving you imaginary lengths just for the eventual area subtraction to work out. For a negative volume though, you need cubic roots of unity for the side lengths, throwing off your area calculations. Has anyone actually put together a system where the sort of concept you're describing is cohesive?
Gotcha; it's less that my complaint doesn't apply, but more that it isn't relevant (i.e., "squares" and "cubes" aren't especially interesting constructs which need to coexist nicely, and if you relax that constraint then directional geometry can be very interesting). Does that sound right?
This sort of thing is more useful for teaching than being an actual proof. You teach the same concept in different ways and the students can form a more solid understanding of the underlying concept independent of symbols or shapes, or at least they may understand it one way if they don't get the other ways.
You can do all sorts of factorizations the same way and handle negative areas by drawing them in a different color.
I vaguely remember from an Analysis lecture when Mathematical rigor wasn't such a thing, the professor mentioned a famous Mathematician (maybe Euler?) who had a notebook with hundreds of proofs where many of the visual ones didn't turn out right. Definitely nice for intuition though since this is oftentimes lacking
No, it assumes "one side is smaller than the other": the labels are arbitrary, so there iss no case where b is larger than a as we can just swap the labels and get a>b again. Even though you might think there are two cases that we need to look at because we're using two letters, there's actually only on case to demonstrate because we're demonstrating a property of the inequality.
The only "but what if..." would be if a=b, which has no geometric proof, but also doesn't need one because "zero = zero times anything" is (by definition) true for fields.
The problem is not symmetric between a and b (because a-b isn't). You cannot swap them. (And even if, this swapping is not part of the "proof", so in any case the proof is incomplete.)
It's fully symmetric. Basic arithmetic covers the following identities:
1. (a² - b²) = -(b² - a²), because of even powers
2. (a - b) = -(b - a)
So the following two statements are the same statement:
3. (a² - b²) = (a - b)(a + b)
4. -(b² - a²) = -(b - a)(b + a)
Let's assume this only holds for a>b (because we're content that the geometric proof shows that):
3a. (a² - b²) = (a - b)(a + b), a > b
But we don't know if it also holds for b>a... after all, how would you show a negative areas? What does that even mean Turns out: it doesn't matter, the b>a relation reduces to the same formulae as the a>b relation, so the geometric proof covers both. To see why, some more elementary algebra: we can invert both sides of (4), provided we also invert the relation between a and b, so this:
4a. -(b² - a²) = -(b - a)(b + a), b > a
Is the same as this:
4b. (b² - a²) = (b - a)(b + a), a > b, by inversion
Of course, algebra doesn't care about which labels you use, as long as the identities and relations between them are preserved, so we can swap "a" for "b" and "b" for "a" in both the identity and relation in (4b) to get:
4c. (a² - b²) = (a - b)(a + b), b > a
And we found a symmetry that we (maybe) didn't realize was there:
3a. (a² - b²) = (a - b)(a + b), a > b
4c. (a² - b²) = (a - b)(a + b), b > a
Same formula, inverse relation. It turns out that it doesn't matter whether we start with a>b or b>a, they reduce to the same expression, thanks to those even powers, and a geometric proof for one is by definition a proof for the other.
That's the point. The geometric proof requires that you show the applicability for the b>a case with algebra. If you don't, it's not complete. And if you do, you can just show everything with algebra in the first place, and shorter, and also for a,b element C (instead of R), at the same time.
I'd have to disagree - it is perfectly fine to show a visual proof and simply state that we can reduce b>a to a>b and this proof therefore covers both, optionally with a little "I don't believe you, show me the math" pop out.
The visual proof is the neat part that people literally can't think of unless you show it to them, after which things might suddenly click for them. The algebraic proof is boring AF and doesn't make for a good maths hook ;)
IMO GP is about the labelling: Instead of a2-b2 use longer2-shorter2 (and on visual representation one side will always be longer, as per GPs explanation).
Diagram doesn't show proof for "shorter2-longer2". I believe showing negative area would spark more controversy (imagine that negative area is painted orange, visual area would still be positive).
Shorter2-longer2 gives you same absolute value with reverse signed, so it feels symmetric to me (can't remember what the formal definition is), i.e:
> we need to be very careful when inspecting visual proofs
This is not a visual proof but a nice visualization, like a written explanation that is not a proof.
For an actual novel proof, nobody would imagine that they could eyeball it for a few minutes and conclude it was complete, correct, and consistent - maybe with the exception of professional mathematicians examining for simple proofs. You might eyeball it and follow its logic and not see any immediate flaws, but that's different.
I find this much more "useful" since the Pythagorean theorem isn't immediately intuitive to me.
As for the proof in the original post, it seems really redundant to me. it follows from a (b+c) = ab + ac.
And while building intuition for this distributive property of multiplication is extremely essential when teaching maths, I feel that the intuition for why this is true is better built without leaning on geometry.
One time I was really sure that splitting a line into three equal segments let you draw lines from a point and split a 60 degree angle into three 20 degree angles, but this wasn't actually true.
I don't feel like it's more redundant that Pythagorean theorem though, as we can say that the later directly follows from the definition of dot product…
How does Pythagoras's theorem follow from the definition of the dot product?
Do you mean that x.y = x₁y₁ + x₂y₂ and x.x = |x|², so it follows directly from that? If you define the dot product to be the first of those identities then you need Pythagoras's theorem to prove the second, so your argument is circular.
(Or you can prove that x.y = |x| |y| cos θ but that's even further removed from the component-wise definition than Pythagoras's theorem. Or you can define the dot product that way, but then you still have to prove the component-wise formula from it.)
Normally you define the dot product (or inner product more generally) with a few conditions that it has to satisfy, and then define two vectors to be orthogonal if their dot product equals 0: <u,v> = 0.
Then Pythagoras's theorem falls out of that - if we have two vectors u, v that are orthogonal, we can use the conditions in the definition of the dot product to prove that ||u + v||^2 = ||u||^2 + ||v||^2 (where ||u|| is the norm of u, defined as sqrt(<u,u>)).
(It's not a hard proof, because the definition of dot product says it's additive in the first slot, meaning <u+v, w> = <u,w> + <v, w>. So it's easy to prove things about sqrt(<u+v, u+v>) by splitting out the u's and v's. It's a bit hard to write this on HN because no mathjax though.)
If you mean you'd start with the axiomatic definition of an inner product space (bilinear, symmetric, positive definite) then I agree that proving that formula is trivial. The problem with a really abstract approach like this is that you haven't actually proved Pythagoras's theorem in Euclidian space. How do you know that the usual definition of dot product satisfies these axioms (or indeed anything does)? How do you show that the "distance" you've defined from the inner product is what we'd expect to be distance (e.g. that it's rotation invariant) or that what you've defined as "orthogonal" from your inner product is related to angle in Euclidean space?
It's not hard to show all that, but by the time you've done it you'll have accidentally proved Pythagoras's theorem along the way. It's like you've gripped on a tube of toothpaste and said "see, there's nothing there" but really you just squeezed it all to the other end.
As a fun illustration of this, note that <x,y> := 2x₁y₁ + x₂y₂ satisfies the axioms but doesn't give you the normal distance measure (e.g. it's not rotationally invariant: |(1,0)| = 2 while |(0,1)| = 1) and has a different notion of "orthogonal".
If you want to prove Pythagoras's theorem on Euclidean space, aren't there about a thousand proofs from that? Including the semi-original from Euclid? I assume it was proved there. And yes of course, you have to start with a bunch of axioms and earlier proofs about Euclidean space, but that's always true isn't it?
> It's not hard to show all that, but by the time you've done it you'll have accidentally proved Pythagoras's theorem along the way. It's like you've gripped on a tube of toothpaste and said "see, there's nothing there" but really you just squeezed it all to the other end.
Funny metaphor. Yes, I don't think you can "simply" prove Pythagoras's theorem without a bunch of background assumptions, it's just that usually these are all assumptions we've already learned (explicitly or implicitly). And if you want to start without assumptions, like in the case of defining inner product space from scratch, then you are by definition starting abstractly and therefore left with the problem of showing this maps onto Euclidean space, somehow.
I agree that there's are lots of proofs of Pythagoras's theorem on Euclidean space. The comment I replied to said that it follows "directly" from the definition of the dot product. That's all I was disagreeing with. They had missed that they were using some or other property of the dot product that was actually proved from Pythagoras in the first place, or some other non-trivial fact about Euclidean geometry.
And I certainly don't mean to imply anything about the importance of abstract inner product spaces. In fact my masters thesis was about Hilbert spaces. And I find it pretty interesting that you can prove something like Pythagoras on the inner product form I mentioned at the end of my last comment.
Ah yes, you're right of course, and I haven't really thought about it in this way before - that either you're based on "real world" geometry, in which case things are a bit harder to prove but make sense, or you're more abstract, in which case you can define things to be easier to prove e.g. Pythagoras, but the complexity is in the mapping between your definitions and the "real world".
> In fact my masters thesis was about Hilbert spaces. And I find it pretty interesting that you can prove something like Pythagoras on the inner product form I mentioned at the end of my last comment.
That's pretty cool, you're definitely more knowledgeable than I am, I'm just a math amateur :)
It may seem trivial, but to use that to prove the component-wise formula for general vectors you're assuming distributivity of the dot product over addition of one of its arguments. But if you're starting with the x.y = |x| |y| cos θ definition, how do you prove that (without first going via the component wise definition that you're still in the process of proving)? You end up needing trigonometric angle formulae that are at least as hard to prove as Pythagoras's theorem.
Sorry, but you can't bypass proving Pythagoras's theorem by definition of the dot product or anything else.
> You end up needing trigonometric angle formulae that are at least as hard to prove as Pythagoras's theorem.
(emphasis mine)
Well that's not wrong (because proving Pythagoras' theorem is pretty straightforward anyway) but at the same time the one trigonometric formula you need (cos(a-b) = cos(a)cos(b)+sin(a)sin(b)) “follows from a (b+c) = ab + ac” if you start from Euler's formula.
It's so straightforward that you've used the exponential function on the complex plane to prove it?! You started by claiming that Pythagoras's theorem follows "directly" from the definition of the dot product. Can you admit that we're now quite far away from that?
To be honest, your comments have been quite low effort. They amount to "yeah but that bit's pretty easy too" while leaving it to me to work out how (and whether) your points fit into a coherent proof. I do get why this stuff all feels so trivial: we usually skip over it in higher-level proofs. But the only reason we can is that we can use nice abstractions like the dot product with its equivalent definitions, and that's thanks to the foundation these lower-level theorems provide.
> It's so straightforward that you've used the exponential function on the complex plane to prove it?! You started by claiming that Pythagoras's theorem follows "directly" from the definition of the dot product. Can you admit that we're now quite far away from that?
I does follow from the definition and common properties of the dot product, which was my original point. But you claimed it was circular because these properties derived from Pythagora's theorem, and so we've ended up showing it doesn't need to. And this later part was obviously much more involved than just “using the dot product”. But that's as if we had to prove that real numbers' multiplication is actually distributive over addition when saying “it follows from a (b+c) = ab + ac”, it's far from trivial if you want to go this far…
> To be honest, your comments have been quite low effort. They amount to "yeah but that bit's pretty easy too" while leaving it to me to work out how (and whether) your points fit into a coherent proof. I do get why this stuff all feels so trivial: we usually skip over it in higher-level proofs. But the only reason we can is that we can use nice abstractions like the dot product with its equivalent definitions, and that's thanks to the foundation these lower-level theorems provide.
I agree with you here, even on the low-effort part, I'm not particularly comfortable writing math on a keyboard and especially not on an English speaking forum because the notations are very different than the ones we use in France.
So you prove something in the 2d space via a 3d space intermezzo? Not very intuitive to me. Distribution on the other hand, can be explained by counting a handful of the same objects.
Indeed I was even taught it in 2d before 3d (and higher).
Even Pythagoras applies to any dimension, although admittedly it doesn't quite fit its usual statement in terms of triangles for higher dimensions: if a vector v has components (v₁, v₂, ...) then its length squared equals v₁² + v₂² + ...
That appears to be constructed in order to deceive though.
Someone who was thinking about a problem and drawing something would always with a drawing like that either intend the angle to be the same, or otherwise highlight the fact that one triangles is 8/3 and the other is 5/2 so that the slope is obviously not the same.
Good visual proofs simply use lines and figures to talk about actual algebra instead of symbols; but every outcome is still in a sense algebraic -- like the one linked and the popular about Pythagoras. Once you pull our your ruler and measure you are obviously lost. Every result should be algebraic, not visual, but it's fine to express the algebra in figures instead of letters.
What do you mean “end up believing” are you insinuating that the shown example isn’t true? It very much is true. The reason it’s true might initially be confusion to you as a viewer because you have a difficult time telling the difference between 3/8 and 2/5 and assume the triangles have identical slopes, but the visual proof very much and truthfully shows that this is not he case.
A similar method is handy for some mental arithmetic involving squares, e.g. it's easy to calculate 1005² because it's 1000² plus two added blocks of 5 x 1000, plus a small 5² block, so 1,010,025. Going the other way, 995² is 1000² minus those same two 5 x 1000 blocks, plus 5², so 990,025.
As one of the bad at geometry good at algebra people this blows my mind. I cannot even begin to comprehend how this shows, even for these specific boxes, the math works. But I can very clearly feel the relatedness of multiplication which makes the algebra work.
That’s not to say the example is bad, or good, more to marvel at how differently people think.
In the first image on the left, you can see the large square has length and width of a, which would have an area of a*a, or a^2. There is then a little square inside with length and width b, for an area of b^2. Essentially, the little square is getting removed from the big one (a^2 - b^2). In the last image on the right, you can see that the length of one side is (a-b) and the top side is (a+b), which would mean the area is equal to the product of (a-b)(a+b). This means that a^2 – b^2 = (a + b)(a – b). The intermediate steps just show how to move the area around visually
I think this helps the most. The piece that doesn't "just click" for me is that area = product of multiplication. Manipulating symbols generally just feels more natural to me.
It’s worth rehearsing that instinct. The insight that areas are products is a really useful one for deepening your intuition of what integration is doing, as well as for enhancing your ability to interpret graphs and charts.
Like, if you have a graph showing power consumption over time, it’s great to be able to mentally recognize that, say, if the time units are hours and the power units are Watts, that the area under the graph will be counted in Watt hours; that a rectangle one hour wide by one Watt tall is one Watt hour, and so on.
Not sure if it helps, but trying: In a sense area isn't anything else than multiplication. It isn't like you have a) multiplication and b) area and then prove that a=b.
Rather, area IS multiplication.
The unit of "square meter" quite literally means "meter multiplied by meter".
I agree and I'd say that area is almost, in some intuitive way, the more basic thing and multiplication follows from that (although I know that's not mathematically true). The definition of multiplication for natural numbers is repeated addition (e.g. 3 x 5 is defined to be 5 + 5 + 5). Many people would see that as the count of a 3 by 5 grid of objects, and that's certainly how we'd explain the commutativity of multiplication in school. If those individual objects happen to be unit squares then you have area.
I would say area is integration, and that, for the simple case of integrating a constant function, equals multiplication of that constant by the length of the interval being integrated over (measure of the set, if you’re doing Lebesgue integration)
Imagine you have 100 square tiles that are each 1cm x 1cm. You can make 20 groups of 5 tiles each with them — this is what it means to say 20x5=100.
Now take each of your groups and arrange its 5 tiles into a vertical line. Each line is now 5cm long and 1cm wide.
Arrange the lines side by side. You now have a rectangle whose height is 5cm and whose width is 20cm. You already know its area 100 cm^2, because you made it out of 100 tiles that were 1 cm^2 each. And now you can see that its area also corresponds to the multiple of its side lengths.
By the fact that the geometric proof in the link wants to proof the formula, but only does so for a small subset of all a,b for which the formula is correct. This makes it a partial proof, at best.
Ok nvm I can't resist wasting my time and typing stuff on the internet again, probably gonna regret it later.
How is it not obvious to the dullest of the dull that this visual proof is not supposed to work for goddamn commutative rings lmao
It's probably not even supposed to work for negative reals, 0 or the case b>a. It's supposed to demonstrate the central idea of the visual proof. Also yes, by choosing suitable ways to interpret the lengths shown in the diagrams it's absolutely possible to extend the proof to all reals but I'm not convinced it's meant to be interpreted like that.
But bringing commutative rings into this... man you're funny
No, this is not correct. WLOG means: I assume one of the possible cases, but the proof works the same way for other cases. But that's not true here. The proof, as shown, only works for a>b>0, it does not work (without extra work or explanation) for a<b. The proof for a<b is similar, but not the same.
[And it certainly does not show it for a,b element of C]
WLOG just means the other cases follow from the one case. There is no implication about how hard it is to get to the other cases, although generally it is easy and you don't bother spelling it out exactly.
This is not what I meant. What is being proved is: a^2-b^2 - (a+b)(a-b) = 0. If you swap a and b you end up with a sign switch on the lhs which is inconsequential.
That is not what the proof proves. The proof proves the equivalence how it was originally stated, and assumes for that b<a.
Your rewriting is of course true for all a,b and might be used in an algebraic proof. But this transformation is not at all shown in the geometric proof.
This is so beautiful! I could have never imagined this. I learnt this formula by rote when I was in school. Didn’t realize that it had a geometric equivalent. Same thing with differentiation and integration. Couldn’t understand. Learnt that too by rote. Is there a geometric equivalent for most formulas if not all? Is there a website?
The notion of “completing the square” is often done as an algebraic trick (“just add and subtract this magic term”).
But it can also be viewed as translating the quadratic expression along the “X-axis” so that (at its new origin) it is left/right symmetric.
That is,
Q(x) = ax^2 + b x + c
With the right substitution, x’ = (x - B), the linear term vanishes. So when you re-write in terms of “x”, you get:
Q (x) = a (x - B)^2 + C
So the intuition is that the linear term in the original quadratic is the thing that shifts the “symmetry axis” of the quadratic.
I have found this helpful when “X” is a vector and you have a quadratic form. In this case, the coordinate shift centers the quadratic “bowl” about some point in R^n.
*
The chain rule for differentiation is another one with simple geometry but cumbersome notation. It’s like: we know[*] that
f(x) = g(h(j(k(x))))
must have a linear approximation about some point x0. The only possible thing it could be is the product of all the little local curve slopes of k, j, h, and g, at the “correct” point in each.
Thinking about little slopes also clarifies derivatives like
f(x) = g(x^3, x^2)
where g is an arbitrary function of two variables.
I have really enjoyed reading some of the better explained pages. I wouldn't recommend most people start on calculus on that site, but since you requested it specifically here's the overview: https://betterexplained.com/guides/calculus/
There is some geometric equivalent for any formula that can be built using some arbitrary combination of the four operations of arithmetic, plus square roots. If you allow for either 3D geometry or origami folds, you can extend this allowing for cube roots too. Note that the geometric equivalent is not going to be necessarily trivial or intuitive in most cases, though.
The geometric “proof” isn’t actually equivalent, because it assumes a > b, and doesn’t generalize geometrically to b > a. The algebraic proof, on the other hand, generalizes at least to commutative rings.
Geometric “proofs” like this are neat, but are no real substitute for the algebraic ones. I’d argue that in cases like the present one they also don’t provide any deeper insights. You’re just moving geometric shapes around instead of algebraic symbols. They might give you the feeling that the theorem isn’t as arbitrary as you thought, but it isn’t arbitrary in algebra either.
I’m putting “proof” in quotes here because there are many examples of incorrect geometric “proofs”, and there is generally no formal geometric way to verify their correctness.
> there is generally no formal geometric way to verify their correctness.
There are formal models of synthetic (i.e. axiom-and-proof based) Euclidean geometry where proofs can in fact be verified. This is accomplished by rigorously defining the set of allowed "moves" in the proof and their semantics, much like one would define allowed steps in an algebraic computation.
I'm finding this kind of strange because when I first came across this identity, I couldn't make any sense of it until I'd mentally visualised this exact sequence of images. Now I'm wondering how everyone else did it.
I love this, giving an intuition about something that is usually taught by rote memorization. I also love how it makes all the math nerds and pedants uneasy xD Man like, of course you need to be careful. You need to be careful about unintentional assumptions in logical proofs too. It’s a cool little creative visualization.
Also it leans on the premise a x b (orthogonally measured) equals surface and that cutting pieces off surfaces leads to subtraction. Are those premises definitions or not, is a valid question.
this again opens the question what does multiplication actually mean. if you approach it geometrically the metric is something^squared, if you approach it as numbers - its not that straightforward on the numbers line.
This same identity can be used to provide geometric intuition as to why i*i must equal -1. This is shown in the diagrams at the bottom of http://gregfjohnson.com/complex/.
Is there by any chance a book that explains what the
illustrations mean? I had a quick look and it looks like I might need to know the theorems to begin with?
I think you could make it work for negative numbers. You would just shade the negative area differently to show it. For example, if you subtract 1000 sq ft from 100 sq ft, geometrically it means you are trying to remove more area than you have. So, to represent this, draw a rectangle of 100 sq ft. Conceptually "extend" this area by subtracting 1000 sq ft. The extra 900 sq ft that you try to subtract but can't "exists" as a negative representation. This could be represented by flipping the surplus 900 sq ft into a negative axis or shading it differently to denote that it's a deficit rather than an actual, positive area.
3. Why is scribbling lines on a paper (that look like math to humans) 'more' of a proof than visual diagrams, if at all?
4. If you came across a proof that was persuasive to alien intelligences -- and led them to conclude true things were true and false things were false -- but, alas, you did not understand it, does that make it less of a proof?
An irrefutable demonstration of a conclusion, possibly via sequence of steps or combination of elements.
2. In what context is a "proof" embodied?
Do you mean what the range or domain of the proof are? Not sure on the "embodied". I think you mean the communication of the proof and the expected base knowledge to understand the proof.
3. Why is scribbling lines on a paper (that look like math to humans) 'more' of a proof than visual diagrams, if at all?
You seem to be focusing on the representation of the proof in a particular notation, rather than the actual logic of the proof.
The graphical demonstration leads to false conclusions. For example, if a=0, it implies that a^2 - b^2 is 0 (or it requires some unfamiliar graphical representation of negative areas)
4. If you came across a proof that was persuasive to alien intelligences -- and led them to conclude true things were true and false things were false -- but, alas, you did not understand it, does that make it less of a proof?
Again, the representation is not the proof, it is a means to record or communicate the proof.
If the representation implies that false things are true (e.g., if a==0), then it is not a proof.
- When I said "embodied" I roughly mean the ground rules that someone needs to know to check the proof. In the case of symbolic logic I mean the symbols and the transformation/rewrite rules. But I'm not sure yet how I would formalize the analogous concepts for visual proofs.
- Re: "You seem to be focusing on the representation of the proof in a particular notation, rather than the actual logic of the proof." ... yes, but maybe not necessarily. The "logic" of a visual proof is quite different than the "logic" of a symbolic proof.
We're on the same page, but if I were to take it one level deeper, I would add:
a. While I know what you mean by "irrefutable", I wouldn't use that word, because it sounds too much like "untestable". The whole idea of a proof is that each step can be verified. With a big enough lookup table, a proof can be checked in linear time (right?). If a step does not "obviously" follow then the step is not well-explained (in the "trivial to verify" sense).
b. Your choice of "demonstration" is key here. An essential aspect of a proof is that it is easier to check than generate. Simply "read off" each line of the proof and check against some known set of facts and transformation rules.
c. It is useful to distinguish between a verified proof (which is subject to correction!) and just a proof (which is a form, a way of communicating how something can be verified). See this Stack Overflow page: "Widely accepted mathematical results that were later shown to be wrong?" [1]
- irrefutable is "cannot be denied or disproven" - it means absolutely proven. Very different from untestable (which means you can't verify or prove it).
This is precisely the distinction we've been calling out from the original - it's a demonstration that works for some cases. The diagrams fail as a proof because they can be refuted by the negative/0 cases where they don't work.
- testing or verifying with a set of data is also very different from a proof. This "checking" or "demonstrating" provides some assurance of correctness or utility for the test domain.
- demonstrating and checking against known facts is not sufficient for a proof - "I've tested my division function for millions of positive and negative integers and real numbers! I even verify by multiplying the quotient by the divisor and I've proven it is correct!" did you happen to test a divisor of 0? (dividing by 0 can also invalidate proof attempts that do not exclude 0 as a divisor)
It needs to be proven to hold for all cases, not just a sampling of cases (though it is valid to define the range of a proof - the example could have stated, "this is a proof of the equation for positive values of a and b, and were b < a" maybe that could constitute a visual proof for that domain?)
would think the point of algebra is to mechanize quantitative reasoning … trying to cast algebra operations in geometry may not be the most productive …
that's a bit ridiculous - isn't it. also there are similar magic math proofs designed to lead the onlooker astray by miniscule inaccuracies and suddenly it follows that 1 = 0.
I'm a little disappointed at the focus here on how it's hard or impossible to visualize when b<a or when either a or b is negative.
That's not the point. To me, it's useful to already know that an algebraic proof of that equation exists, but to see it work out visually. I don't need to see it worked out visually for every single possible value for this to be helpful for understanding.
It also nicely illustrates how algebra and geometry are linked. And that multiplication is geometrically taking you from 1 dimension to 2.
Aren’t they already used in early high-school math books?
They aren’t used later on because, for more complicated expressions, manipulating the algebraic formulas is easier than the geometric method, or completely breaks down (e.g. when the powers aren’t integers or are variables)
For example, I think anybody who can visualize what (a+2b)³ or (a+b)⁴ looks like geometrically can also, and easier, do the expansion algebraically.
> Aren’t they already used in early high-school math books?
No, or at least they were not in the high-school math books I was assigned. There were no explicit referrals in any way to visual representations of the algebra involved.
Visual proofs can be deceptive. For example, you have the famous missing square puzzle, where rearranging pieces of a triangle causes a square to vanish, something that should be impossible.
It doesn't mean it is not possible to have rigorous visual proofs, the one in the article is, but doing so can be deceptively hard. It is also common for visual proofs to be less complete, for instance, here, it only covers the cases where a>0, b>0 and a>b, there are no such limitations when doing so with algebra. I guess you can tweak the visual proof to account for these cases, but it will become far less elegant.
So I understand why teachers avoid visual proofs in math classes, they actually want you to stay away from them. They have entertainment value, that's why you see them so much in pop-science, and I think it is important, and also some historical value, as they were much more present in ancient times, but to actually learn maths, not so much.
That is unless you are studying visual proofs specifically, but I think it is an advanced topic you would tackle well after you can master simple equations like a^2 – b^2 = (a + b)(a – b).
Such a visual argument isn't "formally wrong" because it isn't trying or claiming to be a formal proof at all. "Formally wrong" would mean that formal mathematics is used incorrectly.
>> "Formally wrong" would mean that formal mathematics is used incorrectly.
> No, “formally wrong” means that it fails formal verification.
The word "formal" has meaning in mathematics independent of formal verification. The latter builds upon the former. Agree?
> In the context of hardware and software systems, formal verification is the act of proving or disproving the correctness of a system with respect to a certain formal specification or property, using formal methods of mathematics. - Wikipedia https://en.wikipedia.org/wiki/Formal_verification
Here is how I'm using the terms:
- "Formal" in mathematics refers to rigorous logical reasoning with precise definitions and deductive steps. This is the older, more general meaning.
- "Formal verification" emerged later as a specific term in computer science referring to automated/mechanical verification of system properties. This is now the standard meaning in software/hardware contexts.
So, putting these terms into use... If a human verifies a formal proof "by hand", I think it is fair to say that does not comprise "formal verification". On the other hand, if an automated system verifies the proof, then I would say "formal verification" has happened. Perhaps you will agree this is how many, if not most, experts use the terms.
Are we mostly playing language games -- or is there a key insight you think I don't understand?
You can perform a formal verification of an informal proof, like for example the ones in https://youtu.be/VYQVlVoWoPY linked elsethread. Of course, you have to come up with some formalization of what the informal proof is trying to claim, but it doesn’t mean that what you’re proving wrong must be a formal proof to start with.
It therefore makes sense to me (and represents actual usage of those terms) to say that some informal proof is formally wrong, meaning that translating its reasoning into a formal representation will reveal its incorrectness.
A few years ago I re-drew a bunch of these in latex with my PhD advisor and another colleague [3]. We planned to print them as posters and hang them for a Pi day event that unfortunately never happened because the pandemic broke out.
[1] https://www.amazon.com/Proofs-without-Words-Exercises-Classr...
[2] https://en.m.wikipedia.org/wiki/Proof_without_words
[3] https://www.antonellaperucca.net/didactics/proof-without-wor...
reply