Hacker News new | past | comments | ask | show | jobs | submit login
Why does e to pi i equal -1? (2015) [video] (youtube.com)
355 points by espeed on Feb 12, 2017 | hide | past | web | favorite | 137 comments

1) e^x is a function whose derivative is equal to its value.

2) e^ix is a function whose derivative is equal to its value rotated by 90 degrees (ie^ix).

3) As x goes from 0 to pi, the trajectory of e^ix always has a velocity vector perpendicular to its current position. For example, when x = 0, the current position is 1 and the velocity vector is i.

4) So the trajectory a circle arc of length pi, which ends at -1.

In a linear algebra course I helped teach last semester, I had the students go through the exercise of writing down the matrix for "multiplication by i," where the complex numbers are thought of as a two-dimensional vector space with {1,i} as a basis. Then I asked them to recognize the matrix (it's a 90-degree rotation of the plane), and then I asked them for an interpretation of that matrix squared (a 180-degree rotation). One thing to draw from this: for the real line, negation doesn't reflect the line, but instead it's a rotation through another dimension of numbers!

When we went on to differential equations, I tried to convince them that e^(ix) is meaningful using your #2: it is a path around the unit circle since it's the solution to z'=iz, and we already agreed multiplication by i is a 90 degree rotation, and circles are what you get when velocity is orthogonal to vector position. It's a bit tougher to convince someone that x is arclength, though, especially when they aren't too comfortable with complex numbers yet.

I love this answer.

For some reason, I've been dreaming about negative numbers lately. I think they deserve their own number set notation.

the standard IEEE floating point uses a 'sign and magnitude' system. Negative numbers are basically made by flipping a single bit. I've been playing around with a number system that expresses real numbers as two's complement (like how integers work) - xor all your bits and then add one. In this system, 0 is -0 and the number that is weird for signed integers: 0b100000...0000 is infinity.

Now, there's an exponent and a fraction, too. I've been playing around with how to do these numbers with logic gates (and verilog!) and you can either two's complement the whole thing and work with absolute values, or you can keep the fraction part as a two's complement...

So I just redid multiplication using two's complemented fractions! And addition/subtraction too, which for floating points in general is significantly harder than multiplication. The nice thing about two's complemented floating points is you don't need separate algorithms for addition and subtraction; you can just do everything with one algorithm.

Have you seen "Stanford Seminar: Beyond Floating Point: Next Generation Computer Arithmetic"?


Well, I'm the guy in the black shirt who did the demo.

If you liked that lecture, I'm already starting on some verilog implementations.

This is an example multiplication 8 bit * 8 bit -> 16 bit unpacked (20 bits). It differs from standard floating point in that the fractions are stored as two's complement. It takes a little bit of wrapping your head around, but the hidden bit for negative numbers is actually -2 ! Moment of zen.


Way cool -- I realized that when I checked out your profile just after posting the comment -- what timing -- I discovered the video a few days ago when looking for precise/compact interval representations. Interesting work indeed.

Has there been much traction for getting major chip manufacturers to implement this? I know they're all looking for the next big thing and Intel is working on specialized neuromorphic chips. A general "drop-in" replacement for floating point seems like an opportunity for a general-purpose win from low-hanging fruit:

Intel Gets Serious About Neuromorphic, Cognitive Computing Future https://news.ycombinator.com/item?id=13623846

well seeing as how John invented these numbers literally two months ago, I haven't seen any traction yet! But I am persuing fundraising opportunities. In the demo I showed how you can effectively reduce the bitwidth to 8 bits and still train in a very trivial machine learning exercise. I'm currently enrolled in the udacity machine learning class and implementing everything in parallel in julia so that I can try more complicated architectures using posits.

I do have a hardware architecture in mind for how to very effectively and efficiently execute machine learning calculations using posits.

I’m interested in representing angles / points on the circle; 3d unit vectors / points on the sphere; unit quaternions / points on the 3-sphere using 1, 2, or 3 relatively low-resolution posits, under stereographic projection.

How efficient do you think regular C or GPU code (on existing hardware) can be made for compressing a 32-bit float to e.g. a 16-bit posit, and for expanding the posit back into a 32-bit float?

I don't think it can be made that efficient in software. Is there a particular reason why you need 16 bits? A 32-bit float is going to be better than a 16-bit posit almost always (posits are better their equivalently sized float, but they're not that good) and once it's in posit representation do you have a way of doing mathematical operations on them?

This is just for data compression, not for computation directly. 32 bits is often overkill for transmission/storage of rotations, unit vectors, geographical locations, unit quaternions, and the like. Depending on the use case 8 bits might be enough, or 12, or 16.

To actually do computation I would convert the posits back into 32-bit floats (or e.g. in the Javascript case, 64-bit floats), and then take the inverse stereographic projection.

[Stereographic projection is extremely cheap; for each data point only requires one division and some additions and multiplications.]

I’ll do some experimenting at some point.

Posits look too good to be true! Is there a reason why regime bits do not include the sign bit? Then both "0 0001" and "1 1110" could be interpreted as 4 regime bits. Even better we could include the last flipped bit as well and we would have 5 bits.

Edit: Well. I see it would result in losing the values 0 and 1. Another question: Since it is fixed length of 4 bits (for N=32) why don't we just extract the 4 bit value, then we could represent 2^4 regimes this time without losing 0 and 1.

I wouldn't screw around too much with the sign bit. The way it's laid out is really kind of cool... Negation is simple two's complement.

In my software posit library (which is intentionally strictly binary and not backended by IEEE floats), (https://github.com/interplanetary-robot/SigmoidNumbers) I did everything by first inverting negative numbers and doing decode in the positive domain.

As I design the hardware, it's actually better to NOT do a two's complement inversion to do the decode, and keep the fraction as two's complement!

Also the 4 bit posit was just a simplification to help you understand the structure from a constructive point of view. posits can be of arbitrary length; they have a property I call isomorphic - so appending zeros exactly preserves the value of a short posit when increased in length; conversely, rounding a long posit to a shorter one reports the "nearest representable value".

You need to be a lot more explicit about what you’re asking. Look at the slides, searching down for “At nbits = 5, fraction bits appear”. Notice that every possible bit pattern is used and meaningful.


@espeed didn't ask if you contributed significantly to the lecture, they asked if you'd seen it. Please stick to the question!

If you are unsure why you were downvoted, I strongly suspect it's because politely giving and receiving due credit are important, knowing that one's interlocutor was involved in specific research is very useful information (to know what questions to ask), and it's probably kind of rude to call someone to account in this manner. Hope this helps :)

I was being sarcastic on a whim, fully aware that it would attract downvotes.

Speaking of number sets.

I never understood why complex numbers are considered one. I mean, yes they "are" a set, but besides that they are completely different.

All number sets I learned about did fill some gaps in one dimension, but complex numbers somehow added a new dimension.

Like real numbers stood in an entirely different context to rational numbers than complex numbers stood to real numbers.

You are mixing up a couple of things. This "dimension" you're talking about is probably the dimension of a vector space. The real numbers, as a vector space over the field of real numbers (recall that vector spaces are defined over fields of numbers), have dimension 1. However, you can consider the vector space of real numbers over the field of rational numbers. What is the dimension of this vector space? Well, the dimension of a (finite-dimensional) vector space is given by the number of elements in a basis for it. So how do we go about finding a basis for R over Q? A basis for R over R consists solely of the number "1", since given a real number "x" there exists an element from the field (in this case R), namely x itself, such that multiplying it by 1 will give you the number x. Yes, this sounds obvious, but that's how you prove that R over R has dimension 1. So, going back to R over Q, we see that "1" cannot be a basis, since if we pick, say, pi, there is no rational that we multiply 1 by to give pi. Moreover, there cannot be any finite number of real numbers that would make up a basis for R over Q, since then R would be countable (look this up if you don't know what it is). So the dimensionality of a vector space depends on which field you're considering. By the way, the vector space of complex numbers over the reals has dimension 2 (a basis is {1, i}), but over the complex numbers it has dimension 1. So, there's an "extra" dimension only if you consider it over the reals.

C (complex numbers) is a field, just like R (real numbers). It's an algebraically closed field, unlike R, so in some sense it's actually the most natural set to call "numbers".


I'm not trying to argue for R being more a set than C.

In my head a one dimensional thing like R is fundamentally different from a multidimensional thing like C.

Well I was answering the question about why is C considered numbers - it's a field, hence elements of C behave exactly like numbers. Moreover, it's an 'algebraic closure' of R, hence a more natural choice for "all numbers", and it's a maximal one at that (i.e. quaternions and such are no longer fields and don't really behave like numbers, while C still does).

Edit: FWIW, I consider the terminology of 'real' vs 'imaginary' completely stupid and misleading. This terminology didn't really make it to other languages, e.g. in Russian it's 'material' vs 'complex' numbers, but they don't use the term 'imaginary'.

We do: вещественная часть и мнимая часть.

Oh yeah, forgot about that. Well, then Russian terms are as stupid.

What do you mean by “number set notation”?

This isn't a proof, and while that may be obvious to you, it's not obvious to everyone who sees it.

There exists no logical path to the statement from the previous definition of e^x because, up to this point, e^x is a function on real numbers. e^ix is nonsense until you define e^x in complex coordinates, at which point the proof needs to rely on properties of that definition.

what you have posted misses the nature of the insight.

If two analytic functions on the complex numbers agree on uncountably infinitely many points, then they agree everywhere they are defined. This means that there is a unique analytic function that maps (ix)[x \in R] to the circle and (x)[x \in R] to the natural exponential, and they are the same complex function. Without the knowledge that analytic continuations are unique, the statement is entirely (pun intended) meaningless.

Sure, step 1 relies on the existence of e^z which is not obvious. Step 2 relies on the chain rule over C, step 4 relies on having a unique solution to a differential equation, etc. My comment can be expanded to a proof, but it was more aimed to explain why Euler's identity is natural.

step 1 isn't "not obvious"; it's nonsense. What, exactly, are you trying to prove exists? e^z? e^z is undefined at this point. If you're trying to prove that e^i\pi=-1, you need to start with a definition of e^z.

I think you should explain step 2 in more detail.

When I was a physics TA, I used similar arguments with double derivatives show the connection between imaginary exponentials and sines/cosines.

For f(x)=e^(kx), double derivative d^2f/dx^2 = k^2 f(x)

Meanwhile for g(x)=sin(kx), d^2g/dx^2 = -k^2 g(x), and similarly for cosine.

So if k is imaginary, from a differential equations point of view, the exponential behaves exactly like a sine or cosine.

That shows the general idea, and further consideration of boundary conditions gives e^ix = cos(x) + isin(x).

Minor wording: e^x is a function whose derivative is equal to itself, or the value of whose derivative is equal to its own value. This might be considered the kind of excessive precision that obscures rather than clarifies (as would be, for example, writing x ↦ e^x in place of just e^x), but lots and lots of confusion can result (especially in a function-analytic setting) from failing to distinguish between a function and its values.

Is this just the contents of the video in text form?

In any case, thanks! I don't need to watch the video now, as you've very clearly explained it in only four lines of text!

I can't tell if you are being sarcastic, but no, parent's comment is not the contents of the video. The video provides a geometric construction, and makes no use of calculus.

Not being sarcastic, I just didn't watch the video.

I was wondering if it was basically the same thing with nice animations...

It's not really any clearer to me. I cannot generalize my understanding of exponents to anything that deals with imaginary numbers.

For me what made it clicked is realizing that complex numbers are 2D matrices: z = x + i y = [[x -y][y x]].

So really we should be writing z = x * [[1 0][0 1]] + y * [[0 -1][1 0]], but since it's tedious we just call 1 == [[1 0] [0 1]] the 2x2 identity matrix and i == [[0 -1][1 0]], and check that i^2 = -1.

Then no more magical i number, the complex product can be derived from the matrix product, the exponential becomes the 2x2 matrix exponential, and so on.

Using a matrix as an exponent isn't any more comprehensible than an imaginary number to me. If it works for you, that's great, but it's not much help to me.

Think of exponentiation of some number 'a' as in-between its integer powers: 'a^1.5' is kind-of half-way between 'a' and 'a^2'.

If you plot all the integer powers of 'a', they all belong to a curve and the exponential simply fills-in the gaps for non-integer exponents.

Now, there are many possible ways to fill the gaps but the exponential does it so that a^m * a^n = a^{m+n} holds even for non-integer numbers m and n.

Similarly, if you take integer powers of a complex number, they all lie on some curve and the exponential fills-in the gaps, again turning sums into products. The same works with matrices, and so on.

First time I understood this.

Thank you for that.

You're welcome :-)

To elaborate:

The complex plane can be thought of as being made up of two-dimensional geometric transformations consisting of rotation and scaling (Tristan Needham calls such transformations “amplitwists”), with 1 as the identity transformation, and i as a quarter turn anticlockwise, and i^2 = –1 as a half turn. To compose two such transformations, you multiply the scales and add the angle measures of the rotations. Because such operations are linear, you can also break them into a part parallel to 1 and a part perpendicular to 1 (some multiple of i), and multiply two such transformations component-wise, using the distributive law (a + bi)(c + di) = (acbd) + (ad + bc)i.

exp z is a complex function which maps (in an angle-preserving way, i.e. conformally) from an infinite two-ended cylinder to a whole plane minus one point (the “origin”). The exp function maps negative infinity on the cylinder to the origin on the plane, and it maps the zero point on the cylinder to a given “unit” point in the plane, and the “zero” circular slice through that point on the cylinder to the “unit circle” on the plane, containing the unit point and concentric with the origin. The coordinate system on the cylinder has 2πi measuring one loop around a circular slice, and 1 pointed along the cylinder axis. The coordinate system in the plane is the customary square grid. Addition of coordinates in the geometry of the cylinder (if you like, rotating and/or sliding the cylinder) corresponds to multiplication of complex numbers (composition of amplitwists) in the plane. That is, exp(w + z) = (exp w)(exp z).

In particular, exp iy for some real number y maps points at distance y along the zero circle in the cylinder to points on the unit circle in the complex plane at a proportional distance around the circle; that is, to rotation operators which correspond to the given angle measure in radians.

So πi is halfway around the zero slice in the cylinder, and exp maps it to the operator in the complex plane corresponding to a half-turn rotation, i.e. exp πi = –1.

The log function is the inverse map, from the plane to the cylinder; it is a multi-valued function because we can make multiple “straight” helical connections between arbitrary points on the cylinder, which wrap around different numbers of times.

Once we have this general concept for how we want the exp map to work, we can work out the details to find that the unique such function is the solution to a particular differential equation f'(z) = f(z), or alternately the Taylor series we are all familiar with, exp z = 1 + z + z^2/2 + z^3/6 + ...

This explanation strikes me as a little too aggressive in throwing out the notation with the bathwater, only reaching its result by redefining the terms we already have intuition for into space-stretching operations that don't work like arithmetic does in my head.

In that sense I think I have the same problem with this proof that I do with the standard one, where you add the Maclaurin series of cos(θ) and i * sin(θ) and match term-by-term with the series for e^(i * θ). The problem is, at the point you can actually show equality, the things on one side aren't obviously a rotation and the things on the other side aren't obviously an exponential.

I'm not just hear to yell at clouds. I was given a proof that I truly love by a professor I adore, which I think really does give insight into what all these operators are doing. The best video I can find with it is here:

https://www.youtube.com/watch?v=-dhHrg-KbJ0 (Skip to 7:30 if you're already comfortable with the limit definition of e^x)

The basic summary is:

1) e^iθ is equal to (1 + iθ/n) ^ n for large n

2) That base, (1 + iθ/n), plotted as a complex number, has length approaching 1, angle approaching θ/n

3) The base squared, (1 + iθ/n)^2, by de moivre's theorem, forms another point as if the transformation from (0, 1) were repeated twice — that is, the length stays one, and another tiny angle is added for a total of 2θ/n

4) The full result is therefore n transformations, taking the path along the unit circle, traveling θ and arriving at cos(θ) + i * sin(θ)

Your (1) is just one of many possible (technically equivalent) definitions for the exponential function. And arguably not the most basic/natural/intuitive one.

The primary motivation for the exponential function is to be the inverse of the logarithm function. And the motivation of logarithms is to convert multiplication problems to addition problems, so they could be solved with table lookups (later performed on a slide rule) instead of difficult arithmetic. That is, log ab = log a + log b. Which is to say, exp(c + d) = (exp c)(exp d). Just setting this constraint along with the derivative exp’ 0 = 1 is enough to characterize the exponential function.

Or you can get to this function in many other ways, e.g. by solving the differential equation d/dx exp x = exp x; by the series 1 + x + x^2/2 + x^3/6 + ...; or by defining the logarithm as the definite integral of 1/x starting at 1, and then taking the exponential function to be its inverse. I like defining the (complex) exponential as a conformal mapping between the cylinder and the plane minus a point.

Of course! I think there's a good reason (1) is the first definition of e^x we get in school, though. I don't think it's the most "natural" definition, but I think it's the most grok-able. It's easy for high-schoolers to learn precisely _because_ each part of it is so intimately familiar that it's clear without hand-waving precisely what everyone's talking about.

When we then step into viewing real exponentials as "The thing that gets bigger at a rate proportional to where we're at", it's un-abstract, which is honestly important for intuition. Of course you're going to want to then view it with the calculus definition, or as a matrix, or as an operator... But I wouldn't start there.

So I feel the same about the complex version. There's another nice proof in this thread that starts with the calculus definition and follows a similar track. It's a much more beautiful proof! But I worry some will lose intuition when we start the conversation with "a complex exponential's rate of change is perpendicular to its position"...

That said, if I were to ever teach a math class on complex numbers, I think fleshed-out versions of these two 4-step proofs are the ones that together best build intuition for what "rotation in complex exponentials" really means. That rotational aspect is the foundation of much of the handwaving done in control theory, DSP, ASP, signals and systems, and really anything involving fourier and laplace transforms. So, it's important to grok it well.

Anything which repeats on some periodic interval (e.g. a signal which repeats in time) can be represented geometrically as being defined with respect to uniform motion around a circle (sometimes there are physical circles involved, and sometimes it’s just an abstract circle).

Complex numbers (a two-part complex of a scalar part and a bivector part) and the complex logarithm/exponential are the natural formalism to use for describing uniform circular motion, and are therefore the natural formalism for any kind of periodic signal.

The way to teach this is to start with vectors in the plane, and then teach about geometric products/quotients of vectors (this is a subject called “geometric algebra” or “Clifford algebra”, and the basics are plenty accessible to high school students). All of the mystery is removed from complex numbers when they are taught this way.


Thanks for the pointer to the video ("E to the Pi for Dummies" by Mathologer). It is definitely worth watching. Much more understandable, imho, than the one referenced at the top of this thread.

Your text summary of the video is accurate and likely makes sense to some people but not so much for others (namely, me).

Tour words are hard to grok until you see the visual depiction --- until you see the sequence of n transformations become a spiral arrangement of triangles that ends up approximating the (-1, 0i) point in the complex plane.

I agree, the suggestion that anyone could/would be able to just throw out everything they know about arithmetic is sketchy at best, and even it's sketchier to make it seem as though a result like Euler's formula could purely be derived from geometric intuition. Sure, it "makes sense" in the video, but I don't think it's possible to truly grok and appreciate the result without having done any calculus. The real joy of math isn't in knowing, it's in proving.

Anyway: as for proof methods, I like the Maclaurin series trick with this one. Too bad the internet hasn't come up with a universal/nice way of writing math notation in a readable format yet, I'd like to sketch that proof here...

The Maclaurin trick is what I'd call the "standard proof", and I think it's the source of a lot of the lack of intuition surrounding e^i... It's exactly what you call it, a "trick", where you prove two rabbits are identical by transforming them both into an infinite number of hats.

On the one hand, it's all sound logic and algebra, but on the other hand, I certainly don't blame Randall Munroe for seeing that proof in high school and going on to say "I have never been totally satisfied by the explanations for why e to the ix gives a sinusoidal wave."[0]

[0] https://xkcd.com/179/

Love that xkcd, but I found the Maclaurin "trick" to be extremely common-sensical... If you're already finding the area of infinitely many small boxes under a curve then Taylor series don't seem that weird. Once you are convinced that Taylor series are O.K. then this proof is really cool. I also think it's weird to show high-schoolers this, as about 1% of high school seniors have the mathematical knowledge needed to grok this level of math.

Anyways - people have differing levels of magic tolerance, and some people don't have a lot of background in calc, so I don't blame them for looking for other explanations.

I came here to post that video by Mathloger, it's very well explained.

It's a nice video but it goes so fast I don't think people who don't already understand this stuff will get it.

"Young man, in mathematics you don't understand things. You just get used to them."

I don't really like this, because it's a self limiting quote that leads to complacency over real understanding. Especially in the case of imaginary numbers and e where intuitive understanding only doesn't exist because mathematics has a history of being poorly taught.

I think there is both a good and bad aspect of this quote, you've done a great job highlighting the bad.

The good, is in essence that you don't always need to be fully comfortable and in fact will not always be comfortable. I find in mathematics if you try to always strive for total comfort you will never progress, as a lot of the comfort comes from advancing past a topic and upon revisiting it you realize you understand the fundamentals better than you thought.

Totally agree. My father would always tell that what's important is to truely understand things and not just be able to do the problem. This had two bad side effects : 1/ i just read the course and didn't do the exercises, and 2/ i "blocked" on things like infinitesimal calculus, because i couldn't get my head on its true meaning.

Re: knowing vs understanding, have you read "Richard Feynman on education in Brazil"?


In every maths subject at uni, I'd do quite well in the exams, but not fully 'getting it'. Then the next semester, in the next subject along the, I suddenly understood it, as if the semester break had given things a chance to arrange themselves, and the context of the more advanced subject made things clear.

Agreed, but isn't that irrelevant here? e^i*pi has an intuitive meaning, so we shouldn't be content with our lack of understanding.

Honestly, after spending months studying the subject, I don't think it's really possible to "get" complex numbers. I just view them as affine transformations written in an unusual notation. I don't think they make sense as anything but a recontextualization of R^2.

So what don't you get? Are you implying there is something more to complex numbers?

What I don't get is why someone would bother with complex numbers and their silly notation when linear algebra would work perfectly fine. I know that in certain contexts they are useful. E.g. to avoid losing information when solving polynomials. But that's a minuscule fraction of their range of application. Nearly every practical use I've seen of complex numbers just use them as a vector representation.

Well the two uses I'm most familiar with are AC circuit analysis and quantum mechanics. They can both be reformulated without complex numbers of course, since nothing is special about i.

Yet the complex versions are a lot easier to work with, because even in manifestly real formulations, the complex structure is still there, but in disguise:

- http://www.scottaaronson.com/democritus/lec9.html

- http://physics.stackexchange.com/questions/32422/qm-without-...

The problem with that is that complex numbers initially emerge as roots of polynomials with real coefficients. Getting to affine transformations from there seems a much bigger leap than asserting there is a square root of -1.

The video linked in this post will only make sense if you accept that x e^(theta i) and x k are respectively the rotation part and the scaling part of a linear transformation of x. I'm not aware of any other way to intuitively grasp an expression like e^(pi i).

this is an old comment, but e^z (over C) can be defined as the analytic continuation of e^x (over R). This is a much more interesting definition, since it depends on the fact that that function is unique.

Mild snarkiness aside, is the quote really inaccurate?

"Getting used to" [something] has a pejorative sense, but it also just means "becoming familiar with," and really understanding something is, in a way, simply being so familiar with it that reasoning about it is second nature... at some point, things just sort of start to make sense...

Not quite, you can get used to something without understanding it at all. Many people that fly are used to it but they don't understand the basic physics of flight at all.

And who would you prefer in the pilot's seat?

To say suggest that you don't understand something without a formal underlying theory is one view, but not mine. I feel like I understand English (as in a deep understanding, not just the ability to interpret sentences) without anything like a set of rules, and everything like a set of experiences akin to a pilot's experience with flight.

What does that have to do with anything I said? I'm pointing out the difference between "understanding" and "getting used to".

I'm refuting that difference.

Not very well. I'm not talking about pilots.

I fly with many frequent fliers that are completely comfortable with flying while having no technical understanding of the mechanics.

My mistake. I misread the phrase "many people that fly" as "many people who pilot planes".

Mathologer has a video that I found easier to understand, YMMV


> goes so fast

Especially goes too fast right when it becomes less obvious.

See the "Essence of Linear Algebra" (visualized) by the same author -- it should give you more intuition into the nature of the transformations:

Essence of Linear Algebra https://www.youtube.com/watch?v=kjBOesZCoqc&list=PLZHQObOWTQ...

Like when he presents equations that he said we wouldn't need...

See MIT's "Big Picture of Calculus" video:


Yeah, he's gotten much beeter about that since this video was released.

The most elementary explanation I ever got for why e^(i pi) = -1 is from the definition

e = lim_{n -> infinity} (1 + 1/n)^n

First, we can generalize this to

e^z = lim_{n -> infinity} (1 + 1/n)^(z n) = lim_{m -> infinity} (1 + z/m)^m

using the substitution m = zn. Therefore,

e^(i pi) = lim_{m -> infinity} (1 + i pi/m)^m

Now, converting the complex number (1 + i pi/m) in terms of polar coordinates (r, theta) yields

r = (1 + pi^2/m^2) ~ 1

theta = sin^{-1}(pi/m) ~ pi/m

Since the product of two complex numbers is

(r, theta) (r', theta') = (r r', theta + theta'),

we have

(1 + i pi/m)^m ~ (1^m, m * pi/m) = (1, pi) = -1.

I love the Khan Academy video on this:


Sal's enthusiasm at the end is contagious!

I've learned so much from Khan Academy – not just whatever is explained by the video I'm watching, but different ways of educating people as well. In fact, I helped out with tutoring basic math a while back and I struggled at first because I didn't realize how much I wasn't teaching simply because it was second nature to me. After watching a bunch of Khan Academy videos on the same topics, I found different ways of explaining things, and was much more successful. Khan Academy has really challenged my own way of thinking about many things, math in particular.

It's a great resource, and while I don't use Khan Academy much these days I still donate $100 every year.

Is it pure chance that this was posted only two or three days after I watched it along with a few other e to pi = -1 videos?

Randomness aside. 3blue1brown makes some wonderful math videos that I find really explain the intuitiveness of some of the ideas. I was unfortunately cursed with a math teacher who for whatever reason required us to memorize until we passed the test. Imaginary numbers were taught as "something that will help you in college"

Richard Feynman used to go up to people all the time and he'd say "You won't believe what happened to me today... you won't believe what happened to me" and people would say "What?" and he'd say "Absolutely nothing".

I agree wholeheartedly with the quality assessment of 3blue1brown videos. All mathematics should be clear and intuitive, by definition :)

> All mathematics should be clear and intuitive, by definition :)

I disagree with this completely, and so does all of higher mathematics. It's neither clear nor intuitive. In fact, as soon as you start learning about infinities (Calc I), intuition becomes hit and miss.

Intuition always comes after the feeling of awe or confusion, in my opinion. Really depends on how you learn.

No, another link to 3Blue1Brown was posted earlier this week and hit the front page. My basic assumption is that when a topic is posted, people will explore that topic and, if it's interesting enough, post further content about that topic.

If you prefer slides over a video, here is a slideshow I wrote: https://docs.google.com/presentation/d/1oMNjkDp-LieSGnZEwNpc...

Or you might prefer Better Explained's article: https://betterexplained.com/articles/intuitive-understanding...

Upvoted for the Better Explained link. I always liked their explanation of e https://betterexplained.com/articles/an-intuitive-guide-to-e...

In Lie theory, given a Lie group, there's a general notion of an "exponential function", which maps elements in the tangent space to the identity to "full" elements of the group.

In this case, our group is the unit circle in the complex plane. This is not circular logic, by the way. If a, b are complex numbers on this circle, then |a| = |b| = 1 and so |ab| = |a||b| = 1.

The identity of the group is the complex number 1 (with plane coordinates (1,0)). So the tangent space to the identity is the vertical line 1 + it, for t a real-valued parameter. In two-dimensional coordinates, that expression looks like (1,0) + t * (0,1).

(To be completely precise, the tangent space is a vector space, not a line displaced from the origin. In particular, the tangent space must contain a zero vector.)

If v is an element of this tangent space, then in Lie theory the exponential of v, exp(v), is defined to be g(1), where g is the unique geodesic on the circle (passing through the identity element 1) whose velocity at time 0 is v.

Visually, this is sort of like taking the tangent vector v = ti, placing its base at 1, then wrapping it around the circle and marking where its endpoint ends up at. If the vector is v = pi*i, then it has length pi, so it will end up demarcating an arc length of pi on the circle. Since we're working with the unit circle, this takes us straight to (-1,0).

I'm still leaving a lot out, of course -- most importantly why this notion of exponential has anything to do with the ordinary one.

I was about to write a similar story, but then I realized you need a Riemannian metric (and exponential) for everything arclength-related.

IIRC since the unit circle is a compact Lie group, there is a bi-invariant Riemannian metric whose exponential is the Lie group exponential and we land back on our feet, but the Lie group structure alone is not sufficient.

e^τi = 1.

In words, this equation makes the following fundamental observation:

The complex exponential of the circle constant is unity.

Geometrically, multiplying by e^iθ corresponds to rotating a complex number by an angle θ in the complex plane, which suggests a second interpretation of Euler’s identity:

A rotation by one turn is 1.

The Tau Manifesto http://tauday.com/tau-manifesto#sec-euler_s_identity

Thanks for posting this. I recently saw another video about this equation than the one posted here and thought "maybe this is one of those cases where tau doesn't work out nicely", but didn't have the intuition to see if it was or not.

This seemed like voodoo until i took signals and systems. Then it's just a normal observation about vector sums on the complex unit circle.

It seems like voodoo when you have high-school mathematics, because you've been taught about "to the power of" in terms of multiplying something by itself a certain number of times, and multiplying something by itself an imaginary number of times is nonsensical.

Once you've done enough maths you think of it in a different way and e^ipi seems obvious.

It's just a different conception of what the symbols mean, I suppose.

There are so many assumption that were hammered into me in high school (and below) that I find hold back my thinking, rather than help me. I kind of feel like the education I received was full of hacks to get me past whatever test was coming up, rather than truly teaching me. I've spent years playing what feels like catch up to actually truly learn the things I thought I knew – a lot of high school mathematics included.

What was not obvious to me, and took a lot of higher math to get to, was that in the grouping of N=>I=>Q=>R=>C, at each stage you're pulling in a more complete mathematical representation (counting, rings, fields) until you're algebraically closed, and while you could go higher, you start losing properties you like again, like commutability over multiplication.

Complex numbers are that nice saddle point, which is an interesting thing to ponder.

I wish there was more focus on these "why" aspects in at least the optional advanced math or physics you could get in HS. It helps put some things in perspective.

I've been watching 3Blue1Brown a lot recently, and he's been linked to many times here too. His work is absolutely amazing, and he even wrote a python library to make the animations!

Yes, the Python library is called Manim, and it's on GitHub:


Gauss said that if the reason weren't immediately obvious to someone, they would never be a first rate mathematician. For the rest of us, there's YouTube.

Very Nice Video - One of my favourite sub 5 minute explanations is Oliver Humpage's @ Ignite Bristol circa 2011:


Hope you enjoy as much as I did.

At 2:37, right after finishing explaining adders and multipliers on the line it started going way too quick like where did e^x and some these infinite sums come from???

He introduces e^x as a way of converting adders into multipliers.

The infinite sum is not important. I believe he is showing it purely as a visual tool to indicate that e^x is just "some function".

More specifically, based on the video, the important feature for a adder -> multiplier converter to have is that f(x+y) = f(x)f(y). However, many functions satisfy this requirement. He states (without justification [0]) that e^x is the most "natural" choice for such a function, and provides the infinite sum as a visual aid to show that it is just some function.

[0] Rather, justification is provided by reference to another video.

I've seen this a while ago, and while it's pretty instructive, it's actually also pretty confusing. The magic happens in a seemingly-innocuous throwaway sentence at around 4:20 (after being introduced to the 2D plane):

> ... This can now include rotating along with some stretching and shrinking ...

It's entirely non-obvious WHY we should be okay with rotating all of a sudden. The real answer is not super complicated, but it deals with a couple of amazing relationships between exponential functions and trigonometric identities[1]. So really, we don't have to accept "rotating" as some weird new action we can do when moving into the complex plane, we just have to accept that trigonometry is weirdly related to analysis due to some very cool properties.

[1] https://www.phy.duke.edu/~rgb/Class/phy51/phy51/node15.html

>It's entirely non-obvious WHY we should be okay with rotating all of a sudden.

Why shouldn't we? We have a set of object called multipliers that we want to generalize to the 2 dimensional plane. The presented generalization (eg, the multiplier identified by x maps the point at 1 to the point at x) seems natural; and it defiantly seems to be well defined. Additionally, it appears obvious that the presented generalization is equivalent to the original definition if considered only along the x axis.

From the perspective of the video, the fact that e^x is written as an exponential is "a vestige of its relationship with repeated multiplication". In the construction presented in this video, we should think of e^x as the "natural" homomorphism from adders to multipliers.

The fact that this function has anything to do with exponential, trigonometry, or analysis is neat, but not important in this context.

> Why shouldn't we? We have a set of object called multipliers that we want to generalize to the 2 dimensional plane. The presented generalization (eg, the multiplier identified by x maps the point at 1 to the point at x) seems natural

Rotating all of a sudden does not feel natural to me at all. Why not start cutting up the 2D plane? Why not fold it? In analytic terms, we would cause discontinuities, whereas rotating is "smooth". But this is definitely nontrivial. Rotating is just so arbitrary. I think that skewing or flipping, for example is just as "natural" as rotating.

> The fact that this function has anything to do with exponential, trigonometry, or analysis is neat, but not important in this context.

It's actually at the heart of why rotation (and not some other geometric operation) is key to e^iπ.

The better idea is to start with vectors. Adding vectors or scaling them is straight-forward. It’s trickier to figure out what it means to multiply or divide two vectors.

The quotient of two vectors v/u should be some kind of operator which transforms one into the other (that is, when you multiply it by one, you get the other, (v/u)u = v(u\u) = v, because we want multiplication to be associative). If those two vectors are the same length but different directions, the natural transformation to use is a rotation. The reason to use a rotation is that we want the transformation to make sense irrespective of any arbitrary coordinate system we decide to impose. If we used some kind of skew, it would break down under change of coordinates. As for reflections: if the quotient of two vectors was some kind of reflection, then we could square any quotient of vectors to get an identity transformation, which would not result in a very useful or consistent arithmetic.

The nicest and most useful formalism for defining multiplication of vectors is called geometric algebra, a.k.a. Clifford algebra. Start with http://geocalc.clas.asu.edu/pdf/OerstedMedalLecture.pdf

Or see the recent blog post http://www.shapeoperator.com/2016/12/12/sunset-geometry/

Or see more links at https://news.ycombinator.com/item?id=12938727#12941658

We want some operation that maps the point 1 to the point x, while holding the origin constant. I agree that scale and rotate is not the only such function, but when I personally visualize taking a grid and moving one point while keeping another constant, that is what I visualize. Additionally, this has the added property of maintaining the grid structure.

It is not immediately obvious to me how skewing could define such an operation in the general case; or how flipping could define such an operation in most cases.

In any case, the system of adders and multipliers in 2-dimension he describes, even if not the only reasonable 2D generalization, is certainly a reasonable generalizaion, and one that has proved useful.

>It's actually at the heart of why rotation (and not some other geometric operation) is key to e^iπ.

In this video, e^x is defined as a mapping between multipliers and adders. Multipliers are defined to be a combination of rotation and scaling. Any relationship between exponential, calculus, infinite sums, etc is purely a result of these definitions (except, perhaps, the motivation for choicing i * pi to be the principle multiplier mapping that gets mapped to -1 under e.)

> Multipliers are defined to be a combination of rotation and scaling.

True, my point was only that the definition obscures the fact that rotation (and specifically the stunning relationship between trigonometry and exponential functions) is the key to the answer of "why". Without that, I think the video is just an exercise in indirection.

as a masters in mathematics, this is exactly what i came here to write. thanks!

To have a notion that multiplication by imaginaries causes rotation, you'd need Euler's formula. I honestly think the best way to get a visual sense for why multiplication by rexp(itheta) is to look at the first few terms of the taylor series added together and see that the adders combine into a spiral that converges on rcos(theta) + isin(theta). that is still plenty visual for me.

>To have a notion that multiplication by imaginaries causes rotation, you'd need Euler's formula.

No you don't. You might notice this by simply working with imaginary numbers. You might invent imaginary numbers for rotation [0].

Alternatively, consider the multiplication (x + yi)(a + bi), as the value (a + bi) performing a transformation on (x + yi). We want (x + yi)(a + bi) = xy - by + ayi + bxi. If we consider (x + yi) to be a 2 dimensional matrix (with basis 1 and i), we can write the above equation as a matrix multiplication:

    [ x   y ] [  a   b ] = [ ax-by ; ay + bx]
                  -b  a
Notice that

    [  a   b  ] 
      -b  a
is just a rotation matrix multiplied by the scalar (a^2 + b^2).

[0] That is, define a group (in the group-theory sense) of functions of rotations, denoted as xi for real numbers x, and a group of group of functions for sliding, denoted x for real numbers x. You might then notice that you can combine these groups in a field structure, that happens to have 1i * 1i = -1, and that the subfield of elements with no i component happens to be isomorphic the the reals.

Alternatively if you don't like coordinates, you may notice that multiplication by a unit complex number is linear and preserves the complex norm (i.e. modulus), so it defines an orthogonal transformation of the plane.

Topologically, unit complex numbers are the unit circle which is connected, and 1 is part of this circle so all of these transformations must have determinant 1: these are rotations.

No Euler formula.

Yay skew symmetric matrixes.

It's also fun to introduce e as a matrix operator for 3d rotations.

It's useful for kinematics and having compact representations for axis angle notations. It's a little far in my head but at some point it felt like a ha-ha moment with the Euler identity, in the "2d version".

In the three-dimensional case, for any practical purpose, use quaternions. https://en.wikipedia.org/wiki/Quaternions_and_spatial_rotati...

(At least if you’re ever needing to compose rotations; to apply a quaternion to a big array of 3-vectors, go ahead and convert it to a 3x3 matrix first, which should end up slightly more efficient.)

3x3 rotation matrices are really hard to keep normalized properly, whereas quaternions are trivial to normalize (just divide by the norm).

If you need to compress a unit quaternion down to 3 numbers, instead of taking the logarithm (which is computationally expensive and a pain to deal with) use the stereographic projection.

Quaternions are a useful tool for manipulating rotations in a lot of common applications. But that wasn't my point here. Also quaternions are hard to grasp by humans. I find axis-angle much more palatable in general.

Imaginary numbers can be represented with a 2x2 skew symmetric matrix with no stretch of the imagination at all. And 3x3 skew symmetric matrixes represent rotations most compactly with only 3 actual variables. Instead of 4 for quaternions, 9 for "classic rotation matrixes", or the need to tell which is the order of the angles if you're given 3 euler angles.

There are interesting applications of Lie Algebra on SO(3) [1], notably in computer vision where a global energy is minimized across two successive rgb-d "shots" in order to recover the infinitesimal rotation [2]. It's going to be easier to minimize energy on something that is most compactly defined, and always amounts to a valid rotation.

[1] https://en.wikipedia.org/wiki/Rotation_group_SO(3)#Lie_algeb... [2] https://vision.in.tum.de/_media/spezial/bib/kerl13icra.pdf

From what I understand that would be (sorta; I’m not an expert in Lie theory) the logarithm of a rotation. I find the stereographic projection to be a more useful way to compress an arbitrary rotation down to 3 dimensions, for most purposes.

Yes, precisely. exp() is the mapping from so(3) to SO(3), so you can use log to go the other way around.

Here is the relationship with the other representations [1]

Where it becomes interesting for our rigid-body transform application (or recovery of it, in the case of computer vision), is with Twist coordinates (6 element vector) which will map to a 4x4 Transform, again using the matrix exp() operator [2].

[1] https://en.wikipedia.org/wiki/Axis%E2%80%93angle_representat... [2] https://en.wikipedia.org/wiki/Screw_theory#Twists_as_element...

Sure thing!

> .. see that the adders combine into a spiral that converges on rcos(theta) + isin(theta)

Incidentally, that's how Mathologer explains e^iπ and I like that explanation a lot more.

For anyone that finds it beautiful isn't there a bit of humanisation and definition involved (for example the Sine function used in deriviation uses 'pi' instead of 90 degrees), not to mention Sine is a human created function. You could have e to pi (90 * -1) too.. or a different method to define angles instead of having 360 degrees (base 60)

Yeah, you can have the same thing with degrees if you use a different exponent base than e:

    e^((pi/180)ix) = (e^(pi/180))^ix = E^ix
and therefore

    E^180i = -1
where E is about 1.0176065.

The particular base e has some nice properties, though, like that its rate of change d/dx e^ix at a given point x is just i e^ix. The rage of change for E^ix, on the other hand, is d/dx E^ix = (pi/180) i E^ix, which is a little less "natural". This strange fact -- that using radians makes the expression e^ix = cos x + i sin x have a nice derivative -- is one of the reasons why mathematicians like to define these functions in terms of radians instead of degrees.

Survivorship bias of beautiful math. Who want's to study systems that don't simply and compose well. The exception that proves the rule is the 4 color theorem - ugly big-arse proof generated by a machine. Who is going to look into that any further?

I had never understood how imaginary numbers had any bearing on physics (probably because I'd never been taught). But lately I've thinking about how all mathematics must have some natural physical equivalent or relation. E.g. how does multiplication happen in nature.. what does it mean really to multiply something?...how does this happen in nature? For the most part, we take these things as facts and learn the mechanics, but in physics, it seems to me, you really need to understand how these algebraic interpretations translate into physical realities (also, perhaps sort of obvious for everyone reads HN). For someone who enjoys discovering these perhaps obvious things later in life, it's clear how mathematicians and physicists could clearly come to the same mathematical conclusions from completely opposite vectors.. I guess this probably happens all the time.

You should read about Maxwell's original equations for electromagnetism in component form (1865), then in quaternion form (1873), and Heaviside's vector-based rewrite (1888), then look at how they can be rewritten in geometric algebra.

If you invent enough crazy new mathematical constructs, eventually one of them will mirror a natural physical phenomenon. And then a pile of equations collapses into something incredibly simple.

It's an interesting and unresolved question whether imaginary numbers are fundamental to the nature of physics or just a handy tool to calculate stuff. They are certainly handy for calculating - they are an easy way to represent oscillations.

Maybe it's a mistake to allow imaginary to ever be separate from real numbers ? Another post here brought up how easily it is to work with imaginaries if you just treat them as the vector [re im]. Maybe (philosophically speaking), there are no purely real numbers. Could it be that all quantities in nature must contain a (sometimes zero) Im component ? That might be a more satisfying interpretation than just allowing them to creep in when they are absolutely required, such as polynomial equations.

e^x = an infinite series

if you substitute x = iz

you can split the even and odd terms of the infinite series into cos and sin

e^iz = cos z + i sin z

evaluating at z = pi yields

cos pi + i sin pi

-1 + 0i = -1

the exponential function maps the imaginary axis to the unit circle. pi gets mapped to -1.

This shows its true, but the "why" and real understanding requires calculus and the first week of complex analysis. Otherwise you are just parroting a set of facts.

Math major here.

This presentation does skip some nessasary legwork for complete rigor, but presents a valid non-standard construction of e^ipi.

Specifically, he defines two sets of objects: adders and multipliers, and a function (written e^x for historical reasons) that maps adders into multipliers.

He then generalizes this construction to work in 2 dimensions instead of 1 dimension.

I would add to this construction that e^x is the particular converter that maps pi -> -1. However, I think this requirement is implicit in him stating that e^x is the most "natural" of the converters, and that pi -> -1 is the most natural of mappings.

The only place where you might need calculus is to provide an explicit construction of e^x (and possibly to justify the notational choice of writing it like an exponential).


To make the point clearer. Under the construction presented by the video, e^x is not defined as an infinite some, but rather as a function satisfying certain properties. That this function is equal to a particular infinite sum is a statement that requires proof; and not a statement that is needed for many applications.

As always with this author, I followed a link to a math explanation video expecting to point out how it is only a superficial treatment that breaks down under rigorous analysis, only to be dissapointed in my inability to find any such flaws. Seriously, this guy is awesome. If you need to learn math, see if he has a video about it.

I do have 1 gripe with this video though, and that is his handwaving around "natural".

More specifically, as far as this particular case goes, there is nothing natural about the choice of e^x, or the significance of pi.

As he identifies, we are interested in some function that maps adders into multipliers with the property f(x+y)=f(x)f(x).

To uniquely identify such a function, we need to add an additional constraint. He chose to add f(pi) = -1. He justifies this by arguing that pi is the length you would travel along the unit circle to arrive at -1. This is true (and the underlying reason why pi and e end up being natural), however using this argument seems to break the abstraction for me.

Under this construction, in the equation f(i * pi)=-1, "i * pi" is an object in the set of multipliers, and "-1" is an object in the set of adders. Specifically, "i * pi" is a function which takes a plane (or perhaps a point) and rotates it, while "-1" is a function which takes a plane (or point) and slides it.

He then invokes an unstated mapping, g, to convert the multiplier [0] "i * pi" into the real number "pi". He than insists that g(x) gives the distance a point would travel along the unit circle when the multiplier x is applied to it.

At this point, because arriving at the point "-1" [1] from the point "1" through rotation requires traveling pi distance, it makes sense that f(i * pi) = f(g^-1(pi)) = -1

I am still not convinced that (within this construction alone), this is a more natural choice that saying that f(i) = 1, but would agree that it is one of the two natural choices. To get to f(x) = e^x being the natural choice requires showing that it comes up in all sorts of unrelated parts of math, so it is probably more natural.

[0] It might be better to speak of "rotational multipliers" here, as I am not sure how to natural define g(x) for multipliers that stretch the plane, instead of or in addition to simply rotating it.

[1] This "1" is again distinct from the adder "1" and multiplier "1", but plays a central role in defining them, so I do not object to its usage.

Look at the Taylor series for e^x, and see what you get for e^(i * theta). The terms with i raised to an even power are real, those with i raised to an odd power are imaginary. Group the odds together and the evens together, and compare with what the Taylor series for sin x and cos x give you for sin theta and cos theta. (I'm being a bad person here, ignoring convergence issues, but I want to say that all three are absolutely convergent.)

I think Mathologer's take is much more intuitive:


Why not just write it e^iτ=1 which conceptually and pedagogically makes just so much more sense.

BTW for thoses who like Complex Analysis here's how to visualize them as a 2D deformation : https://www.youtube.com/watch?v=CMMrEDIFPZY

Honestly the calculus explanation is the most clear to me (Taylor series expansion of e^x)

I deeply thought the iterative nature was advanced ..

I see how intuitive the scale / rotate mindset is; but I'm a bit sad that iteration is a mishap in their view.

> (2015)

I'd prefer if admins could change the link to a description of this mathematical fact that's more up to date, and fits the needs of me and my family in 2017.

It originally said "Why does e to pi i equal -1? (visualized)".

Pure beauty.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact