I find that the most elegant way to understand exponential-related things is by defining the exponential as the function satisfying f'=f (with suitable normalization). In other words, it is the "eigenfunction" of the derivative, which is a linear operator. This is a very natural thing to ask when studying derivatives, ODEs, etc.
With this definition, you can define the constants e and pi in a natural way. In particular, 2 * pi * i is the period of e^x. It then makes sense to analyse the function e^ix, with period 2*pi, and separate its real and imaginary parts. These are sure to be periodic real functions, and we can name them cos(x) and sin(x).
This Quora answer talks about this approach pretty well [1].
Here's a fun perspective from someone who knows very little about maths, but has a huge interest in logic.
I have a question for you: At what 'level' of maths or understanding do you reckon someone has to be in order to make sense of your introductory paragraph?
Let me map out what happens in my head when I read that first paragraph:
> "I find that the most elegant way to understand exponential-related things"...
Okay, I know what exponential increase means. Something grows by an exponent of something. 2416...
> "is by defining the exponential as the function satisfying f'=f (with suitable normalization)"...
I don't know what an exponential is. A function, to me, is something I call as part of code, so I am not sure if I might be confusing the programmer and mathematician-definition of 'function'. I don't really know how to interpret f'=f in any way. (F-apostrophe equals f? Why?). I also don't really know how normalization plays into it. For me, normalization is something related to homogenising a series of things. Sometimes it can be related to normalising the volume of a bunch of mp3 files, or in other cases it can be turning a bunch of badly formatted phone numbers into a common format for a database.
> "In other words, it is the "eigenfunction" of the derivative, which is a linear operator. This is a very natural thing to ask when studying derivatives, ODEs, etc."
I have often heard terms like "eigen"-something, but this always causes me trouble, because I end up parsing it in German ("eigen" means "self" in German). So now it is the "self"-function of something?
Okay, of the 'derivative'. Is a derivative the mathematician's way of saying "figuring out something you don't know by using a bunch of other things you do know"?
I do not know what a linear operator is at all.
I also do not know what ODEs are.
You see, I love logic and science and have been a programmer for so long, yet when I hear something like the statement above, I almost become despondent over the fact that seemingly I know nothing of the underpinnings or logical foundations of the things I spend my life with (aka computers and programming being born from maths...).
Sure, there is certainly a lot of jargon here which I could have explained better, but in the end one has to be familiar with some calculus and complex numbers to make sense of this topic. Depending on where you're from, this could be either late high-school to second/third year of university.
So by exponential I mean the exponential function, in the mathematician's sense: something which takes numbers as inputs and produces other numbers as outputs in a unique way. Already a complication: these can be complex numbers.
Then, the derivative of a function is its "instantaneous rate of change". How much does the function increase for small (infinitesimal, really) changes in input? Since this rate may be different for every point, we are defining a new function f': for every input of f, f' gives you the local rate of change. So there you go, the derivative takes a function f and produces a function f'. We call this an operator. Because the derivative has very nice properties, we call this a "linear" operator.
Finally, the "eigen"-things. These are really fancy words for the simple concept of a fixed-point: when you apply an operation to a certain thing X and just get that X back. So, the function x^2 has two fixed-points: 0 and 1. These would be "eigenpoints", but we don't call them that because we reserve the term "eigen" for linear operations, such as the derivative operator or matrix-multiplication.
The point of my answer is that the exponential function is elegantly understood as the eigenfunction of the derivative. In other words: which function is its own derivative? Which one tells you how it changes just by looking at it?
but in the end one has to be familiar with some calculus and complex numbers to make sense of this topic.
I think this is more important than the listing of terms one by one in some of the sibling comments. The intros to Calculus, Analysis, Linear Algebra you get in high school/university will get you to the right place. But if you're unfamiliar with 'function' or 'exponentiation', you're probably missing some more secondary-school-level maths background to tackle those topics. That's not a difficult gap to close but it's a necessary first step before taking the next one.
>> ... in the end one has to be familiar with some calculus and complex numbers to make sense of this topic.
> I think this is more important than the listing of terms one by one in some of the sibling comments. The intros to Calculus, Analysis, Linear Algebra you get in high school/university will get you to the right place.
That's certainly true, but if someone wants to pursue understanding of these, simply saying "You need more familiarity with ..." gives them nowhere to start, and no terms to search for. Trying to give some context for each item, and to mention some of the underlying words, gives them a place to start, should they choose to spend the time investigating further.
Right, in the next bit I said I just don't think the person asking the question yet has the mathematical tools to tackle either those courses or most of the things you'd get from searching the terms and they'll need to get some more basic stuff out of the way first.
The self-guided search from terms is obviously fine and it's way more than fine to offer to personally give some stranger on the internet pointers as you did.
It's the one solution that satisfies the initial value condition y(0)=1.
This idea is analogously applied to the case of systems of ODEs with constant coefficients. The system y'=y=Iy has a basis of solutions of the form (e^x 0 ... 0)^t, (0 e^x 0 ... 0)^t, etc. Written column-wise into a matrix, this gives us the Wronskian Y which satifies Y(0)=I, and this Y is exactly the matrix exponential e^(xI).
More in general, the equation y'=Ay has the Wronskian Y=e^(xA) which satisfies Y(0)=1.
While others have already answered your question, there's another thing to bear in mind: the most elegant way to think about some concept in mathematics is not necessarily the best way to introduce it to someone who is new to the topic.
There are multiple ways of defining either the number e or the function that takes x to e^x (you can define one of them in terms of the other; if you know e, you can define e^x in the same way you usually define exponentiation: first you define it for integers, then rationals, and then you "fill in the gaps" for the irrational arguments (let's leave out complex numbers for now); while, if you know how to compute e^x for any x, then e is just that function evaluated at 1). Each of them highlights a different aspect of that all-important number.
What the GP was trying to explain was that the exponential function to the base of e, up to a multiplicative constant, is the only function whose derivative (roughly: slope) is equal to itself. That is: any such function is of the form a*e^x for some real number a. This is certainly one big reason for why e plays such a central role in mathematics.
But I think for someone not very familiar with all these topics, it's much easier to think of e as some number somewhere between 2,5 and 3. There are also some explanations for where exactly this number comes from, e.g. as the basis for continuously compounded interest (which is useful in economics), but they do require the student to be comfortable with some sort of, at least informal, limiting process.
To answer your main question - "what level of maths" do you need is more about what topics: a couple introductory classes in Calculus, Linear Algebra, and Discrete Mathematics will make that sentence understandable.
I'll give some definitions and resources, but think about how you would explain multiplication/division/algebra to a kid. At some point, you just have to work through some problems and the math just makes sense. So I'll explain the terms above, but best bet is to just take a few courses on the topics I mentioned above.
Quick definitions:
- Exponential: is the constant e to the power of some number. e, like pi, is a ubitiqous constant in mathematics and nature. You run into pi when doing geometry, and e when doing calculus.
- A function in mathematics is similar to a function in programming, but not exactly. A course in Discrete Mathematics helps here.
- F-apostrophe is notation used in calculus to show the relation of one type of function, called a derivative to another. f and f' are related. How, though, is better explained by taking calculus.
- An eigenfunction exists in a system of equations. Think of it like a 'balance' point. Linear Algebra will help make sense of this term.
Those 2 video's will do a good job of giving you the 'intuition' behind Calculus and Linear Algebra. Like programming, though, you just have to actually work some problems out by hand for the stuff to sink in. For that, do something like MIT opencourseware or a local online college course.
Tangentially related to your point - math is an extremely dense subject.
Small little statements with just a few symbols can be LOADED with information - things like complex, meaty definitions for even the most benign terms (e.g., a function), and stuff that can be deduced from it from a million different theorems.
This is part of the reason why it's so easy to fall behind in math. Once you are missing a vital part of information, it can be hard to identify where that gap is and the train just keeps moving and moving.
This was my experience with math, and something that I noticed when I tutored people in some low level math as well.
I've thought about this on occasion. Anecdotally (and subject to selection and confirmation biases) I've seen students who struggled excessively in a given course, dropped, and taken the previous course to excel not just in the subject they originally dropped but also in subsequent classes. I've wondered if my alma mater might be better off recommending students to start a level or two below where they currently are to fill in some of those vital gaps.
This is the first thing that needs to break in your intuition. The "increase" that occurs with "repeated multiplication" in the layman's understanding of exponentials is associated with the real numbers or subsets of the real numbers.
In reality, the exponential map is a repeated "action," where the action can be rotational, as is the case with the imaginary numbers. You can also exponentiate matrices, polynomials, quaternions, etc. Each one has a well-defined exponential map, which you can write down and evaluate.
> At what 'level' of maths or understanding do you reckon someone has to be in order to make sense of your introductory paragraph?
It varies from country to country, curriculum to curriculum. I would expect any student studying A-Level in the UK (the last year of secondary school) would know most of this, but not necessarily all the connections between them
>> "I find that the most elegant way to understand exponential-related things"...
> Okay, I know what exponential increase means. Something grows by an exponent of something. 2, 4, 16, ...
For something to increase exponentially means that at each step it is multiplied by a factor (usually greater than 1). In your example the factor is 2.
So yes, you have a handle on what it means for something to grow exponentially.
>> "is by defining the exponential as the function satisfying f'=f (with suitable normalization)"...
> I don't know what an exponential is.
There is more than one way to define an exponential, and in part that's what makes them so important. they turn up in multiple contexts, and being able to think of them in different ways is valuable.
> A function, to me, is something I call as part of code, so I am not sure if I might be confusing the programmer and mathematician-definition of 'function'.
In mathematics a function is a "thing" that takes a value from some domain and returns a value in some (possibly the same, but not necessarily) domain. So "log" is a function that takes positive real numbers and returns a real number (although it's also defined in other contexts such as complex numbers and abstract groups).
Polynomials are functions, usually we think of them as taking a real number and returning a real number: y=4x^3-x+5. Polynomials can also take other "types" such as complex numbers, matrices, group elements, etc.
But basically a function takes a value and returns a value.
> I don't really know how to interpret f'=f in any way.
The notation f' is used (in some contexts) to refer to the derivative of the function f. Given a function f we can ask: At this location, how quickly is it changing? That defines another function, which we call the derivative. This is differential calculus.
Given a function y=f(x) the derivative is often written as dx/dy.
> I also don't really know how normalization plays into it. For me, normalization is something related to homogenising a series of things. Sometimes it can be related to normalising the volume of a bunch of mp3 files, or in other cases it can be turning a bunch of badly formatted phone numbers into a common format for a database.
So the exponential function y=e^x, where e is the constant 2.71828..., is the unique function with the property that its derivative is the function itself. But the function y=2^x has nearly the same property once you permit multiplication by an appropriate other constant. That's the re-normalisation.
So an exponential is a function which, up to multiplication by a constant, is its own derivative. Hence f'=f up to normalisation.
>> "In other words, it is the "eigenfunction" of the derivative, which is a linear operator. This is a very natural thing to ask when studying derivatives, ODEs, etc."
> I have often heard terms like "eigen"-something, but this always causes me trouble, because I end up parsing it in German ("eigen" means "self" in German). So now it is the "self"-function of something?
When you have a transformation, often it does different complicated things in different directions. But sometimes you can find a direction where the transformation is especially simple, such as just a scaling.
Take a square and stretch, rotate, and sheer it. When you've done so there will be a direction where the result is a simple scaling. That's called an eigen-vector, and the amount of scaling is the associated eigen-value. By knowing the eigen-vectors and eigen-values of a transformation you can know nearly everything about the transformation as a whole.
(Note: a rotation has to eigen-vectors when you limit yourself to the real plane, but does have complex eigen-values.)
> Okay, of the 'derivative'. Is a derivative the mathematician's way of saying "figuring out something you don't know by using a bunch of other things you do know"?
No. As mentioned earlier, the derivative of a function is the measure of how fast it's changing. Example: if y=x^2-3x+1, then at location x the function is changing at rate 2x-3. If you plot y against x, then at any point the slope of the curve will depond on x, but is given by the function dy/dx = 2x-3.
> I do not know what a linear operator is at all.
A "Linear Operator" is a thing L that transforms Xs into Yx, but has the properties that
* L(x0+x1) = L(x0) + L(x1)
* L(c.x) = c.L(X)
where c is any suitable constant, and x, x0, and x1 are the things that L transforms. Linear operators are the basis of all machine learning (and many other things)
> I also do not know what ODEs are.
Ordinary Differential Equation. These are equations where you express a relationship between a function and its derivatives. So the equation earlier : f' = f : is an ODE. It says that the function f is equal to its own derivative.
ODEs are important in fluid dynamics, because the way a fluid flows depends on pressure changes, which are expressed as derivatives.
The ODE f"=c.f is used to describe Simple Harmonic Motion.
I hope that helps, but you have no contact details in your profile, so I can't contact you to offer more information.
I was a math major in college, and somethings you learn how to work out mechanically and understand intellectually.
I dont do anything remotely close to academic, undergraduate mathematics, but it's interesting how now that I'm older i seem to understand things better, to internalize them beyond something mechanical or simply accepting them because they make logical sense.
Wish I'd had that type of understanding in school.
This explanation is very intuitive - and the author makes a good point. There might be better ways to introduce pi other than the ratio of the circumference of a circle to its diameter.
To be fair that's the elementary school way of defining pi. I'm somewhat rusty myself, but I'm pretty sure even undergraduate real analysis courses define pi in terms of the complex exponential. "Ratio of circumference to diameter" is insanely hard to make rigorous.
Euler's formula got me through a bunch of EE classes back when I was a student. I was pathologically unable to commit trig identities to memory, possibly due to laziness. So instead I learned Euler's formula and worked everything through in complex numbers until I was ready to give an answer. I never got marked down for it.
I did this exact same thing for math competitions. You often need some stupid exact formula that you are supposed to have memorized, but I never did.
De Moivre also yields the same results for this, and it's somewhat easier/faster to use.
I remember discovering this formula by accident when I was a junior in high school. I was messing around on my Ti-83, when I discovered that `e^(i*pi)` resulted in `-1`. That blew my mind, so I showed my pre-calc teacher and he thought it was just a calculator glitch because it didn't know how to calculate imaginary powers.
I kind of forgot about it until the full formula surfaced in one of my electrical engineering classes in college and I realized I had (partially) stumbled across the formula by accident a few years earlier!
Oddly this leaves out what I would consider the most "elementary" proof, starting from Bernoulli's definition of e and requiring only the squeeze theorem.
You begin by writing exp(z) = lim_{N->\infty} (1 + z/N)^N. Note that in this limit we do not need the analytic continuation of powers: N is an integer, so the function y^N is single-valued on C. This gives a definition of the exponential function that is immediately entire and does not rely on calculus.
Then we can write exp(ix) = lim_{N->\infty} (cos(arctan(x/N)) + i sin(arctan(x/N)))^N ⋅ (1 + x^2/N^2)^N. (note that sec(arctan(t)) = 1 + t^2).
It is easy to show that lim_{N->\infty} (1 + x^2/N^2)^N = 1 for all finite x, so we find exp(ix) = lim_{N->\infty} (cos(arctan(x/N)) + i sin(arctan(x/N)))^N. It is not necessary to apply L'Hopital's rule since neither of the factors approaches zero or infinity -- we can simply split the limit.
By induction we can then show that (cos(x) + i sin(x))^A = cos(Ax) + i sin(Ax) for all integer A. We apply this to the previous expression to conclude that exp(ix) = lim_{N->\infty} = cos(N ⋅ arctan(x/N)) + i sin(N ⋅ arctan(x/N)).
Therefore we only need to find lim_{N->\infty} N ⋅ arctan(x/N). Consider the triangle with vertices (0,0), (1,0), (1, x/N): its hypotenuse intersects the unit circle at (N / sqrt(N^2 + x^2), x / sqrt(N^2 + x^2)), subtending an arc of exactly arctan(x/N). A similar triangle inscribed at this vertex has a short side which is necessarily less than the arc length. We therefore have Nx / sqrt(N^2 + x^2) < N*arctan(x/N) < x. The left and (trivially) right limits both equal x, so by squeezing lim_{N->\infty} N arctan(x/N) = x and therefore e^(ix) = cos(x) + i sin(x).
Be careful with your last paragraph. It builds upon results about the relationship between trigonometric functions (usually defined in terms of power series) and the unit circle's geometry, as well as intuition from 2D geometry in general. These things can actually be quite hard to prove. (Prime example in a slightly different context: https://en.wikipedia.org/wiki/Jordan_curve_theorem ) For instance,
> A similar triangle inscribed at this vertex has a short side which is necessarily less than the arc length.
already uses the fact that straight lines are the shortest lines in the class of C¹-differentiable curves due to the Euler-Langrange equations. (Good luck with proving this statement in the class of C⁰ curves.) Then, to use this fact, you need to prove differentiability of a curve parametrizing the unit circle, i.e. differentiability of the trigonometric functions. Now, at this point you might as well use the standard definition of e^x in terms of a power series because it simplifies the differentiability proof tremendously.
You can get that the interior line is shorter and the outer line is longer than the arc by relating these to the inner and outer polygons used in Archimedes' demonstration that the area of a circle may be related to the circumference:
It is possible to prove that the circumference of the outer polygons gives a decreasing sequence and the circumference of the interior polygons gives an increasing sequence simply by applying the triangle inequality when you double the number of sides. This is sufficient to justify the use of the squeeze theorem here, although you are correct that the axiomatization of geometry is quite subtle.
I did not give much thought to the definition of the trigonometric functions while writing this because I assumed this theory is usually fully developed before we begin complex analysis (but is it?).
The Tau Manifesto shows how using tau = C/r makes the geometric meaning of Euler’s formula (and hence of Euler’s identity) even clearer. Disclaimer: I am the author of The Tau Manifesto.
Famously evaluates when x = π (pi) to Euler's Identity (listed halfway down the page), which is one of those beautiful "Wow!" formulas that brings seemingly unconnected things together.
Apropos of nothing, that page suggests that the identity (with the 1 shifted across the equals sign) is beautiful because it relates e, pi, i, 1, and 0.
I always felt 1 and 0 were shoehorned in, because they're clearly nonessential when one writes it the way you did. It honestly detracts from the pure relationship between e, pi, and i.
Gauss was known to have said “if you do not immediately recognize Euler’s formula as an intuitive result, you will never be a first-class mathematician.” (Paraphrased)
It makes more sense when you use tau as tau is a full rotation, not half rotation. e^tau=1.
Also I think that i is not interesting because i * i=-1, but because i * conjugate(i)=1. I never saw the -1 in the context of complex numbers but if you say it's 1, it's easy to see the unit circle. Similarly, with split-complex numbers, you have j * conjugate(j)=-1. Split complex numbers are about mirror reflection so the -1 makes sense.
With this definition, you can define the constants e and pi in a natural way. In particular, 2 * pi * i is the period of e^x. It then makes sense to analyse the function e^ix, with period 2*pi, and separate its real and imaginary parts. These are sure to be periodic real functions, and we can name them cos(x) and sin(x).
This Quora answer talks about this approach pretty well [1].
[1] https://qr.ae/pNy7qp