
Elliptic Curve Cryptography for Beginners - deafcalculus
http://blog.wesleyac.com/posts/elliptic-curves
======
tptacek
So far, the best single-page elliptic curve primer for generalist programmers
I know of is Adam Langley's:

[https://www.imperialviolet.org/2010/12/04/ecc.html](https://www.imperialviolet.org/2010/12/04/ecc.html)

I understand the urge to try to get a high-level grok of curves without much
math, but I spent years bouncing off the outermost surface of curve
understanding by trying to start with the curve picture and the intuitive
geometry of curve shape, and what finally made curves click for me (and
quickly) was to simply take the curve equation --- which is itself high school
math! --- and play with it a bit.

So if you're a programmer and you want a baseline understanding how curves
work, do what you'd do with any other subject you're trying to understand: pop
open an editor, take the Weierstrass curve equation, pick a y, solve for x;
then do it in a finite field (ie, mod a prime). Then write an "add", then a
"scalar mult". It's a couple hours of noodling, tops.

As always, but particularly with curves, remember that the basic understanding
of what's going on is nowhere nearly enough to ever use them safely!

~~~
palisade
The common advice of, "You'll never know enough to use them safely so don't
bother trying, just trust us." that has been going around has been proven
itself to weaken encryption.

Just recently, an amateur programmer with very little background in
cryptography discovered a flaw in libsodium in the Argon2 implementation and
also in the reference implementation that everyone in the world was trusting
without question. My advice is if you're an engineer, don't be afraid to write
your own implementation of tried and trusted ciphers. This is how we find bugs
and improve. This isn't the only trusted library or algorithm that has been
shown flawed in recent times.

The strength of your cipher implementation can be tested and proven. We need
to stop telling everyone these algorithms are absolutely trustworthy so don't
try understanding them or implementing them. Nothing ever advances or improves
that way. Buck the trends, create competing libraries, try new things.

The ancient ones were not all knowing, they were doing everything wrong. Their
designs are full of flaws. Deny them. We need to code ourselves out of the
coming cryptographic apocalypse. Do not hide your heads in the sand and hope
the world doesn't come crashing down around you.

Edit: I found the blog/website of the man I mentioned in this comment who
discovered the Argon2 flaw.

[http://loup-vaillant.fr/articles/implemented-my-own-crypto](http://loup-
vaillant.fr/articles/implemented-my-own-crypto)

In his own words, "There's something worrying about this bug: I was the first
to discover it, in January 2017. According to Khovratovich himself, it was two
years old. Now I understand why the authors themselves didn't find it: unlike
me, they didn't have a reference implementation to compare to.

What I don't understand is, how come nobody else discovered this bug? If you
implement Argon2 from spec, you cannot miss it. Test vectors won't match, and
searching for the cause will naturally lead to this bug. I can draw only one
conclusion from this:

I'm the first to independently re-implement Argon2. This 'never implement your
own crypto' business went a little too far."

~~~
tptacek
Honestly, I'm not sure you want to get me started here.

Believe it or not, libsodium's implementation of Argon2 is an example of
libsodium going off the rails. Libsodium began as a cross-platform easy-to-
build repackaging of Nacl, a crypto library designed and written by
cryptographers with carefully chosen primitives. Libsodium has gradually
expanded it into a kitchen sink of shiny cryptography, and as a result now
libsodium users have to worry about the pitfalls of AES-GCM. Why does
libsodium implement Argon2? Hell if I know. As an interface, it's actually
worse than JCE, which at least had the self-respect to pretend it had a
"password based encryption" abstraction.

It gets worse though, if you really want to climb down this rabbit hole with
me, because I'm not totally sure why Argon2 exists either. The "bug" he found
in Argon2 has actually no practical impact, so much so that the reference
implementation decided not to bother fixing it. But that's because very little
in password hashes actually matter. Scrypt was the last important thing to
happen to password hashes, and bcrypt still works just fine.

If we want to keep touring all the weird shit that happens because people
pointlessly reimplement things many of which don't need to exist in the first
place, we can keep doing it all the way back to unpadded RSA-512 off a broken
browser RNG, which (a) existed and (b) got me gameover on a pentest once.
Maybe I should be happy about that, but I mostly find it frustrating.

My point here, though, is that you aren't going to learn nearly enough from a
tutorial of any sort to safely implement curves.

~~~
spopejoy
> the pitfalls of AES-GCM

Can you elaborate a little on this? I ask because Noise protocol standardizes
on AES-GCM or ChaChaPoly, will apps using Noise with AES-GCM face "pitfalls"?

~~~
spopejoy
I found this discussion which is perhaps germane, about sensitivity to nonce
repetition:
[https://www.imperialviolet.org/2017/05/14/aesgcmsiv.html](https://www.imperialviolet.org/2017/05/14/aesgcmsiv.html)

------
dsacco
This is pretty good! The arithmetic of elliptic curves gets very complex, but
I like how you arrived at points on an elliptic curve just by defining groups
and fields. It might be a slight improvement to give a short, Rudin-style
explanation of the difference between a set and a field. You could do this
from first principles of set theory without being so terse as to be
incomprehensible or diving into the axioms; I think the stated goal of little
math is fine for the audience, but at the same time such an audience might not
immediately understand what “field = set with addition and multiplication
defined on it” means. I think a small expansion of the treatment on _why_ the
ECDLP is hard would also be good. It's intuitively easy to follow that hard
problems can exist, but maybe continue on in showing why this particular
hardness assumption works for cryptography (because there are many that do
not).

Places to go from here if you enjoyed reading this and want to learn more
about elliptic curves and cryptography related to them:

1\. [http://blog.bjrn.se/2015/07/lets-construct-elliptic-
curve.ht...](http://blog.bjrn.se/2015/07/lets-construct-elliptic-
curve.html?m=1)

More on an elliptic curve and constructing a hypothetical one.

2\. [https://medium.com/@VitalikButerin/exploring-elliptic-
curve-...](https://medium.com/@VitalikButerin/exploring-elliptic-curve-
pairings-c73c1864e627)

Elliptic curve pairings.

3\. [https://www.lvh.io/posts/supersingular-isogeny-diffie-
hellma...](https://www.lvh.io/posts/supersingular-isogeny-diffie-
hellman-101.html)

An intro to supersingular isogeny cryptography, which has a basis in elliptic
curves as a mathematical structure, but is fundamentally different from
elliptic curve cryptography.

------
wyc
If you want to see a "from scratch" implementation using existing algorithms,
here's short working snippet of elliptic curve cryptography (specifically
Bitcoin's) without 3rd party libraries:

[https://github.com/wyc/haschain/blob/master/Secp256K1.hs](https://github.com/wyc/haschain/blob/master/Secp256K1.hs)

The implementation doesn't concern itself with groups or fields, but they're
still very useful to make sense of the code at all. Actually, I should add
some types and implement an instance of Data.Group when I have more time.

Of course, it's for fun and not production use. I didn't give the slightest
thought to timing attacks, optimized performance, etc.

------
enedil
As nice as it is, integers are not a field (i.e. most of them don't have
multiplicative inverses, or 1/n usually isn't integral).

Another good writeup: [http://andrea.corbellini.name/2015/05/17/elliptic-
curve-cryp...](http://andrea.corbellini.name/2015/05/17/elliptic-curve-
cryptography-a-gentle-introduction/)

~~~
widdma
True, perhaps the author meant rationals.

~~~
ecesena
It's usually integers modulo p (p prime), which is a field.

Edit: but yes, the OP writes "integers (Z) are a field", which is wrong.
Integers form a ring.

------
mlevental
more in depth and just as gentle:
[https://jeremykun.com/2014/02/08/introducing-elliptic-
curves...](https://jeremykun.com/2014/02/08/introducing-elliptic-curves/)

------
drcode
In case anyone here is interested in Cryptography as it pertains to
Cryptocurrency specifically, here's a talk I gave recently that might be
interesting to you:
[https://www.youtube.com/watch?v=Fyqtl7eGQZY&t=1062s](https://www.youtube.com/watch?v=Fyqtl7eGQZY&t=1062s)

~~~
kanzure
here are some videos from dan boneh on cryptography,
[https://www.youtube.com/playlist?list=PL9oqNDMzcMClAPkwrn5dm...](https://www.youtube.com/playlist?list=PL9oqNDMzcMClAPkwrn5dm7IndYjjWiSYJ)

------
fallingfrog
In the image for A+A=C (multiply by 2) I see the intersection point C, but
what if you wanted to multiply by 3? That would be A+C=D and I don't see any
point on the curve that the line through A and C intersects. In fact it looks
like for any A you choose, adding A+A gives a C which has that property (in
other words A*3=0 for any A). This is just by visual inspection. What am I
missing?

~~~
CarolineW
Seems to have labelled "C" and "-C" in a very misleading manner. Take point A.
Draw a tangent, that cuts the curve at point C. That's not the point 2A.
That's the point C.

Reflect C across the X axis, and _that 's_ point 2A.

So you're not missing anything, the article is misleading at best.

In short, to take the sum of two points A and B, draw a line through them both
to find a third point on the curve, reflect that point across the X-axis, and
the result (which will also be on the curve) is A+B.

If A and B coincide then use the tangent at A.

The point at (vertical) infinity plays the role of the zero.

~~~
fallingfrog
Ohh OK! Thank you. Makes more sense now.

------
gewoonkris
The article gives a nice intuition about how encryption works using elliptic
curves without going too deep into the math.

I'm curious for a similar explanation for how decryption would work though; a
trapdoor function is nice and all, but it's only half of the story if there is
no 'way out'.

~~~
dsacco
Your question (if I interpret correctly) is about how trapdoor functions work
in asymmetric cryptography, which is broader than elliptic curves in
particular. I can give a basic overview; in essence, you use a trapdoor
function because you want something to be very difficult to compute but
comparatively easy to verify. Mathematically speaking, a good trapdoor
function should be such that anyone without the solution will spend a great
deal of time trying to calculate it, while anyone with the solution will spend
almost no time verifying that the solution is correct.

If we use RSA as an example, we can see how this works in practice:

1\. Let _n_ be the product of two large primes, _p_ and _q_. Large means 512
bits or greater in this context. Then _n_ is the RSA modulus.

2\. Select a special number _e_ such that 1 < _e_ < ( _p_ \- 1)( _q_ \- 1),
and such that _e_ is coprime with ( _p_ \- 1)( _q_ \- 1) (meaning no numbers
divide evenly into both).

3\. Your RSA public key is now ( _n_ , _e_ ) - you can share this publicly
with anyone you want to securely communicate with. Conversely, _p_ and _q_
must _not_ be public. This is where your one-way function comes in to separate
what can be public from what must be private: _n_ is part of the public key,
but was generated by _p_ and _q_ , and multiplication of two primes is a one-
way function.

4\. Your RSA private key _d_ is computed from _q_ and _e_ , such that for any
pair ( _n_ , _e_ ) there can only be a unique _d_. _d_ is the inverse of _e_
modulo ( _p_ \- 1)( _q_ \- 1) - in other words, _d_ is a unique number such
that, when multiplied by _e_ , is equal to 1 modulo ( _p_ \- 1)( _q_ \- 1).
This can be expressed simply as _ed_ = 1 mod ( _p_ \- 1)( _q_ \- 1).

5\. Assuming you chose _e_ correctly, _d_ is unique and very difficult to find
without having _p_ , _q_ and _e_. However, while it's very difficult to find
_without_ those inputs, it's very _easy_ to find if you have them, because you
can use what's called the Extended Euclidean Algorithm to compute _d_ in
polynomial time given _p_ and _q_.

6\. So now you want to encrypt something with RSA. Your peer has your public
key, ( _n_ , _e_ ). The encryption process to compute the ciphertext C from
the plaintext P is simple: C = P^ _e_ mod _n_ (C is equal to the plaintext P
multiplied by itself _e_ times and reduced modulo _n_ ). Exponentiation has
polynomial complexity.

7\. To decrypt the ciphertext C in order to compute the plaintext P, all the
holder of the private key needs to do is this: P = C^ _d_ mod _n_. In other
words, they raise the ciphertext C to the power of their private key.

So, returning to trapdoor functions in general: why does this _work_? It works
because the solution is easy to compute with all inputs, easy to _verify_ with
a reduced (public) set of inputs, and extremely difficult to compute with only
the reduced set of inputs (and the secret inputs are themselves difficult to
find from the public ones).

Hope that helps a bit and I understood your question's context correctly!
Trapdoor functions do not necessarily mean that there's "no way out", what
they mean is that a set of values have a relation such that one of the values
requires only a few of the values to compute a ciphertext and all or most of
the values to compute a plaintext for that ciphertext. When we talk about
functions being irreversible, we're specifically talking about _one-way
functions_ , and those are used in the context of hash functions. Trapdoor
functions are "merely" difficult to reverse.

One of the things I'm actively researching at the moment is whether or not
(and how) it would be possible to construct a post-quantum secure public-key
cryptosystem based on a hardness assumption derived from problems in Ramsey
Theoretic graph problems (coincidentally, this problem had a recent treatment
after being quietly raised over a decade ago:
[https://pdfs.semanticscholar.org/1599/62064634fe10897aea300c...](https://pdfs.semanticscholar.org/1599/62064634fe10897aea300cddb3a909affe6e.pdf)).

~~~
T_D_K
This is a great response! But I think the person you replied to understands
all this, and is wondering about the exact method (or a layman's explanation)
of the same process in terms of elliptic curve crypto rather than the
traditional RSA (I'm wondering the same thing).

------
red_admiral
I always get a bit annoyed when someone tries to explain ECC with an image of
an elliptic curve over the reals. The whole point of ECs over finite fields in
cryptography is that they don't have any regular structure like that!

~~~
xyzzyz
What do you mean? Elliptic curves over finite fields have tons of structure
themselves, and to my knowledge, the discrete logarithm problem for ECs over
reals is just as hard as for finite fields.

~~~
red_admiral
I'm not sure what your level of understanding is, so I'll write this reply in
a way I hope the average reader can follow, though I assume you'll know most
of this already - I hope I don't come across as condescending.

TL;DR: the topology of these curves is quite different over finite fields.

I first encountered what I consider cryptographically unhelpful
drawings/intuition when I studied lattice-based cryptography. There's a whole
field where a fundamental hard problem is finding the closest vector to a
point not on the lattice. If you try and explain this by drawing a 2 or
3-dimensional example over the reals, anyone who's paying attention and
proficient in Linear Algebra will ask themselves, why don't you just transpose
the whole thing and then take the orthonormal projection? The problem is that
in a finite field, unlike over the reals or complex numbers, orthogonal
projection doesn't get you any closer to your target, so the intuition over
R^2 is in my opinion very unhelpful to understand exactly why SVP/CVP are
supposed to be hard, and indeed had me confused for a while until my professor
pointed out I should forget "the silly drawing over R^2" which he didn't like
either.

For Elliptic Curve Cryptography, I find the example of a curve over the reals
again misses the point of why exactly problems like DLOG are hard - for
discrete-log based crypto at the 256-bit security level over finite fields,
you need an about 15k bit modulus depending on which site you look at (NIST
2016 at keylength.com is a good place to start) due to speedups from Number
Field Sieving etc. THis is the kind of structure that I mean you don't get.

On EC, to get 256 bit security you need exactly 2 * 256 = 512 bits of key
(slightly oversimplifying, the factor 2 is because you get the "sign of the y
value" for free). The number 512 pretty much stands for the conjecture "there
is no other cryptographically exploitable structure to take DLOGs over these
Elliptic Curves". In fact it's not just "we haven't found any such structure"
but there's an argument about heights of points (Miller '85 I think - though
I'm pretty sure I've also read something by Koblitz on this) why on certain
kinds of curves such structure is unlikely to exist. (Of course, other kinds
of curves for fancy bilinear group stuff exist and do have more structure. And
supersingular curves are another topic altogether.)

The structure you obviously do get is a group, which you can extend to a
vector space over the finite field so that (x \mapsto xP) is a linear
function. The security property you want is roughly "you get this group, but
only this group" (and the "sign", so add 1 extra bit to your keylength) and
there is no useful concept of anything like points being close to each other,
continuous maps in the usual sense etc. Plot the points on an EC over a small
finite field and it looks like a random scatterplot rather than a neat and
elegant curve - which is the whole point of using ECs over finite fields for
DLOG-based crypto.

~~~
xyzzyz
I found your comment interesting, especially the Miller '85 reference. You
certainly don't come across as condescending, so don't worry about that.

You are of course right that extra structure helps with solving DLOG. I was
hoping that you could point me to some specific reasons why DLOG is easier on
real/complex curves. I'd be very interested in learning these.

I don't find a "nice smooth curve vs. scatterplot" argument very convincing by
itself -- you only care about a cyclic subgroup anyway. Take a complex
elliptic curve, viewing it as a torus consider its fundamental parallelogram,
and plot a cyclic subgroup on this parallelogram. Won't you get a nice scatter
plot just the same?

Even if you can use this extra structure to find some more efficient DLOG
algorithms, you could try to apply the same solution as with the crypto based
on multiplicative groups of finite fields: just use larger points. My
understanding so far is that the reason we don't do it is purely computational
efficiency -- I'd be very interested in learning any reasons to think the
contrary.

And, back to the original point, I think that the drawings of real/complex
curves are very helpful when learning the basics. The group law is really best
understood if you actually draw some lines intersecting the real affine part
of some curve, show what happens when you draw a tangent line etc. If you
confine yourself to finite fields, while it formally works just the same, the
geometry is much less obvious, and talking to beginners about Cartier divisors
and line bundles won't help much.

While the finite fields are the whole point of elliptic curve cryptography,
they are definitely not the whole point of elliptic curves, and in fact I
believe that complex case with all the extra structure is best for educational
purposes.

~~~
red_admiral
I'm afraid I don't have a neat answer to why your first question. I did find
the Koblitz paper I meant earlier but it's more about why one particular
attack won't work:
[https://link.springer.com/article/10.1007/s001459900040](https://link.springer.com/article/10.1007/s001459900040)

There are other reasons beyond efficiency why I would prefer people to switch
their DLOG crypto to EC in practice: it's hard enough to implement - even with
introductory blog posts with images - that your average developer leaves the
details to an expert. libsodium for example is very well designed and written
and few people think "I know, I'll write my own Curve25519 implementation" but
lots of people seem to think that they understand finite fields well enough to
build crypto, after all they have a bignum library and a modexp function, what
could possibly go wrong?. I've seen finite-field discrete log software that is
supposed to be production ready with the following problems:

    
    
      * Non-constant time implementation of "schoolbook" square-and-multiply. 
      * Forgetting to check if points really are in the group, e.g. you're supposed to be working in a q-order subgroup of Z^*_p where q | (p-1)/2 but the software accepts any integer less than p as a group element. 
      * Crash or infinite loop if you pass 0 as a "group element".
      * Parameters can be specified or overridden by the sender of a message and set to tiny values.
      * Not checking whether the modulus is prime.

------
jrubinovitz
Well done, loved the use of interactive graphics. Curious if you will continue
this approach, but for more advanced cryptography, or move on to other things.
Would love a more in depth guide to cryptography in this format.

------
landmark3
Well written and even I understood it - recommended!

------
jlgaddis
I'm not aure if this page uses MathML or something like that but it does not
render correctly on the iPad.

