Hacker News new | past | comments | ask | show | jobs | submit login
Elligator: Elliptic-curve points indistinguishable from uniform random strings (2013) (acm.org)
97 points by NavinF 9 months ago | hide | past | favorite | 25 comments



Elligator is a bidirectional map from random bytes to elliptic curve points, which is mainly useful for censorship resistance. Its state-of-the-art protocol integration as far as I know is obfs4 (https://gitlab.com/yawning/obfs4), one of the Tor circumvention pluggable transports (https://tb-manual.torproject.org/circumvention/). The others rely on disguising as other protocols rather than looking random.

Elligator implementations have a history of subtle bugs, arguably because there was not a spec, only a paper, although it looks like there are some third-party test vectors now.

In general the "inverse map" from random bytes to point is used only for censorship-resistance use cases, but the "direct map" turning random bytes (like a CSPRNG output or a hash) into a point is useful for a number of purposes in cryptography, like VRFs. That led to the direct map being specified more rigorously, like in https://www.rfc-editor.org/rfc/rfc9496.html#name-element-der... and https://datatracker.ietf.org/doc/html/rfc9380.

IMHO a map from a fixed amount of random bytes should be part of the fundamental group abstraction, and that's what Ristretto provides. The CFRG approach is slightly different, providing full domain-separated hash "suites" that go straight into a curve point.


As a historical note, the first time I recall seeing the anti-censorship use case was in “Telex” [1] which uses it to steganographically embed a ciphertext into a TLS random nonce, so that a router in the middle can detect whether the connection should go to its terminus or be routed somewhere else. However the authors of that paper didn’t have a standard primitive for doing this and so they had to make their own approach. Ironically, an even older use of the general approach is the NSA’s Dual EC DRBG generator [2], which has to encode elliptic curve points into random-looking bit sequences as part of its design as a PRNG. They ended up homebrewing that using standard curves by dropping several bits of each X coordinate, which in theory is all they need because there’s no need to go in the inverse direction. However the conjecture is that Dual EC is backdoored, and the backdoor requires an inverse mapping from bitstrings back to points; this has to be achieved by brute-forcing every possible point and testing.

None of this has anything to do with your comment but I love the history of these steganographic tricks.

[1] https://www.usenix.org/conference/usenix-security-11/telex-a... [2] https://en.m.wikipedia.org/wiki/Dual_EC_DRBG


Koblitz in "Elliptic Curve Cryptosystems" [1] dedicated section 3 to how to embed binary strings into points and back, which is of course necessary for elliptic curve ElGamal.

[1] https://doi.org/10.1090/S0025-5718-1987-0866109-5


This isn't meant to protect against real-time detection. For example GFW will block streams that appear high entropy from the get-go. This is more to conceal the fact a key exchange has occurred in captured traffic that is flagged for expert/human analysis.


Being indistinguishable from uniform random bytes means you can undetectably embed it within anything else that's supposed to be uniformly random.

Also, an expert human can't distinguish a single regular EC point from uniform random bytes, depending on the format. You'd still need to look at the statistics over a number of points (i.e. by writing a distinguisher).


If a single point matches the curve equation of a well-known curve then that’s pretty compelling evidence.


The point is that any random byte string can be decoded by Elligator to become a point matching the curve equation. This means that such a check is virtually useless and tells you nothing about whether there is a hidden key exchange happening or not.


not for compressed point representations (hence "depending on format")


So just X25519 then?


No, most curves support a compressed (x coord, parity/sign bit) representation. It's what you'd probably be using if Elligator didn't exist, so makes sense as the point of comparison.

https://datatracker.ietf.org/doc/draft-mattsson-tls-compact-...


Wouldn't it be a good candidate for steganography then? If I hide this string in my paragraph of innocuous text then I have some plausible deniability if Eve successfully guesses what aim doing.


The trouble with this is that whatever steganographic feature you're embedding it into would (likely) not be uniformly random by default. So although nobody could prove you were transmitting an EC point, they could reasonably assume you were transmitting some form of encrypted data, given enough samples.

I think steganography is the right idea in general, but it needs to be more clever than the classic text/image-based techniques.


Elligator is appropriate for use as a component in a steganography system, but in most cases the problem it solves isn't the hardest part of the system.

It's also useful in cases where you need to hash to a curve in a way that provably (in the ROM) doesn't have problematic structure. This comes up fairly often: OPRF, Boneh-Boyen short sigs, password-auth key exchanges like SPEKE and PAK, etc. It also was useful in SIKE, before that was broken.


Elligator is only tangentially relevant to the hashing problem. If "doesn't have a problematic structure" is understood in a weak sense, then this was already solved by Shallue and van de Woestijne almost a decade prior to the Elligator paper. And if it is understood in the stronger sense of indifferentiability, then Elligator by itself doesn't actually fit the bill, and you need something like the Brier et al. approach or SwiftEC.

The steganography stuff is the only actually novel contribution of the Elligator paper.


Perhaps it's a bit more than tangentially relevant? Yes, SW[U] predates it for hashing, and Brier et al builds indifferentiability on that. But Elligator2 is simpler and faster, and also works with Brier et al, though it only supports even-order curves. As a result, Elligator2 is specified in RFC 9380 along with SW[U], and has fairly broad implementation.

Also yes, SwiftEC is faster than Elligator2 + Brier et al, at least if Jacobi symbols are fast with your parameters/hardware. But you can do something very similar to SwiftEC with Elligator2. This is simpler than SwiftEC and probably still faster, and was published earlier (https://eprint.iacr.org/2020/1513).


You can make a random (high e) signal look non-random (low e) by adding strings of zeros and ones to it. Or you can embed it into known formats.


I was always fascinated by stuff like LSB stenography where you hide a message (or another image) in the least significant bits of an image. Not visible to the human eye, mathematically should look like thermal noise in the image at most.


Image sensor noise is not a uniform distribution (I believe it's typically modeled as a Gaussian distribution, but even that's an approximation).

Thus, it is mathematically/statistically distinguishable.


The LSB of samples from a gaussian distribution is uniformly distributed: 50/50 chance of 0/1

(Assuming you have enough bits per sample which image sensors provide IRL)


Only in theory. In practice, they're correlated with the other bits and adjacent pixels.


How correlated? I spent 2 minutes trying and failing to get chatGPT to generate code that downloads an image dataset and runs diehard on the LSBs. I'd expect the LSBs to pass the randomness test.


Here's a mathy explanation of its use in the real world (for peer to peer communications): https://github.com/bitcoin-core/secp256k1/blob/master/doc/el...



I have been thinking about local privacy preserving AI lately and stenography circumscribes much of the challenge. Having access to sensitive data means you have to limit access to a lot of tools. That’s how it would work in a privacy context but the same tools are neutral to being used in a censorship context.

It’s over 10 years since but it would be nice if important research like this at least touched on the egalitarian issues rather than presenting a partisan agenda. E.g. someone somewhere who has to deal with private data now also has to deal with even stricter restrictions, without any doubt.

Sometimes I worry about researchers working on important issues with apparently blinders on. If we don’t self-supervise we just outsource the work, and in this case that means we are back to square one.


Research like this does not touch on that issue because it’s completely unrelated. Why would someone trying to exfiltrate private data have to do key exchange?

Moreover, if you try to enforce anything by placing a bump in the wire (the only thing defeated by this), you’re gonna lose every time.

This “what if a bad guy has access to technology too!?” argument is tiresome.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: