Psychic Signatures in Java

tptacek · on April 20, 2022

This is probably the cryptography bug of the year. It's easy to exploit and bypasses signature verification on anything using ECDSA in Java, including SAML and JWT (if you're using ECDSA in either).

The bug is simple: like a lot of number-theoretic asymmetric cryptography, the core of ECDSA is algebra on large numbers modulo some prime. Algebra in this setting works for the most part like the algebra you learned in 9th grade; in particular, zero times any algebraic expression is zero. An ECDSA signature is a pair of large numbers (r, s) (r is the x-coordinate of a randomly selected curve point based on the infamous ECDSA nonce; s is the signature proof that combines x, the hash of the message, and the secret key). The bug is that Java 15+ ECDSA accepts (0, 0).

For the same bug in a simpler setting, just consider finite field Diffie Hellman, where we agree on a generator G and a prime P, Alice's secret key is `a mod P` and her public key is `G^a mod P`; I do the same with B. Our shared secret is `A^b mod P` or `B^a mod P`. If Alice (or a MITM) sends 0 (or 0 mod P) in place of A, then they know what the result is regardless of anything else: it's zero. The same bug recurs in SRP (which is sort of a flavor of DH) and protocols like it (but much worse, because Alice is proving that she knows a key and has an incentive to send zero).

The math in ECDSA is more convoluted but not much more; the kernel of ECDSA signature verification is extracting the `r` embedded into `s` and comparing it to the presented `r`; if `r` and `s` are both zero, that comparison will always pass.

It is much easier to mess up asymmetric cryptography than it is to mess up most conventional symmetric cryptography, which is a reason to avoid asymmetric cryptography when you don't absolutely need it. This is a devastating bug that probably affects a lot of different stuff. Thoughts and prayers to the Java ecosystem!

loup-vaillant · on April 20, 2022

Interestingly, EdDSA (generally known as Ed25519) does not need as many checks as ECDSA, and assuming the public key is valid, an all-zero signature will be rejected with the main checks. All you need to do is verify the following equation:

R = SB - Hash(R || A || M) A

Where R and S are the two halves of the signature, A is the public key, and M is the message (and B is the curve's base point). If the signature is zero, the equation reduces to Hash(R || A || M)A = 0, which is always false with a legitimate public key.

And indeed, TweetNaCl does not explicitly check that the signature is not zero. It doesn't need to.

However.

There are still ways to be clever and shoot ourselves in the foot. In particular, there's the temptation to convert the Edwards point to Montgomery, perform the scalar multiplication there, then convert back (doubles the code's speed compared to a naive ladder). Unfortunately, doing that introduces edge cases that weren't there before, that cause the point we get back to be invalid. So invalid in fact that adding it to another point gives us zero half the time or so, causing the verification to succeed even though it should have failed!

(Pro tip: don't bother with that conversion, variable time double scalarmult https://loup-vaillant.fr/tutorials/fast-scalarmult is even faster.)

A pretty subtle error, though with eerily similar consequences. It looked like a beginner-nuclear-boyscout error, but my only negligence there was messing with maths I only partially understood. (A pretty big no-no, but I have learned my lesson since.)

Now if someone could contact the Whycheproof team and get them to fix their front page so people know they have EdDSA test vectors, that would be great. https://github.com/google/wycheproof/pull/79 If I had known about those, the whole debacle could have been avoided. Heck, I bet my hat their ECDSA test vectors could have avoided the present Java vulnerability. They need to be advertised better.

DyslexicAtheist · on April 20, 2022

> Thoughts and prayers to the Java ecosystem!

some very popular PKI systems (many CA's) are powered by Java and BouncyCastle ...

nmadden · on April 20, 2022

BouncyCastle has its own implementation of ECDSA, and it’s not vulnerable to this bug.

_ofdw · on April 20, 2022

>infamous ECDSA nonce

Why "infamous"?

SAI_Peregrinus · on April 20, 2022

It's more properly called 'k'. It's really a secret key, but it has to be unique per-signature. If an attacker can ever guess a single bit of the nonce with probability non-negligibly >50%, they can find the private key of whoever signed the message(s).

It makes ECDSA very brittle, and quite prone to side-channel attacks (since those can get attackers exactly such information.

Mindless2112 · on April 20, 2022

There's an easy fix for that though -- generate k deterministically using the procedure in RFC6979 [1].

[1] https://datatracker.ietf.org/doc/html/rfc6979#section-3.2

dilippkumar · on April 21, 2022

> If an attacker can ever guess a single bit of the nonce with probability non-negligibly >50%, they can find the private key of whoever signed the message(s).

This doesn’t seem right. Why wouldn’t someone guess a bit 0, see if the recovered message makes sense, and if it doesn’t, then try bit 1?

It would make the entire scheme useless no? Am I missing something?

Thorrez · on April 21, 2022

I think they have to get the bit repeatedly and then combine the biased signatures together mathematically to get the key.

drexlspivey · on April 20, 2022

That makes no sense, how can you get the private key from knowing 1 bit of the nonce?

tptacek · on April 20, 2022

See, cryptography engineering is sinking in!

Here you go:

https://toadstyle.org/cryptopals/62.txt

What's especially great about this is that it's very easy to accidentally have a biased nonce; in most other areas of cryptography, all you care about when generating random parameters is that they be sufficiently (ie, "128 bit security worth") random. But with ECDSA, you need the entire domain of the k value to be random.

drexlspivey · on April 20, 2022

Ok but for this scheme you need a large amount of signatures from the same biased RNG which makes sense. I thought that the GP was suggesting that you can recover the key from one signature with just a few bits.

cbhl · on April 20, 2022

"same biased RNG" largely reduces to "I use the same computer to generate all my signatures"; for example see the Debian RNG bug from 2008

and "large amount of signatures" could be "I sign every email I send to a mailing list" or "I use this key to sign some widely distributed software every two weeks"

tptacek · on April 20, 2022

When these bugs first came into fashion, the "bias" of the RNG was an implementation artifact, not some bug in /dev/random: it was the code you used to fill a bignum uniformly in the size of the nonce. So mentally substitute "same biased RNG" for "same implementation, with same keys".

Yes, the attacks require many signatures. Like the infamous Bleichenbacher RSA attack, which was originally dubbed "The Million Message Attack", in part as a jab at how impractical they were presumed to be, collecting thousands of signatures is often a very realistic attack; for instance, any system that generates signed messages automatically.

_ofdw · on April 20, 2022

I guess like so:

https://cryptopals.com/sets/8/challenges/62.txt

E: Thomas beat me to it

Dylan16807 · on April 20, 2022

I'm not particularly knowledgeable here, but I know it's extremely fragile, far beyond just needing to be unique. See "LadderLeak: Breaking ECDSA With Less Than One Bit of Nonce Leakage"

Zababa · on April 20, 2022

Thank you for that, that was a great explanation.

tialaramex · on April 20, 2022

This is the sort of dumb mistake that ought to get caught by unit testing. A junior, assigned the task of testing this feature, ought to see that in the cryptographic signature design these values are checked as not zero, try setting them to zero, and... watch it burn to the ground.

Except that, of course, people don't actually do unit testing, they're too busy.

Somebody is probably going to mention fuzz testing. But, if you're "too busy" to even write the unit tests for the software you're about to replace, you aren't going to fuzz test it are you?

hsbauauvhabzb · on April 20, 2022

The issue is the assumption juniors should be writing the unit tests, sounds like you might be part of the problem.

tialaramex · on April 20, 2022

I think I probably technically count as a junior in my current role, which is very amusing and "I don't write enough unit tests" was one of the things I wrote in the self-assessed annual review.

So, sure.

hsbauauvhabzb · on April 21, 2022

It’s more unit testing is everybodies job, especially complex cryptographic functions, which should really have at least two sets of eyes, or even two test case sets where each developer doesn’t see the other developers test cases to reduce the likelihood that positive bias may overlook missed tests.

But i say that as someone who regularly audits code with almost certainly no unit tests based on the quality of the applications, just one set would do me fine.

tptacek · on April 20, 2022

The point of fuzz testing is not having to think of test cases in the first place.

tialaramex · on April 20, 2022

[Somebody had down-voted you when I saw this, but it wasn't me]

These aren't alternatives, they're complementary. I appreciate that fuzz testing makes sense over writing unit tests for weird edge cases, but "these parameters can't be zero" isn't an edge case, it's part of the basic design. Here's an example of what X9.62 says:

> If r’ is not an integer in the interval [1, n-1], then reject the signature.

Let's write a unit test to check say, zero here. Can we also use fuzz testing? Sure, why not. But lines like this ought to scream out for a unit test.

tptacek · on April 20, 2022

Right, I'm just saying: there's a logic that says fuzz tests are easier than specific test-cases: the people that run the fuzz tests barely need to understand the code at all, just the basic interface for verifying a signature.

loup-vaillant · on April 20, 2022

You still need your tests to cover all possible errors (or at least all plausible errors). If you try random numbers and your prime happens to be close to a power of two, evenly distributed random numbers won't end up outside the [0,n-1] range you are supposed to validate. Even if your prime is far enough from a power of two, you still won't hit zero by chance (and you need to test zero, because you almost certainly need two separate pieces of code to reject the =0 and >=n cases).

Another example is Poly1305. When you look at the test vectors from RFC 8439, you notice that some are specially crafted to trigger overflows that random tests wouldn't stumble upon.

Thus, I would argue that proper testing requires some domain knowledge. Naive fuzz testing is bloody effective but it's not enough.

cliftonk · on April 20, 2022

That’s all true, but fuzz testing is very effective at checking boundary conditions (near 0, near max/mins) and would have caught this particular problem easily.

loup-vaillant · on April 20, 2022

Do you mean fuzz testing does not use even distributions? There’s a bias towards extrema, or at least some guarantee to test zero and MAX? I guess that would work.

Also, would you consider the following to be fuzz testing? https://github.com/LoupVaillant/Monocypher/blob/master/tests...

dwohnitmok · on April 21, 2022

No, most fuzz testing frameworks I know of these days do not use even distributions. Most use even more sophisticated techniques such as instrumenting the code to detect when state transitions are triggered to try to maximize hitting all code paths in a program instead of repeatedly fuzzing the same path.

yencabulator · on April 21, 2022

The usual trick is coverage-guided fuzzing. https://google.github.io/clusterfuzz/reference/coverage-guid...

solarengineer · on April 20, 2022

If we write an automated test case for known acceptance criteria, and then write necessary and sufficient code to get those tests to pass, we would know what known acceptance criteria are being fulfilled. When someone else adds to the code and causes a test to fail, the test case and the specific acceptance criteria would thus help the developer understand intended behaviour (verify behaviour, review implementation). Thus, the test suite would become a catalogue of programmatically verifiable acceptance criteria.

Certainly, fuzz tests would help us test boundary conditions and more, but they are not a catalogue of known acceptance criteria.

anfilt · on April 20, 2022

While fuzz testing is good and all, when it comes to cryptography, the input spaces is so large that chances of finding something are even worse than finding a needle in a hay stack.

For instance here the keys are going to be around 256 bits in a size, so if your fuzzer is just picking keys at random, your basically never likely to pick zero at random.

With cryptographic primitives you really should be testing all known invalid input parameters for the particular algorithm. A a random fuzzer is not going to know that. Additionally, you should be testing inputs that can cause overflows and are handled correctly ect...

tptacek · on April 20, 2022

Yes, but here we're just looking for (0,0).

kasey_junk · on April 20, 2022

This is true in principle but in practice most fuzz testing frameworks demand a fair bit of setup. It’s worth it!

But if you are in a time constrained environment where basic unit tests are skipped fuzz testing will be as well.

ramblerman · on April 22, 2022

Imaging being so senior you no longer need to write unit tests yourself, but just delegate them.

Sounds exactly like the kind of disconnected environment that would lead to such bugs.

ptx · on April 20, 2022

Apparently you have to get a new CPU to fix this Java vulnerability, or alternatively a new PSU.

(That is to say: a Critical Patch Update or a Patch Set Update. Did they really have to overload these TLAs?)

RandomBK · on April 20, 2022

Does anyone know why this was only given a CVSS score of 7.5? Based on the description this sounds way worse, but Oracle only gave it a CVSS Confidentiality Score of "None", which doesn't sound right. Is there some mitigating factor that hasn't been discussed?

In terms of OpenJDK 17 (latest LTS), the issue is patched in 17.0.3, which was release ~12h ago. Note that official OpenJDK docker images are still on 17.0.2 as of time of writing.

tptacek · on April 20, 2022

CVSS is a completely meaningless Ouija board that says whatever the person authoring the score wants it to say.

bertman · on April 20, 2022

The fix for OpenJDK (authored on Jan. 4th 22):

https://github.com/openjdk/jdk/blob/e2f8ce9c3ff4518e070960ba...

drexlspivey · on April 20, 2022

with commit message “Improve ECDSA signature support” :D

baobabKoodaa · on April 21, 2022

I'm guessing the commit message is obscured to give people more time to update before it's exploited in the wild.

sdhfkjwefs · on April 20, 2022

Why are there no tests?

MrBuddyCasino · on April 21, 2022

I spot no test or comment in the code on why this assertion is important.

bertman · on April 21, 2022

It's literally what the whole bug is about. From OP's article:

>This is why the very first check in the ECDSA verification algorithm is to ensure that r and s are both >= 1. Guess which check Java forgot?

MrBuddyCasino · on April 21, 2022

Yes I just think it’s insane they fixed it without adding a test or comment.

vlowrian · on April 21, 2022

What puzzles me most is that two days after the announcement of the vulnerability and the release of the patched Oracle JDK, there is still no patched version of OpenJDK for most distributions.

We're running some production services on OpenJDK and CentOS and until now there are only two options to be safe: shutdown the services or change the crypto provider to BouncyCastle or something else.

The official OpenJDK project lists the planned release date of 17.0.3 as April 19th, still the latest available GA release is 17.0.2 (https://wiki.openjdk.java.net/display/JDKUpdates/JDK+17u).

Adoptium have a large banner on their website and until now there is not a single patched release of OpenJDK available from them (https://github.com/adoptium/adoptium/issues/140).

There are no patched packages for CentOS, Debian or openSUSE.

The only available version of OpenJDK 17.0.3 I've seen until now seems to be the Archlinux package (https://archlinux.org/packages/extra/x86_64/jdk17-openjdk/). They obviously have their own build.

How can it be that this is not more of an issue? I honestly don't get how the release process of something as widely used as OpenJDK can take more than 2 days to provide binary packages for something already fixed in the code.

This shouldn't be much more effort than letting the CI do its job.

Edit: Typo.

ptx · on April 21, 2022

Azul published updated packages yesterday, including for some older non-LTS Java versions: https://www.azul.com/downloads/?package=jdk#download-openjdk

vlowrian · on April 21, 2022

Thanks for the info! That's very interesting since they usually only provide out-of-cycle critical fixes for their paid tiers. On the other hand - this only proves that it's actually possible to provide a hot-fixed OpenJDK in time.

Unfortunately, I assume that a very common case is just using the distribution provided openjdk-package and configuring the system for auto updates. So the main issue here is that a serious number of systems is relying on the patch process of the distribution to fix issues like this and they are still vulnerable at this moment.

vips7L · on April 21, 2022

I wouldn't use your distros version of OpenJdk. If you want fast updates, you need to be using Azul or some other provider who is dedicated to it.

vlowrian · on April 21, 2022

I can see how this would have helped in this case.

As I see it, the distributions are mostly relying on the upstream provisioning of the openJDK project. So if they fix this issue, it shouldn't take long until we see updated packages in all major distributions. This might be a problem specific to the openJDK build process, so a different package source would help in that case.

But as mentioned above, Azul usually doesn't provide out-of-cycle critical fixes without a paid plan. And most people will still use whatever the distribution provides - so this is still an issue regardless of alternative package sources.

And since I assume that many or most running JDK instances actually are coming from the distributions repository rather than an alternative source, and there is literally no outcry regarding the missing packages whatsoever - I fear that there are a lot of vulnerable software systems of people not knowing about it right now.

gunnarmorling · on April 21, 2022

For folks on RHEL, the java-17-openjdk package for RHEL 8 has been updated: https://access.redhat.com/errata/RHSA-2022:1445.

> The official OpenJDK project lists the planned release date of 17.0.3 as April 19th, still the latest available GA release is 17.0.2

> (https://wiki.openjdk.java.net/display/JDKUpdates/JDK+17u).

I don't think there 17.0.3 ever will be available from openjdk.java.net; there's no LTS for upstream builds, and since Java 18 is out already, no further builds of 17 should be expected there. IMO, this warrants some clarification on that site though.

needusername · on April 22, 2022

> I don't think there 17.0.3 ever will be available from openjdk.java.net

https://adoptopenjdk.net/upstream.html

These are the official upstream builds by the updates project built by Red Hat. Not to be confused by Red Hat Java, not to be confused by the AdoptOpenJDK/Adoptium builds. These can‘t be hosted on openjdk.java.net because they host only builds done by Oracle, not to be confused by Oracle JDK.

vlowrian · on April 22, 2022

This site doesn't provide anything newer than OpenJDK 11 and references the Adoptium projects for July 2021 and future releases. But Adoptium only provide their own Temurin distribution. Looks like a dead end for an OpenJDK 17.0.3 upstream build.

vlowrian · on April 22, 2022

Thanks for the clarification. The site is not clear on that topic and actually suggests otherwise by listing the planned release dates in the timeline.

On the other hand, the problem that many popular server distributions like CentOS and Debian still haven't updated their Java 17 packages remains and I wonder if this is due to their own package build process or because they are waiting for an upstream process to complete.

If they actually rely on the upstream builds from openjdk.java.net that would mean that the fix will not make it to their repositories at all.

Razhan · on April 21, 2022

Amazon had releases of Corretto available on April 19th, Corretto 17 was released before 10am PDT, less than one hour after the announcement

LaputanMachine · on April 20, 2022

>Just a basic cryptographic risk management principle that cryptography people get mad at me for saying (because it’s true) is: don’t use asymmetric cryptography unless you absolutely need it.

Is there any truth to this? Doesn't basically all Internet traffic rely on the security of (correctly implemented) asymmetric cryptography?

fabian2k · on April 20, 2022

I've seen this argument often on the topic of JWTs, which are also mentioned in the tweets here. In many situations there are simpler methods than JWTs that don't require any cryptography, e.g. simply storing session ids server-side. With these simple methods there isn't anything cryptographic that could break or be misused.

The TLS encryption is of course assumed here, but that is nothing most developers ever really touch in a way that could break it. And arguably this part falls under the "you absolutely need it" exception.

slaymaker1907 · on April 20, 2022

You can still use encryption with JWTs if you use a symmetric key. I believe HS256 just uses a symmetric key HMAC with SHA256. If you go beyond JWT, Kerberos only uses symmetric cryptography while not being as centralized as other solutions. Obviously, the domain controller is centralized, but it allows for various services to use common authentication without compromising the whole domain if any one service is compromised (assuming correct configuration which is admittedly difficult with Kerberos).

jaywalk · on April 20, 2022

Server-side session storage isn't necessarily a replacement for JWTs. It can be in many cases, but it's not one to one. JWTs do have advantages.

fabian2k · on April 20, 2022

That's why I wrote "in many cases". The problem is more that for a while at least JWT were pretty much sold as the new and shiny replacement for classic sessions, which they're not. They absolutely have their uses, but they also have additional attack surface.

er4hn · on April 20, 2022

The biggest problem with JWTs is not what cryptography you use (though there was a long standing issue where "none" was something that clients could enter as a client side attack...) but rather revocation.

x509 certificates have several revocation mechanisms since having something being marked as "do not use" before the end of its lifetime is well understood. JWTs are not quite there.

codebje · on April 20, 2022

JWT is just a container for authenticated data. it's comparable to the ASN.1 encoding of an x509 certificate, not to the entire x509 public key infrastructure.

You could compare x509 with revocation to something like oauth with JWT access tokens, though.

In that case, x509 certificates are typically expensive to renew and have lifetimes measured in years. Revocation involves clients checking a revocation service. JWT access tokens are cheap to renew and have lifetimes measured in minutes. Revocation involves denying a refresh token when the access token needs renewing. Clients can also choose to renew access tokens much more frequently if a 'revocation server' experience is desirable.

Given the spotty history of CRLDP reliability, I think oauth+JWT are doing very well in comparison. I'm pretty damn confident that when I revoke an application in Google or similar it will lose access very quickly.

tialaramex · on April 21, 2022

> x509 certificates are typically expensive to renew and have lifetimes measured in years

In the Web PKI thanks to Certificate Transparency we can measure, the typical X509 certificate was issued by ISRG (Let's Encrypt) and thus cost well under one dollar (free to the subscriber, that cost is borne by the donors) and has a lifetime of precisely 90 days.

codebje · on April 21, 2022

> In the Web PKI thanks to Certificate Transparency we can measure, the typical X509 certificate was issued by ISRG (Let's Encrypt) and thus cost well under one dollar (free to the subscriber, that cost is borne by the donors) and has a lifetime of precisely 90 days.

Yes, it's true that in the past few years Let's Encrypt has substantially altered the typical lifetime of web server certificates, as well as substantially eased the burden of refreshing a certificate in what I would guess to be the majority of use cases.

Revocation, however, is still a mess. OCSP services are slow and a privacy leak, and are largely ignored by browsers - in 2021 Firefox was still checking OCSP services but given they're so unreliable if it can't contact a service it assumes the certificate is fine. OCSP winds up being a trade-off between allowing an attacker to conduct a denial of service on all certificates or blocking revocations.

In practice the major browser vendors all do more or less the same thing - build their own proprietary list of revoked certificates and distribute it to browsers from time to time, with varying sources and granularity on what they will and won't include in their centralised CRLs. I would have little faith in a timely revocation of a compromised server certificate.

Sirened · on April 21, 2022

Not to mention OCSP stapling provides a revocation escape hatch that allows a certificate to continue to be used even after it has been revoked and the revocation has been streamed to all relevant OCSP servers.

nicoburns · on April 20, 2022

> Is there any truth to this?

Yes, symmetric cryptography is a lot more straightforward and should be preferred where it is easy to use a shared secret.

> Doesn't basically all Internet traffic rely on the security of (correctly implemented) asymmetric cryptography?

It does. This would come under the "unless you absolutely need it" exception.

adgjlsfhk1 · on April 21, 2022

note that symmetric encryption is also really hard. it wasn't until 2010 or so that GCM mode came around and provided a system that is somewhat easy to implement without accidentally breaking everything.

0xdeadb00f · on April 21, 2022

GCM is not without it's own pitfalls though, however.

lazide · on April 20, 2022

Initial connection negotiation and key exchange does, anything after that no. It will use some kind of symmetric algo (generally AES).

It's a bad idea (and no one should be doing it) to continue using asymmetric crypto algorithms after that. If someone can get away with a pre-shared (symmetric) key, sometimes/usually even better, depending on the risk profiles.

loup-vaillant · on April 21, 2022

> It will use some kind of symmetric algo (generally AES).

AES-GCM, you mean. Let's not forget the authentication in "authenticated encryption". I'm nitpicking, but if a beginner comes here it's better to make it clear that in general, encryption alone is not enough. Ciphertext malleability and all that.

lazide · on April 21, 2022

Plenty of stuff still uses CBC (or other modes) with another authentication method. AES-GCM is nice in that it combines both explicitly, but a lot of stuff just combines other methods and it’s fine.

AES-GCM has the annoying property of output size > input size for instance.

loup-vaillant · on April 21, 2022

I was just trying to mention the most widespread method. Sure you can use AES-CBC or AES-CTR, and combine it with HMAC or keyed BLAKE2…

but as tptaceck pointed out, all authentication methods are going to increase your message size. It's unavoidable: to get authentication you need some redundancy, and the only general way to get that redundancy is to have a message bigger than the plaintext. We do have attempts at length preserving authenticated encryption, but as far as I know they're not as well studied as the classical "encrypt-then-mac" methods such as AES-CBC + HMAC or AES-GCM. https://security.googleblog.com/2019/02/introducing-adiantum...

tptacek · on April 21, 2022

All secure encryption has an output size greater than its input size.

lazide · on April 21, 2022

How so? All symmetric crypto algorithms at their basic level that I am aware of do not change the message size at all. If you have an example, that would be helpful. I’m not referring to padding. If you’re referring to IV, then I see what you’re saying, but most algorithms derive that from positional data or treat it like a semi-public part of the key, which I’m not referring to.

AES-GCM (as a method) is unusual this way, because it combines encryption and validation at the same time, in each block. They’re two steps - you have the cipher text and the validation data separate.

It’s encrypting + signing everything, essentially, for each block. It stores the data for it directly in each block, which is why the inflation.

For why this is both great, and terrible depending on the use case - for problem cases, imagine full disk encryption. If you naively encrypt the block using AES-GCM, any block you encrypt will no longer fit in the device. If you encrypt a file (like a database file) which relies on offsets or similar hard coded byte wise locations to data, those no longer work.

In both cases you’d need a virtualization layer which would map logical offsets to physical ones. Definitely not impossible. Not as straightforward as replacing your read/write_blk method with read/write_encrypted_blk though.

As for why it’s awesome, it greatly simplifies and strengthens the real world process of encrypting or decrypting data where the size of the input and output are not fixed by some hardware constraint or fixed constant, where you have a virtualization layer, or where you don’t need to care as much (or can remap) offsets. Which is often.

loup-vaillant · on April 21, 2022

> How so? All symmetric crypto algorithms at their basic level that I am aware of do not change the message size at all.

That's because you are not aware of the importance of authentication.

Without authentication, your system is not secure: an attacker might intercept messages, and modify them undetected. The key word here is "ciphertext malleability". And once they can do that, they can cause the recipient to react in ways it should not, and in some cases the recipient might even leak secrets.

Sometimes (like disk encryption) the size overhead is really really really inconvenient, and the risk of interception is lower, so you break the rule and skip it anyway. But unless you are in a similar situation (you probably aren't), you must use authentication. It's only professional.

In practice, that means you should use authenticated encryption. Authenticated encryption is used everywhere, including HTTPS. And yes, it has a small size overhead. Usually 16 bytes per message, like AES-GCM and RFC 8439 (ChaPoly). Per message. Not per block. So the actual overhead is very low in practice. And again, it's the price you have to pay to get a secure system.

---

Use authenticated encryption.

Accept the overhead like everyone else.

Resistance is futile.

lazide · on April 21, 2022

Oh I am quite aware.

You do not seem to be aware of the practical constraints around an actual attack like ciphertext malleability in this context, or have thought through how you would implement direct disk encryption on a block device with AES-GCM without, you know, doing block based AES-GCM for individual blocks?

Which is exactly what I was referring to?

For block based, the best way is simple to use a validating filesystem like ZFS on top of whatever block based crypto is being used, if you need random IO. If you don't, a simple fixed size signature (seperate from the data) is sufficient, and out of band is fine.

In either case, including AES-GCM, the validation and authentication is not, itself, the symmetric encryption algorithm. They wrap approved block ciphers which do that.

As per the Standard, anyway. [https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpubli...]

I'm not against AES-GCM, not at all. It's awesome! I'm pointing out that it has implementation tradeoffs.

loup-vaillant · on April 22, 2022

> You do not seem to be aware of the practical constraints around an actual attack like ciphertext malleability in this context

Show me a peer reviewed paper demonstrating, or at least convincingly arguing, of the soundness of a particular technique you are trying to advocate, and I’ll believe you.

Otherwise it’s pretty simple: either your authentication method has been validated (as are HMAC and polynomial hashes), or there’s a good chance it’s broken even if you don’t know it yet.

> For block based, the best way is simple to use a validating filesystem like ZFS on top of whatever block based crypto is being used,

File system blocks are typically 4KiB or so. AES blocks are 16 bytes. I’m not sure what you mean by "block based crypto" here, the length of AES blocks has nothing to do with the file system blocks you’re trying to encrypt.

> In either case, including AES-GCM, the validation and authentication is not, itself, the symmetric encryption algorithm. They wrap approved block ciphers which do that.

I have implemented a cryptographic library, so I’m well aware. I insist on authenticated encryption as if it was a monolithic block because it makes much safer APIs. You really really don’t want to let your average time pressured programmer to implement their polynomial hash based authentication protocol by hand, there are too many footguns to watch out for. Believe me, I’ve walked that minefield, and blew my leg off once.

> I'm not against AES-GCM, not at all. It's awesome! I'm pointing out that it has implementation tradeoffs.

Compared to what? All the example you cite in your other comment (SSL to PGP/GPG, S/MIME), make the exact same trade-offs!! They all add an authentication tag to each message, effectively expanding it size.

tptacek · on April 21, 2022

No matter how you're encrypting, if you're authenticating, you have to store the authenticator. Which is why GCM "expands" the size of the message. If you're not authenticating, you're not encrypting securely.

The fact that XTS isn't authenticated is a huge problem with full-disk encryption.

https://sockpuppet.org/blog/2014/04/30/you-dont-want-xts/

lazide · on April 21, 2022

You can (and folks do) authenticate in ways that don’t make individual blocks bigger.

  And any decent structural validation of the data still makes it reasonably secure even without per-block validation.

GCM is also opened up to different types of attacks due to it’s structure. Such as if the data is gone, it may be impossible to figure that out without additional signature or metadata.

Without the correct key for AES, it is exceedingly difficult to construct a value that can result in a successful attack after decryption even for the simplest file systems (as compared to a very visible crash or disk corruption issue even without validation), and that blog post way oversimplifies the actual process. It also makes numerous flat out false statements about many encryption modes.

a trivial answer that solves every one of the attacks mentioned in that blog is using ZFS on top of a encrypted block device.

In each of these cases, for a successful attack, you’d need to generate a new block, or identify an existing block to replace a known block with, that would produce the attackers desired outcome. All GCM does is make it more detectable in the encrypted data if that happens.

Some modes mentioned, if watching the actual disk activity and doing chosen plaintext attacks, it could be possible to shorten the time to recover the underlying volume keys, but that is not helped immensely by GCM (necessarily).

It is going to be obvious in the system itself without the right key if someone tries to swap in a bogus block, because it will be gibberish/corrupt, if it is data used by anything or checked by anything.

AES-GCM just means you can tell when you pick something up, vs when you look at it if it’s damaged. And it does it at the trade off of adding a signature on everything. Sometimes that’s worth it, sometimes it’s not.

loup-vaillant · on April 21, 2022

> You can (and folks do) authenticate in ways that don’t make individual blocks bigger.

First, name one example.

Second, what do you mean by "individual blocks"?

AES-GCM adds one authentication tag per message. A single message may contain millions of AES blocks, and the total overhead of AES-GCM over it will still be a single authentication tag (16 bytes). That makes it very similar to pretty much any authenticated encryption scheme out there.

lazide · on April 21, 2022

Ah, your second question is a good one, and probably gets to the root of the disagreements.

I was specifically referring to the context of things like block devices. There is no single message (in a sane way, anyway) for the device. Each low level block is the message, in the sense you are referring to. That's when inputsize != outputsize is a problem, as that 'message' is also fixed size.

When I am referring to authenticating in a way that doesn't make individual blocks bigger, I'm referring to a HMAC signature in filesystem metadata or similar in this type of scenario. Out of band information. Practically speaking, even a basic CRC of metadata and file contents would make most attacks impractical.

Which you could do with AES-GCM of course, by storing the tag separately. I currently know of no implementations that do so however, but I'm sure there are ones out there. It would require storing the tag per block, which doesn't sound fun or performant.

To answer your second question in that context - everything from SSL to PGP/GPG, S/MIME, etc.

loup-vaillant · on April 22, 2022

> Practically speaking, even a basic CRC of metadata and file contents would make most attacks impractical.

If I recall correctly, CRC of plaintext-then-encrypt scheme have been defeated in the past. With practical attacks.

---

> Which you could do with AES-GCM of course, by storing the tag separately. I currently know of no implementations that do so however, but I'm sure there are ones out there. It would require storing the tag per block, which doesn't sound fun or performant.

Here’s an example from possibly the most famous modern cryptographic library: https://doc.libsodium.org/secret-key_cryptography/aead/aes-2...

As for storing the tag "per block", I’m not sure what you mean. Sure you need one tag per block, but with the above API you can store that tag anywhere you want. If for instance you pack them into dedicated blocks, a single 4KiB blocks can store 256 authentication tags. The loss of storage capacity would be a whooping 0.4%.

> When I am referring to authenticating in a way that doesn't make individual blocks bigger, I'm referring to a HMAC signature in filesystem metadata or similar in this type of scenario. Out of band information

Then just store the authentication tag from AES-GCM out of band!! Surely your meta-data can handle a 0.4% size overhead?

---

> To answer your second question in that context - everything from SSL to PGP/GPG, S/MIME, etc.

Thought so. They’re all just like AES-GCM. One of them (TLS 1.3, a.k.a. SSL) can even use AES-GCM for its symmetric crypto.

lazide · on April 22, 2022

If you store all your tags (which update every time a block updates) in one location, then you’ll burn it out on a SSD, or thrash your disk with seeks on spinning rust (you’ll multiply your writes 2x, minimum). And unless you are adding some kind of virtualizing layer, you don’t have .4% of blocks to play with. You have 0% of blocks.

I tried to read your pointer, but the link goes no where explaining it. Mind giving a more useful link? It Could be because I’m on mobile.

We weren’t talking about CRC of plaintext anyway - we were talking about block encryption. So it would be CRC (as validation) of on-disk filesystem structures as part of parsing. Aka an actual attack.

Standard AES-GCM appends the tag to the encrypted message directly. None of those I name do it that way. Using AES-GCM as a transport is layering their stuff inside it, which of course is fine as I’m describing it - because they don’t have fixed size structures in their protocols! It doesn’t mean they aren’t doing the additional validation and authentication.

loup-vaillant · on April 22, 2022

> And unless you are adding some kind of virtualizing layer, you don’t have .4% of blocks to play with. You have 0% of blocks.

That is a shitty problem to have, there is no perfect solution. If you at all can, change the problem. If that means you need a virtualization layer, use it if possible.

---

> I tried to read your pointer, but the link goes no where explaining it. Mind giving a more useful link? It Could be because I’m on mobile.

The first sentence of the link I gave you reads as follows: "Some applications may need to store the authentication tag and the encrypted message at different locations."

Then it shows you the following function that achieves that separation (with zero performance overhead I might add):

  int crypto_aead_aes256gcm_encrypt_detached(
      unsigned char       *ciphertext,
      unsigned char       *mac,
      unsigned long long  *mac_size_p,
      const unsigned char *message,
      unsigned long long   message_size,
      const unsigned char *additional_data,
      unsigned long long   additional_data_size,
      const unsigned char *always_NULL,
      const unsigned char *nonce,
      const unsigned char *key);

If that does not help you, you need more basic training. I recommend Dan Boneh's standford cryptographic course or crypto1O1. And if you need to understand the severity of various attacks at a gut level, you might want to take a look at the cryptopals challenges as well:

http://openclassroom.stanford.edu/MainFolder/CoursePage.php?...

https://www.crypto101.io/

https://www.cryptopals.com/

tptacek · on April 21, 2022

People have been saying stuff like "even a CRC would make attacks impractical" for decades, and what all they've managed to accomplish is an obstacle course for early-career academic cryptographers. Which, by all means, carry on: it produces great papers, and it's a great way to get new people into the field. But if you care about security, your ciphertext needs to be authenticated.

Which brings me back to: all secure encryption expands the size of the ciphertext. If you're using XTS in a new design, you are doing something very wrong.

lazide · on April 21, 2022

[flagged]

tptacek · on April 22, 2022

I don't understand your actual answer. You gave a list of cryptosystems that all suffered devastating attacks due to not properly authenticating ciphertexts; in the TLS case, several years worth of devastating attacks. PGP is an especially funny example. The only surprise is you didn't manage to work SSHv1 into the list.

smegsicle · on April 20, 2022

if people were getting mad at him, he must have been pretty obnoxious about it because i don't think there's much controversy- Asymmetric encryption is pretty much just used for things like sharing the Symmetric key that will be used for the rest of the session

of course it would be more secure to have private physical key exchange, but that's not a practical option, so we rely on RSA or whatever

Sirened · on April 21, 2022

It's generally good to use symmetric cryptography wherever possible because it usually (!) is faster and simpler. More complex crypto systems provide interesting properties but if you can pull off whatever you're doing without it, why bother. The author tries to make a security claim for this but IMO that's not even the real issue

formerly_proven · on April 20, 2022

I wouldn't be particularly worried of someone decrypting a file encrypted in the 80s using Triple DES anytime soon. I don't think I'll live to see AES being broken.

I wouldn't bet on the TLS session you're using to have that kind of half life.

loup-vaillant · on April 21, 2022

There are two sides to this coin: one is the actual strength of the primitives involved. RSA is under increasingly effective attacks, and though elliptic curves are doing very well for now, we have the looming threat of Cryptographically Relevant Quantum Computers. Still, without CRQC there's a good chance that X25519 and Ed25519 won't be broken for decades to come.

The other side is the protocol itself. Protocols are delicate, and easy to mess up in catastrophic ways. On the other hand, they're also provable. We can devise security reductions that prove that the only way to break the protocol is to break one of its primitives. Such proofs are even mechanically verified with tools like ProVerif and Tamarin.

Maybe TLS is a tad too complex to have the same half life as AES. The Noise protocols however have much less room for simplification. That simplicity makes them rock solid.

dynamite-ready · on April 21, 2022

Wonder if someone can add a little more info to the title of this story. It's would probably draw more clicks if the title wasn't so cryptic. This is essentially a Java dev infosec post.

pas · on April 21, 2022

Just wait a few days and it'll be on the news like the log4j2 vulnerability :) (Though it might not, because in practice BouncyCastle is used in a most big/old Java software - as far as I know.)

danmur · on April 21, 2022

It's crazy that the check for this was right there in the original code and was obviously missed when porting to Java. Great example of why unit tests are part of the code (and were missing in both in this case).

stolsvik · on April 21, 2022

The title of the blogpost is now updated to "CVE-2022-21449: Psychic Signatures in Java" - maybe this HN post could too?

vemv · on April 21, 2022

Agreed (@dang)

ccbccccbbcccbb · on April 20, 2022

Q: Which type of cryptography is implied to be unsafe in the following sentence?:

"Immediately ditch RSA in favor of EC, for it is too hard to implement safely!"

tedunangst · on April 20, 2022

What's java's RSA history look like?

ccbccccbbcccbb · on April 22, 2022

My comment was rather language-agnostic. Are there any fundamental differences in implementation of RSA or EC in various languages?

MrDresden · on April 23, 2022

Computerphile has released a very approachable explanation of the flaw[1], along with some basic background on ECDSA as well.

[1]: https://www.youtube.com/watch?v=502iGDxuiRk

lobstey · on April 20, 2022

I doubt how many companies are actually using java15+. Many still sticks to 8 or 11

pluc · on April 20, 2022

Not a good few months for Java

lobstey · on April 20, 2022

Not that a lot of companies are using the Java 15+. People generally stick to 8 or 11.

needusername · on April 20, 2022

I believe Oracle 11 is affected.

stolsvik · on April 21, 2022

I do not believe so. The "affected list" which includes 11 is for the complete set of the "CPU" - Critical Patch Update.

This specific one was introduced with the rewriting of these parts of the code from C++ to Java, and that happened with Java 15.

needusername · on April 22, 2022

Looks like you’re correct and I was wrong.

m00dy · on April 20, 2022

[flagged]

0des · on April 20, 2022

I want to live in this world, but no.

cesarb · on April 20, 2022

And once again, you'd be saved if you stayed on an older release. This is the third time this has happened recently in the Java world: the Spring4Shell vulnerability only applies to Java 9 and later (that vulnerability depends on the existence of a method introduced by Java 9, since all older methods were properly blacklisted by Spring), and the Log4Shell vulnerability only applies to log4j 2.x (so if you stayed with log4j 1.x, and didn't explicitly configure it to use a vulnerable appender, you were safe). What's going on with Java?

KronisLV · on April 20, 2022

> ...the Log4Shell vulnerability only applies to log4j 2.x (so if you stayed with log4j 1.x, and didn't explicitly configure it to use a vulnerable appender, you were safe)

Seems like someone likes to live dangerously: using libraries that haven't been updated since 2012 is a pretty risky move, especially given that if an RCE is discovered now, you'll find yourself without too many options to address it, short of migrating over to the new release (which will be worse than having to patch a single dependency in a backwards compatible manner): https://logging.apache.org/log4j/1.2/changes-report.html

Admittedly, i wrote a blog post called "Never update anything" a while back, even if in a slightly absurdist manner: https://blog.kronis.dev/articles/never-update-anything and personally think that frequent updates are a pain to deal with, but personally i'd only advocate for using stable/infrequently updated pieces of software if they're still supported in one way or another.

You do bring up a nice point about the recent influx of vulnerabilities and problems in the Java ecosystem, which i believe is created by the fact that they're moving ahead at a faster speed and are attempting to introduce new language features to stay relevant and make the language more inviting for more developers.

That said, with how many GitHub outages there have been in the past year and how many other pieces of software/services have broken in a variety of ways, i feel like chasing after a more rapid pace of changes and breaking things in the process is an industry wide problem.

yardstick · on April 20, 2022

> using libraries that haven't been updated since 2012 is a pretty risky move

I disagree. Some libraries are just rock solid, well tested and long life.

In the case of log4j 1.x vs 2.x, has there been any real motivator to upgrade? There are 2 well known documented vulnerabilities in 1.x that only apply if you use extensions.

KronisLV · on April 20, 2022

A sibling comment mentions the reload4j project, so clearly someone thought that 1.x wasn't adequate to a degree of creating a new project around maintaining a fork. Can't speak of the project itself, merely the fact that its existence supports the idea that EOL software is something that people would prefer to avoid, even if they decide to maintain a backwards compatible fork themselves, which is great to see.

Here's a bit more information about some of the vulnerabilities in 1.x, someone did a nice writeup about it: https://www.petefreitag.com/item/926.cfm

I've also dealt with 1.x having some issues with data loss, for example, https://logging.apache.org/log4j/1.2/apidocs/org/apache/log4... which is unlikely to get fixed:

  DailyRollingFileAppender has been observed to exhibit synchronization issues and data loss.

(though at least in regards to that problem, there are alternatives; though for the most part EOL software implies that no further fixes will be available)

But at the end of the day none of it really matters: those who don't want to upgrade won't do so, potential issues down the road (or even current ones that they're not aware of) be damned. Similarly, others might have unpatched versions of 2.x running somewhere which somehow haven't been targeted by automated attacks (yet) and might continue to do so while there isn't proper motivation to upgrade, or won't do so until it will be too late.

Personally, i dislike the idea of using abandoned software for the most part, when i just want to get things done - i don't have the time to dance around old documentation, dead links, having to figure out workarounds for CVEs versus just using the latest (stable) versions and letting someone else worry about it all down the road. Why take on an additional liability, when most modern tooling and framework integrations (e.g. Spring Boot) will be built around the new stuff anyways? Though thankfully in regards to this particular case slf4j gives you more flexibility, but in general i'd prefer to use supported versions of software.

I say that as someone who actually migrated a bunch of old monolithic Spring (not Boot) apps to something more modern when the versions had been EOL for a few years and there were over a hundred CVEs as indicated by automated dependency/package scanning. It took months to do, because previously nobody actually cared to constantly follow the new releases and thus it was more akin to a rewrite rather than an update - absolute pain, especially that JDK 8 to 11 migration was also tacked on, as was containerizing the app due to environmental inconsistencies growing throughout the years to the point where the app would roll over and die and nobody had any idea why (ahh, the joys of working with monoliths, where even logs, JMX and heap dumps don't help you).

Of course, after untangling that mess, i'd like to suggest that you should not only constantly update packages (think every week, alongside releases; you should also release often) but also keep the surface area of any individual service small enough that they can be easily replaced/rewritten. Anyways, i'm going off on a tangent here about the greater implications of using EOL stuff long term, but those are my opinions and i simultaneously do admit that there are exceptions to that approach and circumstances vary, of course.

cesarb · on April 20, 2022

> especially given that if an RCE is discovered now, you'll find yourself without too many options to address it, short of migrating over to the new release

Luckily, there's now an alternative: reload4j (https://reload4j.qos.ch/) is a maintained fork of log4j 1.x, so if you were one of the many who stayed on the older log4j 1.x (and there were enough of them that there was sufficient demand for that fork to be created), you can just migrate to that fork (which is AFAIK fully backward compatible).

(And if you do want to migrate away from log4j 1.x, you don't need to migrate to log4j 2.x; you could also migrate to something else like logback.)

ragnese · on April 20, 2022

Was Spring4Shell Java's fault, or Spring's fault? Log4Shell was obviously (mostly) log4j's fault.

This one, I gather, is actually Java's fault.

It sounds like three unrelated security bugs from totally different teams of developers.

znep · on April 21, 2022

Spring4Shell is entirely a flaw in Spring, however is somewhat understandable because it was only exploitable due to a new feature in Java (modules) that added new methods to java.lang.Class, which is a very significant change. You could argue the very existence and nature of Java object serialization deserves blame as well, but that gets nuanced quickly.

Modules are also part of the reason why so many folks got "stuck" on java 8.

It is definitely an interesting study in the challenges of trying to make advances in a platform when a lot of the ecosystem is very much in maintenance mode and may not have a lot of eyes on the combination of existing libraries vs new versions of Java.

ragnese · on April 22, 2022

Agreed. There's definitely space to spread some blame and criticism around, I suppose. And there are plenty of old Java decisions that open the door for these issues.

brazzy · on April 20, 2022

I think they other two are considered "Javas's fault" because the frameworks they occurred in are so pervasive in the Java ecosystem that you might as well consider them part of the standard library.

taeric · on April 20, 2022

You make this sound like it is unique to java. I remember heartbleed had similar, in that the lts I was on did not have the vulnerable library.

At some level, as long as releases add functionality, the basic rules of systemantics will guarantee unintended interactions.

jatone · on April 20, 2022

gasp new code can introduce bugs... whatever will one do?!