Hacker News new | past | comments | ask | show | jobs | submit login
Critical Vulnerabilities in JSON Web Token Libraries (auth0.com)
147 points by pr0zac on Apr 1, 2015 | hide | past | web | favorite | 50 comments



Be wary if you do anything with signed data before you verify the signature.

http://www.thoughtcrime.org/blog/the-cryptographic-doom-prin...


Don't trust the data before you verified it is indeed the common problem here. You have to be careful with JWK in a similar way: the public key can be specified a as a URL via the x5u parameter, you have to make sure you only trust keys from a whitelisted URL otherwise whoever supplies the token signed with a JWK can just provide their own public key and self-sign it.

This can be problematic and need some leg-work on a developer's side when the URLs to be whitelisted aren't documented. For example Apple use a similar signed blob to allow 3rd parties to verify a Game Center identity, using generateIdentityVerificationSignatureWithCompletionHandler. The data you get back includes the publicKeyUrl, but not which URLs to expect. A naive implementation would just download the public key and run with it. Bad for two reasons: you're now downloading data from an arbitrary location and you are trusting a signature that you verified using untrusted information. In case anyone is interested, the two locations I've seen Apple provide here are https://sandbox.gc.apple.com and https://static.gc.apple.com .


That's a great link (which I've seen before) and I upvoted it.

But these "typical JWT lib api vulns" are WAY SIMPLER than Vaudenay etc. Any programmer could do this in minutes or less. WOW. It's so obvious and straightforward (in hindsight I suppose). My mouth is still hanging open.


We've recently been looking at adding JWT, and one of the things I looked out for was that we were protected against downgrade attacks. This was done both as a check of the current library code, and an explicit check on our side that the alg is what we expect to protect against future library changes.


By the way, claps for Tim for the amazing job. He contacted various people and gave them heads up weeks ago. I think the wat this was handled was awesome.

<3 OSS


I don't understand why you need the alg property at all. I mean, you are the one issuing the token so you definitely know what algorithm is used in the back-end. Why is this necessary?


If you change your algorithm, your old tokens are still usable because you know which algorithm was used to create them.


It would not be possible to establish a period to migrate all tokens? This "if you change in a possible feature" is not a good argument when modeling something, in my opinion. That's how AbstractFactoryStrategies are made.


That sounds like a very niche case, probably not worth solving. That said, the JWT lib could allow you to specify one or more fallback algorithms in case the default one fails to validate.

If you're switching off an algorithm, you're probably doing it because it's been broken. And, if it's broken, you won't use it anyway.


As the issues described in TFA make clear, the verifier really should know the algorithm in order to verify completely. However, since this format allows various algorithms, the algorithm must be recorded somewhere, especially in the public-key scenario when the verifier is probably not the signer.


Thanks.

OT but what does TFA stand for? I've seen you use it in two threads and I presume it's referring to the posted article in question but I can't figure out the actual meaning.


The F*ing Article... probably evolved from RTFM. Hacker culture, so warm and welcoming, isn't it?


That usage was deprecated by RFC19647. The correct definition is now 'The Featured Article'.


The abbreviation gives some plausible deniability - it could be "the fine article"!


I've usually seen it as The Fine Article, but tastes vary. b^)


I found a (thematically) similar issue in the clojure jwt library last year [1]. Now I'm wondering if the issue may be wider than just that library.

[1] https://github.com/liquidz/clj-jwt/issues/8


Also see: https://news.ycombinator.com/item?id=9111049

Where are the CVEs?! ... OK, I just sent a brief write-up to the oss-security list, we'll see what happens.


Yea, it bothers me as well. Trusting blogs for security vulnerabilities spread across the internets isn't the model to be aiming for.


I'm glad someone of not finally made a statement about that. I have reported two bugs like this against libraries in the last two years and the general feedback has been, that it's not an issue. They just removed the none algorithm entirely.


For those of you concerned about if a JWT library you're using is vulnerable, you can see the current list of verified/validate clients for each language here: http://jwt.io/ (page down to see the list)

My team has been using ruby-jwt, so I'm glad to see that at least does not seem to be on the list of vulnerable libraries.


I've actually seen this same issue in JWT libs before (August 2014) as well. I think one of the main issue is JWT is simple to implement, the specification seems to be unclear about how to treat unsecure JWTs (with the `none` alg).

From the specificaiton: Even if a JWT can be successfully validated, unless the algorithm(s) used in the JWT are acceptable to the application, it SHOULD reject the JWT.

So what it is saying is that you should have code in your application to make sure to only trust algorithms that you like.

The other issue is that following the specifications rules for validation get you in a hairy spot where this vulnerability exist:

https://tools.ietf.org/html/draft-ietf-oauth-json-web-token-...


You didn't read carefully enough.

Your site may well find both HMAC and RSA to be acceptable algorithms. However if you can be tricked into using HMAC to verify something actually signed with RSA, then anyone can forge content you accept as valid.

What is important is not that the algorithm is acceptable. It is that the algorithm you use for verification is the algorithm that actually should be used.


Can someone please explain to me the appeal of tokens like the following please?

    payload = '{"loggedInAs":"admin","iat":1422779638}'
I've always worked under the assumption that everything the client sends is wrong; which means the only fields I issue are a user hash and session key. If either of those don't match the session table record then the token is rejected. Having group permissions issued from the client side seems like a bad design from the offset - however this seems a really obvious point to make, which makes me wonder if I'm missing the point of JWT's.

Is anyone able to expand on this for me please?


I think the main concept you are missing is called HMAC, which is a cryptography thing. If you don't know this, I will recommend this explanation [1].

The client can send something wrong? Yes, as long as the client can digital sign what is sending with the secret you expect.

JWTs can be signed with a symmetric key or an asymmetric key, the vulnerability mentioned in this post is when the server-side expect a token digitally signed with an asymmetric key but an unauthorized client uses the public key to create a signature as if the key where symmetric. The issue in this case is when the server is blindly accepting any algorithm.

[1]: http://security.stackexchange.com/a/20301/9332


Once the token has a signature, three-legged authentication.

For example, a Netflix employee can log onto the Netflix Single Sign On system, and that system can issue them a signed token saying "The bearer of this token can log into this AWS account with these powers until this time". The employee presents this to AWS and, if the signature is valid, AWS lets them log on.

That way each employee only needs one account - on the Netflix Single Sign On system - instead of them also needing an AWS account (with a separate password, 2FA and so on).

(SAML is another standard for doing this, but it's a standard with a lot of unnecessary complexity)


I agree with you (I do the same).

However, in certain cases it's useful to have stateless sessions (or rather, moving the state to the client, using signed and/or encrypted tokens).

In this case, JWT is used by e.g. OpenID-Connect, to pass data between systems. In some of these use-cases the client is an intermediary, passing along the token.


Like a portable passport rather than a centralised authentication model. That does sound an interesting concept :)


Passports (travel documents) are increasingly moving to central authentication too. They were prone to trivial forgery before they were chipped and barcoded. And exactly for the same reason.


Such a token doesn't require you to have a database to lookup and maintain the user session/permissions in, which can lead to more reliable and lower latency designs.


Your "reliability" point I'd question because a session table method is harder to attack. But I'd be willing to agree that their reliability would be on a par with each other (where session tables are weaker, JWT's excel; and visa versa).

If you don't mind me asking a few more questions, how are the JSON web tokens typically authenticated if not via a database? Or are they assumed correct?

    edit:
Would the guy who down voted me mind explaining why it's now bad etiquette to ask questions on HN? Or was there something more specific about the post which you took a dislike to?

It's becoming impossible to navigate around the complex neuroticisms of every HN member (since it can take only 1 vote to make a comment grey) so a little assistance would be greatly appreciated. :)


> how are the JSON web tokens typically authenticated if not via a database?

They are signed with a private key (e.g. a secret random 32-bytes kept by the server) when they are issued. Revocation is typically based on an expiration time, hence the timestamp in the header.

The JWT libraries in this case failed their basic premise, that is, to verifying the signature. An unsigned token which contains trusted data is just an open door for the user to claim whatever they want.

The centralized session model is easier to reason about, but has many pitfalls. Now all your servers need to be able to contact that centralized state in order to service even a basic request. HMAC is very fast, much faster than pulling central state from a network service. If you're asking a client to maintain a session token, might as well ask them to maintain the session state itself if it's small enough to fit in a cookie.

I would guess the reason for any downvote is the question you're asking is directly addressed by TFA, so it strongly implies you didn't bother to read it.


> I would guess the reason for any downvote is the question you're asking is directly addressed by TFA, so it strongly implies you didn't bother to read it.

Yeah, in hindsight I can see how some might assume that. I did read the article but was very confused by it. I was having a particularly dim moment and just couldn't fathom JWT's out (I guess we all have days where even the simple things don't seem to mentally "click"?)

Thankfully yourself and others have done a good job explaining things (reiterating where necessary).

Thank you :)


That's a security point, not a reliability point. They will be more reliable, as you don't need to talk to e.g. MySQL that might be down or slow. The disadvantage is that they're difficult to revoke.

Cryptography is used to verify them.


> That's a security point, not a reliability point.

I appreciate the distinction you're making, but in my opinion security and reliability are one and the same when you're discussing authentication methods. The lack of security would render your system unreliable (as you can't ever be certain the credentials are valid); and an unreliable software stack would be detrimental to the security of the authentication (bugs are, after all, often security issues).

> They will be more reliable, as you don't need to talk to e.g. MySQL that might be down or slow. This disadvantage is that they're difficult to revoke.

I'm all for building fault tolerant systems, but the database being available is the bare minimum you'd normally need for a web / cloud / whatever service to operate. Even authentication aside, I'm not really sure what good the service would be with the database offline (so you can log in, but can you now post comments in the community portal, or use the CMS, or book manufacturing jobs, etc. And what about any required form of audit logging?). So I don't really understand why you would need login process to be tolerant to that specific failure.

Your database performance point is an interesting one though. Personally I'd approach that by having the session table mirrored on a high performance key-value store to reduce the load on your RDBMS (to use your MySQL example). But then that solution quickly runs into additional security, reliability and complexity issues; much like using JWTs. So I'm definitely not trying to say my approach to the same problem is any better.

What I did find the most interesting about this JSON web token approach was a comment made by sandstrom (https://news.ycombinator.com/item?id=9302304) where he discussed the decentralised / portability possibilities of such an authentication model. This isn't something I've ever needed to consider in my line of work, but it seems a very useful method of linking different platforms in a clean way, and without them necessarily having direct access to each other.

Anyway, I don't want to sound like I'm dismissing your points. It's a very interesting solution on a common problem. Sometimes I need help to understand the process and the best way to do that is to question it so any ignorance (on my part) is counter argued and thus corrected :)


> I'm all for building fault tolerant systems, but the database being available ...

"the database"? Sure, if you've ruled out building systems that split state across multiple databases, filesystems, etc. However, what if you (for instance) want to isolate the user login system from the purchasing system, so that if someone discovers a DoS exploit against your login DB, users who are already logged in can still use your website and make purchases against your orders DB?

I ran into something similar back around 2005-6. My employer had a big problem with credit card thieves buying our inexpensive product in order to filter canceled CCs from usable CCs. All of the recommended open-source CAPTCHAs at the time expected that the code creating the CAPTCHA image and the code verifying the user's answer shared a filesystem and/or a database. For security reasons, our CC-processing machine was a bare minimum machine with bare minimum interface to the outside world.

So, what I did was make a security token that was tok = HMAC( key0, Concat( timestamp | 0x3F, ipv4addr | 0xFF ) ) ] . I then used seed = HMAC( key1, tok ) to seed a 31-bit linear feedback shift register that was used first to generate several base32 characters and then used to generate the positions of those characters and the psudorandom distortions of the CAPTCHA image. The main webserver then just created a form with hidden fields for tok and timestamp (plus all of the payment information) that made an HTTPS POST to the payment webserver. That way, each IP address would only get a new CAPTCHA every 32 seconds (so they couldn't just hammer us until we gave them some easy pseudorandom distortions), we had a verified timestamp for expiry (so we had a bound on how long we needed to keep successful answers in our anti-replay history), and users wouldn't get rejected if their ISP's NAT or DHCP setup changed their external IP within the same /24 between getting the CAPTCHA and submitting the purchase. The only pieces of shared state between the two servers were key0 and key1.

This wasn't JWT, but it illustrates a real world case where there was a strong need to separate state between two systems and have the client transmit the cryptographically authenticated data between the two systems.

As an aside, the system was built incrementally over the course of one evening. First stage deployment used dummy stub code for the anti-replay history database and sat back and waited for a thief to come along and hit us with a batch of several thousand stolen CCs. An IP from Virginia hit us a few times a second for a couple minutes with purchase POSTs missing the token and timestamp fields, all with different names and CCNs. After a pause of a couple of minutes, I started seeing a few successful purchases per second from that IP address, all using the same token and timestamp. I quickly manually banned that IP address and within 30 seconds saw several rejects per second from an IP address in England trying to use the cryptographic token tied to the /24 in Virginia. Then the attacker apparently decided it was easier to verify the stolen CCs somewhere else rather than try and OCR our CAPTCHA.


If you love to confirm your stereotypes as much as the next guy, take a closer look at the languages:

- Ruby, Java, Lua, Scala and .NET are unaffected

- updates are available for Node.js and Python

- there are still no fixed version for JavaScript and PHP libs


The JavaScript version has additional crypto related vulnerabilities, by the way :-)


They list the php-jwt library from firebase as vulnerable but I can't see why. As far as I can see they do not support the None algorithm.

https://github.com/firebase/php-jwt/blob/master/Authenticati...

It's not in the methods array and the decode method specifically does a check to see if it's empty.

if (empty($header->alg)) { throw exception }

Can anyone throw some light on this?


The second critical problem, which is not addressed by php-jwt, is that it does not take as a parameter to the decode() function which algorithm should be used. It figures out the algorithm by looking at the header.

As the original post describes, if your code expected to use a RSA public/private key pair, then it will pass the public key to decode(). Then an attacker can craft a JWT that claims to use a HMAC symmetric key and sign it with the public key, which is public. One and done.

(your code in that case expected the header to specify an RSA algorithm, and be signed with the private key. but the decode() function doesn't know that)


Cheers, so using this library only to encode should not be an issue for us.


As in, a third party is doing the decode? You should check which library they are using...


Don't use JSON Web Tokens.


Can you expand? JWT are a good way to remove state from the service and the HMAC lets you trust it. This looks like an implementation bug, which is unfortunate, but not a reason to avoid the technology.


I wrote 10 warning signs of bad crypto standards on Twitter a few minutes ago, largely inspired by JWT.


All points well taken. Still, people need to pack stuff into cookies. There are probably some modules for some environments that do this in unimpeachable fashion. How likely is the average developer to reliably pick those modules, or (haha) just code up the equivalent without using a module? At least a flawed consensus around JWT gets people looking at it.

So now what? The draft [0] hasn't expired yet, so it's possible they'll just rip out the public-key stuff. What should they add to answer your reservations about CTR+HMAC?

[0] http://self-issued.info/docs/draft-ietf-oauth-json-web-token...


Apparently the drafts have been sent to the editor so they can't be changed. [1] Oh well!

[1] http://self-issued.info/?p=1323


Good alternatives?



> This article originally appeared as a guest post on Auth0’s blog.

um...


That blurb was added later.




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: