If you dig through Node repos on Github searching for "jwt.decode()" or "this.jwtService.decode()" you'll see a shocking number of results.
The other big mistake I've seen is encoding personal data in the token, instead of just an ID that only has meaning inside of your database. It turns out not everyone is aware that JWT tokens are publicly decodable and that the only security is in the signing + verification. Many people I've spoke with think the token is like a hash.
I worked on some code signing code, and one of the internal customers, who had a lot of clout (but not as much as they thought) were adamant that I needed to provide them a way to pluck data out of the payload without verifying the signature first. I told them, in a variety of extremely diplomatic ways, to politely fuck off. In fact I did a sweep of the code to make extra sure that none of the data paths gave you anything before verification had finished (there were not barn doors but there were soft spots). When your dog is sniffing around, it's time to hide the treats.
The only reason you want to look at a payload is to make decisions about it, before you can trust those decisions are rational. What I did promise them is that I would continue to try to make the checks cheaper, which we did.
Ten years later, I still trust that I made the right call. Never expose a function from a security library that gives you the data without verifying it first. You are doing violence to the user, the team, the company, possibly an entire industry.
In general I don't see any reason to look at any data in the payload before the signature is verified. If anything, a good reason is that increases attack surface by giving attacker ability to interact with my system without the signature.
I don't know what exactly you mean by improving performance by looking at the payload before the cost of signature verification. If you mean that for some reason you could discard large portion of messages without actually verifying the signature then this is generally false.
In a DoS situation the attacker might want to consume your resources without bothering to forge the signature but just as well he can forge the content so potentially easy fix to their fake content will cause you to pay for the signature verification anyway.
If, on the other hand, the application somehow relies on this to be viable then very likely this is a design fault. I can imagine for example that you might want to discard messages with outdated tokens but there are better ways to solve this (just don't send outdated tokens?)
In one case they were trying to categorize the content based on inspecting the manifest, shall we call it. If you start making decisions based on non-authoritative information, my experience is that you very quickly forget that it's non-authoritative. That slippery slope is universal, rather than particular to any examples in this thread. If it can happen, it will happen.
If you decide things without checking with a higher authority, then you are the Confused Deputy, and that typically ends badly. It's not Barney's fault he keeps getting locked in the jail cell. It's Andy's.
The only case I've come across so far of needing the unverified claims is when you need to verify the signature based on a key dependent on the token issuer (iss claim).
You could arguably define another means of out-of-band signalling, though, but it would also be unauthenticated.
Edit: That said, getting the unauthenticated data should be very explicit and impossible to do "accidentally", if the library allows it at all.
I'm not terribly familiar with the implementation details here.
Are you essentially talking about having different strategies to verify the token? Would it make sense to provide some sort of filter function as a parameter that the library can call to clarify what operations need to be performed?
That would allow you to peek at the records in a context where your role is clear, and where the code flow makes it awkward to take the data and run with it. If the code feels wrong enough, it's hard for the developer to claim ignorance, and if it's clear from the outset that ignorance is not an explanation, that leaves malice and stupidity. Most developers won't admit to either, but if forced they'd rather be seen as malicious than stupid, and malicious gets you a meeting with HR.
Yes, some libraries provide a key callback function where the unauthenticated token is provided for this type of thing, but the result of the function is supposed to be the key used to verify the token and nothing else is supposed to happen as a side effect.
Yes, you could have side effects, but it's pretty damned obvious you're being naughty. That's mostly what I want. The obviousness of the cheating makes people think twice, ask questions. Slow down.
But are you going to do that triage programmatically or in a debugger?
A lot of libraries could be better if the authors spent some time sitting at breakpoints in their own code. You could, I'm sure, arrange a JWT parser so that the data you need for triage is hanging out in a very clearly named stack frame a couple steps into the library.
In my case the format was simple enough that I trained a couple people to be able to inspect these things by hand and tell what went wrong. The happy path for the format was stupid simple. Anyone could have written their own parser for the happy path. It was the rules enforcement that made the code I had worth anything. We were the source of authority, and I took that seriously, on the Precautionary Principle if nothing else (and there were plenty of other reasons).
I think with JWTs, you sometimes need to do this because the JWT itself stores the id of the key used to create the JWT. For example, if you have a multi-tenant application where each tenant has their own key, or for key rotation (although in the key rotation case, I guess you might be able to try every key instead.)
In this vein I wonder if it wouldn't be a bad choice to underscore the non-verified decode methods in token API's. Surely people would still abuse them but at least there's an immediate visual warning if you choose not to read the docs, "Why is this method underscored but the other one not?"
It's still booby-trapping. I'm finding as I get older that a lot of programmers use their own mistakes to rationalize why they are the only ones who should be trusted with certain things.
Well, of course I'm in charge. Last time Steve did something he pushed the big red button that shut down the whole server room. If I weren't in charge, then it would just be "Steve"s everywhere. Anarchy. Who put the red button there? Who thought having the big red button there might be a stupid idea? That's not important. What's important is that Steve fucked up and I didn't. Remember: never trust Steve.
When people sit down to write open source, they write it the same way they would write code at work, because they've either never experienced anything better, or it was drowned out by the din. So their OSS is always almost as toxic as the situation I illustrated above.
If, for some reason, you want to leave dangerous things laying around, you need way more safety interlocks than an underscore. If you should never call a function, what's the harm in making it expensive to call? Use a function name where the only reasonable interpretation of calling it is that the developer consented, with full knowledge. Make the function name 60 characters long with some colorful adjectives. Name it after the OWASP recommendation that says what a bad fucking idea it is to call this method. Giant neon lights. Fireworks. And a marching band.
ES6 has a way to name functions with Symbols, and Symbols with the same name are not equal. It's not used often but I've seen it used to give a secret back entrance to an object. Like a very limited version of friend functions/classes.
If the symbol is not exposed, then you can't get anywhere in the neighborhood of calling it accidentally. I think if you do enough object introspection you can still get the function handle, but you could never mistake the resulting code for being anything other than farm fresh, Grade-A Cheese.
Mmm. The people who don't like JWT tend to blame JWT here, but this type of mistake is a sharp edge in a library that has no good reason to be there except laziness.
We've been here before with HTTPS. Today the libraries most of you are likely to use to fetch https://example.com/something (such as Python's requests) will without any extra effort default to insisting that the remote server has a certificate for the name example.com from a CA trusted in the Web PKI and knows the associated private key.
But even just ten years ago the equivalent library might have a flag labelled "verify" that defaults off, and it turns out even if you do set that flag you're only causing it to check the server can prove it owns this certificate, not that the certificate is for the name you wanted, or issued by anybody you trust. So even if you jump through hoops it's useless.
That wasn't a bug in TLS, and I'd argue that libraries offering moral equivalents of jwt.decode() isn't a bug in JWT, it's a bug in these libraries. These libraries are garbage, their authors should be ashamed.
Imagine if you got a library with a sort() method and careful reading of the documentation revealed that oops, you actually mustn't call sort() if you want things sorted you need to call trickyOrderingMechanism(ORIENTATION, input, mode_flags) and read sixteen extra pages of documentation because the sort() method just uses the "default" sort of "not sorted at all". Brilliant. Library goes in the trash right?
I don't really understand what you're arguing here. I don't think this is a "sharp edge" as much as people fundamentally misunderstanding JWTs. 'decode' seems like the proper term for taking the base64-encoded string and, surprise, decoding it. I mean, you can decode any JWT, without any verification, at https://jwt.io now. If you're calling decode, which doesn't take any kind of public key, how would you except it to do any verification in any case? What exactly would one expect it to verify?
Verify then seems like the proper term for verifying that the JWT was in fact signed by the corresponding private key to the public key you pass that method.
I'm curious, how else would you define these libraries?
Not quite. It's normal to decide the header to find the right key. Yes, you can also do this as a callback, but callbacks make for confusion code flow. It's fine for jwt.decode to exist and be public - if it only decodes the header and not the body.
If it's unverified, then you can't trust the header to begin with. You can't trust that it even forms any kind of valid format. It could be a payload of "naughtystrings" for all you know.
And if you really must decode it, then you can call the base64 decode yourself - and know what you're doing is unsafe.
In the case of an unverified token, it is best to pretend the token doesn't even exist.
Literally incorrect. You can't verify the jwt prior to reading the header, because you don't know which of the advertised keys was used to sign the jwt.
I assume you're refering to this part of the spec:
> If the JOSE Header contains a "cty" (content type) value of "JWT", then the Message is a JWT that was the subject of nested signing or encryption operations. In this case, return to Step 1, using the Message as the JWT. [0]
This is also step 8 of validating a JWT. There is a number of validation steps you need to take before you can decide how to process the payload.
You'll also note that none of the previous 7 steps actually decode the payload ("the Message") - they only find it. And 4 of those steps are about verifying that the header isn't dangerous.
Again, the library can provide that functionality in a better way. In this case, the library could provide a means for the consuming code to be queried about what keys are available. E.g., in JS-like pseudo-code:
validator = new JwtValidator();
// By default, the validator should reject everything, until you call…
validator.add_signing_key(new RsaSigned((a strongly-typed RSA key));
// I wouldn't even add this to the library, but let's
// say you want to comply with the RFC fully:
validator.add_signing_key(new CompletelyUnsafeNoneAlgorithm());
// (And I'd name it that explicitly, to let the consumer know that
// what they're doing is nuts.)
This validator knows exactly which algorithms — and keys — are valid.
If the full keyset is somehow not known ahead of time, a callback to enumerate keys of a certain type could be configured.
Better API design here prevents this vuln. from ever happening.
(Personally, I wish the RFC had never added "alg": "none"; it's a mistake waiting to happen. I wish JWS/JWT could be updated to remove it.)
> The other big mistake I've seen is encoding personal data in the token, instead of just an ID that only has meaning inside of your database. It turns out not everyone is aware that JWT tokens are publicly decodable and that the only security is in the signing + verification. Many people I've spoke with think the token is like a hash.
I have been working as a software engineer professionally for 3 years (the entire time with web technologies) and I learned this like 4 months ago... I think there needs to be more awareness of this.
Anytime one sees base64 text, have a little lookie at the decoded version, as there are usually all kinds of fascinating things that people will "encrypt" in b64
And you aren't the only one poking around in such weakly obfuscated data.
Part of the driving force behind the IETF's enthusiasm for end-to-end encryption is not high minded principles of privacy (though there certainly are plenty of people who care, not just from the EFF and ACLU) but engineers frustrated that protocols get ossified by idiots decoding packets and guessing what to allow rather than reading the documentation. If you encrypt everything, these idiots can't tell what is what and so they either block everything (which at least has the benefit of being obvious) or they give up and stop trying to do the impossible. Hooray.
The idiots also write passive aggressive "informational" documents about how the sky is falling because of all this encryption, but you get used to those.
I've raised a PR to rename decode() to unsafeDecode() (removing it from the public API might be a hard sell). I really have no idea why this is part of the public API.
I saw a set of junior devs implementing this thinking it was encryption, and not understanding the difference. I feel like the main JWT page needs a big warning that says “this is not encryption, but a way to verify the contents have not been modified by anyone but you”
Hey, I can't deny that was me 5 years ago. I was totally putting emails and other personal info into JWT's because "Hey, it's encrypted".
Live and learn haha
But yeah, definitely some lack of awareness around how JWT's work + best practices.
I think that Auth-as-a-Service platforms and the "npm install" culture aren't helping this at all.
In recent years from conversations with other devs, it seems like many of them have never written an auth system themselves + couldn't do it if you asked. Coming from an era where rolling your own token-auth setup, or on larger projects even custom OAuth providers, this is both a bit mindblowing and kind of scary.
> The other big mistake I've seen is encoding personal data in the token,
You’re right that this is a big mistake but I also thought this was in fact one of the primary allures of JWT - an attempt (misguided) to make session management stateless and avoid that database roundtrip on the server end?
They could rename it to unsafeDecode or similiar. Yaml had a similar design problem where decoding something could execute code which was unsafe. The solution was to leave the function but change the name.
Slightly OT but why should I choose a JWT over creating some opaque token (random bytes) and storing that in a database mapping it to a user's ID?
It seems like people create short lived JWTs and then pass some opaque token to an auth endpoint to get a new JWT signed whenever the old one expires, or they store the JWTs in a database so that they can be revoked, making me question their usefulness.
Because people think their app will immediately scale to a billion users and their database is unable to service all those users. (It won't and the database can)
Alternatively, they have drank the microservice kool-aid and think that making a request to an AAA service every time the user wants to do something is just too much overhead (It isn't).
If you have a federated and perhaps also heterogeneous system then "Just use a database" isn't the easy option.
Think about all the people/organisations that emit tokens you trust, or trust tokens you emit. Would they fit in an elevator? A meeting room? A conference venue?
If the code that mints all your JWTs and the code that verifies them are two methods in a class running in the same service and maintained by the same team, that's a sign you probably didn't need JWTs and an opaque token was more likely what you should use.
I remain skeptical that JWTs are a good idea for the general case. People like them because you can do stateless auth, but if your database really can't handle a single PK-based lookup per request, I feel like you have other problems. And as you say, a lot of people end up storing them in a database somewhere anyway.
The way I've seen it work is with having short lived access tokens and a refresh token, with the refresh token being saved to a database so it can be revoked. I think the benefit over an opaque token is that you have data that can be verified to be true and then passed on to multiple places. E.g passed between microservices
> Slightly OT but why should I choose a JWT over creating some opaque token (random bytes) and storing that in a database mapping it to a user's ID?
To avoid the database lookup. Often you might want to hold some state, without the latency.
JWT is about your server being able to be stateless, whilst the client is stateful, which can speed up some... Irritating... Places of performance problems.
But you then lose the ability to revoke a token on the backend, given that requires a DB lookup. Or you have very short lived token, meaning that you don’t have real benefits versus an opaque token in DB.
IMHO JWTs only make sense in some constrained contexts, such as:
1. You want a report, click on “generate”
2. The processing starts, you receive a token to access the resource
3. Once the file is created you get can access it by using the token previously received
In those kind of short term and limited use cases they can make things a bit nicer as the “report generation service” only need to check the token.
But in practice JWTs are often used used as a general authentication/authorization mechanism, and that makes little sense (and brings a lot of overhead).
Okay, maybe I shouldn't have linked to a twitter thread :).
You shouldn't use alg none unless you have some other way of protecting the contents of your JWT (like, say, network isolation) and you have verified the performance impact of signing a JWT is worth the possible security impact.
Not too many folks are in this situation, so in general, sign your JWTs!
The library most people use for JWT stuff in Node is Auth0's. There are two methods for decoding a token payload: jwt.decode() and jwt.verify()
Both of them return the decoded payload, but one (if you don't read the documentation) sounds more immediately like the one you think you want to use.
jwt.decode() doesn't verify the token when decoding:
https://github.com/auth0/node-jsonwebtoken#jwtdecodetoken--o...
If you dig through Node repos on Github searching for "jwt.decode()" or "this.jwtService.decode()" you'll see a shocking number of results.
The other big mistake I've seen is encoding personal data in the token, instead of just an ID that only has meaning inside of your database. It turns out not everyone is aware that JWT tokens are publicly decodable and that the only security is in the signing + verification. Many people I've spoke with think the token is like a hash.