
Building a Stateless API Proxy - panarky
https://blog.thea.codes/building-a-stateless-api-proxy/
======
lvh
Firstly, great writing. Secondly, magic proxies are awesome. And
pyca/cryptography instead of something terrible like pycryptodome! All great
stuff.

Some critique of the crypto bits, since I'm a crypto person.

1\. Do you actually need asymmetric cryptography for this? It seems like at
some point the proxy has full authority, and it could just encrypt the token
for you symmetrically? (This is valuable because symmetric crypto is a lot
less precarious than asymmetric crypto, see next point)

2\. Please don't use PKCS1v15 padding for encryption in new systems. It's been
known to be busted for about 20 years now. We have workarounds, and they may
well be deployed in the exact context you're using it. But they keep breaking,
because we just have them to keep the infinite amount of already-deployed
software running, not because we think it's fundamentally fixable. This is
also the textbook example of a vulnerable service: one that takes ciphertexts
and decrypts them on demand. With PKCSv15, I can modify the ciphertext so that
_the way you treat the modified ciphertext_ tells me something about how to
make private key operations. And in this setup, that means I get the real
token, so that sounds bad. The good news is that you've successfully designed
around it by adding a signature, so I don't think it's mountable... but.
Please, no more PKCS1v15 :-(

3\. It feels a little awkward to use JWT for the outside signature but not the
inside encryption. But less JWT is a good thing :)

Concrete suggestion: use Fernet (you're already using pyca/cryptography) or
libsodium's secretbox and then all of the crypto problems go away. You keep
the security engineering dilemma of stateful v stateless proxy (do I want the
real token in the proxy at all?) -- but that's another argument.

~~~
theacodes
This great feedback. I'm definitely not a crypto expert. I'm happy to update
the post with a different padding algorithm if you want to suggest one. I'm a
little hesitant on swapping out RSA because I intentionally picked something
relatively simple and familiar to folks, but yeah, will still do some
research. Others had suggested ECC which I think is totally worth noting in
the post one way or another.

Thank you!

~~~
lvh
The answer to your question is OAEP but I feel like I'd still be doing you a
disservice there because I am convinced the answer ought to be box/secretbox
or Fernet or AESGCM maybe -- but that hinges on my question about asymmetric
cryptography which is elsewhere in the thread :)

~~~
theacodes
Okay, it's updated to use OAEP and PSS padding. Thank you!

------
kolektiv
Just a quick supportive comment on this - it's a really nice piece of writing,
introducing a topic which tends to cause people to glaze over, and doing it in
a very approachable and easy to pick up way - and that takes some skill and
effort!

~~~
theacodes
Thank you so much, I really appreciate the kind words!

------
dnet
Using Elliptic Curve cryptography would've resulted in much smaller
signatures, libsodium is considered secure and has bindings to most
sane/modern environments:
[https://download.libsodium.org/doc/](https://download.libsodium.org/doc/)

JWT also has its fair share of security issues in itself:
[https://paragonie.com/blog/2017/03/jwt-json-web-tokens-is-
ba...](https://paragonie.com/blog/2017/03/jwt-json-web-tokens-is-bad-standard-
that-everyone-should-avoid)

~~~
1_player
That JWT link sounds like hyperbole.

Can anybody chime in on whether JWT is absolutely broken as stated in the
article, or, while it has some issues, the author likes being a bit too
dramatic?

~~~
lvh
JWT is multiple layers of bad. My favorite summary is that it has poor
implementations of a harebrained scheme designed to solve a problem you don't
have.

The idea that I need to read the header, which is unauthenticated, to parse
the token violates the Cryptographic Doom Principle. Has that led to
vulnerabilities? Of course it has: I just said it violates the Cryptographic
Doom Principle.

The idea that it has everything plus the kitchen sink -- even for drastically
different behavior and opinions on how the world works, from symmetric
encryption to asymmetric signing and multiple implementations of each at that,
is anathema to modern cryptographic design. Wireguard has one scheme and it
does a lot more complicated stuff than "encrypt a session token".

JWT's saving grace here is that few people implement all of it. And ... that's
... cool? Until they do, of course.

You can argue that something is an implementation problem and not a spec
problem. Some issues definitely are, but if every major implementation has the
same damn bug, then I think it's a spec problem. Unauthenticated headers are a
spec problem. PKCS1V15 enc is a spec problem. The fact that an implementation
can patch around it doesn't make it not a spec problem. I'm sitting on several
more vulns in ~every JWT library that are, to cryptographers, literally too
boring to publish even though one of them is _key recovery_.

Other posters have said that it's silly to say that merely the ability to use
it unsafely is a problem. But good crypto looks exactly like bad crypto while
you're doing it, and there's good crypto that doesn't have that set of
problems, so why would you ever choose the poor design?

(Don't use JWT.)

~~~
sarcasmic
Most people use the JWT compact serialization, which cannot carry the
unprotected header at all. If you're exchanging JWT compact tokens, the header
is protected by the signature or the encryption.

~~~
lvh
What? You mean protected by the _MAC_? The header is never encrypted: the
header is how you even figure out what to do with the rest of the message at
all. That is why it definitionally can not be protected by it (that's the
definition of the cryptographic doom pricnople!) and is how the bugs I am
referencing are exploited to begin with. The only sense that a JWT header is
"protected" is that the spec calls it that.

Have you ever exploited a JWT vuln? Which one? Because odds are there's a way
it boils back down to the JWT header design choice being silly.

I mean there's an easier way to have this conversation: if the header is
"protected", how did the alg=none bug ever work?

~~~
sarcasmic
In a JWS the header is integrity-protected by the signature if the alg isn't
none. This is prominently noted in the specs and alg=none artifacts are
referred to as "unsecured JWS". In a JWE the protected header is integrity
protected by the AEAD cipher, because all encs must be AEAD.

The alg=none substitution issues happened because of bad usage of mediocre
libraries. Other algorithm substitution can arise for the same reason. The
invalid curve attacks were the ones that the spec didn't call out as a
security consideration.

I support the arguments that say algorithmic agility is a bad idea and a new
protocol with algorithmic agility shouldn't have come out at a time when other
protocols (like TLS) were finally starting to catch on to this fact. But the
JWT cat is out of the bag, and won't go back in: it's widely deployed and
people are using it thinking it's solving problems they actually have.
Education is the proper remedy.

The PASETO effort attempts to provide better answers and better design to an
audience familiar with JWT, but there's also been an uptick in the kind of
advice that heavily condemns JWT without supplying some migration paths. That
latter brand of advice is harmful.

~~~
lvh
Same points I made before: if more than one library has a flaw, it’s a design
flaw and not a one-off implementation flaw, and if you’re trusting the header
before you validate (which is necessary!), then it is not meaningfully
protecting anything, which is why those bugs work.

And, finally: we’ve put together an extensive list of recommendations,
repeatedly, both in general and in the articles on this thread.

------
pm90
Slightly OT: Github has API rate limits that often slaughter us (we use
Jenkins to scan our Github org), and will get worse in the future as we
move/create more repos. Could this be used to alleviate that? I believe that
would need some caching, but ... I don't know how caches work exactly, and how
I would go about implementing that here...

~~~
pseudobry
Take a look at Maintner, a tool made by the Go team to load and store GitHub
metadata, particularly issues, comments, reviews, and events:
[https://godoc.org/golang.org/x/build/maintner](https://godoc.org/golang.org/x/build/maintner).
It handles backoff, holds all data in memory, and supports backing up data to
GCS.

My team is successfully using it to track ~200k issues/PRs across ~300 repos.
We wrapped our deployment in a minimal API to give our infra blazing fast
access to that data without needing to worry about GitHub auth or rate limits
(since Maintner handles that).

And, we use the proxy described in the article.

------
languagehacker
This is great! A couple of years ago I needed to write a quick-and-dirty proxy
in Golang to get around some draconian security policies placed on us by the
team running our GitHub: Enterprise server. We released it as an open-source
project here: [https://github.com/electric-
it/hubbard](https://github.com/electric-it/hubbard)

------
siscia
Is there a reason why they don't encrypt both the token and the permissions?

This would remove the needs of a signature altogether.

~~~
yardstick
You still need an integrity check or signature on the encrypted data,
otherwise it’s potentially possible for someone to tamper with the ciphertext
to change specific parts, such as the permissions.

If you are encrypting using an AEAD cipher like AES-GCM or ChaCha20-Poly1305
then it is already built in. But AES-CBC and others need an explicit
verification on top.

~~~
nkrisc
Ok I'm a bit of a cryptography noob, but are you saying it's possible to alter
the encrypted token such that when decrypted by the private key the
permissions are changed, in this example, but the real GitHub token is not?

Edit: nevermind, I think I've just misunderstood the process outlined in the
article. I confused the tokens.

------
ricardbejarano
I built something similar for the Amazon SES (Simple Email Service) for my
cron jobs and other private applications to use.

[https://github.com/ricardbejarano/postino](https://github.com/ricardbejarano/postino)

------
twblalock
As the article mentioned, revocation is a problem with the stateless approach.
I've never seen a way to revoke individual tokens statelessly -- you either
need to maintain state about the valid tokens, or maintain state about a
revocation list.

~~~
greiskul
I don't think it's possible to do revocation in a stateless manner. The token
that you are revoking was once valid, and when you decide to invalidate it,
this change of state needs to be persisted somewhere. But yeah, if revocation
is rare and only few tokens are invalid at any given time (which can be easy
by adding an expiry field to tokens), keeping a revocation list is the way to
go.

------
mgraczyk
Cool stuff! I use a similar thing in my own servers to generate short-lived or
one-time access to individual requests. Nice to be able to share an url or
`curl` command with somebody to see what I am seeing or for ad-hoc permission
grants.

------
sleepychu
How does the proxy get the initial token? Do you hand it a GH token and get
back your magic token?

~~~
simonw
Yes, it has an API you can call that will encrypt the token with the proxy's
key. Implementation here: [https://github.com/theacodes/magic-github-
proxy/blob/master/...](https://github.com/theacodes/magic-github-
proxy/blob/master/src/magicproxy/async_proxy.py#L32)

------
kureikain
Anyone knows what tool is used to generate/ those images in the post?

~~~
theacodes
I hand-drew the illustrations using an iPad pro & photoshop. The code
highlighting is using Sphinx + witchhazel.thea.codes.

------
0xdeadbeefbabe
This begs the question why not be github? I bet some permutation of this idea
occurs to the next developer.

