Hacker News new | past | comments | ask | show | jobs | submit login
Securing Microservices (facilelogin.com)
162 points by prabaths on Oct 12, 2017 | hide | past | favorite | 56 comments

If you're using microservices and care about security, do yourself a favor and use a monorepo.

A lot of improving security is about changing things in small ways but across the entire fleet. If you have microservices without a monorepo you oftentimes need to make the same changes in potentially hundreds of places.

This makes it a lot easier to do things like enforce standards for repos. Code coverage. Testing. Unsafe function use. Repo sprawl makes microservice security very challenging, and it isn't mentioned in this blog post. Losing track of services and leaving specific services behind is not good.

> If you're using microservices and care about security, do yourself a favor and use a monorepo.

This seems like a strong reminder that "microservices" aren't really about having lots of independent little systems but are a different way of factoring your one big system.

It's like FizzBuzz - do you handle the 3 first or the 5?

You handle the 15 first to avoid the accumulator requirement.

My 15 falls through both the 3 and the 5 paths. So I must do 3 first, or 15 will be BuzzFizz.

No, I'm saying just do a switch with four branches, %15, %5, %3, default, and break all of them. That way you're explicit about the ambiguous case and you avoid string concat or stateful stdout logic.

I think the point was that you do

If n%15==0

First, then the others. But it’s a pedantic point anyway (like this one)

Or you can print fizz and buzz in independent statements, or it against printing the raw number, and then print a new line for each number in the end.

yes, yes, a thousand times yes.

We implemented our services like this a few years back. It worked really nice. But in everything that I have read I have never seen any references to this practice. I didn't even know it was called "monorepo". I was just assuming everybody was using multiple repos and we were weird.

Google and Facebook are the largest examples of monorepos that I'm aware of. Here is a write-up about it overall, though you don't need to be convinced :)


Both Google and Facebook are also dealing with such large repos that they've needed to start either customizing or building their own SCM's. MS started the GitVFS project to do a similar thing to suit their needs.

Most people aren't at that scale, but IMO, many benefits people get from monorepos you also get by using GitHub/Gitlab with master projects mapping in git repos via sub-modules.

Anyway, it really sucks to work on extremely large monorepos when you don't have access to the same resources as Google and Facebook. For this reason I'm personally always hesitant to recommend monorepos as the be-all-end-all.

A good number of engineers from facebbok and google are active developing on mercurial, even if it gets comparatively less attention from the big public.

Facebook also maintains a set of custom extension for it [0], and there is an interesting talk about the reasons beyond their choice [1].

[0] https://phab.mercurial-scm.org/diffusion/FBHGX/

[1] https://m.youtube.com/watch?v=gOVD-DrUpwQ

There is also Mononoke, a HG server being built in Rust: https://github.com/facebookexperimental/mononoke

(work for FB)

That's really interesting. I've updated wikipedia mentioning this.

The fact that Google and Facebook go to such lengths means that there is a big benefit on using the monorepo. Or perhaps they are stuck with it.

> For this reason I'm personally always hesitant to recommend monorepos as the be-all-end-all.

May be not a monorepo, but at least trying avoid having too many repos.

Again, I go back to the sub-module approach with some small amount of automation built around that. Managing sub-modules kinda sucks, but if you use it as a reference to all of the products in production and as a synchronization point for delivery, then it gives you all the benefits of the monorepo, without the performance issues.

> it really sucks to work on extremely large monorepos

What kind of scale are we talking, and what issues do you get?

I combined 50 repos into a mono repo. It sucks because git is very slow and the logs are noisy but having 50 repos sucks worse. Deploys also take forever but at least they're atomic now. The real issue was the previous devs copy and pasting codebases because hey heard microservices were web scale. Overall I'm happy with the tradeoff it's allowed me to unify the look and feel of the platform a lot easier by just doing platform wide find and replace, and also revert across the board if stuff goes wrong. It's also allowed me to start removing duplicate code bases. Instead of doing 50 repos for 50 apps, a better idea is to do layers. One repo for the backend. One for the frontend. One for the cache layer, etc. you start separating out your electron app from your web app into separate repos and you'll find you need even more repos for shared UI elements. It can easily lead to copy pasting or worse an explosion of repos. You don't want to tell your new hires "welcome to ABC corp. as your first task clone 90 repos and npm install each one". If you're going to do it at least write a script to set the whole build up in one go. Also keep in mind the tech giants mostly used monoliths up until thousands of employees before refactoring to microservices. For auth you should probably have every app validate tokens by going out over the network to the auth microservice. This way you can easily switch, for example from JWT to sessions, in one place.

600k+ files at 12gigs of Repo (without history). I've been trying to work on what option we haven't to get off our old SCM. Right now Git is potentially too slow, and that's just the local system problem. Git LFS works decently well large files.

I've explored lots of different options, and hope to look at mercurial at some point, but am not hopeful.

Uber as well.

A monorepo is certainly worth the consideration, but there are other options if you leverage CI (such as dependent builds).

I was functioning as an architect at a pretty large company and we used spring boot. My team wrote a number of internal starters (and our own parent Pom) that all other teams would use and set it up so that the services would build/test/deploy after our base pom and starters would release. It obviously takes a bit of time and tooling to do that, but it was working quite well for us and still kept us from needing to update the same thing manually in 20 places (that is until we’d need to release a breaking version).

This is overstating. Monorepos have some terrible trade offs.

I think a monorepo works well for company cultures that have a lot of internal code, it works less well when you have a highly decentralized mode of operating (the whole point of microservices IMO) and a lot of shared externally written open source code. Repo sprawl isn’t an issue if you have known orgs - the security team’s CI/CD checks them all and files issues or PRs to them all.

"Care about security" is bad way to phrase it. Of course everyone cares about security, only the levels and requirements differ. Please note that I'm only responding because this fallacy comes up a lot of times, and one needs to be transparent with trade-offs when suggesting something like microservices with a monorepo:

* If you have 100 services in a monorepo, then it needs completely different toolchain like bazel/buck (all new) or cmake/qmake/etc, to find out the whole dependency graph, with deep integration with the SCM to rebuild only changes and downstreams, avoiding a 2-hour build, avoiding 10gb release artifacts, scoped CI/commit builds (i.e. triggering build for only one changed folder instead of triggering a million tests), independent releases, etc

* Some more tooling for large repository and LFS management, buildchain optimization, completely different build farm strategies to run tests for one build across many agents, etc

* Making sure people don't create a much worse dependency graph or shared messes, because its now easier to peek directly into every other module. Have you worked in companies where developers still have troubles with maven multi-module projects? Now imagine 10x of that. Making sure services don't get stuck on shared library dependencies. Should be able to use guice 4.0 for one service, 4.1 for another, jersey 1.x f/a, jersey 2.x f/a, etc etc. Otherwise it becomes an all-or-nothing change, and falls back to being a monolith where services can't evolve without affecting others

* Does not mean its easy break compatibility and do continuous delivery (no, there is database changes, old clients, staggered rollout, rollbacks, etc. contracts must always be honored, no escaping that, a service has to be compatible with its own previous version for any sane continuous delivery process)

Imagine monorepo like: building a new Ubuntu OS release for every Firefox update, and then work backwards, doing it for every single commit. I'm not even scratching the surface of anything here. It changes everything - from how you develop, integrate, test, deploy, git workflows, etc. This is why big monorepo companies like Facebook/Google release things like bittorrent-based deployments, new compression algorithms, etc - because that's the outcome of dealing with a monorepo.

I may go as far to say this, after many many journeys:

Monorepo with monolith - natural, lots of community tooling, lots of solved problems.

Multi-repos with multi-services - natural, lots of community tooling, lots of solved problems.

Anything else without the right people who have already done it many times, and you're in for a painful rediscovery journey of what Google/Facebook went through, and this does not have as much knowledgebase/tooling/community/etc as other natural approaches.

Additionally, it's important to eliminate as much duplicate code as possible through shared directories/libraries or else you will end up in file/dll hell.

Sure. We do this with ansible, and an site.yml mapping roles to systems.


Changes are hard (tm)

Execution order changes with each change requirements.

Should i commit this now? What if other team mate executes now?

It's not easy.

There is a new standard forming for providing identity with this kind of architecture called SPIFFE. Check it out at https://spiffe.io. Its basically mutual TLS but with identity baked into the certificate. Along with what the certificate looks like, there is a reference implementation called Spire, to generate and distribute the certificates.

SPIFFE's next SF community day is 3 November. To learn more about this event and other project updates, join the Google Group (https://groups.google.com/a/spiffe.io/forum/#!forum/announce).

I assume SPIFFE is more useful to system to system authentication without the end user context - like how Netflix uses short-lived certificates to secure interactions between microservices (https://medium.facilelogin.com/short-lived-certificates-netf...) ?

Thats the primary motivation and main focus for SPIFFE. Providing service to service identity. However because its not breaking any of the standards its potentially applicable in other contexts. The SPIFFE SVID (the certificate standard) doesnt do anything wierd or different with TLS certs (which is actually a strength) it more sets out a way to use the current existing cert infrastructure to provide identity.

> A signed JWT is known as a JWS (JSON Web Signature) and an encrypted JWT is known as a JWE (JSON Web Encryption). In fact a JWT does not exist itself — either it has to be a JWS or a JWE. It’s like an abstract class — the JWS and JWE are the concrete implementations.

This is backwards; a JWT is the payload of (usually, IMO) a JWS, sometimes a JWE. But not all JWSs/JWEs are JWTs, so JWE/JWS cannot be called a concrete implementation of a JWT.

> Both in TLS mutual authentication and JWT-based approach, each microservice needs to have it’s own certificates.

JWT doesn't, to my knowledge, make use of certificates. I'm less clear on the JWE cases, but JWS's only carry the algorithm used to do the signing, and the signature. You have to know/figure out what key signed it to verify it.

Further, if you're using the HMAC algorithms, you're definitely not using a cert.

In fact JWT is an abstract concept - I have written a blog about that in detail. Please find it here - https://medium.facilelogin.com/jwt-jws-and-jwe-for-not-so-du...

HMAC is not recommended - as it will be symmetric key. In fact you will find more details in the above link...

Forgive me if I don't find your own blog a decent citation. Re-reading your first article a bit closer, I can see how you might interpret it, I think (and perhaps I was not charitable enough during my first read), but I still think it is perilously close to mixing JWS being a concrete serialization of a JWT (this is correct) with JWS being a JWT (this is not).

JWS is simply a format that contains a signature for an arbitrary (that is, not always JWT) payload. See the "typ" header in the JWS RFC (in fact, see the entire RFC)[1]; were a JWS always a JWT, we would have no need for "typ". In fact, the RFC for JWS never mentions JWT in the normative parts of the standard — it is only ever mentioned during examples, since JWT is perhaps the primary consumer of the JWS standard. It calls this out, explicitly:

> While the first three examples all represent JSON Web Tokens (JWTs) [JWT], the payload can be any octet sequence

JWT is the concrete thing in that it is a JSON document encoding a set of claims, and comes either signed (wrapped in a JWS) or encrypted (wrapped in a JWE). JWT's RFC[2] states this fairly directly:

> The claims in a JWT are encoded as a JSON object that is used as the payload of a JSON Web Signature (JWS) structure or as the plaintext of a JSON Web Encryption (JWE) structure

While I see what you're getting at with "JWT is an abstract concept" — that you need to wrap a JWT inside a JWS or a JWE — that does not mean that all JWSs are JWTs, and while the text can certainly be interpreted as "you must wrap a JWT in either a JWS or a JWE", I feel it toes the line too close to "all JWS's are JWTS", particularly at, "A signed JWT is known as a JWS". Having a JWS doesn't imply that it is a JWT; for example, the ACME protocol uses JWS, but not JWT. The distinction here is subtle, but important, I feel.[3]

> HMAC is not recommended - as it will be symmetric key. In fact you will find more details in the above link...

Not recommended by who? For what reasons?

Your article never mentions HMAC AFAICT (it mentions MAC in the process of describing JWS, but no further). And yes, use of HMAC implies a symmetric key, but that isn't necessarily insecure: it just means that anything that wishes to validate JWTs signed with that key must have the key to do so, and thus, must be trusted with that key. If you have a single service (say, an "auth" service) that is responsible for validating JWTs, this works fine, and is a great tradeoff for the additional complexity that signing w/ RSA keys brings. E.g.,


  client  -- login --> auth_service
         <--  JWT  --

       client  -- JWT+cmd --> foo_service
                                          -- is this JWT valid? --> auth_service
                                         <--      yes, it is    --
              <-- success --
Here, the HMAC key only needs to reside on the auth service, so it is reasonably well contained. The tradeoff, of course, is that we need to make a network request to auth service to validate JWTs, but we don't need to deal with RSA. For some setups, this is perfectly acceptable. (Swapping out RSA keys directly here results in no more or less "secret" stuff on any given node.)

The big advantage to RSA keys, of course, is that any service can verify JWTs without being able to issue them, but if you want to swap out the secret (the private key), you'll need to touch a lot more places, or have some infrastructure to distribute the public key.

[1]: https://tools.ietf.org/html/rfc7515#section-4.1.9

[2]: https://tools.ietf.org/html/rfc7519

[3]: Doubly so since I feel there is a lot of ill-will towards JWT, and many misconceptions about it. It's a good format, IMO, but the messaging around it needs to be crystal clear if people are going to ever stop fearing and start understanding it.

Okay - rereading your comments - looks like you have misinterpreted this one.

"A signed JWT is known as a JWS (JSON Web Signature) and an encrypted JWT is known as a JWE (JSON Web Encryption)"

This is a correct statement. This does not mean JWS is a JWT all the time.

This is well highlighted in the blog link I shared with you: https://medium.facilelogin.com/jwt-jws-and-jwe-for-not-so-du...

"Yes, you read it correctly, the payload of a JWS necessarily need not to be JSON - if you’d like it can be XML too."

Well I am not quite clear from your comment how you interpret. This is my point - as also rightly in the JWT RFC.

"JSON Web Token (JWT) is a compact, URL-safe means of representing claims to be transferred between two parties. The claims in a JWT are encoded as a JSON object that is used as the payload of a JSON Web Signature (JWS) structure or as the plaintext of a JSON Web Encryption (JWE) structure, enabling the claims to be digitally signed or integrity protected with a Message Authentication Code (MAC) and/or encrypted."

"JWTs are always represented using the JWS Compact Serialization or the JWE Compact Serialization."

A JWT will only exist as a JWS or JWE. It does not exist by itself - its an abstract concept.

Regarding HMAC - its not recommended for the context of this article. It's not a recommended approach to do authentication with shared keys is in a distributed environment.

Some good stuff in here: have been spending a lot of think time on this problem, so this was helpful to validate some thoughts. but you lost me when you went talking on about XACML. Does anybody use XACML still? I feel like that is a dinosaur left behind with SOAP?

I agree XACML has lot of complexities. But if you look at the recent developments, you can now have both XACML request and response JSON based - and the communication between the PEP and PDP in a RESTful manner. Also - there is a standard coming up to have a JSON representation of XACML policies. BTW, this blog only presents an architectural model - it can be any policy language. Recently I found Netflix uses the same model for policy distribution but instead of XACML, uses PADME. For me more than the language, the issue XACML having is maintainability, auditability and governance. There are tools around to support that. Even PADME does not solve these problems.

I've done a fair bit of reading around policy-based authorization but have never heard of PADME. For the life of me I can't find anything about it in any Google searches. Can you point me at a reference for any information about it?

This was discussed at a Netflix meetup. The official site is www.padme.io. Also you can find the video recording of that meetup from https://www.youtube.com/watch?v=dim85J5cLq4 - OPA and PADME are discussed from 33:49. Also check http://www.openpolicyagent.org/.

The article has pretty much summarized security, but did not talk about JWT revocation techniques. Some may argue that short lived JWTs do not need revocation and will expire and the refresh of the token can be blocked by authorization server. But what about long lived JWTs ? For mobile apps which logins once and keeps the login unless explicit logout. In cases such as those, how do we revoke a rogue JWT ?

The standard way, I think, is to issue a short-lived self-validating ('stateless') token for access and a long-lived validation-required ('stateful') token for access-token renewal. The mobile app logs in once and uses the access token until it's about to expire; the remote app server doesn't need to perform an online validation since the access token is self-validating. When the access token is about to expire, the client requests a refreshed access token from the remote token server using the refresh token.

In other words, there's not a long-lived self-validating JWT. If for some reason that were required, you might rotate the shared secret or signing key.

I think this is the biggest drawback to "stateless" JWT's that most people gloss over when championing. If you need to be able to revoke stateless JWT's, you probably shouldn't be using stateless JWT's in the first place, because you'll have to re-add state to the system to handle revocations. And at that point why not just go with the simpler mental model in the first place.

And since giving users (or you) the option to "log themselves out of other computers" on a fine-grained basis is often an important feature to provide, it essentially means stateless JWT's aren't a great solution for sessions in most web apps.

That's just what I've gathered personally. If anyone has a counter argument for this case I'd love to hear it!

> If you need to be able to revoke stateless JWT's, you probably shouldn't be using stateless JWT's in the first place, because you'll have to re-add state to the system to handle revocations. And at that point why not just go with the simpler mental model in the first place.

Because you can speed up the normal app path by using stateless access tokens and only require online validation to issue new access token.

> And since giving users (or you) the option to "log themselves out of other computers" on a fine-grained basis is often an important feature to provide, it essentially means stateless JWT's aren't a great solution for sessions in most web apps.

The 'stateless' tokens may have arbitrarily short lifetimes, e.g. one or five minutes.

An app may issue multiple requests per second or minute; there's a speedup if those requests don't require online validation. Moving validations from multiple per second to once per five minutes saves a huge amount of system load.

It's an economic tradeoff between the costs of incorrectly giving access and the cost of performing validation for every access.

Hey @zeveb, I appreciate the response!

But the grandparent actually mentions the short-lived approach, and is asking about cases where you want to have longer-lived tokens. And where you still want to be able to revoke them.

Stateless tokens can definitely speed things up. But if you end up needing to add state back into the equation for revoking tokens, then I'd argue that most web apps would be better off sticking with the much simpler mental model of non-stateless tokens. That is, until they decide that their lowest-hanging performance fruit is the extra validation work, and that it's worth complicating their architecture to remove it.

> But the grandparent actually mentions the short-lived approach, and is asking about cases where you want to have longer-lived tokens. And where you still want to be able to revoke them.

If that's actually what you want, then of course you want online-validated tokens. But I think generally it's not what you actually want: you actually want short-lived self-validating access tokens and online-validated tokens.

Note that talking about an online-validated token's lifespan is a little silly: there's no good reason for it not to live forever (until revoked).

Well... yes - one way to do that is to have a way to propagate revocation events from the issuer to the up stream applications - and each upstream application, possibly at the gateway level or at an inceptor will check the incoming tokens against a revoked list of tokens. You may also check: http://openid.net/wg/risc/.

Like a revocation list of JTIs in an in-memory distributed cache to be checked by the edge service, yes not a bad idea, though there is a cost involved there.

Yes - revocation is always tricky - that's why Netflix moved to short-lived certs - and forgot about cert revocation. Here is a blog I wrote on Netflix model: https://medium.facilelogin.com/short-lived-certificates-netf...

This article illustrates well the authentication problem with HTTP. These methods looks very complicated and not properly adressing the problem.

Jwt is a nice thing, but the associated information (req./resp.) is not signed. We have to use TLS with certs, but TLS is P2P security based on certificates and is not part of HTTP. It’s just a secure pipe.

The reason we can’t have a correctly designed security model and protocol is because HTTP was not designed for that. One cannot sign an HTTP req/resp because the headers may be rewritten or modified, etc. Same problem with SMTP.

Because of this design property, attempting to get properly conceived authentication system on top of HTTP is like trying to force a cylinder into a square hole. They simply don’t match or we endup with big gaps.

This is vastly overstating the case. HTTP was deliberately designed to enable orthogonal approaches to security, it’s why it has endured in the face of evolving to TLS 1.3 and oauth2 headers - we learn more about security as an industry but the core semantics don’t change.

HTTP also has clear semantics of which headers do and do not get to be rewritten, some web APIs do require signature of those headers, but in practice it has been too much a burden on developers to get this right compared to using subject/CA verrified TLS (which is hard enough).

The combo of HTTPS+Oauth2+JWT is pretty reasonable in practice, and in some language ecosystems (Java and Spring Boot for example) requires little code to implement: https://spring.io/guides/tutorials/spring-boot-oauth2/

Dsig of each request/response is what we used to do in the WS-Security days and was terribly slow and tricky.

Can you give an example of a protocol which you consider to have a good security model with signed request/responses? It's an honest question, I just don't know many protocols besides the more common and older ones (HTTP, FTP, SMTP, IMAP).

> API Gateway intercepts the request from the web app, extracts out the access_token, talks to the Token Exchange endpoint (or the STS), which will validate the access_token and then issues a JWT (signed by it) to the API Gateway.

What is "it" in this quote? Will JWT be signed by API Gateway?

Otherwise.. a great article! Made my understanding of security princinples in architecture like that MUCH clearer

It should be signed by the STS - which is trusted by all the downstream microservices. The STS, who validates the access_token, in the response can send back this signed JWT to the gateway. The STS of the access_token and this JWT can be the same or two different ones, based on the use case...

Thank you! I'll ask some more questions tommorow after I sleep on it if you don't mind.

Rather than managing a PKI for your JOSE (and all the revocation headache that comes with that), why not just go with JSON Web Keys (https://tools.ietf.org/html/rfc7517) and rely on your Web PKI for safe delivery of public keys?

I was a little disappointed with the Newman book because arguably the hardest thing to get right with a microservices system is what to do when things go wrong, when things fail, specifically partial failures. I also don't remember much about security in the book, so I am thankful for this article.

This is an excellent article. It's not easy to manage, but if you get OAuth 2 right, you should be secure.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact