Hacker News new | past | comments | ask | show | jobs | submit login
BeyondCorp: How Google Ditched VPNs for Remote Employee Access (thenewstack.io)
256 points by dsr12 on Jan 22, 2018 | hide | past | favorite | 88 comments



Most companies need to be able to answer the question, "is this client one of ours," when protecting sensitive resources.

Most companies will instead answer the question, "is the client on our network," and pretend that it was the same question. The fact that it clearly is not has some very obvious security implications and attack vectors that we've been living with for decades.

Beyondcorp tries to more directly answer the original question about device identity rather than subbing in the network question in its place.

The fact that this approach is novel says a lot about the maturity of our industry.


Even the question "Is this client one of ours?" is a bad question to ask. A much better question is "Is this specific action authorized and authenticated?".

When you only authenticate a client, with a mechanism such as TLS Mutual-Auth, or ALTS, you still aren't really authenticating the actions, just the channel. That leaves the system open to request smuggling attacks, hi-jacking attacks, context-mismatch attacks (TLS is particularly cumbersome here, because authentication contexts can change mid-request), layering violations like credential lengthening problems (do you tear down a previously opened connection when the credential used to establish it expires or is revoked?), and vulnerabilities in the channel authentication mechanism (e.g. X509 and ASN1 are both notorious problem areas).

I work at AWS, so I'm biased, but it seems much stronger to me to use a system that AAA's each action, like a request signing protocol (ours is https://docs.aws.amazon.com/general/latest/gr/signature-vers... ).

Request signing systems like that authenticate each action, which is very granular, it permits strong offline signing (just like my iPhone banking app does), and it's agnostic to the details of networks and clients and things like hi-jacking and smuggling just can't work.

Of course, today's browsers aren't really set up for this; they don't support built-in request signing, but I still find it a little weird to see VPNs/networks traded for a model whose expiry date also went by years ago.


Request signing can be easily implemented in JavaScript for API requests with WebCrypto but it's not clear to me in what threat model it would be beneficial because even if they keys are not exportable users don't generally see what actions did they authorize and as such they don't know what did they exactly sign.


The main threats are protocol and network level. For example, request-smuggling and hi-jacking attacks can take the form of bugs in proxies and servers that allow requests to be smuggled because they don't escape newlines in headers and so on. With signing, these requests don't validate.

It takes a much smaller TCB, and no connection state, to AAA a signed request, but with mutual-auth you need a state machine and the TCBs for X509/ASN.1 validation tend to be huge. That's not what you want in a security critical control. Honestly, enabling normal TLS mutual-auth likely degrades security in most cases, as it opens the server to whatever attacks the X509 processing is vulnerable to. ALTS mitigates this somewhat by using Protobufs, but that's still a very very big TCB. Compare that to say the TCB for validating HMAC or ed25519.

Then there's the basic stuff I already pointed out, like sessions lasting longer than their credentials are valid for. The layering violation invites these kind of issues.


Hmm... But you'll still use TLS for transport so adding mutual auth doesn't really increase the attack surface that much. (it still requires ugly ASN.1 and X.509). Or maybe you suggest using signed requests without TLS (plain HTTP)?


A TLS server without mutual auth doesn't need to do any online X509 processing, and only a tiny amount of ASN.1 (parsing the DH share, which is easy). It just serves certs, without parsing them.

Mutual auth increases the TCB by a lot, the Kolmogorov complexity increases by several orders of magnitude.


Good point. Although I think the design was chosen because of the complexity of infrastructure behind your TLS server. Adding more things for the client to do so that you don't need to trust any intermediaries inside AWS data centers. (I'm not complaining, just an observation from my POV).

(For the record in BeyondCorp all backend components are mutually authenticated but they still use sessions and U2F tokens so there are no trusted points).


The short and long of BeyondCorp:

- Instead of a single VPN that will expose your entire squishy corporate LAN to anyone who gets VPN access, each application gets its own protected proxy.

- The protected proxies query a centrally-aggregated auth/authz database, which can work with client-side software to ensure qualities such as private key possession, full disk encryption, software updates, etc. In Google's case, this is combined with a host-rewriting browser extension for usability.

- Access proxies can easily funnel HTTP traffic, but some more clever solutions involving tunnels exist for plain old TCP and UDP.

By giving every application its own authentication and access control proxy, each application is secured on its own, hence "zero-trust."


Do you have more information on these 'protected proxies' that you mention? My understanding was more that it was each client was given a health check of sorts & was either allowed, or not allowed after meeting a certain number of criteria.


For commercial solutions take a look at zscaler offerings.

The biggest problem in such solutions is that you have to identify the applications you have and sure Google is a relatively new tech company. Try to identify all apps/services on big old companies. Until you do that or decommission them, hou have either to keep your old VPN solution up or proxy ALL traffic and use the analysis tools they have to identify apps/services, again no walk in the park.

Of course as mentioned Google has most of its apps/services anyhow on the Internet so you mostly just use the host checking and client identification functionalities of such tools.


You could probably keep the vpn and incrementally move services "out", starting with low-hanging fruit (few dependencies) and/or popular ones (eg: (Web)mail). Basically low cost/high benefit trade-off.

At some point you'll have a few dinosaurs on the vpn, and can take those services quietly out back and retire them permanently.


This paper should have the most relevant answers to your questions:

Beyond Corp: The Access Proxy

https://research.google.com/pubs/pub45728.html


Each time an employee tries to connect to an application, the access proxy makes an evaluation of how much trust that session can earn, and if it's equal to or greater than the trust required by the application, the proxy allows the traffic through.

The trust earning is based on host/machine information as well as user and authentication information.


> All the corporate resources are behind this uber-reverse proxy. Based on a decision provided by its “trust engine,” the proxy makes the decision of whether or not to provide access to the desired application. If permissions are in place, according to the tiered trust model, it forwards the requests to the application, along with the security credentials.

To me this sounds like a firewall/vpn moved into the application layer.


Sounds like the difference between "authenticate, then access this application" and "authenticate, then access our network which has this application accessible on it."


I mean, what the OP said implies what you're saying.


I know the approach described in the article is not particularly new [0], but I think it deserves to get more traction than it does (AFAIK).

I do have some other questions, though:

1) Does this infrastructure support BYOD, and if so, what does the provisioning process look like?

2) What permissions do employees have their devices?

3) How are device compromises handled?

[0] https://research.google.com/pubs/pub43231.html


The answer to both the byod and permissions questions is the "tiered" device trust part from the article. You, the policy-maker decide how certain you are that a device hasn't been pwned given its provenance and user access story, and you assign a "trust tier" accordingly, which determines what resources it can access.

I don't think beyondcorp necessarily changes your incident response story, assuming you already have one.

A lot of this discussion glosses over the fact that U2F really makes this a viable system. U2F solves the MITM problem and ensures that the anyone who logs in does so with a company-issued hardware authenticator in physical communication (usually USB, but maybe also NFC or Bluetooth) with the client device. This means that even in a byod story, there's a piece of corp-issued hardware always attached. This in turn means that impersonation requires physical device theft in addition to credential theft.


BYOD is possible to at least some degree, with devices like Chromebooks. A few years ago, I was able to use my personal Chromebook to access internal sites. I tried just a few. The sole requirements I remember were a stock install of the OS, a work profile and a Yubikey.


> A lot of this discussion glosses over the fact that U2F really makes this a viable system.

This. Really, BeyondCorp is only amazing insofar as it takes full advantage of U2F. U2F is the real (and lasting) innovation we're looking at here.


> A lot of this discussion glosses over the fact that U2F really makes this a viable system. U2F solves the MITM problem and ensures that the anyone who logs in (…)

Makes viable: certainly; solves: not so sure. Session hi-jack doesn't magically cease to be a problem.


It becomes much less of an issue if the connection is re-negotiated periodically, and a new key may require a physical action (touch) from the key generator.


I don't know more than what's mentioned in public sources, but I can guess:

1) BYOD is likely supported, but provisioning installs the invasive device monitor (as chrome extension [0])

2) Paper mentions that you can at least install your own printers. I would think that SW engineers would have full access, subject to device monitor being happy.

3) Since all keys (both per-user and per-device) are stored centrally, it should be fairly easy to revoke all device keys in this case.

[0] https://static.googleusercontent.com/media/research.google.c...


>BYOD is likely supported, but provisioning installs the invasive device monitor

Probably not. Since they lock down which apps run or not, I doubt they'd trust some home rolled, untrusted, unverified install with unknown executables running.


The article specifically states "Provisioning Chromebooks for new employees is a minimal processing, taking no longer than 90 seconds worth of configuration settings." Other than the badly worded phrase "a minimal processing" I think you can reasonably assume you have your answer.

The article also mentions that "The model benefited the fact that all of Google’s internal applications are already on the Web". I'm just going to assume that means Google employees are using web-based remote access for development. If that's the case then their Chromebooks are really just expensive dumb terminals. We may differ on this but I personally don't see any reason to drop $1k+ on a dumb terminal just to say I own the hunk of plastic I do my job on every day, particularly when my employer will hand me one for free when I start my new job.

Based on that assumption I think you can also reasonably assume permissions around futzing with the needed browser plugins are limited and device compromise is handled how Chromebooks normally handle most problems: wipe and reinstall.


You are reading too much into that sentence. Most software engineers at Google actually have a MacBook as laptop and a Linux workstation.


And that MacBook is primarily a web browser and/or really expensive dumb terminal.


Primarily... but then it's a MacBook for secondary uses.

You can't run minikube on a dumb terminal while off at a conference.


I guess the question really is can you download code from the repos to your own desktop or laptop and do anything with it?


As far as I recall, you can download code to your desktop but not your laptop, unless you’re doing something like iOS development that would be difficult to do over say ssh -Y.


I'm not a googler, but from my readings about their dev culture: it wouldn't be practical, as such, for a lot of what that dev server is really used for. Stupid-much development, however, is API driven and based on isolated logic. Those individual components or spike solutions would make absolute sense on the laptop.

More importantly: with the kinds of operations Google is running these days you'd really expect lots of the APIs youd need for product development to be readily accessible.

So, yeah, you'd be hacking on your laptop separately, and maybe doing some kinds of work. They run a monorepo, though, so it'd be of a somewhat peripheral nature. Proper dev is what the dev station is for.


> If that's the case then their Chromebooks are really just expensive dumb terminals.

Your comment seems to be mostly a bunch of wild speculation, but if you are presupposing they use Chromebooks as dumb terminals, I don't know where you're getting this $1000 figure from. If that's were the use-case then there's no reason not to go with the cheapest Chromebooks available, which can be as low as $200.


I have no position on whether Google employees are essentially using dumb terminals or not but...

> if you are presupposing they use Chromebooks as dumb terminals, I don't know where you're getting this $1000 figure from. If that's were the use-case then there's no reason not to go with the cheapest Chromebooks available, which can be as low as $200.

That's wrong: A dumb terminal is basically a screen, a keyboard, and a network interface. A $200 Chromebook is going to have much lower quality versions of all three than a $1000 Chromebook. So, there is a very good reason why Google wouldn't issue "the cheapest Chromebooks available."


Because the Pixelbook is $1,000 and I figure that's the bottom end system you'd get from Google. But as others as have stated it is completely speculation on my part. All I have to go on is the mention of checking all apps were web-enabled as part of the migration. That implies people aren't just building a VPN from their desktop and then ssh'ing around after the migration to a private LAN which was also the point of the article.

So my original assumption was therefore must be using a web interface to some sort of unix desktop and then sshing from there, right? However on second thought the article implies once you're trusted you're trusted and all systems are enforcing their own security so maybe people are sitting on UNIX desktops at Starbucks, authenticating via a certificate and then trusted across the public interfaces of Google's assets. That too would make perfect sense and still fit the spirit of the article.

The question is really do you have trusted access from a portable device or simply via one?

Again this really is speculation on my part and I'll happily own that. I can only dream of the level of access the average Googler has; the regulatory oversight and internal interpretation of same has resulted in far harsher limits in my space.


Totally wild speculation! And $1,000 comes me mentally mapping Chromebook to Pixelbook.


I'm not a Googler but reading from previous publications BYOD is not supported. This stems from the fact that for BeyondCorp to work they need to be sure the device has not been tampered with. Only having trusted-boot really ensures that.


Android and x86 at least have no credible assurance about that. Boot chain integrity does nothing to stop you from gaining access through local privilege escalation vulnerabilities.


But it means you can whitelist trusted OS releases. And ensure devices have all the security updates they claim to.

You'd still have intrusion detection, firewall, user presence detection, monitoring at the proxy, etc. to guard against threats posed by local exploits.


After the user (or malware) gets code execution in ring 0, the whitelists and detection mechanisms are no longer effective.


They're a necessary condition, not a sufficient condition. (Good signal if present, bad signal if absent.)


Any corporate security mechanism requires the devices not to be tampered with...

If anything, BeyondCorp reduces the impact if a device does get compromised.


In a ByeondCorp-like architecture, BYOD is a policy decision. Google as a POLICY has generally said no to all unmanaged machines.

Some resources might require a managed machine by policy, but others may not.

Imagine your Corporate Cafe menu. You don't want it posted on Buzzfeed that y'all be having filet mignon on Tuesdays, but you really don't care about the posture of a device. Contrast this with your internal source code repository or wiki. You might really care about device posture for that type of resource.

Previously you had to put both the Cafe Menu and your Source Code under one big hammer: The VPN. Once you logged into the VPN you had carte-blanche access to all kinds of things.

BeyondCorp gives you a L7 place to drive POLICY decisions in a consistent manner.


>Once you logged into the VPN you had carte-blanche access to all kinds of things.

In my experience this isn't generally true and hasn't been true in properly run organizations since the late 90's. I don't know if it was the case at Google at some point in the past (seems unlikely). Everywhere I've worked for decades has viewed the internal network as hostile. You need credentials to access every internal site/app/resource (including the cafeteria menu). The VPN is just an extra onion layer to guard against screwups with internal endpoint protection, and because to be honest it is pretty easy to deploy so why not.


Partially agree, but the world is a big place. Lots of random internal resources do exist, even at big companies.

Internal resources are owned by many separate teams. They implement AuthN / AuthZ on their own. Resources might prompt for a username & password and then do an LDAP Bind with them, or they might have a local database, or they might use an SSO/SAML, or any other number of mechanisms.

Resource owners want to move fast, they want new internal apps. Central IT/Security wants to add WAFs, 2FA, centralized logging, and all kinds of other controls.

The BeyondCorp model moves these responsibility to an easier to deploy model. It's now centralized as a service, rather than each internal app needing to buy 5 security appliances that they are required to put in their rack.


No disagreement on all that. More layers of security are generally better. The places I worked generally had a centralized SSO service and a strong security team that would hunt down and kill services deployed without authentication.


Yes, but my point is that only when the corporation issues the hardware themselves they can be sure that the user did not tamper with it. (It's not clear to me if that's what you're saying...)


How do you give someone physical control of a device and then ensure the device hasn't been tampered with? And how is this scenario different if the company owns the device vs the employee?


But how is this specific to a BeyondCorp design?


There's a lot more about the end user experience in the most recent published paper: https://research.google.com/pubs/pub46366.html


Reasons your company will never adopt BeyondCorp:

- Your company does not do real inventory management, or if they do, they do it partially, and poorly.

- Your company does not manage all the infrastructure in a single place. Adopting this kind of authentication+authorization requires putting more trust in your local network in order to allow your internet gateway access to all your internal resources.

- Your company's network traffic is made up of a bunch of random protocols, authentication+authorization realms, and edge cases, half of which won't work with this system.

- You don't need it.

- If you have a couple hundred thousand employees and a millions to burn on building universal integration systems, you're going to write your own stuff anyway.


> Your company's network traffic is made up of a bunch of random protocols

Basically anything other than HTTPS


Lots of things become simpler when everything is a layer 4 or 7 service, and protected by SSL, and you're enabled to mandate hardware and software upgrades across your entire device fleet.

Another reuse of this philosophy can be seen in the Istio project, which combines Kubernetes and the Envoy proxy to authenticate and secure communication within a microservice architecture.


Um, I'm not Google, but I think for a typical small/medium-scale tech business we're not there yet.

> "The problem with the “castle” approach is that once the perimeter is breached, the entire internal network, and all the associated applications, are at risk. “Do not trust your network. It is probably already owned”"

Considering an example of a common software development company, we may assume they use VPN to get into private network with their project management, git, devel, staging, backups, documentation and other servers/applications. Each of them requires user authentication, each user has its own privileges. VPN here adds an extra layer of security. But either way, being behind the VPN or not, services potentially carry the same level of risk. Implementing perimeter security doesn't imply a lack of security of services within.

> "Google’s approach involves comprehensive inventory management, one that keeps track of who owns which machine in the network. A Device Inventory Service collects a variety of live information about each device […] Employees get the appropriate level of access regardless of what device they are using or where in the world they are logging in from. Lower levels of access require less stringent checks on the device itself. […] The applications themselves are routinely checked for breaches by vulnerability scanners."

> "VPN was cumbersome to use, and slowed performance, especially for overseas workers. And it is no walk in the park for admins either. To set up a new user, the admin would typically have to configure the cloud network, along with setting up the IPSec rules and firewall rules, the VPN. This is followed by a lot of testing"

Again, I'm glad that it works for Google and that they're able to routinely check all hardware credentials and servers "for breaches by vulnerability scanners", but this whole passage and complexity scheme behind it causes me a headache. I think I'll continue to rely on the traditional VPN, but based on the modern lean WireGuard[1].

[1] https://www.wireguard.com/


Take a look at Google Cloud IAP -- it's essentially a stripped down version of BeyondCorp for public use on Google Cloud. I've used this as a customer of Google's with great success, it really does just work.

Disclaimer: I work for Google Cloud.


You may wish to re-consider your position of relying on Wireguard. From their website:

> WireGuard is not yet complete. You should not rely on this code. It has not undergone proper degrees of security auditing


I would trust my WireGuard significantly more than any installation of IPsec, if nothing else than because I will almost certainly configure IPsec in a manner that is completely insecure, but not obviously so.


It's true, but apparently it's getting close to be implemented into Linux[1][2].

[1] https://www.phoronix.com/scan.php?page=news_item&px=WireGuar...

[2] https://www.phoronix.com/scan.php?page=news_item&px=Systemd-...


Does the new Cloudflare Access product (maybe with their Warp product) address your concerns? It means visitors never need to have access to your physical network, their traffic only goes to the specific machines you intend.


Probably. But I'm not inclined to rely on third-party companies for a core business infrastructure. The сompany behind a VPN I portray above most probably uses in-house self-hosted Gitlab/Confluence/etc instances. However, it's absolutely okay to outsource such needs when it's appropriate.


Last time this created a lot of discussion in HN: https://news.ycombinator.com/item?id=14596613




I think there’s two points of view here. On the one hand yes, if someone gets into your castle then they may have a lot of access you don’t want to give them. However that assumes that things inside these walls are also not secured, which is often not the case.

The other point of view is that the castle wall gives you added protection against unknown unknowns that could mean there are security issues with your now public facing infrastructure. By just exposing everything publicly you create this potential big risk. Google’s preaching here glosses over this fact.


It's not quite removing the walls, think of it more as a ton of tiny castles dotting the landscape with equally good walls instead of one giant wall around the kingdom and some unprotected wooden huts inside.

By doing this they want to force those managing internal apps to put up protection on the level they would for any other external service. Overall it's leading to better security (at least that's what they argue).


> that assumes that things inside these walls are also not secured, which is often not the case

I've operated under the old castle doctrine for many years. It is my experience that in the vast majority of places, once the wall of VPN / packet filtering has gone up, people suddenly relax and forget about internal patches. There are exceptions, but they are uncommon.


BeyondCorp sounds great in theory, but deployment sounds like a nightmare without going to one of the several companies that are offering it as a service. It's certainly not as accessible as a decent VPN w/ 2FA, and I doubt we'll see mass deployment for smaller groups until then.


We wrote a lot more about what we did to ensure then end user experience was good in https://research.google.com/pubs/pub46366.html

You're right that it's still early for companies that don't have the same resources of a company like Google, but products are slowly starting to emerge to make it more turnkey, so I have high hopes that this will be the norm for new companies in a few years.


BeyondCorp is indeed complex but I don't know about outsourcing it to someone else. It's a core of one's company's infrastructure so that'd be extremely risky to let someone else manage it. But maybe it depends on the company size but if that's so a small company doesn't need this level of protection (device keys in TPM, device management and health checking service, etc).


BeyondCorp is not just about using secure channels and authentication, it's also about using secure end-points (=minimal data breach impact).


It's really nice that we see more and more awareness for Zero Trust and specifically Google's BeyondCorp whitepaper. If you're looking to experiment with this model yourself, check out the following open source projects. While they might not implement everything in Google's BeyondCorp paper yet, they are pretty close to the full thing, and address many issues raised in the comments.

-> OAuth2 Authorization Server https://github.com/ory/hydra

-> Identity & Access Proxy (early access): https://github.com/ory/oathkeeper

If you have questions don't hesitate to ask.


These look great, a couple questions:

1) Are these deployed at scale anywhere?

2) Any known security audits?

Thanks!


1) Hydra is deployed at scale, Oathkeeper is our new kid on the block

2) We have an OpenID Connect certification coming in, but no security audits so far


Early access? What's next? DLC and loot boxes?


You play too many video games ;)


Are there any decent open source implementations that a small company could deploy?


Check out scaleft.com. They have tooling to help implement this. It looks to be largely focused on ssh access but there is some stuff about controlling app level access also.


I'm from ScaleFT, thanks for the mention. True that our original focus was in SSH/RDP access, however we've recently introduced Web access as well.

https://www.scaleft.com/blog/how-to-deploy-a-beyondcorp-styl...

I agree with many commenters that it appears transformative, but that's only through the lens of Google. Centralized access controls at Layer 7 through a proxy service that can authenticate and authorize requests, while brokering encrypted sessions isn't that out of reach. Our goal at ScaleFT is to offer as much as a service as we can.

Where things do get tricky, though, is with the access policies and device attestation in a BYOD environment. Admittedly, we have work to do in this regard, but it may not require a full MDM layer. Really, you only need to query device state at a given time to make an authZ decision.

Love to see BeyondCorp get more coverage, and I hope to see further adoption outside of Google.


Hi, I'm Max, one of the two people who gave the talk this post references. I work in Google Cloud and help publish papers about how we have done BeyondCorp. Ask me questions!


Can someone explain what makes "corporate" applications so different.

Why does my corporate bug tracker need different security from Gmail? Does that different security do anything?


Typically the corporate environment includes many "line of business" apps, which are often a simple web interface onto a custom application. It's not common for these to have Gmail-level hardening, and so security is provided by restricting access to the underlying network with a VPN.


> It's not common for these to have Gmail-level hardening

Translation for those not in the business: They are custom J2EE applications made by the combined forces of over-expensive consultants and CEOs nephew, running on unpatched Windows 2003 servers with plain HTTP login form (obviously vulnerable to SQL injection) and DB listening on 0.0.0.0 with default admin credentials. But it's in the internal network, so don't worry, no hacker can get there.


Maybe naive but can this sort of thing be accomplished with the likes of FreeIPA and NFSv4 (with k auth) or this is entirely another ballpark?


This is great for "people" accessing server resources but not so much for automation / API access.


Why can't the "Uberproxy" (in Google terms) consume Authentication mechanisms like spiffe[1] certificates and allow access to protected resources via those?

Theres no reason a BeyondCorp architecture needs to make automation or API access harder?

[1] - https://spiffe.io/


Because for Google it's not enough that you have a certificate if your client is using an old unpatched OS and is potentially vulnerable.


Quite the opposite.


This seems very similar to what Microsoft is trying to do With Azure AD and Intune


Why not just use Citrix?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: