Hacker News new | past | comments | ask | show | jobs | submit login
Restrict Access to your internal websites on AWS with BeyondCorp (transcend.io)
137 points by giacaglia 6 months ago | hide | past | favorite | 56 comments

The BeyondCorp paper explicitly mentions that the device state is taken into consideration when giving access to a user, i.e. that the device is identified and controlled, not just the user. It seems to me like it is an important part in the BeyondCorp access model, otherwise wouldn't this just be a SSO portal?

You are correct. The solution presented is not a BeyondCorp but rather an SSO implementation that adds authentication to the internal application.

For BeyondCorp, it essentially:

* Must be Layer 7 protocol, access privilege aware (achieved by an identity-aware access proxy).

* Promotes authorization as opposed to authentication only.

* Should be able to enforce security policies (time, location, context, 2fa).

* Must be aware of the security state of the user device.

Shameless plug: Check out our zero trust service access project TRASA (https://github.com/seknox/trasa). It's free and opensource and addresses many of the requirements outlined by BeyondCorp.

Since you clearly seem to know what your talking about: What would be a good resource for getting started with zero trust networking?

Heh. Though I am not an expert on the topic, I can recommend a few things. First, there are three directions the industry is heading with "zero trust" thing.

(1) Zero trust access (like BeyondCorp, protects application and services when a user, user credentials, user devices are compromised)

(2) Network micro-segmentation (contain impact when one network segment is compromised, dynamic network assignment)

(3) Zero trust browsing (protection for users from getting infected with malicious contents served by trusted but compromised websites)

Honestly, I am only more familiar with zero trust access, and for this, I can recommend you first read -> BeyondCorp A New Approach to Enterprise Security [0] by Google. The trend was kickstarted from that paper

0: https://research.google/pubs/pub43231/

Azure AD provides a hook for this through Conditional Access, which will block sign in to a application if your device isn’t compliant with security policies or updates (or if you are logging in from an unapproved country).[0] Google provides something similar through Context-Aware Access but I don’t know if it goes as deep (Google used Puppet in the original paper to get device state info).[1]

0: https://docs.microsoft.com/en-us/azure/active-directory/cond...

1: https://support.google.com/a/answer/9275380?hl=en

large enterprise deployments of phones or company owned desktops/laptops, etc, very commonly include what would be called "network admission control" software. The device needs to meet a certain defined state of patch level/servicepack/antivirus scan/other things (like GPO registry settings on a windows machine) before allowed to sign on.

it's all good to theoretically say that smaller companies should adopt a 'beyondcorp' type approach. but at a certain point of threat model on the client device (keystroke loggers + tools that send screenshots somewhere else, as is found on black hat remote access tools/botnet tools), you need to have specialists in endpoint/workstation device security keeping on top of threats, and defining the security policy.

what sketches me out about this particular article is that they're essentially trusting any client endpoint device that has the 2FA hardware token, and has a working browser. you could have a totally screwed up windows 10 laptop riddled with some very nasty RATs that would work fine to use the 2FA authentication tool, and sign in to their service with chrome in a browser. there's nothing about verifying the state of the software and trustworthiness of the operating system of the client device which might be potentially accessing very sensitive internal information.

i see literally nothing in that article about inspecting or trusting the state of the operating system or software on the client device. does it have a bunch of malicious browser plugins? who knows. is it running a remote desktop tool that's linked to somewhere else? who knows. is it infected with an advanced remote access tool? who knows. is it six months out of date on windows updates? who knows...

the article's assertion that a vpn based approach is like an eggshell is false in my opinion. you should not have an environment where simple vpn auth allows you in to the squishy inner center of private data. a belt and suspenders approach is needed.

Indeed, and the industry term for this is posture assessment. And many companies take this a step further and permit access only with organization-issued equipment, even if you possess authentication credentials.

Having 100% company owned equipment allows you to do other common sense things like:

a) full disk encryption with key escrow for recovery by admin team

b) storage of crypto public/private key pair on disk of laptop, for instance an openvpn key file that was created on a company owned PKI server, deployed onto the laptop as part of its provisioning process, and is a unique key for both the human and that particular piece of hardware

c) you can use the same crypto key pair on client device local storage, if not for something like openvpn, for other authentication purposes identifying that particular user and hardware

d) obviously, have the device trust your own internal PKI's root CA for access to purely-intranet resources. getting a company root CA trusted by the browsers in a BYOD device environment is a pain in the ass.

What does Google do in this regard? I don't work there, so I'm curious to hear about their solution for endpoint security.

From what I've heard, they don't allow sensitive data on laptops in the first place—you mostly SSH into your desktop or a cloud machine. That's probably not enough to solve the issues you described, so I wonder what else they do.

I would be shocked if they don't have a whole team of people keeping up on the threat models for client workstation windows, macos and linux endpoint devices, and creating the equivalent of windows active directory registry pushes+other software loads to guarantee the condition of an endpoint device.

Otherwise how do you know an endpoint device (assuming it's on a network segment with a default route out to the internet, or is somebody in a work-from-home mode) isn't running a persistent video recording session feeding something like a realtime mirror of the screen, VNC-over-SSH tunnel to some third party.

At smaller size companies I have even seen stories of a person who was hired as a fully remote developer by $software_corp, and proceeded to set up a remote desktop tool and subcontract their entire job to a person in $developing_country, at a significant profit margin.

I wonder if there are any big companies that require HDCP and some sort of "trustworthy USB devices only" system.

There are companies where your (airgapped) workstation stays in the office and your phone / any other personal electronics stay outside the office.

DRM is not security.

Nevermind the fact that HDCP has been broken for ages and any random Chinese capture card will ignore it and Chinese HDMI splitter will strip it, if the purpose is to cheat the system then you can just point a camera at a screen (perfect video quality isn't a requirement here).

Microsoft documented some of what they do at https://www.microsoft.com/en-us/itshowcase/protecting-high-r....

Note this is for operations/privileged access in high risk environments vs. standard issue desktops. Saw a little bit of this up close a few years ago, seemed quite solid and well thought out.

Is posture asking the device to tell you it’s ok ? For the ignorant like me it seems a motivated and capable adversary can have an insecure device send an ok posture

Posture assessment is often built into modern VPN clients. The actual procedure varies by organization and can sometimes be updated by pushing new validation procedures to the client. It's unlikely to be as simple as "run this file on disk (which an attacker could trivially replace) and check the exit code."

My company has configuration management for company laptops install a client certificate. The "internal frontend" proxy checks for this client certificate in addition to AD credentials + Duo.

(Author of the blog here)

This is a great question! I hoped this would get brought up, as it is very important. I decided against covering in this blog as I felt it was already fairly long, but the tldr is that I see two incremental ways with this setup to add authorization:

1. Cognito has something called "Adaptive Authentication" that will compute risk scores for each login based on IP, device info, etc. You can customize in the AWS console how risk-tolerant you want to be.

2. You can go the fully-managed approach, which is what we are implementing at Transcend now. The idea is that you'd use an MDM like Fleetsmith to install a TLS cert onto each managed device, and then validate that cert on each request in the auth portal. There are lots of cool ways (we use the Vanta agent) to verify that a users' device is "good" to authenticate with.

I'd like to write more about option 2, but I try to keep this blog posts as technology agnostic as I can, and my experience is fairly limited right now to Vanta + Fleetsmith

A lot of companies that care deeply about security are moving to this “trust no one” approach which has the added benefit for end users of allowing access to “secure internal sites” over the plain old internet. If done right this can all be a big boost for security and improved end user experience. That said, the old “you need to be on the VPN” approach is going to stick around for some time.

For sure, VPNs will always be used. I think it'll take a BeyondCorp SaaS company to really take off (or have it become a more "Managed"auth method from the big cloud providers).

At Transcend we are able to do it because we had an early focus on protecting our internal apps, but obviously it's a lot harder to migrate hundreds of services than to start out with a newer approach.

I loved not having to use a VPN back when I worked at Google though, and am glad to see that the open source world is starting to offer some tools to play around with.

We’re about 1/2 way down this road and it’s hard to overstate how true this is with respect to the benefits for end-user experience.

We did it for the security, but if I’d have known the convenience benefits, I think we’d have started earlier.

I mean, yes, if you have billions to dedicate to building a leading class security team-not all organizations have that money and not all organizations need to take that approach. Some do and some need to.

I read a part of the article, but I'm confused. How is this different than making all our internal servers public and using okta or auth0 for sign in?

I wouldn't do that because any of those servers could have a security vulnerability that we're not aware of, so I feel like this must protect against that somehow, but I'm just not fully understanding what it does.

Main difference is that all of these websites are public behind one big proxy (ALB) and not public on their own. The security concerns are centralised in one place, not 10.

That's not to say that the ALB can't have a bug or a misconfiguration that will render it wide open. But that's probably true for VPN as well.

And the point of this is that, while application security is still important, it at least makes all those vulnerabilities post-auth, which is a huge improvement.

The poor man's version of this is to put all your services behind an nginx reverse proxy with HTTP Basic auth (and TLS of course). For personal/small scale operations, this is a great way to almost completely eliminate your attack surface, if you have single-digit users and they can be trusted. Everyone running webapps personally should prefer this over, or in addition to, app-specific login systems.

This code scares me: https://github.com/transcend-io/beyondcorp-cloudfront/blob/m...

This encourages a behavior of copy and pasted authentication javascript from service to service.

The ALB approach from the article at least centralizes the SSO dance in one place, but still a typo in terraform would be very hard to detect.

The BeyondCorp approach Google uses, as far as I know, relies on sophisticated proxy servers in front of ALL protected services to ensure the very tricky aspects like posture assessments, zero day patching, logging, rate limiting and other security best practices are handled in one place.

With a scattershot approach, companies may not be open to a VPN exploit anymore, but may have opened themselves up to many more individual exploits and much slower reaction times when an exploit is found.

Perhaps I don't see what you see, but this is server-side javascript (cloudfront calling a 'lambda at edge' function - similar to Cloudflare Workers).

What's particularly scary about it?

As for "a typo in terraform would be very hard to detect" - perhaps, yes, assuming it didn't fail outright. To mitigate that I'd expect anyone deploying this for real to protect anything valuable would ensure unit tests were written for the Javascript and to have code reviews of any security-sensitive code like this.

I don't think individual development teams should be hosting this code in their own services. Instead, they should declaratively specify the rules on what roles can do what, and rely on another layer to honor those requirements.

Hello everyone! At Transcend, we've used BeyondCorp for all of our internal sites, as well as for our communication between services.

Please let us know if you have any questions about getting started :)

Any plans for articles covering the "device identity" and "device inventory database" components of BeyondCorp?

Slapping an proxy server to handle SSO/SAML in front of your web sites is the easy part.

I'm curious to hear how you're handling the devices -- especially if you're employees are remote.

This is a great question! I hoped this would get brought up, as it is very important. I decided against covering in this blog as I felt it was already fairly long, but the tldr is that I see two incremental ways with this setup to add authorization:

1. Cognito has something called "Adaptive Authentication" that will compute risk scores for each login based on IP, device info, etc. You can customize in the AWS console how risk-tolerant you want to be.

2. You can go the fully-managed approach, which is what we are implementing at Transcend now. The idea is that you'd use an MDM like Fleetsmith to install a TLS cert onto each managed device, and then validate that cert on each request in the auth portal. There are lots of cool ways (we use the Vanta agent) to verify that a users' device is "good" to authenticate with.

I'd like to write more about option 2, but I try to keep this blog posts as technology agnostic as I can, and my experience is fairly limited right now to Vanta + Fleetsmith

How’s this work with things that aren’t websites? Do you have to throw a proxy in front of e.g., your database server that keeps track of which IPs have already authenticated over the web?

Historically at Google the exceptions fell into one of a few buckets:

* You used a modified client or client proxy (this was done for e.g. SSH)

* You used a remote-desktop protocol to remote into a machine with direct network access to the service

* The service got a wholesale exemption and was allowed through the firewall with ordinary IP ACLs

(descending order of impressiveness wrt the BeyondCorp philosophy and whitepaper)

Some of this is discussed in the "Non-HTTP Protocols" section of this paper: https://www.usenix.org/system/files/login/articles/login_win...

Why would you ever need option 2/3 when the IAP exists? Is there stuff that doesn’t work over a tunneled connection?

the iap is an http proxy, so you need a way to send non-http traffic. this might require client modifications (not everything is proxy-aware), and you can't always modify the source.

some protocols are udp and latency sensitive, which doesn't work well enough tunneled

You run a service behind the HTTP proxy, or another proxy with a more suitable protocol like SSH, which can speak the required protocols (or just blindly forward TCP) across production. You run a CLI tool that binds a local port and forwards to this service.

In some ways this is a poor man's VPN server, but it can be smarter: with protocol support, you can combine the identity of the connected developer with application-level data (e.g. this is an INSERT statement) to make AuthZ decisions.

(Author here)

At Transcend, we use a bastion host (like others have mentioned). The key difference that we do that I don't think has been covered is that our bastion only makes outgoing connections, and has no open ports to the world.

Using the AWS SSM managed service, we can create bastions that have no ingress at all.

I talk through some different approaches in a codelab over here: https://codelabs.transcend.io/codelabs/aws-ssh-ssm-rds/index...

Do companies really setup their networks and applications such that if you’re on the VPN you have access to almost everything? Man that sounds insane.

Household name companies routinely get owned because a pen tester finds a network drop in a conference room, or a smart thermostat that merely needs internet access becomes a beachhead. There's apparently a whole generation of IT that believes security is about what is and isn't allowed on "the corporate network."

In my tech career the office WiFi has never been more privileged than the coffee shop across the street, just faster.

We have a very similar setup with Cognito, GSuite and ALBs. For CLI/API access we also have API Gateway which allows to authenticate with JWT tokens that we can issue via Cognito.

It's not perfect:

* This setup only deals with Authentication. There is no authorization at all. I.e ANYONE with org gmail account can get in.

* There is no real SSO. Say you have an application you need to login to behind this proxy. The proxy will pass you through to the login page, where you need to login (again) via whatever it is configured. It's not to say that it's impossible to solve as you have enough info in the headers/cookies that are passed from ALB to sort something out with custom solution, but it takes time.

* You need to be very careful with your OAuth config. With GSuite, as example, you can very easily configure the OAuth client to authenticate ANY @gmail user instead of your @company...

This naming is needlessly confusing. BeyondCorp is a name invented by Google to describe its internal approach to security, and now it's a Google Cloud product. Why would Transcend reuse this name? It sounds as if Transcend is a middleman reselling a GCP product.

Apologies for the confusion! I picked BeyondCorp as there seems to be a number of references to it as a more general-cloud term than anything Google Cloud specific.

As some examples, https://www.beyondcorp.com/ is run by the IdP Okta, and ScaleFT (a zero trust SaaS company) references the philosophy of BeyondCorp as being used outside of Google: https://www.scaleft.com/blog/beyondcorp-outside-of-google/

Close but not close enough. Take an ALB, pair it with your SSO solution (AWS SSO, OneLogin, Okta, etc). Add DUO and make sure to validate the devices as well. You need half if not less of the infrastructure outlined in this article.

URL for your article?

I haven't implemented this extensively, but there are sometimes problems with setting this up in a corporate environment where there's a need to do things in a uniform way.

- Often if you use a central database such as Active Directory for your internal users, you may need to set up your OAuth endpoint (say, an AWS Cognito User Pool) and then have an AD admin allow that user pool to be a relying party. This means there is some lag time to setting this up, and that it isn't done with automation. So if you have an application you want to spin up a custom domain with a temporary user pool, test something, and then destroy it later, it's probably not going to work without some custom workarounds. No need for this with a traditional VPN.

- If you have 50 different apps, with 50 different URLs, you're going to need to do the above 50 times. Also have 50 staging and dev portals? Same deal. Now try managing changes to all of those at once. The manual steps, plus all the integration with all the product teams, now means this whole thing is becoming burdensome. This is a lot more work than just "use this VPN client and whitelist these CIDRs".

- If you're working for one of those weird companies that loves to do split-horizon DNS with lots of custom internal-only domains, guess what? Probably not gonna work without a VPN.

- Onboarding can be complicated. You now need to help your users manage their accounts, such as password resets, being a part of different domains, using a supported device, MFA registration, troubleshooting internet connections, using a supported CLI tool for non-web-interface APIs, etc. Versus just saying "are you connected to the VPN?".

- A VPN allows a central network team to manage access control across the entire network (or wherever that network team manages networks). BeyondCorp needs to be managed for all of your products by one team, or you may end up with an uneven and difficult process of supporting users across teams. A lot of companies (maybe most) are just not set up to allow independent product teams to manage internal user access. Even then, authorization maybe, not probably not both AuthN + Z.

- If you have a single domain serving multiple apps in multiple URLs requiring different authorization, things can get more complex.

Ultimately I think most of the protocols used for BeyondCorp today have too many drawbacks to say we could all drop our VPNs in support of it. We'll probably need at least another generation of protocols and management workflows in order for it to become the new norm.

A genuine question on this topic, does that mean we should have SSH ports open to the world too? If we have some form of 2FA? Or I’m misreading the point of beyond Corp.

SSH through a Teleport/Boundary-style bastion can be viewed as BeyondCorp for developers.

(Author here)

I think the second most secure way to handle service authentication is to protect your ports with strong auth, like this article talks about.

But the most secure way to handle this is to just eliminate the open port entirely.

Here's another post of mine where I talk about making a bastion host with no open ports and no ingress on its security group: https://codelabs.transcend.io/codelabs/aws-ssh-ssm-rds/index...

Off topic but you want to opensource with great features like mTLS and fine grain control use pomerium: https://www.pomerium.io/

BeyondCorp sounds so weird, why not use Zero Trust which is the industry (non Google) term?

Zero Trust is about the conditions inside the perimeter; BeyondCorp is about ingress through the perimeter. Things may be wide open on the interior side of the proxy. Or things may be locked down tight even inside the VPN.

The point if these concepts is that there is no perimeter I thought?

I would think of BeyondCorp as end users accessing services through an application-layer perimeter from a public network, instead of directly from a private network.

The network where the services actually sit becomes, in effect, even more private.

All good except imo requiring a tap per hour is way overboard

or http basic auth!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact