
Secure Access to 100 AWS Accounts - r4um
https://segment.com/blog/secure-access-to-100-aws-accounts/
======
zimbatm
It's good to see an article talking about this. A lot of organizations could
benefit from using more than one account to enforce security (IAM is hard!)
and separation of concerns.

It doesn't really explain it but to do this, the root account has to be
enrolled for AWS Organization[1]. This is what is being used to handle all the
accounts and consolidate the billing. It also allows to create rules span all
the accounts. Recently terraform gained support for the Organization API[2] so
it's possible to control the account list in a declarative manner.

The biggest issue is that now that there are a lot of accounts, the developers
need a way to switch between them. Using the IAM assume-role mechanism is a
good way to avoid needing a lot of AWS keys per developer.

I don't know if I agree with using Okta as it adds another party that now has
access to AWS. I don't see the difference between having a AWS secret or and
Okta secret in the keychain security-wise. Okta might provide audit logging
facilities but so does AWS.

In either case you will need to generate a `~/.aws/config` per developer.
There is also a Chrome plugin[3] that can read this file format and populate
the AWS Console switch role. I don't know if the extension publisher is
reputable yet as it gives a lot of access to the extension.

[1]:
[https://aws.amazon.com/organizations/](https://aws.amazon.com/organizations/)

[2]: [https://github.com/terraform-providers/terraform-provider-
aw...](https://github.com/terraform-providers/terraform-provider-aws/pull/903)

[3]: [https://chrome.google.com/webstore/detail/aws-extend-
switch-...](https://chrome.google.com/webstore/detail/aws-extend-switch-
roles/jpmkfafbacpgapdghgdpembnojdlgkdl)

~~~
toomuchtodo
It should be noted that Orgs should be setup when new AWS accounts are
created, as it can be tricky to perform once your have systems in production.
I would even go so far as to say if you want to commit to Orgs, create a new
AWS account hierarchy and move your applications and infrastructure into it.

EDIT:
[https://docs.aws.amazon.com/organizations/latest/userguide/o...](https://docs.aws.amazon.com/organizations/latest/userguide/orgs_manage_policies_scp.html)

"Warning: We strongly recommend that you do not attach SCPs to the root of
your organization without thoroughly testing the impact that the policy has on
accounts. Instead, create an OU that you can move your accounts into one at a
time, or at least in small numbers, to ensure that you don't inadvertently
lock users out of key services."

~~~
zimbatm
Agreed. One strategy that I have been using quite successfully is to:

* create a new AWS account

* setup IAM and AWS Org

* add 2nd factor on the root account, put the Gemalto in a safe

Then add the "legacy" account to the org and slowly port all the resources to
fresh new accounts.

------
7ewis
We created something similar in house.

Users authenticate to an internal website with ADFS (including MFA) and are
then presented with a list of roles where they can either click through to the
website assuming a role in that account for an hour, or click an option to
access temporary credentials.

The AWS roles are deployed from our CI/CD pipeline to all of our AWS accounts,
so we don't have to have user accounts anywhere and can still deploy features
from our Pipeline without logging in.

We are also in the process of setting up automated account provisioning from
our HR system, Workday. Based on the users team and job title, they'll be
added into an Active Directory group which would then give them access to the
resources required for their role.

Once complete, this will save the support team a lot of time!

~~~
bringtheaction
That’s neat would your company consider open sourcing it if I understand
correctly that you made it for your own use not to sell it anyway.

~~~
7ewis
It has actually been considered.

I'll bring it up again. Personally I don't see why not!

------
ejcx
Thanks for posting this here. I'm the author of this blog post. Feel free to
ask me any questions you might have.

~~~
scrollaway
Tangential: Why Okta over some other provider? GSuite can be used as an SAML
provider, can't it?

~~~
marcc
GSuite can be an IdP for SAML but it’s sometimes limited. AFAIK you can’t use
it to connect to AWS using SAML because AWS makes some assumptions about
attributes in the identity. I may be wrong here, but I tried to do this a few
months ago (with Duo and Yubikeys added in) and was not able to get it
working. I ended up using Okta and it was simple.

~~~
wgjordan
About a year ago, I migrated my team's AWS access from manually-provisioned
AWS keys to GSuite authentication, using a combination of SAML and OpenID
Connect (OIDC):

\- Detailed instructions for Google-federated login to the AWS Management
Console through SAML are available [1], and worked more or less as described.

\- CLI/API access is a bit trickier and less thoroughly documented, but is
possible using the 'Web Identity Federation' feature [2]. Basically, you
generate/refresh temporary AWS credentials by passing a Google OIDC token to
the AssumeRoleWithWebIdentity API. The tricky part is keeping both your Google
OIDC and AWS STS tokens conveniently refreshed. I wrote some open-source glue
code for this that hooks into the AWS Ruby SDK and CLI [3]. It's still a
little rough around the edges and not yet extracted into a standalone project,
but it's been working well enough for my team over the last year.

[1] [https://aws.amazon.com/blogs/security/how-to-set-up-
federate...](https://aws.amazon.com/blogs/security/how-to-set-up-federated-
single-sign-on-to-aws-using-google-apps/)

[2]
[https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_pr...](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_providers_oidc.html)

[3] [https://github.com/code-dot-org/code-dot-
org/blob/staging/li...](https://github.com/code-dot-org/code-dot-
org/blob/staging/lib/cdo/aws/google_credentials.rb)

------
stiveridibla
Excellent article thanks for sharing. It's awesome to see someone documenting
their experiences with multi-account strategy. You said you manage account
with Terraform, are there any other resource types that you don't give the dev
teams access to such as networking? Also, how are you deploying Terraform at
scale across all of your accounts?

------
ah-
How are you dealing with software that expect to get aws tokens? Especially
with sts AssumeRoles only lasting up to one hour?

For example, if you want to run the Athena JDBC driver locally or read
something from S3 in a longish (>1h) running Python script.

~~~
stiveridibla
I can't answer for Segment, but our AWS account strategy is very similar, so
we've found that there is still a need for a centralized KMS solution (system-
to-system auth). Our teams are working on a project to wrap more tooling
around Vault and make it a centralized KMS. T-Mobile has done some work on
this... [https://opensource.t-mobile.com/blog/posts/introducing-
tvaul...](https://opensource.t-mobile.com/blog/posts/introducing-tvault/)

------
posnet
Atlassian open sourced a tool to help with CLI access when using many AWS
accounts through federated roles.

[https://developer.atlassian.com/blog/2017/12/introducing-
clo...](https://developer.atlassian.com/blog/2017/12/introducing-cloudtoken/)

------
cyberferret
I am interested in the thought processes behind setting up separate AWS
accounts "for GDPR compliance" as opposed to having one AWS account and
running multi regions (via VPC etc.) under that account?

Is it because teams in different locations (US, EU etc.) can run semi-
independently? How do you manage shared resources (e.g. S3 or RDS SQL servers)
that need to be accessed from multiple regions, yet still maintain GDPR
compliance?

~~~
ejcx
AWS has a lot of resources that are not per region.

The GDPR line is about improving our least privilege access story. There's a
lot more to GDPR than that, though

------
jbergknoff
What are best practices for sharing build artifacts (e.g. images in ECR, files
in S3) among AWS accounts? Cross-account IAM policies, explicit promotion
process to move artifacts between accounts, or something else?

~~~
ejcx
We use ECR and have all of our images in our hub account, and make them
readable from the prod/stage/dev accounts using a cross account policy.

For S3, it's more complicated because of object permissions and all the
insanity that comes with cross-account writing etc.

Edit for more info: No explicit promotion process for containers. Engineers
build images in our CI pipeline when their pull request is approved and merged
to master, which is pushed to the hub account.

------
bsaraogi
How would this compare to an LDAP, an LDAP with org wide access control can
pretty much enable all of this(keeping in mind GDPR & other security concerns)
from just 1 AWS account ?

------
lis
Out of curiosity, how do people handle communication between services that are
in different accounts? VPC peering to connect all accounts? Public Internet
and OAuth? Another method?

~~~
yeukhon
You need to define “between services”. You mentioned oauth and VPC peering.
One is authentication, the other is networking accessibility. There is no one
solution.

Generally you have: VPC peering between accounts, Network Access Control List
(NACL) for VPC port control, security groups between instances (and some AWS
services which uses SG to limit port access), IAM roles to authenticate and
authorized certain AWS services to do things (e.gz Lambda to read S3 bucket) -
but IAM and policies (bucket policies, SQS policies) govern authentication and
authorization. Finally there is also organization service which allows you to
control what AWS services are allowrd in a group of AWS accounts.

Sorry, on mobile so I can’t make a prettier list. I am generally disappointed
at the complexity of authentication and authorization mechanisms exists for
AWS services to be really honest.

~~~
lis
Oh, sorry, absolutely. I'm actually talking about services that talk to each
other using HTTP. One option I've though of is using the public internet to
connect them and use OAuth to make sure that they are actually allowed to
communicate, solving the security issues on the application level.

Another option would be to allow only access from the internal network, but
then you have to connect them somehow.

~~~
yeukhon
This is a good question. I will simplify this to two rules:

* internal traffic goes through VPC, AWS backone, and Direct Connect / VPN (company and AWS accessing each other). Worth noting that, all S3 requests used to go through the Internet, but now we can enable S3 endpoint so requests originated from VPC is now made within AWS backbones).

* incoming public traffic comes through AWS public infrastructure (e.g. load balancer) before handing off to some EC2 instances (the instance could either be in "public" subnet, or "private subnet")

There is a whole lot of architecture approaches highly dependent on the
requirements, and I don't think we can discuss them here.

For one microservice to talk to the other one, if all within the same VPC, you
just use security group (Network ACL "defends" subnet, but you are better off
just using route table). If not within the same VPC, you can peer. If not
within the same account, you can peer. I believe now you can peer region too.
In some cases, you have to route traffic from VPC1 to VPC2 through your
corporate routing...

------
foodbaby
Temporary security credentials from AssumeRole are valid for up to 3600 secs
(1 hour). Given that, how do folks handle long running jobs/sessions?

------
appdrag
Hey segment, are you aware that you can have several env (dev, stag, prod,
...) on the same AWS account? :p You can secure each environement with
different credentials (IAM) so no need to create several AWS accounts!

~~~
rwitoff
In practice, cramming all this into the same account doesn't work. Segment is
following best practice here.

For example, IAM doesn't provide the granularity in resources and conditions
that you'd want to effectively isolate the blast radius of developer keys.
ec2:TerminateInstances didn't (doesn't?) support VPC level conditions, so
being able to terminate one instance meant you could terminate all instances.

Similarly, you might want your engineering team to iam:PutUserPolicy in
development, but have a much more restricted group in production which isn't
possible with IAM today.

I've taken this pretty far in the past to attempt segmenting within one
account, but always run into limits: [https://github.com/witoff/self-service-
iam](https://github.com/witoff/self-service-iam)

~~~
dastbe
The other bit would be blast radius. What if someone does get access to your
single account? How confident are you that your policies were airtight? By
using many accounts, you create clear isolation boundaries that require opt-in
sharing.

~~~
user5994461
>>> By using many accounts, you create clear isolation boundaries that require
opt-in sharing.

In theory yes. In practice, you will achieve the opposite of that.

Developers and ops will have to juggle between 10 keys and accounts to get
anything. The keys will end up saved and written all over the systems. It will
be impossible to have audit between all the accounts and access.

~~~
ejcx
Op here. I don't think you read the blog post! Our entire engineering org has
a grand total of 0 AWS keys!

Per-account isolation is great for security and especially reliability, if you
run in to constant ratelimit issues like we do.

