
EC2's most dangerous feature - dwaxe
http://www.daemonology.net/blog/2016-10-09-EC2s-most-dangerous-feature.html
======
hueving
The blog post buries the lead a little bit because it's talking about lots of
pain points with the ec2 API and IAM. The important point to take away is that
any process with network access running on your instance can contact the EC2
metadata service at [http://169.254.169.254](http://169.254.169.254) and get
the instance-specific IAM credentials.

Think about things like services that accept user submitted URLs, crawl them,
and display results...

~~~
cddotdotslash
This is actually a vulnerability I've seen countless times. If a site accepts
a URL which it reads and returns to the user, submit the 169.254.169.254
metadata service. About 1 out of 5 times I've tried it, I'm about to get a
response.

~~~
stevekemp
I think that's another example of bad filtering, same thing that happens if
you accept an URL and don't disallow `file:///etc/passwd`.

------
cesarb
What I've done for a previous company was to, as one of the very first things
done within every EC2 instance, add an iptables owner match rule to only allow
packets destined to 169.254.169.254 if they come from uid 0. Any information
from that webservice that non-root users might need (for instance, the EC2
instance ID) is fetched on boot by a script running as root and left somewhere
in the filesystem.

~~~
falcolas
This won't help with IAM roles, since the credentials provided in the metadata
expire. Of course, a small tweak to the iptables entry would help there as
well.

Mind posting your entry for us iptables impared folks?

~~~
cesarb
I don't work there anymore, so I don't have access to the exact rule I used,
but IIRC it was something like

iptables -t filter -I OUTPUT -d 169.254.169.254 -m owner \\! --uid-owner 0 -j
REJECT --reject-with icmp-admin-prohibited

------
skywhopper
Hopefully the operators using EC2 instance profiles understand and weigh the
risks of using that feature. It's good to be cautious, but the feature is only
dangerous if you don't take the time to understand it. Running a server on the
Internet at all is "dangerous" in the same sense. And for this particular
risk, it turns out there's a simple fix.

He _is_ right in his first criticism that the IAM access controls available
for much of the AWS API are entirely inadequate. In the case of EC2 in
particular, it's all or nothing--either your credentials can call the
TerminateInstances API or they can't. I'm sure Amazon is working on improving
things, but for now it's pretty terrible. But in practice it just means you
have to take care in different ways than you would if his tag-based authz
solution were implemented.

That said, while it's certainly frustrating to an implementor, it's not
"dangerous" that limitations exist in these APIs. We're talking about decade-
old APIs from the earliest days of AWS, and while things have been added, the
core APIs are still the same. That's an amazing success story. But like any
piece of software, there are issues that experienced users learn how to work
around.

You can bet that the EC2 API code is hard and scary to deal with for its
maintainers. Adding a huge new permissions scheme is likely nearly impossible
without a total rewrite... I don't envy them their task.

~~~
jamiesonbecker
It's _impossible_ to limit access to any part of the instance metadata in any
way w/o firewalling (which has its own issues) or even to expire access to any
part of it. Since instance profiles have keys (even though automatically
rotated), _any_ process on the system, owned by _any_ user, can access
anything exposed via the instance role. This makes embedding IAM keys into
your instance and protecting it by root-only or ACL's MUCH MUCH safer... but
AWS specifically states that instance profiles are preferred. In fact, for our
Userify AWS instances (ssh key management), we are _required_ to use instance
roles and not allowed to offer the option. (This is why we do not offer S3
bucket storage on our AWS instances but we do on Pro and Enterprise self-
hosted.)

The biggest issue with the IAM instance profiles is that they trade security
for convenience.. and it's not a good trade.

~~~
subway
For the most part EC2 instances should be single-purpose. Use tiny instances
that do _one job_. Your IAM role describes the permissions that should be
granted to that _one job_. It's absolutely true that you cannot isolate
permissions at the process level, but by using single-job-type instances, you
can easily isolate permissions on a per-job (in this model, per-instance)
basis.

~~~
paulfurtado
What? Why should EC2 instances be single-purpose? Amazon offers a wide variety
of massive instance sizes with 160+gb of RAM and 30+ cores. It's extremely
common to run software like mesos, kubernetes, docker, etc on these.
Dedicating an instance per app is extremely cost-ineffective.

~~~
subway
EC2 instances should be single-purpose (Or, if you want to mux containers onto
the instance and retain per-container/job IAM role isolation, use ECS) if
you're developing for AWS as a platform. I'm a huge fan of k8s, and have
respect for Mesos, but these are largely alternatives to the model provided by
EC2/ECS/IAM.

In a perfect world, any service would cleanly interoperate with any other
service. Unfortunately we don't live in a perfect world. If you want to take
advantage of `advanced` features in a given platform, you have to understand
the drawbacks and limitations of those features, and what it means when they
aren't available on another platform.

To me, the greatest tragedy in the way EC2 operates is that it
looks/tastes/smells like a `server`, but it's far more akin to a process.

~~~
jamiesonbecker
Well.. an EC2 instance running Linux is not a process or even a container,
even if functionally it's easier to treat it like one.

It is a full virtual server with its own Linux kernel and operating system: it
has to be updated, secured, and maintained just like any other Linux server.
Most Linux distributions on an EC2 instance have dozens of processes already
running out of the box.

I understand your point -- that ideally a single instance can be treated as a
single functional point from the point of view of the application, and I
agree, but not from a point of view of security. As you know, in any larger
environment, there are likely many additional support applications running on
that server: things like app server monitoring, file integrity, logging,
management, security checks, remote data access or local databases, etc. Those
must not all be treated with the same levels of security and access. (i.e.,
why would rsyslog or systemd need access to all objects in our S3 bucket or be
able to delete instances or any of the other rights that might legitimately be
granted to an instance via an IAM instance role?)

To treat security for all of these processes as if they're all part of the
same app tosses out decades of operating system development and security
principles and places your single function app, as well as that of your entire
environment, at grave risk. I.e., there's a reason why a typical Linux
distribution has about 50 accounts right out of the box and everything doesn't
just run as root.

If you are developing or deploying microservices or containers and don't want
to be burdened by the security requirements, then there are alternatives at
AWS like ECS and Lambda that you should seriously consider.

~~~
subway
Amusing examples, systemd/rsyslog, as both at least briefly execute as root,
with rsyslog being relied upon to willingly drop its own privs (Not to mention
being slowly replaced by systemd-journald, which runs as root), and systemd
always running as root (ya know, since it's init, and all).

It really sounds like we have vastly different ideas about what kinds of
processes belong in an EC2 instance, as well as the ideal life-cycle of an EC2
instance. I tend to adopt a strategy of relatively short-lived EC2 instances
that get killed and replaced frequently. Persistence that depends on a single
instance surviving is avoided at all costs, in favor of persistence
distributed across a number of instances (or punted out to Dynamo/S3/RDS).

You're absolutely right that there is a reason why the typical Linux distro
has 50 accounts out of the box -- it was built with traditional multi-user
system security models in mind. I sure as hell appreciate it on workstations
and traditional stateful hosts. That said, eschewing the traditional security
model in favor of an alternative model does not make your environment
inherently more or less safe -- there are going to be pros and cons to both
approaches (in terms of both security and functionality).

------
_hyn3
tl;dr:

1) IAM instance roles have _no_ security mechanisms to protect them from being
read from any process on the instance, thus completely eliminating them from
all Linux/UNIX/Windows permission systems. (The real reason for this is that
instance metadata was a convenient semi-public information store for things
like instance ID, but it was extended to also provide secret material, which
was, at best, an idiotic move.) As the author points out, Xen already provided
a great filesystem alternative that could be mounted as another drive (or
network drive) to be managed with the regular OS filesystem permission system.
(reading an instance ID is just a matter of reading a "file")... for some
reason, AWS didn't leverage this and instead just added the secret material to
its local instance metadata webserver.

2) the API calls are not fine grained enough and/or there are big holes in
their coverage -- so, for instance, if you want to use some other AWS
services, you can end up exposing much more than you intended.

------
0x0
This is interesting! Can this be abused with AWS-hosted services that reach
out to fetch URLs? For example, image hosts that allow specifying an URL to
retrieve, or OAuth callbacks, etc? Are there any tricks to be played if
someone were to register a random domain and point it to 169.254.169.254 (or
worse, flux between 169.254.169.254 and a public IP in case there is
blacklisting application code that first checks to resolve the hostname but
then passes the whole URL into a library that resolves again?)

~~~
pfg
That's a fairly common vulnerability. A good approach for services that need
to fetch arbitrary URLs is roughly:

1\. Resolve hostname and remember the response

2\. Verify that the response does not contain any addresses in a private IP
space, or any other IP that is only accessible to you

3\. Use the IP from step 1 when establishing a connection

With other solutions, you might end up being vulnerable to DNS rebinding
attacks.

Bonus points for doing all your URL fetching in some sort of sandbox that
enforces these access rules.

~~~
jamiesonbecker
This is accurate.

Remember that even ELB's in AWS have IP's that change all the time, and this
itself is actually a source of vulnerabilities from apps that don't respect
DNS TTL's (as has been seen in the forums repeatedly -- apps get connected to
the previous IP instead of the new one). It's probably safer to retrieve and
verify the IP for each request, and just cache if the IP is 'safe'. (And just
doing IP subnet calculations is non-trivial in most less-common languages.)

Also, request throttling should be maintained and HTTP verb checking, to
prevent being turned into a proxy for other attacks.

Actually, _any_ decision to accept an arbitrary URL should be carefully
examined in light of how hard it is to do safely.

~~~
the_arun
What if we just check whether target host is 169.254.169.254 before allowing
an HTTP call?

~~~
BraveNewCurency
As long as you manually get the IP for every domain. I.e. if they ask for
"blah.com" you have to get the IP, check it, then turn it into "curl -H 'Host:
blah.com' [http://IP"](http://IP"). (Otherwise, it may be a race condition
that allows the DNS server to resolve to a different IP address the 2nd time.
See
[https://en.wikipedia.org/wiki/Time_of_check_to_time_of_use](https://en.wikipedia.org/wiki/Time_of_check_to_time_of_use)
)

------
gtsteve
This is an interesting attack that I must confess I hadn't thought of, but
surely any service that accepts an arbitrary URL has a list of IP ranges to
avoid. However, to harden a role in the event of instance role credentials
leaking, you could use an IAM Condition [0].

There is actually an example of this in the IAM documentation [1], although
the source VPC parameter doesn't work for all services, and I can't see a list
of services that support this parameter. This would ensure that the requests
actually came from instances within your VPC.

[0]
[http://docs.aws.amazon.com/IAM/latest/UserGuide/reference_po...](http://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_elements.html#Condition)

[1] [http://docs.aws.amazon.com/AmazonS3/latest/dev/example-
bucke...](http://docs.aws.amazon.com/AmazonS3/latest/dev/example-bucket-
policies-vpc-endpoint.html#example-bucket-policies-restrict-access-vpc)

~~~
jamiesonbecker
The point is not requests that originate elsewhere. The point is that this
system is not protected in any way from any other process on your system.

------
rcaught
> almost as trivial for EC2 instances to expose XenStore as a filesystem to
> which standard UNIX permissions could be applied, providing IAM Role
> credentials with the full range of access control functionality which UNIX
> affords to files stored on disk.

Doesn't this become more complicated when you think about EC2 offering Windows
instances? Even with straight UNIX file writing, what writes this? Where does
it write this? Which user has read permissions?

~~~
jamiesonbecker
In UNIX, the same way that EBS volumes are mounted... think of the /proc or
/sys virtual filesystems.

In Windows, I'm guessing that this would be exposed as a network drive.

~~~
rcaught
My point is that these type of solutions create a lot more overhead,
inconsistency and variation compared to a HTTP request; granted, less
security.

~~~
eeeeeeeeeeeee
I'm sure that's why they went with HTTP -- it is universal and will work the
same way everywhere.

I still think they should just disable it by default, so you have to "opt-in"
to the potential security risk and plan accordingly.

------
Rapzid
I've used firewall rules in the past to scope the metadata store to admin
users.

~~~
cperciva
That's better than nothing, but runs into problems if you have different users
who need to be able to access different subsets of the metadata store.

~~~
cesarb
An admin user could fetch these subsets of the metadata and leave a copy of
them in the local filesystem.

~~~
cperciva
That doesn't work for IAM Roles, because (a) AWS library code expects to get
the keys out of the metadata store, and (b) IAM Role credentials are
periodically rotated, so the keys you downloaded in advance would expire.

~~~
jcrites
It's true that instance profile roles today supply credentials to the entire
server. One benefit of virtualization is that it's reasonable to run small,
single-purpose VMs. However if you do wish to restrict role credentials to
certain processes, there are ways of doing it, such as using EC2 Container
Service with task-level IAM Roles [1]:

> Credential Isolation: A container can only retrieve credentials for the IAM
> role that is defined in the task definition to which it belongs; a container
> never has access to credentials that are intended for another container that
> belongs to another task.

If you do firewall the instance metadata service and want to get credentials
into individual processes, then you could do that using one of the credential
providers in the AWS SDK. I haven't worked with every language SDK, but
service clients in the SDK for Java take an AWSCredentialsProvider as input,
and you can pick from a number of standard implementations [2] or define a
custom one.

> An admin user could fetch these subsets of the metadata and leave a copy of
> them in the local filesystem.

So if you wanted to take this approach, an admin agent could periodically copy
the role credentials as property files into the home directories of users that
need them, and then applications could load them by configuring the SDK with
ProfileCredentialsProvider (which can refresh credentials periodically). The
admin agent could perhaps be a shell script run by cron that `curl`s from the
instance metadata service and writes the output to designated files.

[1]
[http://docs.aws.amazon.com/AmazonECS/latest/developerguide/t...](http://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-
iam-roles.html) [2] See ProfileCredentialsProvider and
DefaultAWSCredentialsProviderChain

~~~
cperciva
_One benefit of virtualization is that it 's reasonable to run small, single-
purpose VMs._

Single-purpose doesn't mean single-user. Lots of services divide their code
into "privileged" and "unprivileged" components in order to reduce the impact
of a vulnerability in the code which does not require privileges. As far as
I'm aware, there's no way to have an sshd process which is divided between two
EC2 Containers...

------
patsplat
EC2 instances are designed to enforce isolation between instances, not
processes. Presumably there would only be one primary service running on each.

Use AWS be pushed towards an architecture based on containers and services.
AWS is the OS, not any individual machine.

------
jamiesonbecker
Until AWS fixes this (which, as the article points out, may never happen), a
chmod'ed 600 file (only readable by root) is actually much safer, even when
STS auto-rotation is taken into account.

------
djb_hackernews
If users can issue arbitrary commands on an instance then that instance should
have zero Iam roles and should delegate actions to services running on
separate instances.

The instances hosting our users go a step further and null route Metadata
service requests via iptables.

~~~
jeremyjh
It isn't just about users, its also about malicious software you may
accidentally install, if for example a library you use is compromised as has
happened before with Ruby gems.

------
tex0
It's much like the same Problem with the Google Cloud. Even worse there I'd
say.

~~~
boulos
I just double checked, and the most similar thing we expose is the token's for
each service-account in the instance metadata. As pointed out in the article,
any uid on the box can read that. But, you can create instances with a zero-
permission service account (the equivalent of nobody?) and just avoid it.

This does mean that everywhere else you'd have to have explicit service
accounts and such, but that seems like a reasonable "workaround" until or
unless we make metadata access more granular (I like the block device idea!
Would you want entirely different paths for JSON versus "plain" though?)

~~~
lobster_johnson
Google Cloud does seem better here. The exception is GKE — Kubernetes nodes
are associated with service accounts which have permissions that, if abused by
a malicious Docker container, could be disasterous for your entire cluster.

Considering the amount of unpatched Docker containers out there, that's a bit
scary. It also effectively prevents GKE from being usable in any scenario
where you want to schedule containers on behalf of third-party actors (think
PaaS). (GKE also doesn't let you disable privileged Docker containers, but
that's another story.)

On AWS you can run a metadata proxy to prevent pods from getting the
credentials, but I don't know of a clean way to accomplish the same thing on
GKE.

------
yandie
If you're sharing the same instance for multiple users, trying to achieve
security among the users is almost impossible anyway. That's why physical
separation/virtualization is one of the first thing to focus on when talking
about security.

~~~
jamiesonbecker
Isolation is definitely important, but not all parts of the system running a
single function need the same levels of access, and in fact it may be possible
to target those components separately. Take a look at the wikipedia articles
for 'defense in depth' or 'privilege separation' to see how important it is
_inside_ a system to treat each component isolated to itself as much as
possible. (This is also why you don't want to rely on only a perimeter
firewall for access control.)

------
icedchai
IAM instance roles are still an improvement over how it was typically done in
the past: hard-coding the same key in a configuration file and deploying it
everywhere.

It's a balance between security and convenience.

------
zimbatm
I wonder how many services on Amazon allow user-configurable webhooks that can
be pointed to [http://169.254.169.254](http://169.254.169.254) ...

------
logronoide
Same happens with metadata access in Openstack.

The access is controlled by source IP (and namespace). I wonder if it's
possible to spoof the IP and access Metadata of other servers/users.

------
Thaxll
It has been the case for 10 years, anyone knows that, I don't see the problem.
If you're not happy with it just use API keys.

~~~
helper
IAM instance roles were only added in 2012:
[https://aws.amazon.com/blogs/aws/iam-roles-for-
ec2-instances...](https://aws.amazon.com/blogs/aws/iam-roles-for-
ec2-instances-simplified-secure-access-to-aws-service-apis-from-ec2/)

------
cperciva
OK, dwaxe, I have to ask: Are you a robot? Because I uploaded this blog post,
tweeted it, and then came straight over here to submit it and you _still_ got
here first.

Not that I mind, but getting your HN submission in within 30 seconds of my
blog post going up is very impressive if you're _not_ a robot.

~~~
brettproctor
Assuming dwaxe replies claiming to not be a robot, how would we go about
verifying? :P

~~~
ATsch
I heard robots float on water

~~~
bsandert
What also floats on water?

~~~
tomludus
Ducks!

------
ChoHag
The number of people in this thread not merely nodding their heads and mmhmm-
ing (or the internet equivalent) is a concern.

