> The ability to apply a wildcard match on the s3:ResourceAccount condition key
That’s the crazy part. No good can ever come from this - there is no legitimate reason why you would grant or deny permission based on a partial account id match.
This is because, I assume, the AWS policy execution has a number of “operators” and “operands”, and in this case, you’re using the StringLike operand on the account ID string.
Anyway, this discussion is a bit amusing to me, since Devops people are discovering side channels[1] now, although other types of side channels such as speculative execution side channels on CPUs (Meltdown, Spectre) already made waves at the time of discovery, and before that we had power analysis[2] and Magnetostriction detection, and constant time cryptography (this one is a field of its own, so I omit references.)
> Anyway, this discussion is a bit amusing to me, since Devops people are discovering side channels[1]
We, the DevOps people, already knew about side channel attacks, Spectre and the likes, evaluated the performances hit for the fixes (or alleged fixes), patched our kernel boot params etc etc. We are curious people, just like many here.
Not GP, I tried to answer as friendly as possible. Their approach is even more weird given that Security goes really hand in hand with systems infrastructure In any small enough organization, security will be managed by devops/sre. If the org becomes large enough, it will start dedicating people to Security.
I didn't think you were wrong I just thought the exchange and its tone was very stereotypical of Developers and Operations (and I was amused). Each being on their side of the fence and only throwing things over the top.
It used to be really bad... "works on my machine", bad deploy instructions etc. We paved a lot of these roads with (really fucking bad) strategies like containers. At some point maybe we will learn to write software that is operationalizable. Projects like tigerbeetle give me hope that this is our next evolution.
It’s a consequence of weak typing choices - not an inevitable result of allowing flexibility.
Doing glob matching on account IDs is like doing concatenation with guids, applying a bitshift to a UTF8 string, or running a regex on an integer. It is a nonsensical operation, and - as shown here - results in surprising security properties of the resulting system.
Surprising security properties are an undesirable result in an access control policy language.
They have ARN's which include the account which a glob match is useful. Something like "arn:aws:*:*:1234567890:*" is useful but "arn:aws:*:*:1234567*:*" isn't
That's the part that surprised me as well; it doesn't seem like a field that should be eligible for anything other than an exact match. I am unable to conceive of a use case for pattern matching account IDs.
If for some reason you’re dealing with thousands of accounts that are architecturally indistinguishable, bucketing them by ID prefix isn’t a particularly wild thing to want to do.
AWS assigns these individually, and customers can’t influence the ID that they get. For access control purposes I see no valid use case for wildcards there.
Sharding on account ID might make sense if someone has a large number of them, but that would not necessitate wildcard matching.
But it could seem like a neat and obvious way to reduce policy size (which is limited) and make it arguably more readable, or at least the intention clearer. (I might assume `2847373847261`, `37385857721`, `5847262671`, ... is `*1` over our accounts, but I might be wrong, or I might forget (/not correctly automate) to add the new one.)
It sure could. If you're sharding by id and have some per-shard resources, they could definitely get permissions to only accounts 12345*. (I'm not saying it's a good idea, just that once you're in that situation, you would pattern match on partial IDs)
But account IDs are assigned by Amazon and there's no structure within the namespace that's useful to you. If you mean all, you can wildcard * - but there doesn't seem to be any legitimate cases for "all account ids beginning with a 1".
Presumably the permissions language is broadly defined and has something like
filter = property, operand, value
with few constraints on which operands can be used in which situations, to keep parsing the language simple (parsers being notoriously prone to vulnerabilities after all). In retrospect perhaps that isn't a good trade-off, but it would be tricky to tighten things up now without breaking lots of existing users.
I think the GP is talking about granting access to a particular bucket to an unbounded number of customer AWS accounts — probably in requester-pays config. (Think: static data used by an Amazon Marketplace virtual appliance.)
They don't need to — in fact, you might be depending on them not following any particular pattern. Think: treating the Account IDs as pre-hashed keys, and then specifying prefix patterns as ways of sharding the hash keys onto a set of buckets, to evenly distribute access (and therefore traffic) by customer.
GET requests? Probably not. PUT/DELETE requests? I think so, yes. All updates to the bucket ultimately bottleneck at an update to the meta-version of the bucket-record-object in the object-storage-system's bucket-metadata store (itself probably something like DynamoDB / BigTable / etc.)
Given the way these IaaSs' distributed KV stores all manage writes (i.e. by having a cluster of transactor nodes that per-key write-linearization responsibility for parts of the keyspace is sharded across — such that writes are fanned out to a particular designated transactor-node given the key's hash-slot), a very large S3 user, generating an extremely high level of metadata-update concurrency against a bucket, could very likely write-contend that bucket's metadata / have a "hot" bucket-metadata key; experience low perf due to that; and solve that by sharding the bucket (swapping one too-hot metadata key for N somewhat-hot metadata keys.)
I want to give an intuition-building example here, of an IaaS feature that wouldn't exist / wouldn't be exposed to the user if not for object-storage buckets being metadata-write-contended at scale. I'm not very familiar with the AWS ecosystem, though, so I'm not sure what the good example is for AWS. What I do know is GCP, so here's a GCP example: Google Cloud Dataflow allows you to set a temporary workspace GCS bucket on a per-job basis (gcsTempLocation). And, IIRC, Google's Cloud Architects advise to not have a bunch of active Dataflow jobs sharing the same gcsTempLocation — regardless of whether they use distinct key prefixes to namespace the temp files. Given that each job would be doing a lot of little serial updates to the temp bucket — and given that Dataflow jobs can each be highly internally concurrent — you're already potentially putting out O(N^2) ~concurrent updates to that bucket. You really don't want to make it O(N^3).
If you're big enough to think you need it, you're also big enough to have people who can tell you it's a bad idea and there's a better tool for the job.
aws support is not going to bend over backwards just to let you shoot yourself in the foot. it's more likely they grant an exception to one of the iam quotas.
My general assumption is not that they’re random, but at least that they’re not correlated; in particular that Amazon is not in the habit of handing out, like, account IDs 676363687000 - 676363687999 to a single organization. Even if they did hand out a sequential batch of 1000 account IDs, it would be more likely to be 676363687541 - 676363688540 than a set with a single consistent prefix.
Odds are that an account wildcard match like 676363687* will just match a few hundred entirely random AWS accounts.
> in particular that Amazon is not in the habit of handing out, like, account IDs 676363687000 - 676363687999 to a single organization
Honestly, wouldn't surprise me that much if they were willing to accommodate this if for sufficiently large accounts. It'd still pretty sketchy to design your access control around, but it wouldn't be unrealistic.
I once was involved in creating two (linked) amazon accounts at the "same" time, and ended up with account IDs of which the first 4 digits are identical.
it's irrelevant whether they're "cryptographically" random, all that matters is that account IDs are not controlled by the user and therefore have no logical relation to any access-control policies the user may wish to implement
Probably comes out of a tendency to generalize. Last week for one of my side projects I built something that lets you write queries in a format inspired by OWL and it has a library of relational operators that can, say, extract the host from a URL or do a prefix query, like query, regex query, etc.
As it is my side project’s side project I do what is easy so these operators are always available even in cases where they don’t make sense (I dunno what happens if you try a regex query on a number, I don’t care) I can imagine there is something a bit like this inside AWS but for a security-sensitive system with a lot of users you have a different standard.
This is like matching a bitfield on the group ID on a Unix system. I could see someone coming up with this idea, and I could see them thinking this is somehow smart, but implementing it on systems that aren't 100% in your control would just be silly.
While I wouldn't publicly hand out my account IDs as a general practice, I think you have to expect that some of them will be disclosed at some point. As more third party vendors and SaaS platforms move away from IAM users and access keys to using role assumption as the preferred method of integration (as they should!), the account ID of at least the account you use as their integration point is now known by another party, who have their own dependencies, vulnerabilities, etc.
If you put a role ARN in the principal section of a bucket policy, AWS will check if the role exists and fail the policy update request if not. Even if it's not in the same account. Don't know if there's another way but you can manually enumerate roles from there
AWS account ID == Your IP address. It may be sensitive, but someone needs to know it to get s*$t done.
Illustrative example: I had to deal with a third party that we needed to integrate with because of anti-money laundering procedures a year or two ago. I wanted my team to setup a privatelink with the organization because that's generally more secure than an open sftp port. The company refused citing security reasons to hide their Account Id (it's needed for the role ARN used for reciprocal permissions to PV endpoints). So what did we do?
We ended up whitelisting a range of public IPs they use for inbound port 22...
Moral of the story: you may think you are a genius for obfuscating your IDs, but you can't really run a business unless people have an address back to you
AWS PrivateLink has another property that generally makes it undesirable for these types of integrations: communication is bidirectional, and IP subnets should not overlap.
We (as a vendor ourselves) typically integrate as a VPC Endpoint Service, where communication is unidirectional and our service is exposed as a load balancer’s endpoint within the customer’s VPC.
I thought PrivateLink was branding for vpc interface endpoints? There's no ip subject restriction for that because it's basically a proxy. Are you thinking of vpc peering?
The initial text was ambiguous but the author has now clarified their answer in this thread. Do you really think they were happy with this? I actually think this might open other attack vectors.
I agree that the account number just by itself is not a secret, but there is a reason why all AWS demo videos mask the account number.
This is my attitude towards security disclosures. In this case, Amazon approved the disclosure. But even if they hadn't, it's better for the good guys and bad guys to know about problems when the alternative is only the bad guys knowing (or the bad guys and a few good guys at the affected company).
For sure an interesting find, but was kinda hoping based on the title that there was a more straightforward way to do this.
I really wish that AWS had a simple way from an admin account to ask "where is X resource" within an organization to quickly tell me which account has a specific S3 bucket (and other things, but s3 buckets is the big one).
Admittedly this is mostly an issue with legacy buckets that existed before better practices and buckets all being defined in code. But with a ton of AWS accounts it can be tedious to hunt down a resource in an unknown account and possibly region.
If you use AWS config setup for the organization (aggregator), you'll get a athena-sql-queryable inventory of all your resources from all organization accounts.
So finding out which account owns a resource can be as simple as, roughly: select accountId where arn = "x"
> The Zone ID and Account ID are not sensitive. Sensitive data like account API Key, Secrets etc. can all be revoked, rotated or changed. See the comment 36 below on the Wrangler repo: as per our security team, it’s completely Fine to have your zone_id and account_id public, the Global API key and associated email address should be kept secret.
That said, one thing I could think of that this could be used for is correlation. If you’re running multiple S3 sites from the same AWS account, people would be able to see that they’re hosted by the same account. Whether or not this matters depends on your threat model.
Exactly - this isn't going to open the door for someone but could add a ton of value to enumeration.
As we are very canary focused, we also think it's interesting to consider the implications of the recent research from Truffle Security w.r.t canary tokens (https://trufflesecurity.com/blog/canaries).
Not necessarily. An AWS account ID + the knowledge of a role name that by mistake has the "allow role assumption" allowlist too wide (say "*") is now enough to take over the account.
One might of course say "well then don't do that", but of course the more complex a system like IAM is the easier it is for unexperienced people to open the floodgates.
That's easy to find out: change the API credentials of a user, but forget to update the service. Notice only a few days later that you forgot the change, but you also never got any notification "something" is going wrong.
In contrast, every half-decent IdP will lock an account automatically after anything from 3-10 wrong attempts.
Turning what you said around, you're arguing you might want to keep an account ID secret for "security by obscurity" reasons. In my mind, even in a multi-layer security solution, even then the account ID should be considered as a public string whose knowledge (along with other bits like what misconfigurations it has) provides no additional vector of attack, because of defense in depth.
For CF accounts, I use the gmail `+ feature` to make a totally unique email address that cannot be easily guessed. Not perfect, but adds another layer of abstraction.
Quite a few people in this thread assume that the AWS key id is part of a "security by obscurity" "protection in depth".
This will probably be downvoted, but if you read this anyway: this is a good example of why "security by obscurity" is not a good defense. You will overlook something (a determined attacker will not)
Anything non-"security by obscurity" does not depend on you understanding something or not - it will apply, no matter what, as long as the attacker hasn't a genius on payroll which cracks e.g. AES-256 just so (https://www.youtube.com/watch?v=KEkrWRHCDQU)
To me security by obscurity is limited to things like this:
There is a way to view bananas at
/bananas/:bananaUUID
unsecured endpoint.
I don’t want people to get all my banana data, but as long as there isn’t an easy way to list banana uuids, that endpoint is basically effective security by obscurity.
That's an unfortunately common misconception. Your example is not security though obscurity any more than password authentication is, though.
Security through obscurity means substituting security for a flawed algorithm that is usually trivial to exploit if the attacker is made aware of the algorithm. Think things like no authentication and ROT13ing and Base64ing clientside. If the method leaks or is discovered, the whole system is broken.
You just told me your algorithm and I cannot get to your banana because the UUID key space is insanely large. So that's not security to obscurity.
There are some important caveats to consider: Client and server software will not handle URLs like secrets, so UUIDs will leak out through various channels. Some examples include logs, user analytics, ad networks, browser history, bookmarks, e-mail, instant messages, shady browsers, shady ISPs, referrer headers, etc. You cannot rotate resource identifiers without breaking clients, so a leaked URL is permanently leaked.
Hopefully you're using version 4 UUIDs. Those set aside 6 bits to encode UUID details, keeping 122 bits of entropy. Since every banana needs its own identifier, subtract the number of bits needed to uniquely represent bananas. What's left will unavoidably be less guess-resistant than client secrets. Other versions of UUID use many more bits for low-entropy purposes.
How might this matter? A obvious one: Given a production bucket, it’s now possible to find development buckets for that same org, which is not expected behavior IMO.
You only need the bucket name to do that. You should include a randomly generated prefix/suffix in bucket names to prevent against such enumeration attempts. Another good idea (as well as, not instead of) is to expose objects in buckets publicly with a non-default host name, such that the bucket name isn’t leaked at all.
> While account IDs, like any identifying information, should be used and shared carefully, they are not considered secret, sensitive, or confidential information.
Seems like at least in the digital world, there is either public or private information, and that's it. We don't really have a good concept of privilege or protected information.
For example, my home address is technically public, but I most certainly wouldn't want it lambasted across the interstate with a picture of my family next to it advertising where I live. It's handed out on a need-to-know basis, and I mostly trust / expect that it's kept mostly confidential, or use-limited.
One huge mistake that Google did when they were integrating youtube with Google+, was the idea of sharing people's youtube comments with their G+ friends. Youtube comments have always been public, but there was huge customer pushback, forcing them to revert them for this idea, since there is in people's mind a huge difference between public and publicized comments.
A hacker can enumerate the resources of and access to an account by using its account ID. Many AWS customers incorrectly configure resources such that any other AWS customer can access things in their AWS account; for example, some 3rd party providers tell customers to configure access for the 3rd party, that can lead to wide-open access for anyone who knows the customer's account ID. If the cloud admins don't know what they're doing it can be very easy to screw up and not realize it.
It's sort of like giving someone your IP address. By itself it's not enough to hack someone. But if your host is insecure, it sure makes it easier knowing exactly where to attack.
There are levels of data classification, and different regulations and policies apply to each level and geo. They are probably disavowing themselves of any liability.
This is a good question for any provider like AWS--what kinds of information do I leak with seemlingly mundane choices like bucket names.
The other attack vector is from insiders. Many organizations "shield" identifiable information behind UUIDs or some other scheme. In the event of a breach, the UUID might mean nothing to most (it's not foolproof, though), but opens more doors for an insider.
Account IDs are 12 digit random numbers. They are used to identify an AWS account, that's all. Knowledge of the full 12 digits doesn't grant access to anything, prove you own the account, or enable you to authenticate to any systems (or individuals like in customer service for example) in AWS.
They are visible when ever you share something with another AWS account, they're in the ARN. For example, the 12 digit account IDs of all a vendors that vend AMIs, assume roles on AWS accounts (think datadog, for logging / metrics) or otherwise provide services have AWS Account IDs that are well known and easily discoverable. This s3 example is just sort of interesting since its one of the handful of AWS services that don't use account IDs in ARNs.
AWS account ids are not secrets and treating them as secrets or giving the impression that they are anything other than public data is a distraction from real security concerns.
I have always likened AWS Account IDs to be similar to public keys. But even less useful. There is nothing useful that you can do with them without other information.
I think the more worrying attack vector is when you now use the account number to try and Allowlist a principal from that account in some policy in your account. If the principal doesn't exist in the other account you'll get a role/user not found error!
Presumably you could use this to find real principals in the other account.
There seems to be a large discussion of whether account IDs are "secret" or "private" or "confidential" or whatever.
From my point of view, that entirely misses the point. The problem here is that what's revealed here is the relationship between buckets and account IDs, which allows discovery of shared ownership of buckets (unless you use a micro-account approach).
I probably don't care if you can discover that 2343242365 is the account number associated with "coolbuttplugs.com" but I probably do care if the same account hosts a bucket for "michaeljfoobar.name" and my buttplug thing is a sideshow from my white shoe law practice.
In my company we use reseller billing as AWS do not have a local billing entity. The reseller owns the organization's root account and we do not have access to it. Every subaccount creation require a support email to the reseller.
Interesting post. I wonder if you can further combine it by using a PrincipalTag in some way? You can assume a number of roles with different tag values, and these can be interpolated into the condition. This lets you do things without a huge statement?
Pretty sure AWS always could look up the info of an account if a bucket was used for crimes. Adding the name of a fake account probably doesnt add anything for them.
The answer to these types of questions is, depends upon who gets the case. Generally, the further along you've gotten within legal limits, the better the chance that the case will be given serious consideration and time.
While I agree, the way you wrote this might mislead people into thinking that you meant you can treat your account ID as non-secret because AWS does. That doesn't directly follow; the difference between AWS's point of view and your company's point of view means there are things you might care about that AWS does not. Rather, you need to follow AWS's lead and design your cloud deployment such that the account ID doesn't leak anything interesting about your business that you'd prefer to keep private. Correlated ownership of buckets (e.g. different clients of the same design firm) in particular is something AWS doesn't care about, but you might. If you don't want people to know that the buckets have the same owner, better split it up into separate accounts to preserve the non-secretness of the account ID.
No, this is wrong. The fact that AWS does not consider the ID a secret means that your company never should either. If your company “cares” about its ID being secret, then your company is designing its systems in an insecure way.
The fact that AWS does not treat the ID as secret means you have no guarantees that anyone within AWS cannot see or find your ID. You also have no guarantee that AWS at some point won’t expose your ID to the world and break your entire security model, because AWS doesn’t think it’s secret. If you do, you’re basing your security off of false assumptions.
You've said "this is wrong" and then repeated exactly my point back to me. I suspect some misunderstanding has occurred here. We are agreeing on this--you need to follow AWS's lead and design your cloud deployment such that the account ID doesn't leak anything interesting about your business. That's a direct quote from my original post. I further gave an example--if you're a design firm using a single account for different clients, you're leaking the information that they are your customers. Better split it up into separate accounts to preserve the non-secretness of the account ID--another direct quote from my original post.
Stated another way: you can unintentionally make your account ID sensitive if you're not careful. You have to be careful.
I think we agree, but either your meaning or your words are giving me pause.
> you can treat your account ID as non-secret because AWS does. That doesn't directly follow;
It does follow, and not only that, but not only “can” you treat them as non-secret, you _must_ treat them as non-secret.
> the difference between AWS's point of view and your company's point of view means there are things you might care about that AWS does not
The point here is that if you want to have good security, you _cannot_ “care” about this if your service provider does not also care about it. If you “care” about your ID being public, but your provider does not, then if you want to have good security you must either find a way to not care, or find another provider.
Correlation of bucket ownership isn't a security issue at all. I never used that word in my posts, and the author of this article also never suggested that it was. That's the point--there are other considerations that AWS does not care about, but your business does. Your cloud deployment needs to be designed such that the account IDs do not leak information that your business doesn't want to be revealed. You don't get the non-secretness for "free"--you have to think about it and be careful on how you isolate things into accounts. As far as I can tell, we agree on all of this and have just suffered some misunderstanding in this thread.
No, this is wrong. Just because it is not a secret does not mean that is disclosure is not valuable. You need that information to compromise it.
It narrows all the possible account IDs to one.
Is that ID already compromised? Can you gain access through someone else that does the work for you?
You can cross reference with other systems you have compromised. Is that account ID in their system? What access does that give you?
Etc, etc. It is not a secret, but it absolutely is valuable.
For example, if you considered it a secret (even though it is not), you might not choose to use third parties that require it, and that can improve your security posture.
No, operational security might still a valid concern even where there are no issues with digital security.
You might have a 100% secure system, but you don't want your competitor to know exactly what you are doing. You might also not consider their knowledge of what you're doing to be a vulnerability, nor think that you should spend many resources on preventing them from knowing, you just don't want to make it easy for them.
Cryptography is very black-and-white. Business operational intelligence is not.
AWS doesn't consider them sensitive, but some organizations do. My feeling is that I would rather avoid leaking any info about my AWS environments. Just because AWS doesn't think they can be dangerous doesn't mean someone else won't find a way.
I don't understand your reasoning. Neither my name nor where I live nor my phone number nor my license plate are secrets. Yet I don't go around wearing a t-shirt with my personally identifying information printed on it. What am I missing?
> While account IDs, like any identifying information, should be used and shared carefully, they are not considered secret, sensitive, or confidential information.
If you knew that there was an assassin out there trying to find you and cause you harm, would you consider “I don’t wear a t-shirt with my address on it” to be any form of protection against that assassin?
If you want protection against that, you need to focus on better home security, hiring bodyguards, going to the police etc, and you’re better off assuming that the assassin will find your address regardless of whether or not you wear a t-shirt with it printed on.
Put another way: there’s a difference between “I don’t do this thing” and “I rely on not doing this thing for my safety”. The second one makes a lot more assumptions than the first one does, and those assumptions can lead to problems if they are false assumptions.
> Put another way: there’s a difference between “I don’t do this thing” and “I rely on not doing this thing for my safety”.
That was my point. The fact that some organizations consider AWS account IDs sensitive is independent of whether they rely on it being sensitive or not.
I might have taken all precautions against an assassin attack, yet I won't make the assassin's job easier by announcing my PII to them. The fact that I won't announce my PII says nothing about whether I took the security precautions.
If an organization considers it sensitive, that implies they’re putting some level of reliance on it being so. Otherwise there would be no point in considering it sensitive.
There’s a difference between “making it easier for an attacker” and using it as a security control, even if it’s not the only security control. The point is that even if you don’t go around wearing a shirt with your address on it, that should never factor in to your designs for security. It should never be considered a security control, even a “defense in depth” one.
In fact, your threat model should ideally ask the question “assume someone does walk around with a shirt with my address on it, will I still be safe?” That doesn’t mean you’re actually going to go do it, but if the answer is yes, that’s how you know you’ve done your job.
You’re still misunderstanding. I’m not saying you should go “paint a target” on yourself, I’m saying you should assume _someone else is_ going to paint a target on you, and defend yourself accordingly, rather than acting like the lack of a target protects you in any way.
If you place any sort of security assumptions on AWS account IDs into your threat model, you're effectively directly introducing a security vulnerability. If you're not then why include it into your security threat model to begin with? I believe that is their point.
Since AWS does not, and has never, treated that information as secret, then there is absolutely no reason to consider it sensitive because there is no security guarantees with how AWS handles those IDs (as this article demonstrates).
Thus, either you're including them into your threat model as sensitive and thus immediately opening up yourself to vulnerabilities (bad security), or you're not including them at all (and thus not treating them as sensitive/secret/whatever). The argument the parent had (and that I agree with) is that you should do the latter unless AWS provides a means to work with those IDs securely (it won't because they're not secrets).
This is a false dichotomy. There is a deep chasm between "publish everything", and "this is a secret", called operational security.
Any time a topic like this comes up, there are people on this forum that try to apply the "security by obscurity does not work" principle to every security topic under the sun, when in reality, that principle really only applies to the world of cryptography. In meat space, where humans operate on plaintext, keeping a secret is a very valid approach to some topics. This is why things like NDAs exist.
How is it possible for a user of AWS to keep the account ID secret if Amazon doesn't even consider it secret? If Amazon leaked your account ID they could point to their docs and say the account ID was never meant to be a secret, sensitive, or confidential.
You can walk over to the user's desk and ask them not to share it. Whether or not Amazon leaks it is unrelated to my employees' ability to follow instructions.
There is a lot of data that exists in a space somewhere between "100% secret" and "100% public". This is one of those situations, for many organizations.
You’re wasting your employees time by asking them to keep it secret, when you gain absolutely no benefit from keeping it secret (and in fact are introducing an easy failure point by pretending it’s secret) and you have no guarantees that others are keeping it secret.
> This is one of those situations, for many organizations.
Many organizations just have a blanket policy that you shouldn't be exposing data about an organization's infrastructure unless you need to do it. This is a good policy.
> by pretending it’s secret
No, nobody needs to pretend it is secret. You're missing my above point. There is not a dichotomy between secret and public. It is possible for something to be neither secret nor public.
Well, as a very relevant example, if you tell others what your AWS account ID is, they can figure out if you own any particular bucket. The metadata association between the content of that bucket and the owner might give away information that the contents of the bucket doesn't indicate on its own. It also might not be a technical vulnerability, but that association itself could imply some proprietary business information. Or it could give clues to any would-be attacker as to other resources to target. In business, there are lots of types of information that are not secret, but are also not public.
If you assume it is a secret, you might start to use it as an access key, or similar things. That's where the error is. You should assume it is public (but avoid publishing it yourself of course, that's defense in depth)
I think making the assumption that account IDs won’t leak is the problem. It’s safer to assume that someone out there already has a directory of account IDs to buckets and that your efforts are better expended on other things, like actual secrets.
I guess anything that looks like random letters/numbers is going to be considered a secret to some people. If it wasn't meant to be secret, it would be human readable. Or some such nonsense.
* Security through obscurity provides secondary security only, so it doesn't add to defense-in-depth significantly. If it can be added, then it is slightly safer to prefer to do so. Elimination of unprivileged enumeration and internal primary key predictability are relatively more important to reduce the attack surface.
They’re not a secret, in fact it’s what you use to share the identity of one account with another, possibly third party’s account. It’s a secret in the way your personal email address is. You may not advertise it on 4chan, but you definitely share it to limited audiences. Once you’ve shared it it’s no longer an actual secret. Indeed, as has been noted, aws itself doesn’t treat your account id as a secret and freely shares it within aws, logs it in plain text, uses it for operations with human operators, internal reporting, etc. It’s an identifier, it’s discoverable, shareable, and leaks all over the place.
This is not the stuff secrets are made of.
Trying to eek some sort of security story out of hiding account ids plays right into security through obscurity. As it’s not treated as a secret the most you’re doing is obscuring the attack surface of your infrastructure. IAM doesn’t allow anyone to use the knowledge of your account ID to grant any privilege not specifically granted within the account itself via two way grants. Holes in your IAM policies aren’t protected by hiding the account ID, they’re protected by closing the holes.
Yes, but I do use random email addresses (that look like passwords) to stop correlation, concerted credential stuffing (that might lock me out for a time), etc. And if it pops up in a compromised server no one will know that email address is mine.
And aws also offers ideas like external IDs and similar concepts to do something like this.
But you might notice that even aws services advertise their own account ids. They’re not secrets and treating them as such doesn’t help you improve security.
AWS account IDs aren’t secret, but the identity of who owns particular S3 bucket names is meant to be.
If you get an email apparently from AWS that correctly names an S3 bucket and the associated account ID, are you more likely to take it seriously than an email that just names a bucket?
Not to mention it also means you can establish relationships between buckets by finding common ownership.
No, the account ID isn’t secret but I don’t think we should be dismissive of this new information either. It’s still important metadata to factor into decision making.
More generally, this opens up all sorts of threat models for espionage and alternative data. A private bucket called project-foobar-logs-staging might have its existence known in an index but never have been known to be associated with Baz Corp's publicly-known image buckets - but with this disclosure, a database will become available making that association possible. And it will be possible to monitor if and when project-foobar-logs-prod gets provisioned. Not to mention that state-level actors would love to see an enumerable list of projects their rivals are working on, and if any of them happen to have been secured only by obscurity in the past.
Everyone here throwing down the “technically not a secret” card but I don’t think that was your point. I guess “sensitive, but not secret” is the right way to say it. They’re obviously not secret like credentials, but it’s a damn good piece of info for an attacker to hold against you, because, like you said, it’s impossible to rotate, but is tangentially related to every single part of your AWS infrastructure (per-account of course).
I guess I disagree with the AWS documentation then. It should be treated as a sensitive piece of information that should not be exposed to the public. It’s not going to bring you down if it’s leaked, but it could be used along with other leaked info over years in more sophisticated attacks. Why risk it is more the point.
In fact, when delegating IAM access (where security is top of mind), account IDs are shared liberally.
Account IDs are as secret as phone numbers. That bit of info could be tangentially useful to an attack, but really shouldn't be assumed to be secret in any meaningful way.
But there’s a reason people advise against security by obscurity: the obscurity can go away at any time and once it does it’s unrecoverable.
So sure, maybe pat yourself on the back today that on top of your other measures, no one outside your org and AWS knows your Account ID. But if it gets out at some point, those other measures should be foiling your pentesters on their own. In fact, to better test that this is the case, you should probably give the pentesters your account ID so you can be informed regarding your security in this scenario.
Unfortunately, until humans have crypto modules installed in their brains, we will have to rely on some form of "security by obscurity" that many colloquially call "operational security".
Yes, there are types of data and metadata about your company's infrastructure that can be used against you. But no, you shouldn't hand it to attackers on a silver platter.
> In fact, to better test that this is the case, you should probably give the pentesters your account ID so you can be informed regarding your security in this scenario.
That depends on the scope of the test. Many organizations will do both (and others) to test different layers of security. Remote software exploits are something that many people on this forum are concerned about, but that is hardly the be-all-end-all of security for an organization. There's a lot of security topics entirely outside the scope of computer systems to be cognisant about here.
When obscurity is the only option, like opting not to expose AWS account ID, why not do it? Don’t think this should be conflated with replacing proper security with obscurity, which is obviously wrong. The less you expose about your operation, the better. Not exposing details of your infrastructure to the public is another commonly suggested best practice that is obscurity, not security, that nobody challenges. For example, cleaning up common default response headers for things like cloud front.
You can't keep it secret because Amazon doesn't keep it secret.
So if Amazon doesn't keep the account ID a secret how can you as a user of Amazon be expected to keep your account ID secret? There's no way for you to stop Amazon from exposing it.
You as a user of Amazon can do whatever you want though, including being careful about which pieces of information you elect to expose publicly. If it’s up to you, why choose to expose it, when you can choose NOT to? Just because this reverse search of bucket to account id exists doesn’t mean you should begin to expose your account id on your own.
You can't choose NOT to expose it because you can't stop Amazon from exposing it.
Amazon says it's not secret, so it's not secret. They make no attempts or guarantees to keep it secret so there's always the threat that Amazon themselves can expose it on your behalf. You can't stop that no matter how wrong you think it is.
Doesn't change the fact that security by obscurity is a layer of security. It's not a good layer, it's not one you should depend on, but it is a layer that causes problems for attackers.
This is kind of a big deal, but also not really? UUIDs are not IDs. And IDs are not always indices. This is a common design flaw in many distributed systems.
ITT lots of debate on whether AWS Account IDs are sensitive or not. To chime in my 2c; we've had this debate in multiple orgs with different security teams and the outcome has always been the same; they're not and it's counterproductive to your security posture to treat it as privileged information. Humans have a nasty habit of placing trust in people who have access to privileged information.
"Hi, this is Tom from AWS, I need to speak with you about your account 5923965523" - as a social engineering primer garners significantly different levels of trust from the target depending on whether the target perceives the account ID to be privileged information.
Not exactly - they Can do that, but they don't need to match - using cloudfront in front of your s3 website and then the site can be served from any bucket or folder within a bucket.
I have dozens of s3 static websites served from a single s3 bucket, all with unique top-level domains, each in its own folder within that single bucket - much easier this way.
Given anyone can create a bucket with any name (if its not already in use), you can't count on getting the bucket name that matches your domain name.
That’s the crazy part. No good can ever come from this - there is no legitimate reason why you would grant or deny permission based on a partial account id match.