Hacker News new | past | comments | ask | show | jobs | submit login
How to find the AWS account ID of any S3 bucket (tracebit.com)
583 points by tracebit 9 months ago | hide | past | favorite | 216 comments



> The ability to apply a wildcard match on the s3:ResourceAccount condition key

That’s the crazy part. No good can ever come from this - there is no legitimate reason why you would grant or deny permission based on a partial account id match.


This is because, I assume, the AWS policy execution has a number of “operators” and “operands”, and in this case, you’re using the StringLike operand on the account ID string.

Anyway, this discussion is a bit amusing to me, since Devops people are discovering side channels[1] now, although other types of side channels such as speculative execution side channels on CPUs (Meltdown, Spectre) already made waves at the time of discovery, and before that we had power analysis[2] and Magnetostriction detection, and constant time cryptography (this one is a field of its own, so I omit references.)

[1] https://en.m.wikipedia.org/wiki/Side-channel_attack [2] https://en.m.wikipedia.org/wiki/Power_analysis


> Anyway, this discussion is a bit amusing to me, since Devops people are discovering side channels[1]

We, the DevOps people, already knew about side channel attacks, Spectre and the likes, evaluated the performances hit for the fixes (or alleged fixes), patched our kernel boot params etc etc. We are curious people, just like many here.


Jesus, I have been in tech for 25 years and we're still throwing shit across this divide.

Well we are, if nothing, consistent.

The next nerd who pinches and inch of the coder/ops divide is going to make a billion.


Figma did this for design-dev and is worth about $40 billion.

It's amazing how anything got made considering how disjointed processes used to be.


Only worth about 10 billion. But still a lot!


The deal was for $20b my mistake.


The Adobe deal didn't go through. Last publicly hinted valuation was back at ~$10B when employee share buyouts happened in JAN 2024.


¯\_(ツ)_/¯

Not GP, I tried to answer as friendly as possible. Their approach is even more weird given that Security goes really hand in hand with systems infrastructure In any small enough organization, security will be managed by devops/sre. If the org becomes large enough, it will start dedicating people to Security.


I didn't think you were wrong I just thought the exchange and its tone was very stereotypical of Developers and Operations (and I was amused). Each being on their side of the fence and only throwing things over the top.

It used to be really bad... "works on my machine", bad deploy instructions etc. We paved a lot of these roads with (really fucking bad) strategies like containers. At some point maybe we will learn to write software that is operationalizable. Projects like tigerbeetle give me hope that this is our next evolution.


> the AWS policy execution has a number of “operators” and “operands”

That is correct.

The IAM condition language is flexible and does not prevent you from doing strange things.


It’s a consequence of weak typing choices - not an inevitable result of allowing flexibility.

Doing glob matching on account IDs is like doing concatenation with guids, applying a bitshift to a UTF8 string, or running a regex on an integer. It is a nonsensical operation, and - as shown here - results in surprising security properties of the resulting system.

Surprising security properties are an undesirable result in an access control policy language.


They have ARN's which include the account which a glob match is useful. Something like "arn:aws:*:*:1234567890:*" is useful but "arn:aws:*:*:1234567*:*" isn't


Of course it's possible to create a more sophisticated system with less possibility for user error.


> there is no legitimate reason why you would grant or deny permission based on a partial account id match.

You're not a fan of my AWS lottery idea where accounts ending in 666 get access to a free bitcoin miner??


I love the idea, but maybe we can pair it with my ICO idea of flipping a virtual stack of Pogs to unlock new units of currency that can be mined.


That's the part that surprised me as well; it doesn't seem like a field that should be eligible for anything other than an exact match. I am unable to conceive of a use case for pattern matching account IDs.


If for some reason you’re dealing with thousands of accounts that are architecturally indistinguishable, bucketing them by ID prefix isn’t a particularly wild thing to want to do.


AWS assigns these individually, and customers can’t influence the ID that they get. For access control purposes I see no valid use case for wildcards there.

Sharding on account ID might make sense if someone has a large number of them, but that would not necessitate wildcard matching.


But it could seem like a neat and obvious way to reduce policy size (which is limited) and make it arguably more readable, or at least the intention clearer. (I might assume `2847373847261`, `37385857721`, `5847262671`, ... is `*1` over our accounts, but I might be wrong, or I might forget (/not correctly automate) to add the new one.)


It sure could. If you're sharding by id and have some per-shard resources, they could definitely get permissions to only accounts 12345*. (I'm not saying it's a good idea, just that once you're in that situation, you would pattern match on partial IDs)


But account IDs are assigned by Amazon and there's no structure within the namespace that's useful to you. If you mean all, you can wildcard * - but there doesn't seem to be any legitimate cases for "all account ids beginning with a 1".


Presumably the permissions language is broadly defined and has something like

    filter = property, operand, value
with few constraints on which operands can be used in which situations, to keep parsing the language simple (parsers being notoriously prone to vulnerabilities after all). In retrospect perhaps that isn't a good trade-off, but it would be tricky to tighten things up now without breaking lots of existing users.


At that point you should be using AWS Organizations and OUs.


I think the GP is talking about granting access to a particular bucket to an unbounded number of customer AWS accounts — probably in requester-pays config. (Think: static data used by an Amazon Marketplace virtual appliance.)


Those wouldn't follow any particular partial pattern though would they?


They don't need to — in fact, you might be depending on them not following any particular pattern. Think: treating the Account IDs as pre-hashed keys, and then specifying prefix patterns as ways of sharding the hash keys onto a set of buckets, to evenly distribute access (and therefore traffic) by customer.


Do you ever really need to do that on AWS though? Can you really bog down S3 by having loads of requests to a single bucket?


GET requests? Probably not. PUT/DELETE requests? I think so, yes. All updates to the bucket ultimately bottleneck at an update to the meta-version of the bucket-record-object in the object-storage-system's bucket-metadata store (itself probably something like DynamoDB / BigTable / etc.)

Given the way these IaaSs' distributed KV stores all manage writes (i.e. by having a cluster of transactor nodes that per-key write-linearization responsibility for parts of the keyspace is sharded across — such that writes are fanned out to a particular designated transactor-node given the key's hash-slot), a very large S3 user, generating an extremely high level of metadata-update concurrency against a bucket, could very likely write-contend that bucket's metadata / have a "hot" bucket-metadata key; experience low perf due to that; and solve that by sharding the bucket (swapping one too-hot metadata key for N somewhat-hot metadata keys.)

I want to give an intuition-building example here, of an IaaS feature that wouldn't exist / wouldn't be exposed to the user if not for object-storage buckets being metadata-write-contended at scale. I'm not very familiar with the AWS ecosystem, though, so I'm not sure what the good example is for AWS. What I do know is GCP, so here's a GCP example: Google Cloud Dataflow allows you to set a temporary workspace GCS bucket on a per-job basis (gcsTempLocation). And, IIRC, Google's Cloud Architects advise to not have a bunch of active Dataflow jobs sharing the same gcsTempLocation — regardless of whether they use distinct key prefixes to namespace the temp files. Given that each job would be doing a lot of little serial updates to the temp bucket — and given that Dataflow jobs can each be highly internally concurrent — you're already potentially putting out O(N^2) ~concurrent updates to that bucket. You really don't want to make it O(N^3).


Bucketing them by prefix of the end-user ID is not exactly smart.

Either you bucket by an internal ID and give the user a hash, or you give the user an ID and bucket by your internal hash.

Users have no business knowing your sharding scheme.


Can you even block off IDs like this?


If you’re big enough to need it I’m sure it can be arranged.


If you're big enough to think you need it, you're also big enough to have people who can tell you it's a bad idea and there's a better tool for the job.


aws support is not going to bend over backwards just to let you shoot yourself in the foot. it's more likely they grant an exception to one of the iam quotas.


If there are a lot of them and there is a b-tree index somewhere you might find it useful to scan them in alphabetical order.


Well, that assumes that the ID is cryptographically random. Perhaps that is a bad assumption.


My general assumption is not that they’re random, but at least that they’re not correlated; in particular that Amazon is not in the habit of handing out, like, account IDs 676363687000 - 676363687999 to a single organization. Even if they did hand out a sequential batch of 1000 account IDs, it would be more likely to be 676363687541 - 676363688540 than a set with a single consistent prefix.

Odds are that an account wildcard match like 676363687* will just match a few hundred entirely random AWS accounts.


> in particular that Amazon is not in the habit of handing out, like, account IDs 676363687000 - 676363687999 to a single organization

Honestly, wouldn't surprise me that much if they were willing to accommodate this if for sufficiently large accounts. It'd still pretty sketchy to design your access control around, but it wouldn't be unrealistic.


I once was involved in creating two (linked) amazon accounts at the "same" time, and ended up with account IDs of which the first 4 digits are identical.


A namespace you and only 100 million other accounts share - probably reasonable to just grant access to “1234*”.


Even if it's not that doesn't mean pattern matching IDs is good.


Oh god, no. Seems like a bad idea indeed! But might give some insight into their system.


it's irrelevant whether they're "cryptographically" random, all that matters is that account IDs are not controlled by the user and therefore have no logical relation to any access-control policies the user may wish to implement


If it's sequential, that somehow seems worse?


Probably comes out of a tendency to generalize. Last week for one of my side projects I built something that lets you write queries in a format inspired by OWL and it has a library of relational operators that can, say, extract the host from a URL or do a prefix query, like query, regex query, etc.

As it is my side project’s side project I do what is easy so these operators are always available even in cases where they don’t make sense (I dunno what happens if you try a regex query on a number, I don’t care) I can imagine there is something a bit like this inside AWS but for a security-sensitive system with a lot of users you have a different standard.


This is like matching a bitfield on the group ID on a Unix system. I could see someone coming up with this idea, and I could see them thinking this is somehow smart, but implementing it on systems that aren't 100% in your control would just be silly.


While I wouldn't publicly hand out my account IDs as a general practice, I think you have to expect that some of them will be disclosed at some point. As more third party vendors and SaaS platforms move away from IAM users and access keys to using role assumption as the preferred method of integration (as they should!), the account ID of at least the account you use as their integration point is now known by another party, who have their own dependencies, vulnerabilities, etc.


This is what I’m curious to learn. What can an attacker do with an AWS account ID? How is that any different from knowing someone’s email address?


If the account and its resources are properly configured - not much.

But that can be easier said than done for many organizations, especially when you have lots of different teams configuring their own environments.


Once I have an AWS account ID, my next trick is to grant cross-account bucket policies to discover role names in the account.


How does this work?


If you put a role ARN in the principal section of a bucket policy, AWS will check if the role exists and fail the policy update request if not. Even if it's not in the same account. Don't know if there's another way but you can manually enumerate roles from there


Useful social engineering datapoint


AWS account ID == Your IP address. It may be sensitive, but someone needs to know it to get s*$t done.

Illustrative example: I had to deal with a third party that we needed to integrate with because of anti-money laundering procedures a year or two ago. I wanted my team to setup a privatelink with the organization because that's generally more secure than an open sftp port. The company refused citing security reasons to hide their Account Id (it's needed for the role ARN used for reciprocal permissions to PV endpoints). So what did we do?

We ended up whitelisting a range of public IPs they use for inbound port 22...

Moral of the story: you may think you are a genius for obfuscating your IDs, but you can't really run a business unless people have an address back to you


AWS PrivateLink has another property that generally makes it undesirable for these types of integrations: communication is bidirectional, and IP subnets should not overlap.

We (as a vendor ourselves) typically integrate as a VPC Endpoint Service, where communication is unidirectional and our service is exposed as a load balancer’s endpoint within the customer’s VPC.


I thought PrivateLink was branding for vpc interface endpoints? There's no ip subject restriction for that because it's basically a proxy. Are you thinking of vpc peering?

VPC endpoints seem preferable in this situation.


Oh I apologize, you’re completely correct and I am confusing PrivateLink with VPC peering. Thank you for correcting me.


> get s*$t done

What were you hoping to achieve with this utterly pointless self-censorship?


For those interested, we put the code online here: https://github.com/tracebit-com/find-s3-account


I am not sure this would be in agreement with these policies, or at least the spirit of them: https://aws.amazon.com/security/penetration-testing/


OP's article said they consulted with Amazon's security team before publishing, so I imagine they know what's allowed in this case.


It says he consulted but does not say what was their answer. I can't imagine it was a thumbs up, probably an embarrassed silence?


Yes, for the avoidance of doubt - we got the OK from AWS to publish this research


Reminds me of the old slogan for Kix cereal.

"Kid Tested. Mother Approved."

Kids tested it but we don't know if they approved it. We don't know if mothers tested it; we only know they approved it.


Why all the doubt? "not sure" "can't imagine"

When the source says they already did their due diligence...


The initial text was ambiguous but the author has now clarified their answer in this thread. Do you really think they were happy with this? I actually think this might open other attack vectors.

I agree that the account number just by itself is not a secret, but there is a reason why all AWS demo videos mask the account number.


They can fix the bug if they don't like it.


This is my attitude towards security disclosures. In this case, Amazon approved the disclosure. But even if they hadn't, it's better for the good guys and bad guys to know about problems when the alternative is only the bad guys knowing (or the bad guys and a few good guys at the affected company).


For sure an interesting find, but was kinda hoping based on the title that there was a more straightforward way to do this.

I really wish that AWS had a simple way from an admin account to ask "where is X resource" within an organization to quickly tell me which account has a specific S3 bucket (and other things, but s3 buckets is the big one).

Admittedly this is mostly an issue with legacy buckets that existed before better practices and buckets all being defined in code. But with a ton of AWS accounts it can be tedious to hunt down a resource in an unknown account and possibly region.


If you use AWS config setup for the organization (aggregator), you'll get a athena-sql-queryable inventory of all your resources from all organization accounts.

So finding out which account owns a resource can be as simple as, roughly: select accountId where arn = "x"


You can also do this with steam pipe.

It might not scale well beyond tens of accounts though, depending in your query…


... how did I not know this existed.

That is exactly how we are setup, the amount of time I just spent going account by account looking for a specific resource.

Thank you! I have long wondered why it didn't exist, and apparently it did...


Be aware that AWS Config is not free. https://aws.amazon.com/config/pricing/


Yeah it's pretty nice feature wise but surprisingly expensive given all it really does is run API calls in a loop and export to S3


Other public AWS resources with global namespaces also reveal AWS account IDs:

https://blog.plerion.com/conditional-love-for-aws-metadata-e...


Slightly related - CloudFlare account_id and zone_id are safe to be public

https://github.com/cloudflare/cloudflare-docs/issues/474

https://community.cloudflare.com/t/api-zone-id/355566

> The Zone ID and Account ID are not sensitive. Sensitive data like account API Key, Secrets etc. can all be revoked, rotated or changed. See the comment 36 below on the Wrangler repo: as per our security team, it’s completely Fine to have your zone_id and account_id public, the Global API key and associated email address should be kept secret.


AWS account ID is also safe to be public.

That said, one thing I could think of that this could be used for is correlation. If you’re running multiple S3 sites from the same AWS account, people would be able to see that they’re hosted by the same account. Whether or not this matters depends on your threat model.


Exactly - this isn't going to open the door for someone but could add a ton of value to enumeration.

As we are very canary focused, we also think it's interesting to consider the implications of the recent research from Truffle Security w.r.t canary tokens (https://trufflesecurity.com/blog/canaries).


> AWS account ID is also safe to be public.

Not necessarily. An AWS account ID + the knowledge of a role name that by mistake has the "allow role assumption" allowlist too wide (say "*") is now enough to take over the account.

One might of course say "well then don't do that", but of course the more complex a system like IAM is the easier it is for unexperienced people to open the floodgates.


Well sure, but your account email isn't safe to be public if your password is "password".


Email providers have rate limits against specific user logins, IAM not.


How do you know that?


That's easy to find out: change the API credentials of a user, but forget to update the service. Notice only a few days later that you forgot the change, but you also never got any notification "something" is going wrong.

In contrast, every half-decent IdP will lock an account automatically after anything from 3-10 wrong attempts.


Turning what you said around, you're arguing you might want to keep an account ID secret for "security by obscurity" reasons. In my mind, even in a multi-layer security solution, even then the account ID should be considered as a public string whose knowledge (along with other bits like what misconfigurations it has) provides no additional vector of attack, because of defense in depth.


Oh, there are other things... like a compromised chrome extension that only looks for your accountID being used and sends back credentials.

Knowing an accountId tells you where to focus your efforts. You got through one hoop (of many).


For CF accounts, I use the gmail `+ feature` to make a totally unique email address that cannot be easily guessed. Not perfect, but adds another layer of abstraction.


If I remember right, the pre-signed url generated for R2 uploads include the account ID in it.


Related: AWS key IDs (not the secret key part) include your account ID within them, bitshifted by one position:

https://medium.com/@TalBeerySec/a-short-note-on-aws-key-id-f...

These key IDs are included in the URL for pre-signed links to S3, so there's a good chance you've already been publishing your account ID.


Quite a few people in this thread assume that the AWS key id is part of a "security by obscurity" "protection in depth".

This will probably be downvoted, but if you read this anyway: this is a good example of why "security by obscurity" is not a good defense. You will overlook something (a determined attacker will not)

Anything non-"security by obscurity" does not depend on you understanding something or not - it will apply, no matter what, as long as the attacker hasn't a genius on payroll which cracks e.g. AES-256 just so (https://www.youtube.com/watch?v=KEkrWRHCDQU)


To me security by obscurity is limited to things like this:

There is a way to view bananas at

/bananas/:bananaUUID

unsecured endpoint.

I don’t want people to get all my banana data, but as long as there isn’t an easy way to list banana uuids, that endpoint is basically effective security by obscurity.


That's an unfortunately common misconception. Your example is not security though obscurity any more than password authentication is, though.

Security through obscurity means substituting security for a flawed algorithm that is usually trivial to exploit if the attacker is made aware of the algorithm. Think things like no authentication and ROT13ing and Base64ing clientside. If the method leaks or is discovered, the whole system is broken.

You just told me your algorithm and I cannot get to your banana because the UUID key space is insanely large. So that's not security to obscurity.


There are some important caveats to consider: Client and server software will not handle URLs like secrets, so UUIDs will leak out through various channels. Some examples include logs, user analytics, ad networks, browser history, bookmarks, e-mail, instant messages, shady browsers, shady ISPs, referrer headers, etc. You cannot rotate resource identifiers without breaking clients, so a leaked URL is permanently leaked.

Hopefully you're using version 4 UUIDs. Those set aside 6 bits to encode UUID details, keeping 122 bits of entropy. Since every banana needs its own identifier, subtract the number of bits needed to uniquely represent bananas. What's left will unavoidably be less guess-resistant than client secrets. Other versions of UUID use many more bits for low-entropy purposes.


OWASP has some specific guidance on this while clarifying the need for access controls: https://cheatsheetseries.owasp.org/cheatsheets/Insecure_Dire...


How might this matter? A obvious one: Given a production bucket, it’s now possible to find development buckets for that same org, which is not expected behavior IMO.


Only true if they use the same accounts for both production and development. This would be another reason not to use the same accounts.


You only need the bucket name to do that. You should include a randomly generated prefix/suffix in bucket names to prevent against such enumeration attempts. Another good idea (as well as, not instead of) is to expose objects in buckets publicly with a non-default host name, such that the bucket name isn’t leaked at all.


Or, for read scenarios, putting a CloudFront distribution in front of the bucket!


How so?

You'd have the know the name of the (development) bucket first, right?


And this is why you pad the bucket name with random chars.


And Cloudformation will do this as the default


Unlikely, unless dev buckets are somehow in the same account.


it's bad practice but it's more common than you think especially with older accounts (pre-organization)


> While account IDs, like any identifying information, should be used and shared carefully, they are not considered secret, sensitive, or confidential information.

https://docs.aws.amazon.com/accounts/latest/reference/manage...


Seems like at least in the digital world, there is either public or private information, and that's it. We don't really have a good concept of privilege or protected information.

For example, my home address is technically public, but I most certainly wouldn't want it lambasted across the interstate with a picture of my family next to it advertising where I live. It's handed out on a need-to-know basis, and I mostly trust / expect that it's kept mostly confidential, or use-limited.


One huge mistake that Google did when they were integrating youtube with Google+, was the idea of sharing people's youtube comments with their G+ friends. Youtube comments have always been public, but there was huge customer pushback, forcing them to revert them for this idea, since there is in people's mind a huge difference between public and publicized comments.


Yep, that and other things like reviews wm you did went straight to your Google+ page.

I really wanted G+ to work, but they were just too stupid to understand that this was a deal-breaker.


Also G+ made your email address public to your friends/circles/etc and I don’t think there was a way to disable it.


What does this mean?

If they're not secret, sensitive, or confidential, then why must they be shared carefully?


Usually, the more information an attacker has about you, the higher the chances of coming up with a successful attack vector.


„So let’s give them an API to make their job easier!”


Maybe information that's not secret or confidential, but could still be used to exploit you, should be called something like "sensitive" information.


There is plenty of information that you wouldn't necessarily want to publish, but wouldn't be the end of the world if it were leaked either.


A hacker can enumerate the resources of and access to an account by using its account ID. Many AWS customers incorrectly configure resources such that any other AWS customer can access things in their AWS account; for example, some 3rd party providers tell customers to configure access for the 3rd party, that can lead to wide-open access for anyone who knows the customer's account ID. If the cloud admins don't know what they're doing it can be very easy to screw up and not realize it.

It's sort of like giving someone your IP address. By itself it's not enough to hack someone. But if your host is insecure, it sure makes it easier knowing exactly where to attack.


There are levels of data classification, and different regulations and policies apply to each level and geo. They are probably disavowing themselves of any liability.

https://docs.aws.amazon.com/whitepapers/latest/data-classifi...


This is a good question for any provider like AWS--what kinds of information do I leak with seemlingly mundane choices like bucket names.

The other attack vector is from insiders. Many organizations "shield" identifiable information behind UUIDs or some other scheme. In the event of a breach, the UUID might mean nothing to most (it's not foolproof, though), but opens more doors for an insider.


My personal phone number is not exactly secret or confidential, but I only share it carefully.


> they are not considered secret, sensitive, or confidential information

... by us (it should say).

Users may consider it differently.


Any scenario where the actual hack results in the time-honored trope/misunderstanding of "hacking" the password one character at a time is awesome


Account IDs are 12 digit random numbers. They are used to identify an AWS account, that's all. Knowledge of the full 12 digits doesn't grant access to anything, prove you own the account, or enable you to authenticate to any systems (or individuals like in customer service for example) in AWS.

They are visible when ever you share something with another AWS account, they're in the ARN. For example, the 12 digit account IDs of all a vendors that vend AMIs, assume roles on AWS accounts (think datadog, for logging / metrics) or otherwise provide services have AWS Account IDs that are well known and easily discoverable. This s3 example is just sort of interesting since its one of the handful of AWS services that don't use account IDs in ARNs.

AWS account ids are not secrets and treating them as secrets or giving the impression that they are anything other than public data is a distraction from real security concerns.


What value is there in allowing wildcard account id matching? I can't think of a legitimate use case.


It's likely only allowed because someone didn't think to explicitly disallow it.


This is probably the rule engine of the policies being used for something less useful


I have always likened AWS Account IDs to be similar to public keys. But even less useful. There is nothing useful that you can do with them without other information.


I think the more worrying attack vector is when you now use the account number to try and Allowlist a principal from that account in some policy in your account. If the principal doesn't exist in the other account you'll get a role/user not found error!

Presumably you could use this to find real principals in the other account.


There seems to be a large discussion of whether account IDs are "secret" or "private" or "confidential" or whatever.

From my point of view, that entirely misses the point. The problem here is that what's revealed here is the relationship between buckets and account IDs, which allows discovery of shared ownership of buckets (unless you use a micro-account approach).

I probably don't care if you can discover that 2343242365 is the account number associated with "coolbuttplugs.com" but I probably do care if the same account hosts a bucket for "michaeljfoobar.name" and my buttplug thing is a sideshow from my white shoe law practice.


Accounts on AWS are pretty cheap (free?) - why would you host everything on the same account?


In my company we use reseller billing as AWS do not have a local billing entity. The reseller owns the organization's root account and we do not have access to it. Every subaccount creation require a support email to the reseller.


free as in: AWS does not charge you

not free as in: you have to manage it (for example give a CI role access via OIDC, create a role for you to assume to do stuff via the console, etc)


My favorite factoid about AWS accounts is that there's a global, hard rate limit for their deletion. It's actually a pain point for us.


You may be underestimating human laziness...


Interesting post. I wonder if you can further combine it by using a PrincipalTag in some way? You can assume a number of roles with different tag values, and these can be interpolated into the condition. This lets you do things without a huge statement?


Will it make it easier to identify the operators of malvertising campaigns hosted on AWS?


Doubt they use their real information on their AWS accounts.


Just need enough information to send to either Amazon or the FBI.


Pretty sure AWS always could look up the info of an account if a bucket was used for crimes. Adding the name of a fake account probably doesnt add anything for them.


Shouldn't the bucket name / URL be enough?


The answer to these types of questions is, depends upon who gets the case. Generally, the further along you've gotten within legal limits, the better the chance that the case will be given serious consideration and time.


Can we go further, and find the email associated with an account ID?


i dont think we can.

id is static while email can be changed.


Yes, it can't. But we can find other associated information with Hunter.io like tools.

We are doing something like this here but find links between Apps and SDKs: https://www.fork.ai/technologies


How about the other way: email -> account ID?


TIL: AWS Account IDs are considered secrets. IMO, if a given value cannot be cycled, it is sensitive but not secret.


AWS Account IDs are not secret and don’t need to be. AWS doesn’t design anything that assumes your account ID is secret, and you shouldn’t either.


While I agree, the way you wrote this might mislead people into thinking that you meant you can treat your account ID as non-secret because AWS does. That doesn't directly follow; the difference between AWS's point of view and your company's point of view means there are things you might care about that AWS does not. Rather, you need to follow AWS's lead and design your cloud deployment such that the account ID doesn't leak anything interesting about your business that you'd prefer to keep private. Correlated ownership of buckets (e.g. different clients of the same design firm) in particular is something AWS doesn't care about, but you might. If you don't want people to know that the buckets have the same owner, better split it up into separate accounts to preserve the non-secretness of the account ID.


No, this is wrong. The fact that AWS does not consider the ID a secret means that your company never should either. If your company “cares” about its ID being secret, then your company is designing its systems in an insecure way.

The fact that AWS does not treat the ID as secret means you have no guarantees that anyone within AWS cannot see or find your ID. You also have no guarantee that AWS at some point won’t expose your ID to the world and break your entire security model, because AWS doesn’t think it’s secret. If you do, you’re basing your security off of false assumptions.


You've said "this is wrong" and then repeated exactly my point back to me. I suspect some misunderstanding has occurred here. We are agreeing on this--you need to follow AWS's lead and design your cloud deployment such that the account ID doesn't leak anything interesting about your business. That's a direct quote from my original post. I further gave an example--if you're a design firm using a single account for different clients, you're leaking the information that they are your customers. Better split it up into separate accounts to preserve the non-secretness of the account ID--another direct quote from my original post.

Stated another way: you can unintentionally make your account ID sensitive if you're not careful. You have to be careful.


I think we agree, but either your meaning or your words are giving me pause.

> you can treat your account ID as non-secret because AWS does. That doesn't directly follow;

It does follow, and not only that, but not only “can” you treat them as non-secret, you _must_ treat them as non-secret.

> the difference between AWS's point of view and your company's point of view means there are things you might care about that AWS does not

The point here is that if you want to have good security, you _cannot_ “care” about this if your service provider does not also care about it. If you “care” about your ID being public, but your provider does not, then if you want to have good security you must either find a way to not care, or find another provider.


Correlation of bucket ownership isn't a security issue at all. I never used that word in my posts, and the author of this article also never suggested that it was. That's the point--there are other considerations that AWS does not care about, but your business does. Your cloud deployment needs to be designed such that the account IDs do not leak information that your business doesn't want to be revealed. You don't get the non-secretness for "free"--you have to think about it and be careful on how you isolate things into accounts. As far as I can tell, we agree on all of this and have just suffered some misunderstanding in this thread.


>The fact that AWS does not consider the ID a secret means that your company never should either.

You and gp are talking past each other.

Your focus is on Account ID being public should not be a security vulnerability.

Instead, the gp's focus is on metadata leakage of identity.

Same type of conversation that differentiates concepts of "public key" vs "published key" of SSH keys:

https://news.ycombinator.com/item?id=29209312


No, this is wrong. Just because it is not a secret does not mean that is disclosure is not valuable. You need that information to compromise it.

It narrows all the possible account IDs to one.

Is that ID already compromised? Can you gain access through someone else that does the work for you?

You can cross reference with other systems you have compromised. Is that account ID in their system? What access does that give you?

Etc, etc. It is not a secret, but it absolutely is valuable.

For example, if you considered it a secret (even though it is not), you might not choose to use third parties that require it, and that can improve your security posture.


No, operational security might still a valid concern even where there are no issues with digital security.

You might have a 100% secure system, but you don't want your competitor to know exactly what you are doing. You might also not consider their knowledge of what you're doing to be a vulnerability, nor think that you should spend many resources on preventing them from knowing, you just don't want to make it easy for them.

Cryptography is very black-and-white. Business operational intelligence is not.


AWS doesn't consider them sensitive, but some organizations do. My feeling is that I would rather avoid leaking any info about my AWS environments. Just because AWS doesn't think they can be dangerous doesn't mean someone else won't find a way.


And those organizations are building their security model off of false assumptions, and are wrong.


Defense in depth, always assume that anything can get used as an attack vector including AWS itself.


It’s not defense in depth to build your security model off of false assumptions, it’s just bad security.


I don't understand your reasoning. Neither my name nor where I live nor my phone number nor my license plate are secrets. Yet I don't go around wearing a t-shirt with my personally identifying information printed on it. What am I missing?


This is posted in other places but Amazon explicitly says they're not secret, sensitive, or confidential.

https://docs.aws.amazon.com/accounts/latest/reference/manage...

> While account IDs, like any identifying information, should be used and shared carefully, they are not considered secret, sensitive, or confidential information.


If you knew that there was an assassin out there trying to find you and cause you harm, would you consider “I don’t wear a t-shirt with my address on it” to be any form of protection against that assassin?

If you want protection against that, you need to focus on better home security, hiring bodyguards, going to the police etc, and you’re better off assuming that the assassin will find your address regardless of whether or not you wear a t-shirt with it printed on.

Put another way: there’s a difference between “I don’t do this thing” and “I rely on not doing this thing for my safety”. The second one makes a lot more assumptions than the first one does, and those assumptions can lead to problems if they are false assumptions.


> Put another way: there’s a difference between “I don’t do this thing” and “I rely on not doing this thing for my safety”.

That was my point. The fact that some organizations consider AWS account IDs sensitive is independent of whether they rely on it being sensitive or not.

I might have taken all precautions against an assassin attack, yet I won't make the assassin's job easier by announcing my PII to them. The fact that I won't announce my PII says nothing about whether I took the security precautions.


If an organization considers it sensitive, that implies they’re putting some level of reliance on it being so. Otherwise there would be no point in considering it sensitive.

There’s a difference between “making it easier for an attacker” and using it as a security control, even if it’s not the only security control. The point is that even if you don’t go around wearing a shirt with your address on it, that should never factor in to your designs for security. It should never be considered a security control, even a “defense in depth” one.

In fact, your threat model should ideally ask the question “assume someone does walk around with a shirt with my address on it, will I still be safe?” That doesn’t mean you’re actually going to go do it, but if the answer is yes, that’s how you know you’ve done your job.


It's hubris to think any security measures are completely safe. Painting a target is a bad idea.


You’re still misunderstanding. I’m not saying you should go “paint a target” on yourself, I’m saying you should assume _someone else is_ going to paint a target on you, and defend yourself accordingly, rather than acting like the lack of a target protects you in any way.


> I’m not saying you should go “paint a target” on yourself,

Then I don't understand why you object so strongly to the tshirt example unless you're deliberately talking past the person that made it.


If you place any sort of security assumptions on AWS account IDs into your threat model, you're effectively directly introducing a security vulnerability. If you're not then why include it into your security threat model to begin with? I believe that is their point.

Since AWS does not, and has never, treated that information as secret, then there is absolutely no reason to consider it sensitive because there is no security guarantees with how AWS handles those IDs (as this article demonstrates).

Thus, either you're including them into your threat model as sensitive and thus immediately opening up yourself to vulnerabilities (bad security), or you're not including them at all (and thus not treating them as sensitive/secret/whatever). The argument the parent had (and that I agree with) is that you should do the latter unless AWS provides a means to work with those IDs securely (it won't because they're not secrets).


This is a false dichotomy. There is a deep chasm between "publish everything", and "this is a secret", called operational security.

Any time a topic like this comes up, there are people on this forum that try to apply the "security by obscurity does not work" principle to every security topic under the sun, when in reality, that principle really only applies to the world of cryptography. In meat space, where humans operate on plaintext, keeping a secret is a very valid approach to some topics. This is why things like NDAs exist.


How is it possible for a user of AWS to keep the account ID secret if Amazon doesn't even consider it secret? If Amazon leaked your account ID they could point to their docs and say the account ID was never meant to be a secret, sensitive, or confidential.


You can walk over to the user's desk and ask them not to share it. Whether or not Amazon leaks it is unrelated to my employees' ability to follow instructions.

There is a lot of data that exists in a space somewhere between "100% secret" and "100% public". This is one of those situations, for many organizations.


You’re wasting your employees time by asking them to keep it secret, when you gain absolutely no benefit from keeping it secret (and in fact are introducing an easy failure point by pretending it’s secret) and you have no guarantees that others are keeping it secret.

> This is one of those situations, for many organizations.

And those organizations are wrong.


Many organizations just have a blanket policy that you shouldn't be exposing data about an organization's infrastructure unless you need to do it. This is a good policy.

> by pretending it’s secret

No, nobody needs to pretend it is secret. You're missing my above point. There is not a dichotomy between secret and public. It is possible for something to be neither secret nor public.


And what OpSec benefit is there?


Well, as a very relevant example, if you tell others what your AWS account ID is, they can figure out if you own any particular bucket. The metadata association between the content of that bucket and the owner might give away information that the contents of the bucket doesn't indicate on its own. It also might not be a technical vulnerability, but that association itself could imply some proprietary business information. Or it could give clues to any would-be attacker as to other resources to target. In business, there are lots of types of information that are not secret, but are also not public.


If you assume it is a secret, you might start to use it as an access key, or similar things. That's where the error is. You should assume it is public (but avoid publishing it yourself of course, that's defense in depth)


I think making the assumption that account IDs won’t leak is the problem. It’s safer to assume that someone out there already has a directory of account IDs to buckets and that your efforts are better expended on other things, like actual secrets.


I've gotten in long arguments with senior IT people who refuse to believe this even when they talk to AWS directly about it.


I guess anything that looks like random letters/numbers is going to be considered a secret to some people. If it wasn't meant to be secret, it would be human readable. Or some such nonsense.


Yes. *

* Security through obscurity provides secondary security only, so it doesn't add to defense-in-depth significantly. If it can be added, then it is slightly safer to prefer to do so. Elimination of unprivileged enumeration and internal primary key predictability are relatively more important to reduce the attack surface.


AWS do not define account IDs as a secret (https://docs.aws.amazon.com/accounts/latest/reference/manage...) but until now it's not been possible to do this look up.


They’re not a secret, in fact it’s what you use to share the identity of one account with another, possibly third party’s account. It’s a secret in the way your personal email address is. You may not advertise it on 4chan, but you definitely share it to limited audiences. Once you’ve shared it it’s no longer an actual secret. Indeed, as has been noted, aws itself doesn’t treat your account id as a secret and freely shares it within aws, logs it in plain text, uses it for operations with human operators, internal reporting, etc. It’s an identifier, it’s discoverable, shareable, and leaks all over the place.

This is not the stuff secrets are made of.

Trying to eek some sort of security story out of hiding account ids plays right into security through obscurity. As it’s not treated as a secret the most you’re doing is obscuring the attack surface of your infrastructure. IAM doesn’t allow anyone to use the knowledge of your account ID to grant any privilege not specifically granted within the account itself via two way grants. Holes in your IAM policies aren’t protected by hiding the account ID, they’re protected by closing the holes.


Yes, but I do use random email addresses (that look like passwords) to stop correlation, concerted credential stuffing (that might lock me out for a time), etc. And if it pops up in a compromised server no one will know that email address is mine.


And aws also offers ideas like external IDs and similar concepts to do something like this.

But you might notice that even aws services advertise their own account ids. They’re not secrets and treating them as such doesn’t help you improve security.


AWS account IDs aren’t secret, but the identity of who owns particular S3 bucket names is meant to be.

If you get an email apparently from AWS that correctly names an S3 bucket and the associated account ID, are you more likely to take it seriously than an email that just names a bucket?


Not to mention it also means you can establish relationships between buckets by finding common ownership.

No, the account ID isn’t secret but I don’t think we should be dismissive of this new information either. It’s still important metadata to factor into decision making.


More generally, this opens up all sorts of threat models for espionage and alternative data. A private bucket called project-foobar-logs-staging might have its existence known in an index but never have been known to be associated with Baz Corp's publicly-known image buckets - but with this disclosure, a database will become available making that association possible. And it will be possible to monitor if and when project-foobar-logs-prod gets provisioned. Not to mention that state-level actors would love to see an enumerable list of projects their rivals are working on, and if any of them happen to have been secured only by obscurity in the past.

Edge cases, to be sure, but certainly nontrivial.


Everyone here throwing down the “technically not a secret” card but I don’t think that was your point. I guess “sensitive, but not secret” is the right way to say it. They’re obviously not secret like credentials, but it’s a damn good piece of info for an attacker to hold against you, because, like you said, it’s impossible to rotate, but is tangentially related to every single part of your AWS infrastructure (per-account of course).


It's not even sensitive or confidential though, Amazon says that themselves in their docs.


I guess I disagree with the AWS documentation then. It should be treated as a sensitive piece of information that should not be exposed to the public. It’s not going to bring you down if it’s leaked, but it could be used along with other leaked info over years in more sophisticated attacks. Why risk it is more the point.


Not at all.

In fact, when delegating IAM access (where security is top of mind), account IDs are shared liberally.

Account IDs are as secret as phone numbers. That bit of info could be tangentially useful to an attack, but really shouldn't be assumed to be secret in any meaningful way.


It's valuable info to any pentester. Security by obscurity is still one layer of security.


But there’s a reason people advise against security by obscurity: the obscurity can go away at any time and once it does it’s unrecoverable.

So sure, maybe pat yourself on the back today that on top of your other measures, no one outside your org and AWS knows your Account ID. But if it gets out at some point, those other measures should be foiling your pentesters on their own. In fact, to better test that this is the case, you should probably give the pentesters your account ID so you can be informed regarding your security in this scenario.


Unfortunately, until humans have crypto modules installed in their brains, we will have to rely on some form of "security by obscurity" that many colloquially call "operational security".

Yes, there are types of data and metadata about your company's infrastructure that can be used against you. But no, you shouldn't hand it to attackers on a silver platter.

> In fact, to better test that this is the case, you should probably give the pentesters your account ID so you can be informed regarding your security in this scenario.

That depends on the scope of the test. Many organizations will do both (and others) to test different layers of security. Remote software exploits are something that many people on this forum are concerned about, but that is hardly the be-all-end-all of security for an organization. There's a lot of security topics entirely outside the scope of computer systems to be cognisant about here.


When obscurity is the only option, like opting not to expose AWS account ID, why not do it? Don’t think this should be conflated with replacing proper security with obscurity, which is obviously wrong. The less you expose about your operation, the better. Not exposing details of your infrastructure to the public is another commonly suggested best practice that is obscurity, not security, that nobody challenges. For example, cleaning up common default response headers for things like cloud front.


You can't keep it secret because Amazon doesn't keep it secret.

So if Amazon doesn't keep the account ID a secret how can you as a user of Amazon be expected to keep your account ID secret? There's no way for you to stop Amazon from exposing it.


You as a user of Amazon can do whatever you want though, including being careful about which pieces of information you elect to expose publicly. If it’s up to you, why choose to expose it, when you can choose NOT to? Just because this reverse search of bucket to account id exists doesn’t mean you should begin to expose your account id on your own.


You can't choose NOT to expose it because you can't stop Amazon from exposing it.

Amazon says it's not secret, so it's not secret. They make no attempts or guarantees to keep it secret so there's always the threat that Amazon themselves can expose it on your behalf. You can't stop that no matter how wrong you think it is.


Doesn't change the fact that security by obscurity is a layer of security. It's not a good layer, it's not one you should depend on, but it is a layer that causes problems for attackers.


Any tips on how to do the same for EC2 instances? Am aware of one that’s allegedly joined to our domain but can’t find it in any owned accounts


If you're lucky and it has a instance profile attached with appropriate role/policy attached you can use get caller identity to see what account it's running in: https://docs.aws.amazon.com/cli/latest/reference/sts/get-cal...


Perhaps this can make s3fs and similar mounting tools easier to use; just plug in the bucket, enter password, and it mounts.


Maybe their logic is to let authorities track people who misuses it?


Sometimes I am having trouble finding my own AWS account id...


This is kind of a big deal, but also not really? UUIDs are not IDs. And IDs are not always indices. This is a common design flaw in many distributed systems.


> the ability to use StringLike conditions

so much depends

upon

a regular

expression


ITT lots of debate on whether AWS Account IDs are sensitive or not. To chime in my 2c; we've had this debate in multiple orgs with different security teams and the outcome has always been the same; they're not and it's counterproductive to your security posture to treat it as privileged information. Humans have a nasty habit of placing trust in people who have access to privileged information.

"Hi, this is Tom from AWS, I need to speak with you about your account 5923965523" - as a social engineering primer garners significantly different levels of trust from the target depending on whether the target perceives the account ID to be privileged information.


I apologize in advance for what Im going to say here.

AWS account ID is not sensitive data in any way. Just because you can screw up a config doesnt make a user name or account id “sensitive”.

Its not more sensitive than an email address. What is wrong with you people? Where did you come from, and why are you so dumb?


aws s3 bucket needs to match the domain for website hosting.

let's say if apple uses s3, they need to create bucket name "apple.com", and then we can find what aws account which apple is using.


Not exactly - they Can do that, but they don't need to match - using cloudfront in front of your s3 website and then the site can be served from any bucket or folder within a bucket.

I have dozens of s3 static websites served from a single s3 bucket, all with unique top-level domains, each in its own folder within that single bucket - much easier this way.

Given anyone can create a bucket with any name (if its not already in use), you can't count on getting the bucket name that matches your domain name.


ah thanks. iirc it used to be like that ( when use without cloudfront).


> aws s3 bucket needs to match the domain for website hosting.

This is outdated information, and not required anymore when using CloudFront.

And even in the past, you could use the S3 API to implement a reverse proxy without matching bucket and domain names.


That is not how s3 buckets work.


it used to be like that (without cloudfront)

https://docs.aws.amazon.com/AmazonS3/latest/userguide/websit...


So are S3 Crawler bots inbound that will be used to exploit and blackmail S3 bucket owners... via doxxing?

EDIT: isnt one of the S's "secure"....

Isnt it like THE FIRST S?!?!?!?

EDIT

I get it! - I forgot the three Ss'!

Shove it.


> Shove it

… in Simple Storage Service?


No. None of the Ss in S3 stand for Secure.

And this still doesn’t let you tie an account ID to an email or human.


Of S3?

No, it's Simple. Simple Storage Service


Exploiting incorrect/weak IAM permissions I would assume.


No. Simple Storage Service




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: