Hacker News new | past | comments | ask | show | jobs | submit login

December 2018 snapshot refers to Department of Transport: https://web.archive.org/web/20181227091013/http://charts.dft....

The CNAME of charts.dft.gov.uk.s3-website-eu-west-1.amazonaws.com still works, but the reverse DNS of that IP is simply s3-website-eu-west-1.amazonaws.com: I am not sure how does one gain control of an s3-website subdomain when "abandoned" (bucket name only?), but someone did.

So the scenario someone described below is pretty likely: DoT drops it, and drops AWS use of the name, but leaves the DNS record in. I wouldn't attribute this to anyone in the DoT.

It would still require intentional action to do so, though, so I wonder if anyone has any clue how do people find out about spurious, unused S3 subdomains that still have DNS pointing at them? Scan the entire internet for domains pointing to s3-website, and check AWS API to see if it's available? Or did someone run into this by accident and decided to poke fun at it while earning some cash along the way?




What sometimes happens is someone points a CNAME to a non-existent bucket. Either because they were planning ahead, or someone typo'd a bucket (and thus DNS) name.

There are bots that scan for this. Then someone creates the bucket on S3 and boom, subdomain hijack.


That's what I suggested with

>> Scan the entire internet for domains pointing to s3-website, and check AWS API to see if it's available?

What I wonder is how do you scan all the DNS records with their subdomains? Unlike IPv4 address space, which is very decidedly finite and not-too-big, the space of all the subdomains is basically infinite.

Other than using AXFR (zone-transfer DNS request) which is usually restricted, you are searching an unbounded space.

I guess you don't need an AWS API calls since hitting a non-existing bucket with HTTP will let you know: http://something.that.does.not.exist.s3-website-eu-west-1.am...

IOW, how would you write such a bot? :D


> how do you scan all the DNS records with their subdomains?

You needn't do this for stuff that would work in these "Hijack" situations.

Your target is any link that gets visited, maybe following a bookmark somebody made in 2018, maybe it's linked from some page that was never updated, maybe it's in an email somebody archived. If you're phishing you have one set of preferences, if you're doing SEO you have different preferences (you want crawlers to see it but not too many humans).

When anything follows that link, a DNS lookup happens. Most of the world's DNS queries and answers (not who asked, but what is looked up and the answer) are sold in bulk as "passive DNS". You buy a passive DNS feed from one of a handful of big suppliers, or if you're cheap you hijack somebody with money's feed.

So, you're working from a pile like:

  www.google.com A 142.250.200.4
  www.bigbank.com CNAME www1.bigbank.com
  www1.bigbank.com A 10.20.30.40
  charts.dft.gov.uk CNAME charts.dft.gov.uk.s3-website-eu-west-1.amazonaws.com
Obviously you can grep out all those S3 buckets and then you ask S3, hey, does charts.dft.gov.uk exist? And it says of course not, so you create charts.dft.gov.uk as an S3 bucket and you win.


Watching feeds of Certificate Transparency logs, and optionally going beyond those hostnames by using the newly discovered names to find additional ones, is one approach.

Google hosts a page [0] to search them, but there are other services/APIs that let you consume them in realtime - seeing certificate issuance live.

If you wanted to consume them programmatically without a 3rd party service, everything you need is in this repo [1].

0: https://transparencyreport.google.com/https/certificates

1: https://github.com/google/certificate-transparency-community...


There are size and character limits on DNS, so it's not infinite, although it may still be a pretty large space. Charts.(something well known) could have been a dictionary check though.

AXFR makes it a lot easier though.


Ah, I totally forgot about the domain name (255) and label (63) length limits: thanks!

Still, we are looking at roughly 38*255 possible options (a-z, 0-9, a hyphen and dot to separate labels; "roughly" because each label between periods can be up to 64 characters, labels must be non-empty, and hyphens can't start a label).

As you said, it's pretty large: compared to 2*32 of IPv4 or even 2*128 of IPv6, this is more than (2*5)*255 = 2*1275 options.


The most logical to me is, they registered some AWS IPv4 address for one project. Bill didn't get payed and now another customer has been appointed to the same address but now with totally different content. DNS admins at the government forgot about it and here we are.


This is very obviously just an S3 bucket-name takeover, so no IP address was hijacked (the IP is the same for all S3 eu-west-1 buckets, I am guessing).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: