Hacker News new | past | comments | ask | show | jobs | submit login
A Gov.uk site dedicated to porn? (thecrow.uk)
187 points by asadhaider 1 day ago | hide | past | favorite | 87 comments

It seems to have been taken offline now. Here's the archive[1] link for uh.. research. Obviously, NSFW.

[1]: https://web.archive.org/web/20211125154944/http://charts.dft...

Since the site is down - https://archive.ph/tCgnL


I thought this was going to be about some sneaky exploit where they'd manage to get a gov.uk to forward links to porn or something. But no, it's really a whole subdomain just taken over by some sketchy porn site.

I'm wondering if the porn site operators even know it's happening? Seems the most likely thing is the DfT had a site at that URL, hosted on AWS. And then they shut it down without removing the DNS record and Amazon assigned that IP to somebody else.

The thing where IP‡ is in the DNS for thing.mycorp.example and later nobody cares about thing.mycorp.example and they give up control without removing the DNS entry - is why you can't get Let's Encrypt certificates by just running a HTTPS web server and you need either plain HTTP, a custom TLS server (it can also do HTTPS but it needs to know about ACME as well) or else DNS.

Lots of bulk hosts will let you pick (or randomly be assigned) a shared IPv4 address like and then - either by luck or often alphabetical order - your aaardvark.mydomain.example gets to be the "default" host which shouldn't exist for HTTPS but does in many popular half-arsed HTTPS web servers including Apache. So now web clients connect to, they send SNI to the bulk host's server - "I'm here to talk to thing.mycorp.example" and it ignores what they said and gives them aaardvark.mydomain.example because that's the "default" now. And if Let's Encrypt accepted that, you could buy some bulk host accounts, impersonate all these abandoned sites and get certificates for them. So, they had to knock that on the head.

The custom TLS server trick works by (ab)using ALPN, lazily made servers like Apache don't ignore ALPN at least unlike SNI, and so the client learns this server wasn't the one with the ALPN it needed to talk to after all and the certificate isn't issued.

‡ isn't a real public address it's just for example purposes here

> ‡ isn't a real public address it's just for example purposes here

There’s an RFC for that :)


I know, but I'd have to go look them up, so I keep using But do keep badgering me, sooner or later it'll stick in my head.

They’re not particularly memorable. I have already forgotten 2/3 just aftee closing the rfc.

That’s not really what the issue with the tls-sni challenge was.

How that challenge worked was that the CA would give you a certificate for a fictitious name (say, abc123.acme-challenge.invalid) and you had to present it from the host when asked for that name by the client (the CA) though SNI.

Many hosts that share IPs between customers also let those customers upload their own certificates. The attack just involved uploading a challenge certificate for a colocated site, and letting the host serve it as expected. Even if the host _did_ check that the name on the cert was not the name of another customer (which is itself sometimes impossible), and even if the target site was not abandoned, and even if it had correctly functioning HTTPS, these are fictitious made up names, so the attack would still work.

It involved no ignoring the Host header, or really any misconfiguration, that’s why it requires rolling to tls-alpn.

Two things, firstly I wasn't (though I can see why you'd think so) talking about tls-sni-01 but about the original intent to deploy http-01 challenges for HTTPS.

Secondly it requires not merely a misconfiguration but a bug, a bug which is so widespread it was pointless to pretend it would get fixed in the foreseeable future. When you receive SNI for foo.bar.example and you understand SNI but don't have a foo.bar.example TLS provides an explicit error case for that. Servers like Apache httpd don't (or at least didn't) bother implementing this, and instead give you a default site and this enables the hijack. You should still be able to find when this was discovered in the ACME list.

(as per the other comment, my guess is incorrect. I didn't actually look at the DNS. No porn site operator is going to accidentally pick the s3 bucket name 'charts.dft.gov.uk')

I thought this was going to be about a new government website that all UK porn viewers are required to register with. I feel like I've seen a number of threads about that being imminent.

The site in question is charts.dft.gov.uk (VERY NSFW). It resolves to the CNAME charts.dft.gov.uk.s3-website-eu-west-1.amazonaws.com, which is quite clearly hosting a porn site of some kind.

I suppose there's a few possible explanations here: (1) the original site was hosted on S3, and at some point the bucket was dropped and someone else picked it up, (2) it was originally hosted on S3 and the bucket got hacked, (3) someone with access to the DNS has decided to go rogue and point it at a somewhat-legit-looking but fake domain. If there are historical DNS records floating around it might help to narrow down what happened here.

I don't think it was #3: Amazon owns and resolves it for amazonaws.com. If you could hack that, you could do much more serious damage. I'm assuming it's #1. Bucket names are global.

I believe scenario #3 would be as follows:

1. gov.uk’s DNS server used to point charts.dft.gov.uk to something legitimate 2. Someone hacked gov.uk’s DNS server, and changed this one specific domain to CNAME charts.dft.gov.uk.s3-website-eu-west-1.amazonaws.com 3. That same someone set up their porn thing at AWS in a bucket that maps to charts.dft.gov.uk.s3-website-eu-west-1.amazonaws.com

But why such a specific bucket name? Perhaps the perpetrator did it because he knows how the gov.uk DNS is maintained, but then it would be an inside job. If only the process were as tight and clean as in peppa pig land!

I think it is required to name the bucket after the domain name if you want to use it to host static web content: https://docs.aws.amazon.com/AmazonS3/latest/userguide/websit...

I followed few links there and it’s not even a porn site, it just a shallow catalogue of {img-ahref -> img-ahref} which tricks you into “/dating.html” which redirects to some “dating” site. Probably just a SEO bs.

Sub-domain takeover attack. The sub-domain was CNAME'ed to a S3 bucket and the S3 bucket had likely been deleted. The porn purveyor, re-created a new S3 bucket with pr0n.

A scanner that would have caught the vulnerability: https://tech.ovoenergy.com/how-we-prevented-subdomain-takeov...

Or a grey hat scanner for finding sub-domains vulnerable to takeover: https://github.com/m4ll0k/takeover

Yes. These are pretty much standard fodder for bugs reported on somewhere like hackerone. I guess someone who knew what he was doing just decided to take advantage of it lol

> This site is hosted on a Raspberry Pi 4B in the author's living room (behind the couch).

Holding up quite well despite HN frontpage. I love what a bit of caching can do.

EDIT: appears I jinxed it. I get the allure of hosting something in your home, but these days when you can get a decent VPS for $10/yr it doesn’t really make sense.

> I get the allure of hosting something in your home, but these days when you can get a decent VPS for $10/yr it doesn’t really make sense

When you're hosting static content (like presumably this content is; it's down so I can't say for sure), you should distribute it on a CDN for $0/year. A single VPS can be overwhelmed by traffic just as your Raspberry Pi can.

Why doesn't it make sense? After 6 months your Rpi4 will be costing less than the VPS. Plus you get the fun of actually doing it.

P.S. Getting weird RPi errors because of power supply makes you appreciate the value proposition of a good VPS :p

> Why doesn't it make sense? After 6 months your Rpi4 will be costing less than the VPS. Plus you get the fun of actually doing it.

It doesn't make sense because the Raspberry Pi will not be able to serve traffic that one time when your post hits top of HN, which is the one time you really need your hosting plan to work. Yes, it can serve traffic that 99% of time when almost nobody visits your website, but if we look at % of requests served over the timespan of a year, we will see that the website was down for like 95% of users because of that 1 day of downtime.

> After 6 months your Rpi4 will be costing less than the VPS.

No, $10 per year, not per month. That means the rPi payback is 5-6 years, and for inferior hardware and bandwidth.

You and I clearly have different expectations for "a decent VPS"...

(I have a "One time cost access forever!" VPS, which varies me $9/year "maintenance fees", which I'm happy enough with for the money, but it's definitely "Useful for the price" rather than "decent".)

CloudAtCost is not a good reference for cheap VPS, they were never cheap and that maintenance fee make it even worse. Their performance is abysmal too, but that could have changed since the time I used them.

If you want some good cheap VPS, go check on https://www.lowendtalk.com/ you will find plenty of good ones there. I would suggest to pay a bit more and go with BuyVM, at 20$ per year for their 512 MB offering, but you could definitely get some cheaper just as good somewhere else.

Have a look at some of the providers I mention below. I’m not sure who you’re referring to above, but as with all things, it’s a spectrum. Plenty of garbage at that price point, but also some gems.

BuyVM, RamNode, FDC, Virmach (probably in that order).

Only if power is free

and bandwidth. and physical space, and the ability to stay isolated from physical conditions like floods (natural or plumbing-related) or power outages caused by somebody trying to plug in a kettle.

running a commercial VPS in a datacenter has a ton of advantages, but i'm guessing that the guy with a footer like this doesn't really care about them. running your website off a raspi in your living room is cool, wheher it's the most practical solution or not doesn't really matter.

I almost hate myself for writing this - but any given AWS AZ has had more outages/ performance degradations/service interuptions over the past ~5years than my home internet/electricity... and with an unmetered connection with a 200mbps+ uplink, how much more do I really need for a personal page?

Maybe a raspberry pi behind my sofa isn't so stupid?

Of course you could use multi-cloud/multi-azs, but do you really WANT to for a personal website?

Maybe I'm just unlucky, but I live in a Canadian city with pretty stable internet and since I moved 6mo ago (we won't count moving-related downtime) I've had at least a 1hr power outage and 3hrs of internet-related downtime.

I can't remember the last time my little nano server in US-West-2 has been down.

Properly cached, a single core on a low end VPS should be able to carry some serious weight.

But yeah, I agree. This is static content, and should be hosted on any of the gazillion free tier CDNs. But then you don’t get that warm fuzzy feeling of watching the rPi behind your couch melt into the floor.

Unfortunately this looks like a mistake for this context given it isn’t loading now.

Otherwise, for a well-known average traffic load suitable to a Pi, a Pi is a great idea.

Given it's timing out for me, I'm not sure I'd agree it's holding up quite well :P

Thecrow.uk is timing out for me; but not the DfT site.

Sorry for the ot, but do you have any recommendation for $10 vps?

I used to browse lowendbox which occasionally has good deals from smaller companies who've been around for at least a few years but there's always a risk one day they'll sell, shut down operations or worse just disappear. However, if budget is your number one priority, you can get a years VPS hosting for low double digit dollars a year.

Nowadays a host personal projects on scaleway and netcup (EU based). I've been with the first for a could of years and the second for 6 months now, good service from both.

If you're mainly hosting static* or cacheable content, you may even get by with a raspberry pi running behind cloudflare's free plan with cache enabled. If you don't mind all traffic to your site being served by such a third party of course.

* If you only have static content, GitHub pages can be considered too.

You can get Oracle Cloud Free for ... free.

Anything with the name “Oracle” sound like the steps of a thousand lawyers entering your building…

I’m using a Linode $5 running nginx for static here.

$5 a year?

$5/mo - but there are plenty of decent VPS for $5/yr - the catch is they will be IPv6 only for port 80 so you chuck it behind Cloudflare (carrying static load as well).

The low end world will shock anyone who has only ever seen AWS pricing.

Quite, and the $40-60 a year bracket for a VPS is quite normal, but the original message was "decent VPS for $10/yr". Linode certainly doesn't go that low - at least last time I checked.

I saw an VPS from Italy I think for in the region of $20/year some time back, Sephiroth87 was after a $10/year VPS recommendation, not a $60/year one that hvgk suggested.

Look on lowendbox

Indeed, I would love to get more details on what all went into it, and how far can we stretch such a SBC.

EDIT: evidently, not far

Unless they host their images themselves, but a Pi could handle traffic very well for a static website.

HN traffic isn't that large, maybe a few requests per second.

You're a decade behind the times. HN can be formidable in the amount of traffic it generates, it all depends on the content and the time of day though.

Spoke too soon!

> Visit [redacted], and you’ll be redirected to a subdomain for EU exit hauliers - except the site isn’t there. Instead it’s a WordPress login page. There’s no username field and we feel confident that a brute force attack would be super effective!

> Elsewhere we have the Department for Transport careers page, which sort of does what it says. Clicking on the ‘see all vacancies’ button will redirect you to the civil service jobs site. This isn’t weird in itself, what is weird is that it uses t.co - Twitter’s redirection and domain obscuring tool to do it. Don’t ask us why, we have no idea why they would do this.

This sounds like someone inexperienced with the system is somehow managing it. How can you use a t.co link for... this? I'm surprised this edit got past anyone.

EDIT: Redacted the link just to be on the safe side. It's in the article if anyone's curious.

In fact, it's a t.co link that redirects to a bit.ly link that redirects to the actual site!

This is probably just someone who copied the link from twitter straight into the governments content management system.

The content on this page isn't written by tech people - it's written by policy experts and other civil servants whose expertise isn't exactly how URL's work...

> The content on this page isn't written by tech people - it's written by policy experts and other civil servants whose expertise isn't exactly how URL's work...

It doesn't just take a lack of expertise: it takes an extra level of apathy about the quality of your work and general incuriousness about the world. They can see the url they're pasting, and the majority of web users have some intuitive sense of the difference between domains: they are, after all, human-readable.

I can imagine the tail of "confused grandparent" stereotypes that are completely blind to the difference between t.co/622ahdvdj and charts.tf.uk.gov, but people that are that technically illiterate should be nowhere near computers in a professional context.

December 2018 snapshot refers to Department of Transport: https://web.archive.org/web/20181227091013/http://charts.dft....

The CNAME of charts.dft.gov.uk.s3-website-eu-west-1.amazonaws.com still works, but the reverse DNS of that IP is simply s3-website-eu-west-1.amazonaws.com: I am not sure how does one gain control of an s3-website subdomain when "abandoned" (bucket name only?), but someone did.

So the scenario someone described below is pretty likely: DoT drops it, and drops AWS use of the name, but leaves the DNS record in. I wouldn't attribute this to anyone in the DoT.

It would still require intentional action to do so, though, so I wonder if anyone has any clue how do people find out about spurious, unused S3 subdomains that still have DNS pointing at them? Scan the entire internet for domains pointing to s3-website, and check AWS API to see if it's available? Or did someone run into this by accident and decided to poke fun at it while earning some cash along the way?

What sometimes happens is someone points a CNAME to a non-existent bucket. Either because they were planning ahead, or someone typo'd a bucket (and thus DNS) name.

There are bots that scan for this. Then someone creates the bucket on S3 and boom, subdomain hijack.

That's what I suggested with

>> Scan the entire internet for domains pointing to s3-website, and check AWS API to see if it's available?

What I wonder is how do you scan all the DNS records with their subdomains? Unlike IPv4 address space, which is very decidedly finite and not-too-big, the space of all the subdomains is basically infinite.

Other than using AXFR (zone-transfer DNS request) which is usually restricted, you are searching an unbounded space.

I guess you don't need an AWS API calls since hitting a non-existing bucket with HTTP will let you know: http://something.that.does.not.exist.s3-website-eu-west-1.am...

IOW, how would you write such a bot? :D

> how do you scan all the DNS records with their subdomains?

You needn't do this for stuff that would work in these "Hijack" situations.

Your target is any link that gets visited, maybe following a bookmark somebody made in 2018, maybe it's linked from some page that was never updated, maybe it's in an email somebody archived. If you're phishing you have one set of preferences, if you're doing SEO you have different preferences (you want crawlers to see it but not too many humans).

When anything follows that link, a DNS lookup happens. Most of the world's DNS queries and answers (not who asked, but what is looked up and the answer) are sold in bulk as "passive DNS". You buy a passive DNS feed from one of a handful of big suppliers, or if you're cheap you hijack somebody with money's feed.

So, you're working from a pile like:

  www.google.com A
  www.bigbank.com CNAME www1.bigbank.com
  www1.bigbank.com A
  charts.dft.gov.uk CNAME charts.dft.gov.uk.s3-website-eu-west-1.amazonaws.com
Obviously you can grep out all those S3 buckets and then you ask S3, hey, does charts.dft.gov.uk exist? And it says of course not, so you create charts.dft.gov.uk as an S3 bucket and you win.

There are size and character limits on DNS, so it's not infinite, although it may still be a pretty large space. Charts.(something well known) could have been a dictionary check though.

AXFR makes it a lot easier though.

Ah, I totally forgot about the domain name (255) and label (63) length limits: thanks!

Still, we are looking at roughly 38*255 possible options (a-z, 0-9, a hyphen and dot to separate labels; "roughly" because each label between periods can be up to 64 characters, labels must be non-empty, and hyphens can't start a label).

As you said, it's pretty large: compared to 2*32 of IPv4 or even 2*128 of IPv6, this is more than (2*5)*255 = 2*1275 options.

The most logical to me is, they registered some AWS IPv4 address for one project. Bill didn't get payed and now another customer has been appointed to the same address but now with totally different content. DNS admins at the government forgot about it and here we are.

This is very obviously just an S3 bucket-name takeover, so no IP address was hijacked (the IP is the same for all S3 eu-west-1 buckets, I am guessing).

A great read in a tongue-in-cheek british style, a welcome change of pace for mind and eyes!

> Best of British Porn? Not Quite

That's not a very fair assessment. The same way as it's difficult to find British dishes better than, say, minced beef and onion pie, it's challenging to find authentically British porn that's better than this govermnent office provides its people. We should commend the Tory government for its dedication.

"authentically British porn"

That's a concept I have not pondered before.

There are things we regret not doing and things we regret doing.

I’m sorry.

Both my own site (on a Pi behind the couch) and the gov site were subjected to the hug of death. I've moved thecrow.uk onto a VPS for now and it's back up. Hurray!

The title should be changed to reflect that the article is actually about .gov.uk domain being used for non-governmental websites.

...without permission, that is - probably a subdomain takeover, not a disgruntled employee.

Right, my point was more that I clicked the link thinking that the UK was launching a government-owned porn website.

British Government Porn? That the one where we all get screwed by Rishi in the budget?

I've seen that one, it's a bit too sadomasochistic for my taste.

Looks like someone forgot to delete a DNS entry after decommissioning a server. Bad on behalf of gov.uk, however you'd think AWS would at least auto-delete the CNAME (charts.dft.gov.uk.s3-website-eu-west-1.amazonaws.com) after the server was released, so that it points to nothing...

I don't know if this is laziness and ineptitude on the govt's part or not. You see the design team for UK gov websites have been getting a lot of attention and praise for their efforts, the most recent being here just ten days ago on the subject of check boxes: https://news.ycombinator.com/item?id=29238968 .

Now anyone with a rudimentary handle of the English language would probably have noticed the misspelling of carcasses on the blogpost https://designnotes.blog.gov.uk/2021/11/15/letting-users-tic... and Yorwba highlighted this on 17 November 2021 as seen in the comments. The team duly acknowledge this as seen with the updated image here https://designnotes.blog.gov.uk/wp-content/uploads/sites/53/... and the original misspelling can still be seen here https://designnotes.blog.gov.uk/wp-content/uploads/sites/53/...

Anyway, it would seem their commenting system will not allow links to be posted to them or they choose to ignore links or didn't understand the comment posted when comments like "https://www.bing.com/search?q=plural+of+carcass" come through to them which is metadata for the type of filtering being employed on their comments section.

I think its worth looking at their design principles which can be seen here https://www.gov.uk/guidance/government-design-principles "#1 Start with user needs Service design starts with identifying user needs. If you don’t know what the user needs are, you won’t build the right thing. Do research, analyse data, talk to users. Don’t make assumptions. Have empathy for users, and remember that what they ask for isn’t always what they need."

It would seem Grant Shapps Secretary of State for Transport is perhaps actually meeting the public's needs or maybe its what he thinks of the public. Are we solitary handy manipulators of parts of the body?

Hope Hacker News didn't set fire to the couch!

I am now taking bets on how long it will last.

'This site is hosted on a Raspberry Pi 4B in the author's living room (behind the couch)'

Longer than the porn site it's talking about at least

The OP site is down, the porn site is up right now...

spoke too soon!

Btw, it would have been hilarious if the site owner had set up LetsEncrypt SSL certificate for the charts.dft.gov.uk domain :)

I’m only disappointed that the squatting site didn’t conform to GDS’s GOV.UK Design System.

For reference, it's 5 hours later now and it's still online.

american of course, russian always, japanese, chinese and thai - sure why not, heck, even danish or swedish ... but british or english - no way - not even once

Hacker News crashed the website.

How was this discovered and do we know how long it was in this state?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact