Hacker News new | past | comments | ask | show | jobs | submit login
Help HN: Google has blocked our entire domain for harmful programs
101 points by nstart 10 days ago | hide | past | favorite | 78 comments
Hey HN. Posting this here in the hope that someone who can help sees this. I work on security and compliance at Buffer. A couple of hours ago, Google blocked our entire domain start.page and now shows the "The site ahead may contain harmful programs" warning when trying to visit any subdomain.

start.page is the primary domain for hosting Buffer's link page product. Eg: https://buffer.start.page . About 24 hours ago, a spammer created a start page which linked to a .rar malware file hosted on Google drive. We did not host the file. Just carried a link to it.

That page was detected during our routine content moderation this evening but it had also been reported to Google. We have removed the content at this time and submitted the start.page domain to Google's review process.

In the meantime however, instead of blocking the individual subdomain that had linked to the malware, Google has blocked our entire domain start.page which means that all valid customers are also affected by this. Any customer start page visited on desktop/android now shows the scary red screen warning.

Reaching out on HN right now to see if there's anyone at all on Google who can help expedite the review process so that our customers aren't further affected by this.

Also, if anyone from Google sees this I can further help by sharing information to the linked google drive file. It's password protected so I'm guessing that that helps it bypass detection.

Thanks. Fingers crossed for this since I've never done/had to do this before.

> a .rar malware file hosted on Google drive

By the same standard of guilty until proven innocent, should they block the Google Drive domain, and warn all users that Google Drive is unsafe/malicious, during the same review period?

Glad somebody sees the humor in this. "Google ban-hammered your domain for hosting a link to Google."

Seems like it would be by design. Scan GDrive for malicious software and see who is linking to it.

I understand this is an ironic retort, but the actual standard is even worse than "guilty until proven innocent", it is "arbitrary rule of whatever is convenient to Google's interests". This is the danger of relying on a centralized corporate entity to gatekeep a public resource like the public internet. It is extrajudicial. And it is a gigantic power for one entity to hold without any oversight.

People take your comment as as joke but you are 100% correct. This is pure unfiltered hypocrisy on Google's behalf.

Does seem particularly ironic that the actual malware was/is hosted by Google themselves

I don’t know if it’s exactly the same, but I had old DNS record pointing to a deleted DO droplet. Then someone started hosting a phishing site on a droplet with the same IP, which led to my domain getting flagged as a phishing site. I was able to go to the google search console and submit a ticket, which was resolved in a matter of hours.

See the instructions here: https://support.google.com/webmasters/answer/6347750?hl=en

Thanks so much for sharing thoughts. In our case all our start pages are delivered via cloudflare's web workers so we are safe from dangling DNS records being poisoned in that way.

In our case, Google's search console shows clearly what subdomain was guilty of the issue and that subdomain has been cleared now. Just really want to expedite the review process since Google's safe browsing has decided to block the entire domain instead of the offending subdomain :-/

I've submitted a review but they do say that reviews related to malware take a few days to process. This is a little hard to be honest given that it's not even our site that is hosting the malware. It was a page linking to google drive which is where the malware actually is hosted.

Hoping we get a response soon. Appreciate the supportive chime in.

> since Google's safe browsing has decided to block the entire domain instead of the offending subdomain

This does, unfortunately, seem to be the right call. There's no way to differentiate between the subdomain user being malicious, the domain owner being malicious, or the domain owner getting hacked. The only granularity of data available is that something under the start.page domain was distributing malware, so it makes sense to quarantine the whole domain.

I hope this gets resolved quickly! I think the response is likely the correct one though.

But why isn't google.com blocked? I see no reason to accept any rationales that don't apply to themselves.

For the same reason why Dropbox and One Drive and GitHub and Facebook and all the other big services don't get blocked: because they're big enough and well-known enough as content hosters that the default assumption is that they have a bad user rather than that Microsoft went rogue.

It's not fair to smaller and newer players, but it's perfectly rational and isn't a uniquely special feature of Google.

Correct, but the whole point of a rhetorical question is that everyone already knows the answer, and anyone who would try to defend the invalid argument would not be able to answer the question in any way that supports the invalid argument, such as whatever excuse google or a google apologist might try to give for the double standard exposed in this case.

Because google drive is in the PublicSuffixList and listed as a hosting provider is my guess.

I checked on this and that's not true. The public suffix list applies to anyone who lets out subdomains anyways and google drive is not a service to which this applies. Blogspot on the other hand is in the list.

Why _would_ they block themselves? How would that be to their advantage?

Irrelevant. Google does something and gives a rationale for it.

If that rationale were valid, it would apply to everyone, including google.

If the rationale does not apply to google, then it does not apply to anyone else.

Please gooogle (if they let you) the concept "double standard".

The rationale does apply to everyone: "if you're a big enough content hoster that we've heard of you before, we'll reach out to you rather than blocking your domain immediately". Google doesn't just make an exception for themselves, it also makes an exception for Microsoft, Dropbox, Box, Facebook, and probably hundreds of others that are too small for me to know about.

We can argue about whether this is fair to small players, but it's hardly self-dealing for Google to include themselves in their list of high-profile content hosters, and there are very rational reasons for maintaining such a list.

My point is not whether the rationale is valid or invalid: the point is, they're not going to do it. Is that wrong, in the ethical sense? Quite possibly. Do they care? No.

Try googling the "Golden Rule" and you'll find a version that says "they who have the gold make the rules".

You had no point. Obviously the reason anyone applies a double standard is to advantage themselves. Congratulations on that insight. Can you next explain why anyone would steal?

Oh, sorry, I thought you wanted to have a discussion, I wasn't clear on the fact that you just want to be a condescending dick. Have a nice life.

This does make for a really tricky future though tbh. It’s trivial for folks to password protect a zip file containing malware and for it to be uploaded to google drive or Dropbox and then be linked to from a start page.

If there’s a risk that each time that happens the entire domain could be blocked, that’s a lot of risk to try and mitigate. Especially seeing that many of the bigger providers also struggle to mitigate this kind of content despite having technical teams that are larger by an order of magnitude (or more).

> There's no way to differentiate between the subdomain user being malicious, the domain owner being malicious, or the domain owner getting hacked. The only granularity of data available is that something under the start.page domain was distributing malware, so it makes sense to quarantine the whole domain.

Or, you know, take into account the number of subdomains serving malware relative to the total number of subdomain.

1 out of 1000’s seems unfair reason. If it was 10’s or 100’s of subdomains out of 1000, then it makes more sense to punish the whole domain. But when it’s just a few, blocking the individual subdomains would be the better way.

I empathise with the viewpoint, but I don't think you're thinking like a hacker about this.

If safe browsing only blocked the subdomain when you have a certain threshold of "safe" subdomains, then attackers would just have a sufficient number of "safe" subdomains.

Also how do you set the threshold? It's dependent on the market that the subdomain hosting provider targets, it's dependent on how good their moderation is, it's dependent on how quickly they get indexed, all sorts.

Any solution needs to work for the case of malicious users, and needs to work at a scale of billions of pages, i.e. you can't use any human review or non-machine-identifiable information.

> If safe browsing only blocked the subdomain when you have a certain threshold of "safe" subdomains, then attackers would just have a sufficient number of "safe" subdomains.

If a website has a.example.com, b.example.com, foo.example.com, baz.example.com and they serve malware on baz, I’m saying put that subdomain on the bad list. If they serve malware from many subdomains, block the whole domain.

The issue is that Google blocked a whole domain for just one bad subdomain. That seems too strict, and is very sad for all of the users of that domain.

So to distribute malware, I can buy a domain, set up a thousand subdomains, but only put malware on one of them. Then when that gets found I can move the malware to another, and so on, always being able to trivially work around blocking?

At least when done at the domain level there's a cost involved for getting a new domain, which disincentivises the creation of many malware hosting domains.

Yes there is. PublicSuffixList does that.

Yep, and start.page is not on it.

> The site ahead may contain harmful programs

We had that. The site got hacked and was hosting a trojan distribution point. Very discreetly. Once removed, we requested re-evaluation via the Google Webmaster's console and the flag was removed.

I don't know what you can do for immediate action, but usually Google does reverse their domain safety status quickly after the offending content is removed. 72 hours might be enough for this to go away.

Thanks. I do feel confident that this will be cleared within 72 hours but in the case of the business that does feel like an eternity since we host customer generated content.

Also, this is bound to happen in the future where people are going to link to malware that’s hosted elsewhere.

I’m trying to wrap my head around how to mitigate the risk of the entire site being blocked vs a single subdomain.

Will update here though when it is resolved in case there are any lessons to be shared with folks

We had the same thing happen a few times and ended up creating an antimalware service that used Googles list of malware domains to check all user added outgoing links

That’s awesome. It also sounds relatively trivial to implement I think. Do you have any reference though? Source code, write ups etc? Would love to learn from others.

In this case it would be tricky since the link is to google drive and we can’t block those. Also we’ve seen people work around url blocks by either using short links or by using html pages hosted for free to redirect using JS instead of an HTTP redirect. Always another mole to whack :,)

> I’m trying to wrap my head around how to mitigate the risk of the entire site being blocked vs a single subdomain.

Maybe allow your customer to use their own domain? Sites that host user generated contents such as wiki, blog or personal page often offer this option.

Tangentially... I try to not use any third parties that say what I can/cannot see. That includes SafeBrowsing, which I disable via...


Odd stance. SafeBrowsing does not _ban_ you from viewing content if you really want to. It only warns you before serving you the content.

I understand maybe wanting "total control" over your own devices, but to just completely reject security warnings seems like a net negative overall.

Why send every url you visit (or its hash, whatever) to google in return for little to no tangible benefit? I've never used safebrowsing at any point since its introduction and have yet to encounter a single instance where I would have benefitted had I been using it.

Firefox downloads a list regularly, it doesn't send URLs of every visit to Google https://feeding.cloud.geek.nz/posts/how-safe-browsing-works-...

"It would be too slow (and privacy-invasive) to contact a trusted server every time the browser wants to establish a connection with a web server. Instead, Firefox downloads a list of bad URLs every 30 minutes from the server (browser.safebrowsing.provider.google.updateURL) and does a lookup against its local database before displaying a page to the user."

See https://blog.cryptographyengineering.com/2019/10/13/dear-app... for a critique of the privacy of Safe Browsing.

Chrome uses a local bloom filter of bad URLs for precisely that reason.

Though it's enabled by default, I consider the burden of proof to be to advocate for its utility. Would I opt into this if it was disabled by default?

It has some utility, and some costs: traffic noise, some privacy leak, system complexity, centralization.

On balance it's not for me so I disabled it.

A case for the public suffix list?

Note that adding your domain to the PSL has potentially unwanted side effects.

For example, it will become somewhat inconvenient to share data across your subdomains. Instead of just setting a session cookie with domain=.start.page, you will need to implement a proper single sign-on mechanism. Email might be affected, too, especially when it comes to DKIM and DMARC.

You also need to make sure that your domain is listed in the private section of the PSL. There was a thread a while ago when someone got mistakenly listed in the ICANN TLD section and couldn't get a wildcard certificate for their domain. Let's Encrypt and most other CAs have a policy of rejecting domains like *.co.uk, and may rely on the PSL to tell which is which.

Thanks for sharing those gotchas. I’ll be keeping them in mind when adding our domain. Thankfully we’ve utilized start.page as a separate domain to only host the pages. This does mean that we should be able to add ourselves to the PSL without too much fuss once we’ve got the basics in place.

> A case for the public suffix list?

For people curious about what the public suffix list is: https://publicsuffix.org/

Why did someone invent a centralized store for this instead of using a DNS TXT record like every other publicly useful domain property?

My understanding is that the PSL is good-enough and avoids somewhat costly/unreliable TXT lookups for every domain when only a very tiny fraction of domains would actually want this treatment.

There is also a bit of security risk since browsers use this list to set cookie restrictions. If it were in DNS, which the vast majority of people use unencrypted, an adversary could manipulate responses to either (a) drop the TXT record altogether so the domain is not restricted or (b) craft a response in which the domain disables the behavior.

The public suffix list servers the purpose to only separate (sub)domains that are reasonably expected to be controlled by different owners.

Many systems - ex: rate limits, malware domain lists - would be very easily and cheaply gambled if domain owners could "disown" subdomains at-will, just with a change in DNS. There's a fairly long review process to get onto the public suffix list for exactly this reason.

There's also the historical aspect, that DNS is a much older technology than the need for the public suffix list. Mozilla at the time couldn't expect that all registries would adopt a new standard quickly or at all. Since there was a need for this information for browser security improvements, the list was born, and gradually become the de-facto standard source of such information.

Definitely. Something new we learn everyday. Thanks for sharing

Yep, if you host user-generated content on subdomains and don't add yourself to the PSL, you deserve what you get.

> you deserve what you get

Seems unnecessary to add that as not everyone would've been aware of the PSL.

The parent post bringing up PSL was a helpful addition already.

Thanks, I managed to not be aware of this list until now, despite lots of professional experience building for the web, including two years working at a company that hosted 150k subdomains containing user-generated content.

OP phrased it poorly, but I'm likewise perplexed at how someone can be running a business that revolves around subletting a domain name and not know about the Public Suffix List. It seems like at some point they would have thought through some of the security problems inherent in sharing a domain, researched solutions, and learned about the list.

I really do wish that was the case but it's just not something that I've come across. I don't want to make excuses here and do take responsibility for it but sometimes I feel like we learn important lessons like this in the fire. Still, this one is on me for not knowing about it till today.

On that note though, I'm perplexed as to how people would manage this kind of thing if using paths instead of subdomains. So instead of <user>.start.page if we used start.page/user. In the latter case, I'm not sure how one would prevent their entire domain from being taken down if malicious users kept linking to malware hosted on file hosting/sharing sites. Is there something similar to the PSL for this?

At its core the issue in my head is user generated content linking out to malicious software being a point of trigger for entire sites being blocked. Does this mean that an entire publication site could be blocked if someone used a comment widget to link out to malware and the site got reported? That seems like an effective DoS mechanism at some point.

I guess what I'm struggling with is why the domain gets blocked instead of the actual url that contains malware (or even the single path that links out to it). Fwiw the google drive link hosting the malware is still active.

Yeah, these poor startups just can't help but run the modern equivalent of open relays! Won't somebody think of the valuations!

malware spreaders just need to add their domains to the psl, at which point they can serve from unlimited subdomains?

seems like a great system

No, this is exactly why the PSL is managed by browser maintainers and not an automated system.

I'm shocked no one has pointed this out yet, but it's a really really bad move to host user-submitted content on your primary business domain. There's no such thing as subdomain culpability in the way the Internet is operated.

To clarify here, we did think through this and start.page is not our primary business domain. It's the URL we used to host user generated content. You are 100% correct that it's a bad move to host user-submitted content on our primary business domain and we did intentionally avoid that.

Where we, and in this case, I, missed a very important step by the looks of it is in adding ourselves to the public suffix list. I have done research on subomdain content management but for whatever reason, today is the first day I've come across the PSL. Something I definitely take responsibility for and makes me wonder what other stuff I might be missing that is obvious to other folks who've done this at scale.

OK but still, if your primary product is hosting user generated websites, it's a problem even if your "corpsite" marketing page stays up while your entire product is down.

How does github pages or neocities, for example, handle this kind of thing? Surely I can't bring down every github page by linking to a malicious Google drive file from my own page?

They handle this by serving user content from githubusercontent.com for this exact reason.

I think you may be missing my point. How does one user serving malicious content from githubusercontent.com not result in the takedown/blacklisting of all users' content on githubusercontent.com?

Sure, I understand how github.com is protected here, but I'm also unaware of github pages being totally blacklisted by Google even though I'm certain there have been malicious pages hosted there.

You can only limit the blast radius so much if the product itself is "hosting user content."

Correct. This is a major security fuck up and the consequence is as intended.

Indeed...it seems to me that Google is doing the right thing here. They’re protecting me from a site that anyone can use to link to malware. Working as intended.

The next step for Google is to start blocking more of these pesky malware linking sites. Some really annoying candidates to start with: google.com (especially drive.google.com and calendar.google.com), gmail.com, blogger.com, youtube.com. I also hear there's an entire app for opening malware links called Chrome. Any Google employees lurking around who could maybe take a look?

This sort of thing has not been at all uncommon for Google over the past few years. I’m looking forward to the day when they no longer have this level of power to make or break tiny companies. With any luck, they will either stumble with LLMs and become irrelevant or emerge as only one of several companies and be forced out of their monopoly.

If the malware was hosted on drive.google.com they should also block google.com to be consistent.

Heh, isn't Google great...

"Here is a test suite that can show if your AV/whatever detection tool works"

Google: "Kill it with fire, I get to choose what people see on the internet"

Quick update here: The block has been lifted and our domain has been marked as safe. This was a way better timeline than I could have hoped for given that it's still Sunday night in the American and European markets and it's still early morning in Asia. Australia, New Zealand and anything further than +7 GMT would have been minimally affected.

I really appreciate the community here sharing thoughts, similar experiences, and ideas on what to do. First time I've heard of the public suffix list for this.

A quick question to anyone who happens upon this: How does one prevent this issue affecting an entire site in general? Is there a grace period that Google gives a verified (via search console) site with a security issue? If not, then I'm curious how to protect a site which is targeted by malicious groups via comment widgets or if they host content using paths instead of subdomains. Eg: medium.com uses paths to go to user generated content. How would they defend from having their entire domain blocked if someone created a publication that linked out to malware?

Cheers all!

The project I work on, which also hosts user-generated content, ran into related problems:

- Outlook365 blocking any emails containing our domain

- ISPs blocking our domain via DNS filtering

In each case the blocklisting process was far from transparent and mitigation was difficult and stressful.

If you're in the same boat, reach out to me (email in profile). I believe we can make this topic a little less scary by connecting and sharing learnings.

My experience with this was front page material a while ago - the linked article contains info on how I dealt with it and preventive measures (that you are probably too late to implement now)


I've seen this happen at $EMPLOYER and it actually went beyond the website. Any email you send that has the url/domain in the text (e.g. in the signature) gets flagged by gmail or any G workspace email server with a big red warning. So, all customers who use Google's email servers (directly or indirectly via G Workspace) will get the red warning banners on all emails sent from anyone in your organization. Now THAT gets annoying real quick.

I had this issue a few years ago. The problem went away after I fixed it, submitted what I fixed... Wait a few days and suddenly back in business.

I have never seen any websites host more malware than Office365 and Google Drive.

Blocking THOSE two domains would likely resolve half the malware issues on the web, create a temporary flurry of confusion, and then accidentally solve the other half as people are forced to understand what saving files to a cloud actually entails.

This is a useful warning that you can't share a SLD with other organizations, at least not for any service of value.

Good title prefix for future searches.

"Just carried a link to it."

-- Game over.

first of all startpage.com is a search engine and your site ( start.page ) looks a little spammy

Why did you allow the page to be created then wait 24 hours to remove it? What is to stop that person or others from continuously doing this?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact