Hacker News new | past | comments | ask | show | jobs | submit login
Google Safe Browsing can kill a startup (gomox.medium.com)
1714 points by gomox 50 days ago | hide | past | favorite | 543 comments

This is actually funny, because I was involved with the creation of this list, way back in 2004. The whole thing started as a way to stop phishing.

I was working at eBay/PayPal at the time, and we were finding a bunch of new phishing sites every day. We would keep a list and try to track down the owners of the (almost always hacked) sites and ask them to take it down. But sometimes it would take weeks or months for the site to get removed, so we looked for a better solution. We got together with the other big companies that were being phished (mostly banks) and formed a working group.

One of the things we did was approach the browser vendors and ask them if we could provide them a blacklist of phishing sites, which we already had, would they block those sites at the browser level.

For years, they said no, because they were worried about the liability of accidentally blocking something that wasn't a phishing site. So we all agreed to promise that no site would ever be put on the list without human verification and the lawyers did some lawyer magic to shift liability to the company that put a site on the list.

And thus, the built in blacklist was born. And it worked well for a while. We would find a site, put it on the list, and then all the browsers would block it.

But since then it seems that they have forgotten their fear of liability, as well as their promise that all sites on the list will be reviewed by a human. Now that the feature exists, they have found other uses for it.

And that is your slippery slope lesson for today! :)

This is an amazing story. It really demonstrates the way we pave our road to hell with good intentions...

We should really do something about this issue, where so few companies (arguably, a single one) hold so much power over the most fundamental technology of the era.

Here-here! I really wish there was more human involvement in a lot of these seemingly arbitrary AI-taken actions. Everything from app review to websites and more. This heavy reliance on automated systems has led us down this road. Shoot, keep it, just give us the option to guarantee human review - with of course transparency. We don't need anymore "some human looked at this and agreed, the decisions is final, goodbye."

I know it's easier said than done, especially when taking the scale of the requests into account, but the alternative has, does, and will continue to do serious harm to the many people and businesses caught in this wide, automated net.

It's interesting how closely the unfolding of this awful scenario has followed an entirely predictable path based on the shifting incentives: now hundreds of thousands of businesses face the same massive hazard of blocklisted without adequate human review, and with mediocre options to respond to it if it occurs.

Without a shift in incentives, its unlikely the outlook will improve. Unless the organisations affected (and those vulnerable) can organise and exert enough pressure for google to notice and adjust course, we're probably going to be stuck like this -or worse- for a long time.

Blacklisting a site incorrectly seems like a perfectly adequate reason for a defamation lawsuit. So, I think the real issue is with the legal system.

> this awful scenario has followed an entirely predictable path

The interesting things about predictable paths is that at the start there are a LOT of them, then over time there becomes just one of them. I don't see that this path was any more predictable at the start than any other.

It feels like the need for automated systems is a result of the ever-increasing size of the world (there are now nearly 5 billion internet users[0]). For Apple, app review can take days, mainly because doing human review [consistently] well and constantly for 8 hours a day isn't easy[1], leading to staffing issues when bad reviewers get weeded out and only a small percentage of hires stick around. Outside of hiring 10,000 employees just to endlessly review phishing links for 40 hours a week, you need automation to triage these phishing sites and deal with the outcome later such as via on-demand review by a human (which worked in this case, but won't always work - humans still make mistakes). I'm not sure if there is a solution for this problem outside of just not having the safe browsing product if 'makes no errors' is a requirement.

0: https://en.wikipedia.org/wiki/Global_Internet_usage

1: https://www.businessinsider.com/heres-why-it-really-sucks-to...

There's no reason the number of humans dealing with these problems can't scale alongside the number of humans creating them.

But it's a lot cheaper to pay for a few really expensive programmers to make a just-good-enough AI than to pay for thousands of human moderators. So we end up with a stupid computer creating tonnes of human misery all for the sake of FAANG's already fat profit margins.

"So we end up with a stupid computer creating tonnes of human misery all for the sake of FAANG's already fat profit margins."

I don't want to blame this entirely on the big companies, though. Also the people want and expect "free" things on the internet. This is how we ended up like this.

> There's no reason the number of humans dealing with these problems can't scale alongside the number of humans creating them.

I would think the attackers are using automation also, to spam attacks as in other areas of fraud. It can only be a battle of AI ultimately.

Depends on which problem your tackling. With App reviews for example it is very easy to rate limit the 100 USD developer licenses. And also in cases like the one the medium article is facing businesses would gladly pay a hundred bucks to get real humans to produce competent answers/reviews/decisions. And if you dislike this solution because it creates a google tax (pay us or we'll block your site), make it not a service payment, but a security deposit which they'll only keep if you are fraudulent in some way.

Is it just me but the way things are currently stacked, human insight is still the best line of defence? The OP and other anecdotes in the comments are examples why we’re not quite at “AI vs AI” yet

:s/Here-here/Hear hear/

Hear hear!

Where where!?

Here here!

> I really wish there was more human involvement in a lot of these seemingly arbitrary AI-taken actions.

Narrator: but it was only ever to get worse

Couldn’t agree more, the transparency is key. It enables faith in the system and outcome.

The counter argument to transparency will be that it provides too much information to those who aim to build phishing sites not blocked by the filter.

That said, we’ve experienced systems in which obfuscation wins out over transparency and it would be nice to tackle the challenges of transparency.

Are you implying that the list no longer has a good intention? I wouldn't be surprised if there are multiple orders of magnitude more phishing and hacked websites in 2021 than there was in 2004. Even with human checking, I doubt you'll even have 0% failure rate. Is the solution to just give up on blocking phishing sites?

The failure rate doesn't need to be 0%. If the solution is good, at least it'll be close to 0% which means that it'd be possible for the vendor to provide better support for the small number of mistakes so that they can be clearly explained to the affected party and rectified more quickly. If the failure rate is too high to make better support infeasible, then the current solution is not really a good one and we need to consider a revision.

> Are you implying that the list no longer has a good intention?

Most of the time I run into blocked sites they seem to be blocked because of copyright infringement, not phishing. The only phishing sites I've seen in the last year or so are custom tailored. For example, I had to deal with a compromised MS365 account last year where the bad actor spun up a custom phishing site using the logo, signature, etc. of the victim.

So IMHO the intentions are no longer pure plus the effect is diminished and being worked around.

The solution is for the legitimate sites that are driven out of business by Google AI to sue Google for tortuous interference and libel.

This helps one group and hurts another. If Google is liable for blocking potential malware and phishing pages, they'll either stop blocking it, or adjust their algorithm to strongly err on the side of allowing phishing sites.

Businesses become safer, but more regular people will get phished.

>or adjust their algorithm to strongly err on the side of allowing phishing sites.

It'a not the role of Google to disallow phishing sites (as a browser) just like it's not the role of the ISP.

Make it hookable so people can chose their own phising protection service.

People wouldn't know or care which to pick. They would see the pop-up asking to select a phishing protection provider, would get confused and angry and think "where do I click to get past this pop-up, I want to go on Facebook and this stupid computer is nagging me with stuff again!"

Phishing protection is mostly needed for people who have no clear concept of phishing or technicalities. They just want to do things on the internet, like social media, they don't care about things behind the scenes, that's boring uncool nerd stuff.

And then they will choose the same block list and sites will have the same problem.

All? I doubt it. Not to mention they could offer control to override whatever you like.

> they could offer control to override

Chrome lets you override and proceed to the site. The problem for the small business is that a large fraction of their customers see the scary red warning page.

Well enough that it will still be a blocker.

Well, that goes without saying. If you want a blocker, you want a blocker. So all the nigerian princes and the like should still be blocked.

You just don't want to give control over the blocking blacklist/whitelist to a single entity, even less so to a huge powerful one, possibly in a country other than your own (which e.g. forces their foreign policy dictums to your blacklist), and even less so the one that already makes your browser, that should be a totally neutral conduit.

I don't think this solves the problem from the article, since small businesses will still have to deal with getting mistakenly blocked by whatever the popular blockers are. With 40,000 new phishing sites per week, it's not an easy task. If the blockers are free (I imagine they'd have to be to get widespread adoption), who's going to review the false positives? Volunteers?

But also, it would leave the people most vulnerable to phishing unprotected, namely those not tech-savvy enough to install a phishing protection service. Most internet users don't even have ad-blockers.

The problem isn't the company that blocked it. The problem is the company that reported that there was a problem when there wasn't. In this case it sounds like Google is both companies.

>Is the solution to just give up on blocking phishing sites?

IMHO yes. It's too much power for one company to wield. And especially a company with such questionable morals as Google. This cure is worse than the disease.

I thought you said, the curse is worse than the disease... which also would've made sense.

" Is the solution to just give up on blocking phishing sites?"

But maybe not do it by default on browser-level.

But if you do, then there really needs to be ways to combat wrong decisions in a timely manner.

The solution is simple: Liability. As soon as it becomes legally infeasible to let algorithms block people, it will stop happening.

Make it easy and affordable to submit legal complaints for tech misbehavior and make the penalties hurt.

Ah, so you suggest liability for the vendors of the software blocking websites, with, in practice [1], no liability for the operators of a compromised website, if it is phishing/malware?

This is a great approach, if your goal is to optimize for increasing the amount of dangerous crap on the web. But, eh, that's surely worth it, because the profitability of startups is more important then little things like the security of the average netizen...

[1] Even if you make the operators liable [2], in practice, you'll never be able to collect from most of them. Whereas the blacklist curators are a singular, convenient target...

[2] If you can demonstrate how the operators of compromised websites can be held liable for all the harm they cause, I will happily agree that we should do away with blacklists. Unfortunately, the technical and legislative solutions for this are much worse than the disease you are trying to treat.

Since phishing is not going to go anywhere with or without blacklists - for obvious reasons, e.g. lists can't cover everything, and you can't add sites to the list instantly - I am willing to tolerate a slight increase in fishing which is going to exist anyway in exchange for not having Google (or any other megacorp, or any other organization for that matter) as a gatekeeper of everybody's access to the internet. The potential for abuse of such power is much greater and much more dangerous than the danger from tiny increase of phishing.

> I am willing to tolerate a slight increase in fishing

According to Google's most recent transparency report[1], as of December 20th of last year they were blocking around 27,000 malware distribution sites and a little over 2,000,000 phishing sites.

In your view, would turning off those blacklists and allowing those >2,000,000 sites to become functional again count as a "slight" increase?

(edit: That's a real question, incidentally, not a disagreement or an attempt at a 'zing'; I have no knowledge in this area but went to look up the numbers, and am curious whether 2,000,000 is truly a vanishingly small amount, relative to everything else that's out there that's not already on the list)

[1]: https://transparencyreport.google.com/safe-browsing/overview

I'm not sure what is counted as "sites" - i.e. if Google closes foo.bar/baz123 and the same server gets assigned bar.foo/zab345 and continues to serve malware, is it 2 separate sites? Did Google really achieve this much by forcing the changing of the URL? Sure, bunch of people that got the phish link in the mail that was sent before switch but then shut down won't be phished, but I have no idea how much that changes the picture - I'm sure phishers are well aware that their domains are short-lived and already adapted for that, otherwise they'd be extinct. However, I'd be glad to read some field-validated data about how much closing those 2M sites, whatever is meant by "sites", actually helps against phishing.

I mean if we could trust Google (or anybody else of that kind) to have blacklist strictly limited to reasonable definition of malware and phishing, and knew that usage of such list if strictly voluntary under control of the user, it would be an acceptable, if decidedly imperfect, remedy. But we know we can't trust any of this, even if whoever works on this at Google right now are sincerely ironclad committed to never any mission creep and abuse happen, once the means exist, these people can always be replaced with others that would use it to fight "misinformation", or "incitement", or "blasphemy", or whatever it is in fashion to fight this week. There's no mechanism that ensures it won't be abused, and abuse is very easy once the system is deployed.

Moreover, we (as, people not in control of Google's decisions) have absolutely no means to prevent any abuse of this, since Google owns the whole setup and we have no voice in their decision making process. Given that, it seems to be prudent to make all effort to reject it while we still can. Otherwise next time you'd want to make a site questioning Google's decisions about the malware list, nobody would be able to read it because it'd be marked as a malware site.

You can also be certain that these numbers include all the false-positives. One of the Open Source pages I maintain got blocked as well, because too many AV reported one library package as malware.

There's no "report as false-positive" button at Google, so these reports likely have a lot of false positives in them...

This was the case with railroads too, only a few controlled the biggest and most transforming and business-integral tech of 1800s.

Prior to that it was those that controlled the printing presses.


History continues to repeat itself.

2 millions phishing sites and counting... with 40000 websites added each week.


I guess the automation started in 2007 or so.

Like some kind of perverse blockchain, no site is ever removed, even though most phishing sites don't live long.

I think you mean 2017? 2007 is when the feature launched.

january/february 2007 looks like the time the list jumped from a few hundred to tens of thousands of sites.

That was the normal volume of manually identified sites at the time. Before 2007 there weren’t a lot of participants because it was in beta.

Malware may be easier to catch, but for phishing, it was fairly small (under 150k) until around 2016 where it starts growing linearly.

Something similar I've just read in zero to one (by Blake Masters and Peter Thiel). Peter argues that computers can't replace humans - it'd be foolish to expect that at least for coming decades – strong AI replacing human is the problem of 22nd century. He proposes Complementarity and provides a successful implementation of this idea in PayPal fraud detection system way back in 2002 when purely automated detection algorithms were quickly overcome by determined fraudsters. He went on founding Palantir based on the same idea.

>>> In mid-2000, we had survived the dot-com crash and we were growing fast, but we faced one huge problem: we were losing upwards of $10 million to credit card fraud every month. Since we were processing hundreds or even thousands of transactions per minute, we couldn’t possibly review each one—no human quality control team could work that fast. So we did what any group of engineers would do: we tried to automate a solution. First, Max Levchin assembled an elite team of mathematicians to study the fraudulent transfers in detail. Then we took what we learned and wrote software to automatically identify and cancel bogus transactions in real time. But it quickly became clear that this approach wouldn’t work either: after an hour or two, the thieves would catch on and change their tactics. We were dealing with an adaptive enemy, and our software couldn’t adapt in response. The fraudsters’ adaptive evasions fooled our automatic detection algorithms, but we found that they didn’t fool our human analysts as easily. So Max and his engineers rewrote the software to take a hybrid approach: the computer would flag the most suspicious transactions on a well-designed user interface, and human operators would make the final judgment as to their legitimacy. Thanks to this hybrid system—we named it “Igor,” after the Russian fraudster who bragged that we’d never be able to stop him—we turned our first quarterly profit in the first quarter of 2002 (as opposed to a quarterly loss of $29.3 million one year before). The FBI asked us if we’d let them use Igor to help detect financial crime. And Max was able to boast, grandiosely but truthfully, that he was “the Sherlock Holmes of the Internet Underground.” This kind of man-machine symbiosis enabled PayPal to stay in business, which in turn enabled hundreds of thousands of small businesses to accept the payments they needed to thrive on the internet. None of it would have been possible without the man-machine solution—even though most people would never see it or even hear about it.

Liability was my first though. How is an assertion that a site contains malware not libel? Site would be easily able to demonstrate lost revenue.

Can someone dig out that old agreement to see if Google can be sued big time for this?

I doubt it but I must say it would make me happy and that would be weird because Schadenfreude normally isn't my thing.

> since then it seems that they have forgotten their fear of liability

They most likely have offloaded the liability to a “machine learning algorithm”. It’s easy for companies to point the finger at an algorithm instead of them taking responsibility.

Which then leads them to the awkward place of having to be transparent about how their algorithm work

Either take responsibility, or be transparent.

But we all want our cake and eat it

I take offense to this. Sure, I like to eat cake.

But if I liked to eat cake as much as Google does, I'd have died of obesity (= have my life ruined by legal issues) a long time ago.

Simple solution = let google use their, imperfect (false-positives) filter, allow them to collect $12 / year not to be blacklisted, and google to send all revenue to the Electronic Frontier Foundation or similar internet defending foundations.

Another road to hell paved with good intentions. Once everyone’s paying, who’s to stop them from pocketing the money instead?

“After careful review, we’ve concluded that the Electronic Frontier Foundation no longer aligns with the goals of Google or its parent company Alphabet Inc. to the extent we require from recipients of our Freedom Fund. We will place these funds in a separate account and use them in ways we believe will be in the best interest of digital freedom, both now and in the future.”

Worse, getting such money flow, EFF will get corrupt very soon.

Absolutely. There’s nothing like guaranteed money or power to corrupt people.

"For years, they said no, because they were worried about the liability of accidentally blocking something that wasn't a phishing site."

Can anyone explain how a web browser author could be liable for using a blacklist. Once past the disclaimer in uppercase that precedes every software install, past the Public Suffix (White)List that browsers include, how do you successfully sue the author of a software program, a web browser, for having a dommainname blacklist. Spamhaus was once ordered to pay $11 million for blacklisting some spammers, but that did not involve a contractual relationship, e.g., a software license, between the spammers and Spamhaus.

I think the situation is actually exactly like the Spamhaus case you describe: it wouldn't be the browser user that sues, but the blocked website's owner. The website's owner need not have accepted any kind of agreement from the browser maker in order to be harmed by the blocklist.

Perhaps the website would sue the author of the list.

That does not explain why this comment suggests a browser author was afraid to use the list.

The browser author could easily require the list author to agree that the browser author has no obligations to the list author if the list author gets sued by a website, and the list author must idemnify the browser author if the browser author is named in any suit over the list. The list author must assume all the risk.

That's very interesting. Would you not think for a moment that such mechanism could be abused?

The internet was a much kinder trusting place back then. We assumed when the browser makers agreed to not use it for bad things, we believed them.

I think as always great ideas do not account for human nature...

> there is no chance in hell that the government will try to break them up.

Government is not the only option. Railroads were fixed by Congress. If you want to fix or split Google, writing your representative about your concerns might help.

After years of seeing developments like this, getting worse and worse, it fills me with rage to think about how clearly nobody in power at Google cares.

I naively used to think, "they probably don't realize what's happening and will fix it." I always try to give benefit of the doubt, especially having been on the other side so many times and seeing how 9 times out of 10 it's not malice, just incompetence, apathy, or hard priority choices based on economic constraints (the latter not likely a problem Google has though).

At this point however, I still don't think it's outright malice, but the doubling down on these horrific practices (algorithmically and opaquely destroying people) is so egregious that it doesn't really matter. As far as I'm concerned, Google is to be considered a hostile actor. It's not possible to do business on the internet in any way without running into them, so "de-Googling" isn't an option. Instead, I am going to personally (and advise my clients as well) to:

Consider Google as a malicious actor/threat in the InfoSec threat modeling that you do. Actively have a mitigation strategy in place to minimize damage to your company should you become the target of their attack.

As with most security planning/analyzing/mitigation, you have to balance the concerns of the CIA Triad. You can't just refuse Google altogether these days, but do NOT treat them as a friend or ally of your business, because they are most assuredly NOT.

I'm also considering AWS and Digital Ocean more in the same vein, although that's off topic on this thread. (I use Linode now as their support is great and they don't just drop ban hammers and leave you scrambling to figure out what happened).

Edit: Just to clarify (based on confusion in comments below), I am not saying Google is acting with malice (I don't believe they are personally). I am just suggesting you treat it as such for purposes of threat modeling your business/application.

Jon Williams, circa 1987, wrote a story of a far-flung humanity's future in "Dinosaurs," in which humans had been engineered into a variety of specialized forms to better serve humanity. After nine million years of tweaking, most of them are not too bright but they are perfect at what they do. Ambassador Drill is trying to prevent a newly discovered species, the Shar, from treading on the toes of humanity, because if the Shar do have even a slight accidental conflict as the result of human terraforming ships wiping out Shar colonies because they just didn't notice them, the rather terrifyingly adapted military subspecies branches of humanity will utterly wipe out the Shar, as they have efficiently done with so many others, just as a reflex. Ambassador Drill fears that negotations, despite his desire for peace, may not go well, because the terraforming ships will take a long time to receive information that the Shar are in fact sentient and billions of them ought not to be wiped out ...

Google, somehow, strikes me as this vision of humanity, but without an Ambassador Drill. It simply lumbers forward, doing its thing. It is to be modeled as a threat not because it is malign, but because it doesn't notice you exist as it takes another step forward. Threat modeling Lovecraft-style: entities that are alien and unlikely to single you out in particular, it's just what they do is a problem.

Google's desire for scale, scale, scale, meant that interactions must be handled through The Algorithms. I can imagine it still muttering "The algorithms said ..." as anti-trust measures reverse-Frankenstein it into hopefully more manageable pieces.

> Google's desire for scale, scale, scale, meant that interactions must be handled through The Algorithms

That's fine when you're a plucky growth startup. Less fine when you run half the internet.

If Google doesn't want to admit it's a mature business and pivot into margin-eating, but risk-reducing support staffing, then okay: break it back up into enough startup-sized chunks that the response failure of one isn't an existential threat to everyone.

This lack of staffing is something that really annoys me. It's all over the big tech companies, and is often cited as the reason why (for example) YouTube, Twitter, Facebook, etc cannot possibly proactively police (before publishing) all their user content due to the huge volume.

Of course they can; Google and the rest earn enough to throw people at the problems they cause/enable. If they can't, then they should stop. If you cannot scale responsibly, then you should not scale at all as your business has simply externalised your costs onto everyone else you impact.

There is a limit to which problems you can throw people at, though. Facebook’s and Youtube’s human moderators suffer from the trauma of watching millions of awful videos every day. Policing provocative posts that are dogwhistling while still allowing satire and legitimate free expression is incredibly challenging and requires lots of context in very different fields. It’s not as simple as setting up a side office in the Philippines and hiring a thousand locals for moderation.

Yes: skilled labor. This is not a novel problem. Other companies create internal training pipelines and pay higher wages to attract those sorts of employees, when they're critical to business success.

I agree. Google is such a large behemoth who actively tries to avoid customer support if they can. Splitting it to smaller business with a bit of autonomy and not having to rely on ad money fueling everything else means those smaller businesses have to give a shit about customers and compete on even ground.

Same applies to Facebook and other tech companies. The root issue is taking huge profits from area of business into other avenues which compete with the market on unfair ground (or out right buying out competition)

However anti-trust in US has eroded significantly.

> However anti-trust in US has eroded significantly.

Perhaps compared to the 40s-70s, but certainly not compared to the Reagan era. Starting with the Obama administration, there's been a strong rebirth of the anti-trust movement and it's only gaining momentum (see many recent examples of blocked mergers)[1].

[1] https://hbr.org/2017/12/the-rise-fall-and-rebirth-of-the-u-s...

The Obama admin used it only to attack enemies.

Renata Hesse was part of that effort, and has since worked for Google and Amazon, and is now expected to be in charge of anti-trust at Biden's DOJ.

And as long as the internet giants are on the correct side of the culture war there will be scant appetite for breaking them up or reigning them in. As long as you need 5 phone calls to silence someone and erase them from the online, there is no chance in hell that the government will try to break them up.

> That's fine when you're a plucky growth startup. Less fine when you run half the internet.

It's never fine.

The abdication of responsibility and, more importantly, liability to algorithms is everything that's wrong with the internet and the economy. The reason these tech conglomerates are able to get so big when companies before them couldn't is because it's impossible to scale the way they have without employing thousands of humans to do the jobs that are being poorly done by their algorithms. Nothing they're doing is really a new idea, they just cut costs and made the business more profitable. The promise is that the algorithms/AI can do just as good of a job as humans but that was always a lie and, by the time everyone caught on, they were "too big to fail".

> It's never fine.

It kind of is, though.

The idea is that the full algorithm is "automation plus some guy". Automation takes care of 99.9% of it, and some guy handles the 0.01% that's exceptional, falls through the cracks, and so on.

The problem is when you scale from 100,000 events per day to half a trillion, and your fallback is still basically "some guy". At ten failures a day, contacting The Guy means sending an email, and maybe sometimes it takes two. At a million failures a day, your only prayer of reaching The Guy is to get to the top of HN, or write a viral Twitter thread.

There are some things which are important enough that they can't be left up to this formula, and maybe you're thinking of those. I'm not, and I doubt the person you're replying to is either.

This is probably a big part of why Google is invested in (limited) AI, because a good enough "artificial support person" means having their cake and eating it too.

The issue with (limited) AI is that it's seductive. It allows executives to avoid spending actual money on problems, while chalking failures up to technical issues.

The responsible thing would be to (1) staff up a support org to ensure reasonable SLAs & (2) cut that support org when (and if) AI has proven itself capable of the task.

> It simply lumbers forward, doing its thing. It is to be modeled as a threat not because it is malign, but because it doesn't notice you exist as it takes another step forward.

This is a concept that I think deserves more popular currency. Every so often, you step on a snail. People actually hate doing this, because it's gross, and they will actively seek to avoid it. But that doesn't always work, and the fact that the human (1) would have preferred not to step on it; and (2) could, hypothetically, easily have avoided doing so, doesn't make things any better for the snail.

This is also what bothers me about people who swim with whales. Whales are very big. They are so big that just being near them can easily kill you, even though the whales generally harbor no ill intent.

I'm curious if whales more dangerous on an hour-by-hour basis than driving?

That's generally my rubric for whether a safety concern is possibly worth avoiding an activity over.

> I'm curious if whales more dangerous on an hour-by-hour basis than driving?

It depends on how many passengers you pack in a whale.

My understanding is that a Chrysler as big as a whale can seat about 20. (Love Shack, 1989)

> “You will have killed us,” Gram said, “destroyed the culture that we have built for thousands of years, and you won’t even give it any thought. Your species doesn’t think about what it does any more. It just acts, like a single-celled animal, engulfing everything it can reach. You say that you are a conscious species, but that isn’t true. Your every action is... instinct. Or reflex.

Good story. I can imagine what the specialized humans did to the generalist humans eons ago.

Except in our case, Google's terraforming ships couldn't care less. It's just not part of their programming that there might be some intelligent life out there worth caring about that might be hurt by their actions, so there's no way for them to receive this information. It's not that it's hard to explain, there's nobody to explain it to.

Modern large corporations are just an more inefficient, less effective paperclip maximizer, with humans gumming up the works.

Google is striving hard to remove the "human" part of the problem.

After finished reading the parent comment,

> Google's desire for scale, scale, scale, meant that interactions must be handled through The Algorithms. I can imagine it still muttering "The algorithms said ..." as anti-trust measures reverse-Frankenstein it into hopefully more manageable pieces.

I immediately pressed C-f to search the string "paperclip maximizer", and was not disappointed. Thanks for mentioning it.

Your making another perfect case of why Google should be broken up. It’s important that we can choose again.

Sounds like a non-aligned AI.

It essentially is a non-aligned AI. AIs don't need to be implemented in silico. Bureaucracy is by itself a computing medium too.

That makes me wonder if someone has ever written a scientific paper proving that the bureaucratic processes in place at their company are Turing Complete. You can imagine some sort of Rule 110 cellular automaton being implemented in TPS reports.

A cellular automaton over office documents would be a nice thing to try! That said, a proof of turing-completeness of bureaucracy is relatively trivial:


  1. Requester data
     [bunch of boxes]
  1a. (*) Details on stuffs
     [bunch of boxes]
  1b. (*) Details on different stuffs
     [bunch of boxes]
  4. Additional documents
    - [Frobnication Registration #432]
    - [Frobnication Query #1111]

  (*) - Fill section a) if $something. Fill section b) if $somethingelse.
With sections 1a/1b implementing conditional branching, and section 4 implementing storage.

> proving that the bureaucratic processes... are Turing Complete

It's called COBOL

Thanks for mentioning this story; I just finished it and it's a great read.

Thanks for the story recommendation!

"never attribute to malice that which is adequately explained by stupidity" and all that, but after the events and the almost perfectly orchestrated behavior we've seen in the past and last couple of weeks it's becoming increasingly difficult, at least to me, to not attribute this to malice. Probably deliberate negligence is a better term. They know their systems can make mistakes, of course they do, and yet they build many of their ban-hammers and enforce them as if hat wasn't the case.

This approach to system's engineering is the technological equivalent of the personality trait I most abhor: the tendency to jump quickly to conclusions and not be skeptical of one's own world-view.

[1] https://en.m.wikipedia.org/wiki/Hanlon%27s_razor#cite_note-m...

"Consciously malicious" is not a good rule of thumb standard to measure threats to yourself or your business; it only accounts for a tiny bit of all possible threats. GP isn't claiming that Google is consciously malicious, they are claiming that you should prepare as if they were. These are not the same thing.

A lion may not be malicious when it's hunting you, it's just hungry; look out for it anyway. A drunk driver is unlikely targeting you specifically; drive carefully anyways. Nobody at Google is specifically thinking "hehehe now this will ruin jdsalareo's business!" but their decisions are arbitrary, generally impossible to appeal, and may ruin you regardless; prepare accordingly.

"The decisions are arbitrary, impossible to appeal, and may ruin you."

This is a monopoly.

Google may be a monopoly, but this quote has nothing to do with monopoly status. It has to do with power.

As a local businessman I can ruin someone’s life by applying the right legal pressure. Likewise, if one of my customers is reliant on my product to run their own business, and I drop them suddenly (akin to what google sometimes does), that could ruin them. But it’s not because I’m a monopoly, only because people rely on me. Monopoly implies there’s no choice, and while that IS true with google and search. It is not implied by “arbitrary, impossible to appeal, and may ruin you”. The two are distinct (though often related) problems that are both exemplified in Google.

Yes, exactly what I meant, thank you.

And very well said I might add. I don't mean to leave a vapid "I agree with you" comment, but your analogies are fantastic. They are accurate, vivid, and easily understandable.

I think mistakes just happen and are possibly just as helpful as they are harmful to Google. If they find something they particularly hate or damaging they can just "oops" their way to the problem being gone. Take Firefox[1], each time a service went "oops" on Firefox they gained marketshare on Chrome.

I have no doubt they'd use similar "oops" for crushing a new competitor in the ad space. Or perhaps quashing a nascent unionizing effort. It's all tinfoil of course because we don't have any public oversight bodies with enough power to look into it.

[1] https://www.techspot.com/news/79672-google-accused-sabotagin...

That's the nature of a dominant position. It gives you the power to engineer "heads I win, tails you lose" dynamics.

Well, I think the stupidity and laziness is exacerbated by their ill will towards customers and users. This is also what prevents them from reforming. The general good will and sense of common purpose was necessary in Google's early days when they portrayed themselves as shepherds of the growth of the web. Now they are more like feudal tax collectors and census takers. Sure they are mostly interested in extracting their click-tolls, but sometimes they just do sadistic stuff because it feels good to hurt people and to be powerful. Any pseudo-religious sense of moral obligation to encourage 'virtuous' web practices has ossified, decayed, been forgotten, or been discarded.

I was thinking about this this week in the context of online shopping with in store delivery. My wife recently waited nearly half an hour for a “drive up” delivery where she had to check in with an app. Apparently the message didn’t make it to the store, and when she called half way into her wait she wasn’t greeted with consolation, but derision for not understanding the failure points in this workflow.

It seems that the inflexible workflows of data processing have crept into meatspace, eliminating autonomy from workers job function. This has come at the huge expense of perceived customer service. As an engineer who has long worked with IT teams creating workflows for creators and business people, I see the same non-empathetic, user-hostile interactions well known in internal tools become the standard way to interact with businesses of all sizes. Broken interactions that previously would be worked around now leave customer service reps stumped and with no recourse except the most blunt choices.

This may be best for the bottom line, but we’ve lost some humanity in the process. I fear that the margins to return to some previously organic interaction would be so high that it would be impossible to scale and compete. Boutique shops still offer this service, but often charge accordingly and without the ability to maintain in person interactions at the moment, I worry there won’t be many left when pandemic subsides.

Very poignant observation. I have run into this as well in situations in meat-space everywhere from the DMV queue to grocery pickup.

Empathy and understanding for fellow humans is at an all time low, no doubt exacerbated by technologies dehumanizing us into data points and JSON objects in a queue waiting for the algorithm to service.

As wonderful as tech has made our lives, it is not fully in the category of "better" by any stretch. You're totally right about margins being too high, but I do hope it opens up possibilities that someone is clever enough to hack.

One of the things I hate the most is people I'm transacting with telling me something has to be done in a certain way because that's how "their system" works.

A recent example, I forgot to pay my phone bill on time and network access got turned off. I came to pay it on Friday, and they tell me the notice will appear in their systems only on Monday and then it takes 2 days for the system to automatically reactivate my access. No, they can't make a simple phone call to someone in the company, yes I will be charged full monthly price for the next month even though I didn't have access for a few days, nothing we can do - ciao

Systems (normally) model organizational processes, so companies with garbage processes usually have garbage systems in place too. This highly specific case reeks of fraud, and you should be able to report them to some kind of ombudsman so you could get your couple days' worth of fees back.

I would bet they have some terms & conditions the person agreed to that leaves them legally SOL.

I would also bet that the right kind of escalation leads directly to the desk of someone who will give them a refund.

Yes, the terms probably were written when it took two days for a check to clear.

No, the ombudsman probably can’t get legal to update the T&Cs

Contracts by definition cannot bind people into illegal conditions, and there's degrees of neglect that can be considered illegal. The entire point of an ombudsman is to keep actors within "this is not illegal" lines; I'm guessing you could do this on small claims court too, but with the plague and everything it can take a lot longer

I have not been noticing that.

I am finding the poorly paid workers who provide service to me polite and helpful.

Perhaps this is geography? Different in different places?

>”never attribute to malice that which is adequately explained by stupidity"

I keep reading this on the internet as if it’s some sort of truism, but every situation in life is not a court where a prosecutor is trying to prove intent.

There is insufficient time and resources to evaluate each and every circumstance to determine each and every causative factor, so we have to use heuristics to get by and make the best guesses. And sometimes, even many times, people do act with malice to get what they want. But they’re obviously not going to leave a paper trail for you to be able to prove it.

> I keep reading this on the internet as if it’s some sort of truism

I don’t believe this statement was initially intended to be axiomatic, rather, to serve as a reminder that the injury one is currently suffering is perhaps more likely than not, the result of human frailty.

I'm not sure it's even attributable to stupidity (necessarily) as attributable to automation or, more long-windedly, attributable to the fact that automation at scale will sometimes scale in wacky ways and said scale also makes it nearly impossible--or at least unprofitable--to insert meaningful human intervention into the loop.

Not Google, but a few months back I suddenly couldn't post on Twitter. Why? Who knows. I don't really do politics on Twitter and certainly don't post borderline content in general. I opened a support ticket and a follow-up one and it got cleared about a week later. Never found out a reason. I could probably have pulled strings if I had to but fortunately didn't need to. But, yeah, you can just randomly lose access to things because some algorithm woke up on the wrong side of the bed.

>said scale also makes it nearly impossible--or at least unprofitable--to insert meaningful human intervention into the loop.

Retail and hotels and restaurants can insert meaningful human intervention with less than 5% profit margins, but a company with consistent $400k+ profit per employee per quarter can not?


This is what I'm talking about in my original comment about the malice and stupidity aphorism.

Someone or some team of people is making the conscious decision that the extra profit from not having human intervention is worth more than avoiding the harm caused to innocent parties.

This is not a retail establishment barely surviving due to intense competition that may have false positives every now and then because it's not feasible to catch 100% of the errors.

This is an organization that has consistently shown they value higher profits due to higher efficiencies from automation more than giving up even an ounce of that to prevent destroying some people's livelihoods. And they're not going to state that on their "About Us" page on their website. But we can reasonably deduce it from their consistent actions over 10+ years.

Fair enough. Scale does make things harder but my $FINANCIAL_INSTITUTION has a lot of scale too and, if I have an issue with my account, I'll have someone on the phone sooner rather than later.

You're saying that as if it contradicts (“but”) what lotsofpulp said, but that was exactly their point: If your bank can do it, then so could Google. That they choose not to is a conscious choice, and not a beneficious one.

Conrad's corollary to Hanlon's razor: Said razor having been over-spread and under-understood on the Internet for a long while now, it's time to stop routinely attributing lots of things only to stupidity, when allowing that stupidity to continue unchecked and unabated actually is a form of malice.

(Hm, yeah, might need a bit of polishing, but I hope the gist is clear.)

I'd go with: "Sufficient stupidity[1] is indistinguishable from malice"

[1]: Where stupidity is further defined as "willful ignorance"

Meta: I’ve vouched for this comment. You have been shadowbanned.

I thought I was agreeing. "Fair enough."

I was just paying a bill online.

I had loading images turned off in my browser.

So I get the checkbox captcha thing, and checking it is not enough, so I have to click on taxis, etc. Which didn't initially show because of images being off.

I eventually did turn on images for the site and reload it. But at first, I was like "wait a minute, why should I have to have images on to pay a bill?" and I clicked a bunch of things I'd never tried before to see if there was an alternative. It appears that you have to be able to do either the image captcha or some sort of auditory thing. I guess accessibility doesn't include Helen Keller, or to someone who has both images and speaker turned off (which I have done at some times).

Maybe this is hard for someone younger to understand, but when I was first using computers, many had neither high quality graphics nor audio - that was a special advanced thing called "multimedia". It feels like something is severely wrong with the world if that is now a requirement to interact and do basic stuff online.

Genuinely-handicapped users should certainly have accommodations that allow them to pay bills using the necessary accessibility tools. It's always tricky to keep those tools from being leveraged by spammers and phishers, though, as witnessed by how TDD services for the deaf were misused in the past. Hard problem to solve in general, either through legislation or technology.

But if you're an ordinary user without special challenges, why would you expect anything to work after turning images off in your browser? If you're that much of a Luddite, maybe computers and technology aren't appropriate areas of interest for you to pursue.

Once upon a time, it was not only easy to find the option to disable image loading, but you could easily load them a la carte, by right clicking on any placeholder.

With the browser I use now, it seems to only let you reenable images per-site and then you have to dig in settings to delete the exception.

There IS a Load Image menu item when I right click...but it does nothing! Neither does "Open image in new tab".

I think it's unfortunate if there is a "long tail" of features in a typical application these days that are not expected to work.

What frustrates me personally is that there used to be a Firefox extension to suppress displaying a particular image, which is no longer available. I can't see the utility of disabling all images, but that extension was nice because you could use it to remove things you were tired of seeing like obnoxious backgrounds, avatar photos, and even some ads. Once you right-clicked on an image and told it to remove it, you'd never see it again, even in subsequent browsing sessions.

This extension died during one of Firefox's periodic Purges of Useful Functionality(tm), and I've been looking for another one ever since. So to some extent I see where you're coming from, but a general jihad against sound and images in the browser seems pretty radical.

I would agree. It's not useful in the context of remediation or defense, but on a human emotional level it's extremely helpful.

When Google kills your business it doesn't help your business to assume no malice, but it may help you not feel as personally insulted, which ultimately is worth a lot to the human experience.

Humans can be totally happy living in poverty if they feel loved and validated, or totally miserable living as Kings if they feel they are surrounded by backstabbers and plotters. Intent doesn't matter to outcome, but it sure does to the way we feel about it.

The saying is for your own sanity. If you go around assuming every mistake is malicious, it’s going to fuck up your interactions with the world.

Everyone I know who approaches the world with a me vs. them mentality appears to be constantly fraught with the latest pile of actors “trying to fuck them”.

It’s an angry, depressing life when you think that the teller at the grocery store is literally trying to steal from you when they accidentally double scan something.

One does not have to choose between assuming everything is malice or everything is stupid. Situations in the real world are more nuanced, and hence the saying is inane.

It’s not though. Assuming malice is incorrect 99.9% of the time and correctly identifying that other fraction offers so little upside. What good does it do to realize earlier that the person is malicious and not incompetent?

I think you have a point, and it's important to not be naive as people out there will steamroll those around them if given the opportunity. Personally I try to not immediately assume malice because I've found it leads to conspiracy-minded thinking, where everything bad is due to some evil "them" pulling the strings. While I'm sure there are some real "Mr. Burns" types out there, I can't help but feel most people (including groups of them as corporations) are just acting in self-interest, often stumbling while they do it.

It's a truism not because people are never malicious, but because we tend to see agency where there is none. Accidents are seen as intentional. This tendency leads to conspiracy theories, superstitions, magical thinking, etc. We're strongly biased towards interpreting hurtful actions as malice.

I'd add to this that willfully refusing to remedy stupid can be an act of malice.

That's a very good point. Actually, I just thought about something in the context of this conversation: one's absolute top priority, both in life and tech, should be to stop the bleeding[1] that emerges from problematic circumstances.

Whether those problematic circumstances, harm, arise due to happenstance, ignorance, negligence, malice, mischievousness, ill intentions or any other possible reason is ancillary to the initial objective and top priority of stopping the bleeding. Intent should be of no interest to first respondents, rather customers or decision makers in our case, when harm has materialized.

Establishing intent might be useful or even crucial for the purposes of attribution, negotiation, legislation, punishment, etc. All those, however, are only of interest, in this context, when the company in question hasn't completely damaged their brand and the public, us, hasn't become unable to trust them.

All this to say, yes, this is a terrible situation to be in, how are we going to solve it?

Do I care if Google is doing harm to the web due to being wilfully ignorant, negligent, ill-intentioned, etc? no, not an iota, I care about solving the problem. Whether they do harm deliberately or for other reasons should be of no interest to me in the interest of stopping the bleeding.

[1] https://isc.sans.edu/diary/Making+Intelligence+Actionable/41...

I agree with your sentiment. Modeling intent is useful in two cases: (1) predicting the future, and (2) in court. When modeling intent has no predictive power, it’s generally irrelevant, as you said.

Employees and managers at Google get promoted by launching features and products. They're constitutionally incapable of fixing problems caused by over-active features for the same reason they've launched seven different chat apps.

We are all living at the whim of Google’s technical debt.

I personally find Hanlon's Razor to be gratuitously misapplied. Corporate strategy is often better described as weaponized willful ignorance. You set up a list of problems that shall not be solved or worked on, and that sets the tone of interaction with the world.

Plus financial incentive creates oh so many opportunities for things to go wrong or be outright miscommunicated it is not even funny.

Thanks, I totally agree. Just to be clear I'm not saying it's malice as I don't believe that. I'm just saying the end result is the same so one should consider them a hostile actor for purposes of threat modeling.

Given you're the second person who I think took away that I was accusing them of malice, I probably need to reword my post a bit to reduce confusion.

Accusing them of malice is irresponsible without evidence, and if I were doing that it would undermine my credibility (which is why I'm pointing this out).

> Thanks, I totally agree. Just to be clear I'm not saying it's malice as I don't believe that. I'm just saying the end result is the same so one should consider them a hostile actor for purposes of threat modeling.

No worries at all! I interpreted your post the way you intended; and I agree fully being also in InfoSec.

Going by how you phrased your original post, you're probably more patient and/or well-intentioned than me as I'm farther along the path of attributing mistakes by big, powerful corporations to malice right away.

They probably aren't malicious, but they are definitely antagonists.

Your comment made me think that they have the same attitude with support as they do with hiring, they are ok with a non fine-tuned model as long as the false positives / negatives impact individuals rather than Google’s corporate goals.

I would argue that a consistent behave defeats the benefit of the doubt or involuntary stupidity. Also I believe most of good sounding quotes may be easy to remember but not backed by many truths.

Author here. I don't think it's malice on their part, but their hammer is too big to be wielded so carelessly.

Yes I agree with you (and thank you for your medium post by the way. Our only chance of ever improving the situation is to call attention to it. I fully believe Google leadership has to be aware of it at this point, but it clearly won't be a priority to them to fix until the public backlash/pressure is great enough that they have to).

Just to avoid any misreading, I didn't say I thought it was malice on Google's part. My opinion (as mentioned above, is):

> I still don't think it's outright malice, but the doubling down on these horrific practices (algorithmically and opaquely destroying people) is so egregious that it doesn't really matter.

So they are not (at least in my opinion without seeing evidence to the contrary) outright malicious. But from the perspective of a site owner, I think they should be considered as such and therefore mitigations and defense should be a part of your planning (disaster recovery, etc).

I do not trust management folks, whose paychecks and promotions are dependent on how successful such hostile actions are, to take the right decisions. I also do not think that they are deliberately ignorant/indifferent or that calling attention to it will do any good. These types of individuals got to where they are largely by knowing fully well that their actions are malicious and legal. I used to work under such people, and currently interact with and work with such people on a very regular basis (you could even consider me as part of them tbh). It is very much possible that the management level folks at Google don't have an ounce of goodness in them, and will always see such decisions from a zero-sum perspective.

To make it relatable, do you care so much for a mosquito if it's buzzing around you, disrupting your work and taking a toll on your patience? Because your SaaS is a mosquito to Google. After a certain point, you will want to kill the mosquito, and that's exactly what Google execs think so as to get to their next paycheck.

They have the option of not wielding the hammer. I for one never appointed them the guardian of the walled internet.

So browsers should just let users go to obvious phishing sites?

It's easy to take this position when you're very tech savvy. Imagine how many billions of less tech savvy people these kinds of blocklists are protecting.

It's very easy to imagine a different kind of article being written: "How Google and Mozilla let their users get scammed".

I mean, it was barely a decade ago when my parents computers regularly got filled with malware and popups and scams. They regularly fell for bullshit online. Maybe they have gotten more savvy, but I feel like this has overall greatly decreased, in a world where there's actually increasingly more bad actors.

Even if you're tech savvy. I've been phished. I was only saved by 2 factor and luck.


> I for one never appointed them the guardian of the walled internet.

On the other hand, lots of chrome users most likely do trust google to protect them from phishing sites. For those ~3 billion users a false positive on some SaaS they've never heard of is a small price to pay.

It's a tricky moral question as to what level of harm to businesses is an acceptable trade off for the security of those users.

The trade-off isn't between increased phishing vs. increased false positives. It's being able to get a human on the phone vs Google's profit margins. Break them up already.

I actually don't think this is that hard to fix though.

I'm a fan of google doing their best to protect people from scammers. The real issue here is no way to submit an escalated help request when they accidentally mess up. eg they could build a service where -- and I doubt scammers would play -- $100 (or even $1k) would escalate a help request with a 15 minute SLA. I run a business; we would have no problem paying an escalation fee.

I can already see the headlines on HN:

"How Google Runs a Pay-to-Play Protection Racket"

I mean, that's their whole business anyway, so...

Format your site to suit google, or they don't index it.

Add headers to your emails or google reduces deliverability.

Pay for clicks on your own company's name or google sells ads against the name of your company! They monetize navigation queries.

Run your site through amp and let google steal your traffic or google pushes your search rank down the page.

Let google steal answers to questions contained on your site and display them as answers w/o sending people to your site, or they deindex you (see tons of examples, but also genius).

Let google steal your carefully curated and expensive photographs for google shopping and use them for the item from other vendors or you can't list items in google shopping.

etc etc etc... it's nothing new. So we may as well encourage them to do a more helpful job of what they were going to do anyway.

This was the old Microsoft support model: opening a case cost $99(IIRC), but if the case was actually a MS bug/issue they’d waive the fee.

It might have started at $99 but it's much higher now. I think the last time I used it it was $299 but that was at least 2 decades ago. Fortunately it was their bug.

This. Why is there an implicit agreement that okay Google is the gatekeeper. It shouldn't be. The internet did not appoint Google as the gatekeeper.

>The internet did not appoint Google as the gatekeeper.

Uh, it kind of did, when internet-savvy early adopters (and developers) convinced all their friends, then family, then acquaintances, to switch to Chrome a decade ago.

I know there's probably a very large number of FOSS-only types on this site who would disagree with that assessment, and claim that they've always been in the Firefox camp, but the sheer market share of chrome clearly shows that they are the minority.

Everyone switched to chrome because they were tired of IE having too much power and not conforming to standards. Nowadays web devs often build chrome-first, using chromium-only features, and the shoe has almost migrated to the other foot.

> Why is there an implicit agreement that okay Google is the gatekeeper.

Because they run a popular browser and don't want their users getting scammed?

For each tech savvy person mad about this, there's 10 non-tech-savvy people completely oblivious that could get scammed by phishing sites we'd consider obvious.

Sure, they should do a better job, but that blacklist is probably millions of websites big at this point. It's the kind of thing where a perfect job is essentially impossible, and the scale means that even doing a decent job is going to be extremely difficult.

Have you considered not using a 3rd party for hosting your JavaScript? There is always going to be some risk if the code isn’t under your control.

Is this list only maintained by Google? Do Firefox and Bing use the same list, is their process better/different? Is there any sharing happening?

SmartScreen is a different list. (And has a "This website isn't malicious!" button.)

Agree, we can only vote with our clicks.

Sadly gmail and google docs are top notch products :(

No, we can't vote with our clicks. That's what it means when a handful of companies dominate most of the web and the web playing a dominant role in global economy.

We have very little real choice.

Occasionally people will pretend this is not so. In particular those who can't escape the iron grasp these companies have on the industry. Whose success depends on being in good standing with these companies. Or those whose financial interests strongly align with the fortunes of these dominant players.

I own stock in several of these companies. You could call it hypocrisy, or you could even view it as cynicism. I choose to see it as realism. I have zero influence over what the giants do, and I do have to manage my modest investments in the way that makes the most financial sense. These companies have happened to be very good investments over the last decade.

And I guess I am not alone in this.

I guess what most of us are waiting for is the regulatory bodies to take action. So we don't have to make hard choices. Governments can make a real difference. That they so far haven't made any material difference with their insubstantial mosquito bites doesn't mean we don't hold out some hope they might. One day. Even though the chances are indeed very nearly zero.

What's the worst that can happen to these companies? Losing an antitrust lawsuit? Oh please. There are a million ways to circumvent this even if the law were to come down hard on them. They can appeal, delay, confuse and wear down entire governments. If they are patient enough they can even wait until the next election - either hoping, or greasing the skids, for a more "friendly" government.

They do have the power to curate the reality perceived by the masses. Let's not forget that.

Eventually, like any powerful industry they will have lobbyists write the laws they want, and their bought and paid for politicians drip them into legislation as innocent little riders.

We can't vote with our clicks. We really can't in any way that matters.

That being said, I also would like regulatory bodies to step in and do something about it. To level the playing field. If nothing else, to create more investment opportunities.

Do you think the 1982 breakup of AT&T would have been possible in today's political reality?


Stock picking is not realism.

If by that you mean that valuations are not the result of a rational process, you are correct.

But investment strategy isn't so much about any underlying reality as it is about the psychology of market participants. You don't invest based on what you hope will happen, but what you believe will happen.

Great article. It’s not malice, it’s indifference.

Googles execs and veeps don’t care about small businesses, because most are career ladder climbers who went straight from elite colleges to big companies. Conformists who won’t ever know what it’s like to be a startup. As a group, empathy isn’t a thing for them.

Don't a lot of startup founders go to elite colleges and come from big companies?

The funded ones with the 2 year timelines generally are. But most startups are more bootstrap/angel investor with a bright owner who has a fatal flaw.

Is this an excerpt from your woefully unpublished startup culture fanfic novella? You can't just leave us hanging.

That is malice.

Accidentally unleashing a process that harms people is negligence. Not caring that you are being negligent is malice.

IMHO, it sounds like it worked. The things you changed sound like it's made your site more secure. In the future, Googles hammer can be a bit more precise since you've segregated data.

And you don't know what triggered it. It's possible that one of your clients was compromised or one of their customers was trying to use the system to distribute malware.

It's only more secure from Google's blacklist hammer.

No significant security is introduced by splitting our company's properties into a myriad of separate domains.

This type of incident can be a deadly blow to a B2B SaaS company since you are essentially taking out an uptime sensitive service that a lot of times has downtime penalties written down in a contract. Whether this is downtime will depend on how exactly the availability definition is written.

To add to this - by splitting and moving domains you've hurt your search rank, eliminated the chance to share cookies (auth, eg) between these domains, and are now subject to new cross-domain security dings in other tooling. Lose-lose.

We're talking about user uploads into a ticket system. They should not be publicly available at all. It won't hurt search rank.

If you split up your user uploaded material into per client subdomains you will know which one is uploading the malicious files. And your clients can block other subdomains limiting their exposure as well. Is it a huge improvement? No, but at least it's something

It's not clear from other commenters that had similar issues that GSB would not outright ban the entire domain instead of specific subdomains.

In this case, the subdomain they banned was xxx.cloudfront.net, and we know they would not block that whole domain.

We might consider that approach in the future, but I foresee complications in the setup.

It's probably "scale thinking" that makes google seem like they don't care: Everything is huge when you're "at scale"; the impact of a small blunder can take down companies or black out nation states. It's part of the game of being "at scale". They probably believe that it's untenable to build the necessary infrastructure to where everything (website, startup, person, etc.) matters.

This will sound crass, but it reminds me of Soviets cutting off the food supply to millions of people over the winter, due to industrial restructuring, and they brushed it off as "collateral damage".

Your comment reminds me of the first 30 seconds of this scene from The Third Man https://youtu.be/vSc-91F5Wiw

Of course they care. They've taken over everything they've been able to take over and they're still going strong. This is not by mistake. They just care about different things than you do. This is why Google needs to be broken up.

> I am not saying Google is acting with malice (I don't believe they are personally)

I'd agree. The problem is there is no financial or regulatory incentive to do the right thing here.

It has zero immediate impact on their bottom line to have things work in the current fashion, and the longer term damage to their reputation etc. is much harder to quantify.

There's no incentive for them to fix this, so why would they?

They're never gonna care. They aren't incentivized to care. The only thing that can change the situation is the power of the American federal government, which needs to break Alphabet into 20-50 different companies.

> nobody in power at Google cares

My assessment might be “nobody in power has time to prevent the myriad of problems happening all of the time, even though they handle the majority, with help from businesses, government agencies, etc., and given the huge impact of some problems to society as a whole, they may feel as though they’re rising in the front seat of a roller coaster, unaware of your single voice among billions from the ground down below.”

> they probably don't realize what's happening and will fix it

“If only the czar knew!”

I'm with you on the rest, but what has DO done to not have the benefit of doubt?

Also, to your point, an organization becomes something else than the sum of its parts, especially the bigger it gets.

Google can be a malicious actor without necessarily having individuals make act maliciously.

Yeah that's a fair question. I had a bad personal experience with them, but I've also seen plenty of issues too. There was a big one a little while ago about how Digital Ocean destroyed somebody's entire company by banning them with AI: https://news.ycombinator.com/item?id=20064169 Original Twitter thread: https://twitter.com/w3Nicolas/status/1134529316904153089

In their defense they acknowledged it and some changes. I can't find the blog post now so going from memory. But that only happened because he got lucky and it blew up on HN/twitter and got the attention of leadership at DO. How many people have beenh destroyed in silence?

In my case, Digital Ocean only allows one payment card at a time and my customer (for whom the services were running) provided me with a card that was charged directly.

A couple months later my customer forgot that he had provided the card. He didn't recognizer "Digital Ocean" and thought he had been hacked (which has happened to him before) and called the bank and placed a chargeback.

When DO got the charge back they emailed me and also completely locked my account so I was totally unable to access the UI or API. I didn't find out about the locked account until the next day. I responded to the email immediately, and called my customer, who apologized and called the bank to reverse the chargeback. I was as responsive as they could have asked for.

The next day I needed to open a port in the firewall for a developer to do some work. I was greeted with the dreaded "account logged" screen. I emailed them begging and pleading with them to unblock my account. They responded that they would not unlock the account until the chargeback reversal had cleared. Research showed that it can take weeks for that to happen.

I emailed again explaining that this was totally unacceptable. It is not ok to have to tell your client "yeah sorry I can't open that firewall port for your developer because my account is locked. Might be a couple of weeks." After a day or so, they finally responded and unlocked my account. Fortunately they didn't terminate my droplets, but I wonder what would have happened if I had already started using object storage as I had been planning. This was all over about $30 by the way.

After that terrifying experience, I decided staying on DO was just too risky. Linode's pricing is nearly identical and they have mostly the same features. Prior to launching my new infrastructure I emailed their support asking about their policy. They do not lock accounts unless the person is long-term unresponsive or has a history of abuse.

I've talked with Linode support several times and they've always been great. They're my go to now.

I see where you're coming from. I've also had a bad experience with DO (CC arbitrarily blocked them which ended up with my droplets getting terminated and all data and backups wiped). That was at least as much an error on my part, though.

It does seem that they're unfortunately borrowing the playbook from AWS/Azure/GCP wrt over-automization as they scale. More old-school support could have been their differentiator, but it seems they're going for growth. They're getting close to the razor's edge.

I had a similar experience as well https://news.ycombinator.com/item?id=18145781

I no longer recommend them any production usage.

I'd go a step further and claim that most tech companies are ultimately a threat to people's freedom and happiness. Not the tech itself, but the people that wield and profit from it.

Massive bureaucratic nightmares never act with malice, but the people get crushed all the same.

Worms on the sidewalk.

They care, but the dominant policy in Google's calculus about what features should be released is "Don't let the exceptional case drown the average case." A legitimate SaaS providing business to customers might get caught by this. But the average case is it's catching intentional bad actors (or even unintentional bad actors that could harm the Chrome user), and Google isn't going to refrain from releasing the entire product because some businesses could get hit by false positives. They'd much rather release the service and then tune to minimize the false positives.

To my mind, one of the big questions about mega corporations in the internet service space is whether this criterion for determining what can be launched is sufficient. It's certainly not the only criterion possible---contrast the standard for us criminal trial, which attempts to evaluate "beyond a reasonable doubt" (i.e. tuned to be tolerant of false negatives in the hope of minimizing false positives). But Google's criterion is unlikely to change without outside influence, because on average, companies that use this criterion will get product to market faster than companies that play more conservatively.

Nah-- I think you've got it all wrong. The problem isn't the false positive/false negative ratio chosen.

The problem is that there's false positives with substantial harm caused to others and with little path left open to them by Google to fix them / add exceptions-- in the name of minimizing overhead.

Google gets all of the benefit of the feature in their product, and the cost of the negatives is an externality borne by someone else that they shrug off and do nothing to mitigate.

One solution, perhaps, could be to have some kind of turnaround requirement---a "habeas corpus" for customer service.

By itself, it won't solve the problem... The immediate reaction could be to address the requirement by resolving issues rapidly to "issue closed: no change." But it could be a piece of a bigger solution.

Google Safe Search is only half the story. Another huge problem is Google's opaque and rash decisions about what sites throw up warnings in Chrome.

I once created a location-based file-transfer service called quack.space [0] very similar to Snapdrop, except several years before they existed. Unfortunately the idiot algorithms at Chrome blocked it, throwing up a big message that the site might contain malware. That was the end of it.

I had several thousand users at one point, thought that one day I might be able to monetize it with e.g. location based ads or some other such, but Google wiped that out in a heartbeat with a goddamn Chrome update.

People worry about AI getting smart enough to take over humans. I worry about the opposite. AI is too stupid today and is being put in charge of things that humans should be in charge of.

[0] https://www.producthunt.com/posts/quack-space

[1] https://snapdrop.net/

Isn’t it possible your service was hosting malware and you just didn’t know? This same problem killed Firefox Send: https://support.mozilla.org/en-US/kb/what-happened-firefox-s...

Google has a lot of control of the Web.

Much less control of the Internet.

One lesson is use IP and not the Web.

> I use Linode now as their support is great and they don't just drop ban hammers and leave you scrambling to figure out what happened.

Linode once gave me 48 hours to respond (with threats to take down the site) because a URL was falsely flagged by netcraft based on what looked like an automated security scan of software I was hosting. Granted, they did not take any action and dropped the report once I pointed out that it was bullshit, but I do not consider this great service. If there is no real evidence of wrongdoing I should not be receiving ultimatums.


You are only focusing on the negatives while completely ignoring the positives here.

Here are a few questions to consider that may give you better perspective:

1) Do you know the magnitude of financial and psychological damage caused by malware, phishing, etc on the web?

2) Do you believe that it is possible to have a human review every piece of automation generated malware on the internet?

3) Do you believe it is possible to build an automated system that provides value with zero false positives?

4) Do you think an open standards body or government bureau would perform any better at implementing protections from the threats described here?

Author here - I don't underestimate the complexity of the task that Google Safe Browsing tries to accomplish.

But: Do you believe there is no room for improvement in an automated, opaque system with clear evidence of malfunction, that quite succinctly decides if hundreds of people go unemployed when their company tanks for nothing other than an incorrectly set threshold on some algorithm?

That is the real question to ask. Google is nowhere near its limits in terms of capability, as is made abundantly clear by its extremely comfortable financial position.

I do agree that there's room for improvement. There's always room for improvement, but there are also limits to the transparency one should provide for an anti-abuse system. It's difficult for anybody except for an expert in this area to say what would be a safe and satisfactory way to expose appeal and remediation for false positives. In the example from the story it looks like the turn around time was just an hour for your case, which seems rather good. The fact that not all consumers of this data were as responsive looks out of Google's control, and should be taken up with those companies.

I don't agree with the premise of your last question. It's not Google's responsibility to protect the internet and provide a free anti-abuse database for other browser vendors, and yet Google does do this at significant cost. The fact that they don't do it perfectly is not a rationale for killing it or providing it with infinite resources.

> It's not Google's responsibility to protect the internet and provide a free anti-abuse database for other browser vendors, and yet Google does do this at significant cost. The fact that they don't do it perfectly is not a rationale for killing it or providing it with infinite resources.

I think that's a naive perspective. Google did not create the database to be nice to other vendors, and it also did not make it available to them for that purpose.

An Internet-wide blacklist represents strategic leverage over competitors (or maybe even dissonant voices, should the need arise) and an massive source of data collection probe points. These facts were certainly brought up internally and deemed worth the risk when the massive legal liability of this product was assessed.

Therefore, because of the pervasiveness of this system, it needs to be handled responsibly. They are not doing anyone a favor by making sure it functions correctly. Google is well aware of this, because they don't need regulators and lawmakers gaining yet another excuse to try and dismantle them.

2*) Do you believe that it is possible to have a human review every FALSE POSITIVE result from automated malware detection on the internet, when reported by those adverse affected by the false positive result?

Yes, yes I do. Banks do it for their customers today at scale.

So what happens when the fraudsters automate clicking the "request review" button? They can spin up as many phishing sites as they want, and request as many human hours in review as they want.

With banks, they only have to do that for their customers, whom they've at least had a chance of getting money from. But Google would need to provide it to every site which gets blocked, (as malware sites pretend to be legitimate). Which

There are plenty of mechanisms to tackle this problem. But you have to want to care.

Your clients will hate you for this as you are creating false positives. Sure, Google is sometimes unethical, but calling them a malicious actor? Really?

Following "Consider Google as a malicious actor/threat" with "I am not saying Google is acting with malice" is probably a strong indicator that you should have thought it through before posting it.

"Consider as" does not mean "is". Your lack of reading comprehension is not the fault of the poster.

It's a relatively long article - but it does not answer one simple question, which is quite important when discussing this: were there any malicious files hosted on that semi-random Cloudfront URL? I realise that Google did not provide help identifying it - but that does not mean one should simply recomission the server under a new domain and continue as if nothing has happened!

From TFA:

> We quickly realized an Amazon Cloudfront CDN URL that we used to serve static assets (CSS, Javascript and other media) had been flagged and this was causing our entire application to fail for the customer instances that were using that particular CDN

> Around an hour later, and before we had finished moving customers out of that CDN, our site was cleared from the GSB database. I received an automated email confirming that the review had been successful around 2 hours after that fact. No clarification was given about what caused the problem in the first place.

Yes, yes, Google Safe Browsing can use its power to wipe you off the internet, and when it encounters a positive hit (false or true!) it does so quite broadly, but that is also exactly what is expected for a solution like that to work - and it will do it again if the same files are hosted under a new URL as soon as detects the problem again.

Author here. Nothing was fixed, and the blacklist entry was cleared upon requesting a review, with no explanation.

They seem to be unable to answer this question since Google provided no URL. Without knowing what is considered malicious, how could they check if there was anything? What if it is a false positive?

I am just guessing here, but in case the author had their service compromised, maybe he can't disclose the information. Feels like they know what they are doing, and at least to me, reading between the lines, it looks like they fixed their problem and they advice people to fix it too:

> If your site has actually been hacked, fix the issue (i.e. delete offending content or hacked pages) and then request a security review.

Author here. We didn't do anything other than request the flag to be reviewed.

The recommended steps for dealing with the issue listed in the article were not what we used, just a suggested process that I came up with when putting the article together. Clearly, if the report you receive from Google Search Console is correct and actually contains malware URLs, the correct way to deal with the situation is to fix the issue before submitting it for review.

Yes, I guess if you're allowing users to upload arbitrary files that may contain viruses or malware, and you're not scanning the files, that makes you a potential malware host. That's how Google may see it. They're trying to protect their users, and you've created a vector for infection.

Too bad they don't ban googleusercontent.com.

Whether or not this author's site was or was not hosting malicious content is irrelevant to the thrust of the article, which is that due to browser marketshare, Google has a vast censorship capability at the ready that nobody really talks about or thinks about.

Think about the jurisdiction Google is in deciding that they want to force Google to shut down certain websites that correspond to apps that they've already had them and Apple ban from the App Store, for "national security" or whatever.

This is one mechanism for achieving that.

If there was malicious content, the search console would have provided a sample URL. It didn't.

Our company [0] was also hit by this too.

We receive email for our customers and a portion of that is spam (given the nature of email). Google decided out of the blue to mark our attachment S3 bucket as dangerous, because of one malicious file.

What's most interesting is that the bucket is private, so the only way they could identify that there is something malicious at a URL is if someone downloads it using Chrome. I'm assuming they make this decision based on some database of checksums.

To mitigate, we now operate a number of proxies in front of the bucket, so we can quickly replace any that get marked as dangerous. We also now programmatically monitor presence of our domains in Google's "dangerous site" database (they have APIs for this).

0: https://www.enchant.com - software for better customer service

Author here. I'm not sure exactly how they actually decide to flag. Alternatively, Amazon might somehow be reporting files in S3 onto the Google blacklist.

It would seem surprising, but it's the other possibility.

> What's most interesting is that the bucket is private, so the only way they could identify that there is something malicious at a URL is if someone downloads it using Chrome. I'm assuming they make this decision based on some database of checksums.

Doesn't Chrome upload everything downloaded to VirusTotal (a Google product)?

> Doesn't Chrome upload everything downloaded to VirusTotal (a Google product)?

It doesn't, unless you opt for SafeSearch "Enhanced Protection" or enable "Help improve security on the web for everyone" in "Standard Protection". Both are off by default, IIRC. Without it, it periodically downloads what amounts to bloom filter of "potentially unsafe" URLs/domains.

On the other hand, GMail and GDrive do run the checks via VirusTotal, as far as we know - which means that OP case may have been caused by having some of the recipients having their incoming mail automatically scanned. It's similar for Microsoft version (FOPE users provide input for Defender Smart Screen), at least last time I checked.

What happens if it is a hit against the bloom filter / checksum? Would it transmit the URL so that it can be blocklisted?


TL;DR is you download a chunk of SHA-256 hashes and check if the hash for your URL is there. There is of course the chance of collision but that is minuscule.

Oh I know that's how that works, I meant, does Google transmit back the URLs once it does get a hit, to protect others from downloading that file?

Why would it need to do that? To protect others from the same url, the same hash checking method should work.

The blacklisted URL in this case is found in a downloaded file from a S3 bucket.

Other people downloading the same file would get the same "protection", but in this case this goes a step further:

The S3 bucket itself gets then blacklisted. As it was a private bucket, one of the ways this could happen is that once chrome found the blacklisted URL, it sent back to Google the url (s3 bucket) where the file with the blacklisted URL was found.

The hashes of all things that match a "probably evil" bloom filter, yes.

Hosting a virus on a domain and then downloading it a few times with different chrome installations sounds like a good way to get the whole domain blacklisted...

That's why user uploads are worth some thought and consideration. File uploads normally gets treated as a nuisance by developers because it can become kind of fiddly even when it works and you are getting file upload bugs from support.

It normally isn't that much of a challenge to mitigate the issues, but other things get priorities. Companies end up leaving pivots to XSS attacks and similar bugs too.

Google has a great service for this called Checksum. You upload a file checksum and it validates it against the database of all known bad checksums that might flag your website as unsafe. The pricing is pretty reasonable too and you can proxy file uploads through their service directly.

I'm actually not telling the truth but at what point did you realize that? And what would be the implications if Google actually did release a service like this? It feels a bit like racketeering.

Real shame if this domain got blocked because of a contraband file, eh? Just pay us and we'll make sure you don't have any problems.

Ha! You got me. I was like, wow, that sounds really useful. I'd love to sign up for that, and built my app to use it, if that were the case.

But then, I realized: 1). I'd be integrating further into Google because of a problem they created (racketeering), and 2). They seem to really dislike having paying customers (even if they made it, they'd kill it before long).

And 3), they would later update their evil-bloom-filter and all of the sudden the file you paid to get verified is now an Evil File, and they blacklist you anyway.

They actually blacklist you even faster, because of course they have in their database that you have the now-evil-file.

I wonder if that could be triggered even when the certificate chain is not validated... you could MITM yourself (for example, using Fiddler) and make Chrome think it's downloading files from the real origin. In that case, an attacker could do that from multiple IPs and force Google to flag your whole domain.

Why isn't Dropbox blacklisted? Too big?

Dropbox actually provides an unique domain for each and every user - and separates the UGC from the web front code and Dropbox own assets that way - that's where the files you preview/download are actually coming from. I have no doubt a fair number of those is blacklisted.

unique TLD? that should be very costly?

or does GSB not ban the entire TLD when a subdomain has malicious content?

Would be great if our overlords at least publish the overzealous rules we need to abide by.

Dropbox DL and Preview urls take a form of https://uc[26 character hex string].dl.dropboxusercontent.com/... and https://uc[26 character hex string].preview.dropboxusercontent.com/... - it does not have to be a separate TLD to avoid being blocked, but it has to be differentiated.

This is the same reason why the block of the TFA company did not cause an outage of everyone using CloudFront - GSB does not block full TLDs if it can be shown content is distinct. Same for anyone using S3, Azure equivalents and so on.

I wonder if there's a threshold here.. When I researched this issue (while we were figuring out how to mitigate it), I did encounter some people who had their entire domains blocked because 1 subdomain had bad content. In fact, this thread itself has mention of that happening to neocities

Author here. It's really not clear what criteria GSB uses to decide at which level the ban should apply.

Probably when the ratio of bad sites to good sites at a particular subdomain level passes a threshold.

Or when a significant litigation risk is perceived, if the domain level block review is human

My Google-fu is failing me right now, but there is a list of domains like dropboxusercontent.com that are treated as pseudo second-level domains for purposes like this.

e.g. u1234.dropboxusercontent.com is treated as a unique domain just like u1234-dropboxusercontent.com would be.

Edit: here we go, from another comment - the Public Suffix List: https://publicsuffix.org/

Sounds rather too resource-intensive? I've just tried with current Chrome on Windows and a 32MB zip on my personal domain, Wireshark says the file has not been sent anywhere.

I believe there are limits on the virus checking size. You can see this when trying to download really large files from Google drive (> 100mb)


Seems I might have just hit the limit? ... Nope, 8.1MB zip file also wasn't sent anywhere.

Wouldnt it be more efficient to grab it in parallel to your download?

That only shifts the bandwidth cost between the original server and the user, Google's resources are unaffected. And it's not what GP claimed.

I just checked the nginx access logs - both the 32MB and the 8MB zip files have been accessed only once (both were created only for this experiment).

Or you could screen your attachments for malware?

We do, but it's not good enough for Google.

Yes, the power of something like Google Safe Browsing is scary, especially if you consider the many many downstream consumers who might have an even worse update / response time. Responsiveness by Google is not great, as expected, we recently contacted Google to get access to the paid WebRisk API and haven't heard anything in a few months...

However, phishing detection and blocking is not a fun game to be in. You can't work with warning periods or anything like that, phishing websites are stood up and immediately active, so you have to act within minutes to block them for your users. Legitimate websites are often compromised to serve phishing / malicious content in subdirectories, including very high-level domains like governments. Reliable phishing detection is hard, automatically detecting when something has been cleaned up is even harder.

Having said all that, a company like Google with all of its user telemetry should have a better chance at semi-automatically preventing high-profile false positives by creating an internal review feed of things that were recently blocked but warrant a second look (like in this case). It should be possible while still allowing the automated blocking verdicts to be propagated immediately. Google Safe Browsing is an opaque product / team, and its importance to Google was perhaps represented by the fact that Safe Browsing was inactive on Android for more than a year and nobody at Google noticed: https://www.zdnet.com/article/mobile-chrome-safari-and-firef...

Lastly, as a business owner, it comes down to this: Always have a plan B and C. Register as many domains of your brandname as you can (for web, email, whatever other purpose), split things up to limit blast radius (e.g. employee emails not on your corporate domain maybe, API on subdomain, user-generated content on a completely separate domain) and don't use external services (CDN) so you can stay in control.

Applications are open for YC Summer 2021

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact