Hacker News new | past | comments | ask | show | jobs | submit login
Google has now ‘forgotten’ more than a quarter-million URLs (washingtonpost.com)
94 points by lelf on Dec 23, 2014 | hide | past | favorite | 44 comments



It's interesting that the right to be forgotten only applies to search engines (as far as I can tell). So in theory someone could scrape the first few pages of results for most names in the world (or those who are somewhat famous at least), keep track of what results have been completely removed and create a site listing all of them.

I could even see this being a business model by selling that data to Private investigators, credit agencies, the media etc. It's a little unethical though could be useful if say politicians or unscrupulous people try and re-write history to make themselves look better.


It would be hilarious if Google then scraped your list of removed search listings.


It applies to every person responsible for the processing of personal data. See Article 12(b) of the EU's Data Protection Directive. [0] Most parties would however be able to rely on one of the grounds for processing listed in Article 7, such as consent, or as is the case for news media the Article 9 exception for freedom of expression.

In the Google Spain case the CJEU held that Google needs its own ground/exception to publish personal data. (Rather than saying "NYT is allowed to process these personal data, and therefore so am I")

[0] http://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:319...


It would make a lot more sense if the right to be forgotten applied to regular sites first and foremost. A news article that unfairly damages someone's reputation would eventually drop off Google if the original article were removed.


This ruling is unlinking of factual data. It is industrialized bowdlerization of historical research tools. And your proposal would be industrialized bowdlerization of the actual texts.


The rest of the world still sees the actual search results as does anyone who goes to google.com. Your censorship idea would have to disconnect Europe from the rest of the internet to work.


The point is that the search results would quickly disappear if the original pages were removed (since the search engine would remove them from its index too).

It's not the case because it's easier to target the search engines rather than the N websites which have published a story.


For some reason they were nicknamed memory holes. When one knew that any document was due for destruction, or even when one saw a scrap of waste paper lying about, it was an automatic action to lift the flap of the nearest memory hole and drop it in, whereupon it would be whirled away on a current of warm air to the enormous furnaces which were hidden somewhere in the recesses of the building.


But Google drily noted that in some cases Web users are overestimating just how much of the online space the company can control: "Sometimes we even receive requests to remove content 'from the Internet,'" the company reports. Google might have a great deal of power over what happens online. But not that much.

Removing something from Google search results effectively does remove it from the Internet. If you can't find something, then it may as well not exist. It's a bit scary how much control Google has over the Internet.


If you can't find something [through Google], then it may as well not exist.

If, for example, a director wants their movie removed from Google search results, it will still show up in searches on imdb, wikipedia, rottentomatoes &c., and is thus accessible to quite a large number of internet users.


Or you could always search Google via proxy to some jurisdiction that doesn't have these silly laws.


how often do you use wikipedia's or imbd's search function?


Pretty often. Site-specific searches are still more efficient then Google. StackExchange comes to mind.


Quite Often, I frequently use wikipedia search, facebook search ( to find certain people ), bing search , github search, npm search,stack overflow search, reddit search.

But yes I agree removing someone from google will make them less visible.


At least once per day, each. Why would I use google to search wikipedia or imdb?

Both of those sites have now got search engines that are very effective; I have a two character shortcut in my address bar thanks to DuckDuckGo.


Direct browser addressbar searches do make it much easier to use a site's own search. Certainly for Wikipedia my keyword search of 'w <query>' is used daily to find things.


> Removing something from Google search results effectively does remove it from the Internet. If you can't find something, then it may as well not exist. ...

Well, if you believe the "social" hype[], people increasingly find info via shares from friends (e.g. Facebook posts). If I chose to make a Facebook post about Max Mosely (the "media figure in the United Kingdom" referred to in the article and strangely not mentioned by name) how could the "right to be forgotten" be applied? I think that even in Europe, that level of individual censorship would be unacceptable.

[] BTW I don't believe "social discovery" is nuking general search. But it's an interesting "what if"


this is an overgeneralised meme. Just because you can't find something on google doesn't mean that other people (who actually know of these site locations, or are the hosters of these sites, or are searching via different search engines) aren't accessing it.

Google is also not the be all and end all of search. According to comscore (take with giant salt crystals) google has around 67% of the search market in the US. Now taking only the US, this means that there are around 33% of the web-going population in the US who could care less what google does with its index (because they simply never see it at all in the first place).

Is irrelevance the biggest slap in the face? For some people even if google deletes their entire index, it wouldn't even make a difference.


Google's share of search in the EU is above 90%.


I am not sure everyone uses google :^), while a vast majority might there are other options


Yep, the celebrities are only a tiny tiny fraction of the people who don't want their online profiles/comments to be found. Removal from google makes it exceedingly harder to trace someone's possible online presences.


I don't understand who would choose to use the neutered euro-search when they can just go to the US site. Maybe they think people won't know.


There are other search engines than google. I'm using duckduckgo.com for years for searches. It occasionally misses some stuff, but so does google. And there are also Bing, Yandex, etc.


Is there a way to search through only the "forgotten" content? (from outside the EU). That could be interesting.


Some details from Google: https://www.google.co.uk/policies/faq/


How many have been removed for copyright reasons?


37 million requests...in the last month alone. I don't see a number for how much they actually removed, though.

https://www.google.com/transparencyreport/removals/copyright...

edit: interesting, you can download the entire set of copyright removal URLs, who requested the removal, the number of URLs in the request, etc[1]. I wonder if anyone has done any research on the kind and distribution of those URLs yet.

[1] https://www.google.com/transparencyreport/removals/copyright...


> Sometimes we even receive requests to remove content 'from the Internet'

Today Google resembles the once dominating Internet Explorer of Microsoft. Except one thing - it's a great search tool. Which means it's going to be very hard to find a new "Firefox" [1] to provide a viable alternative.

[1] In case someone doesn't know, it was Firefox that spearheaded the break out of the IE monopoly.


It would be interesting to see if the law leaves loopholes where businesses can use this as a competitive advantage. For example, what if I can request a page be removed from search results if my name appears in a comment on a page that promotes my competitor's business / products / services.


It's just keyword linking, not complete URL removal from search.

You can ask that your name keyword in search doesn't go to the page with your competitor's company, but keywords for your competitor and/or for your product/service will still route there. It's still whack-a-mole. Users looking for reviews will start searching within review sites as opposed to google searching.

The worst impact here is that politicians could remove links between their names and articles that are rightfully criticizing them.


Thanks for the clarification. I'm still not totally convinced some clever folks out there won't find ways to reduce search results for websites promoting competitors. Google will have every reason to suspect they're doing "the right thing" on some of these requests, but it's hard to imagine there won't be some ulterior motive, or even just unintended consequences, behind some of these requests that would not be obvious to the naked eye.

It's almost as if: not only do people who want keywords removed need to keep an eye out, but people who don't want their pages removed from search results will need to monitor this as well.

It will be interesting to see what kinds of disputes occur as a result of this. An interesting social experiment if nothing else.


Some people have already used this to hide negative reviews of themselves.


I suppose it's only a matter of time until a company finds a person with an obscure real name (say, "Kentucky"), asks them to comment on a negative article about the company (say, a CNN article critical of the chicken industry), and then has them ask to be forgotten, forever hiding the bad result from searches for "kentucky fried chicken problems"


Are other search engines complying with this EU court ruling? Or do other search engines just not matter?


Microsoft and Yahoo have reported they are complying as well.


Forgive my ignorance... but why would anyone want their URL 'forgotten'. If they want to hide from the search engine, isn't that what robots.txt is for?


If CNN has an article about my drunk driving record 10 years ago, I can force Google to "forget" said CNN article, even though I don't control CNN.com's robot.txt.


No, they don't forget the article. They forget the link between your name and the article - searching for other words in that article will still bring it up.


Google has not forgotten anything the results are censored for Europeans.


Hence the quotes


Is there away to prove the data is actually deleted? It's not really gone forever is it, just removed from the search index. Other copies will have been made by various agencies. I suppose for most purposes thats ok, but not all.


well, as covered in the article, it isn't deleted. And right to be forgotten only applies in the EU anyways...the searches should work the same outside the EU.


For now. The EU is pushing for more:

“De-listing decisions must be implemented in such a way that they guarantee the effective and complete protection of data subjects’ rights and that EU law cannot be circumvented"...

Context is extending RTBF to Google.com, but that seems like a slippery slope.

http://www.theguardian.com/technology/2014/nov/27/eu-to-goog...


And only for search engines the information remains on newspaper sites.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: