DuckDuckGo: Escape your search engine filter bubble (dontbubble.us)
If I had to tackle the notion of over-personalization in ~5 minutes, I'd say:

- If someone prefers to search Google without personalization, add "&pws=0" (the "pws" stands for "personalized web search") to the end of the Google search url to turn it off, or use the incognito version of Chrome. Personalization tends to be a nice relevance improvement overall, but it doesn't trigger that much--when it launched, the impact was on the order of one search result above the fold for one in five search results.

- personalization has much less impact than localization, which takes things like your IP address into account when determining the best search results. You can change localization by going to country-specific versions of Google (e.g. search for [bank] on google.co.uk vs. google.co.nz), or on google.com you can click "change location" on the left sidebar to enter a different city or zip code in the U.S.

- We do have algorithms in place designed specifically to promote variety in the results page. For example, you can imagine limiting the number of results returned from one single site to allow other results to show up instead. That helps with the diversity of the search results. When trying to find the best search results, we look at relevance, diversity, personalization, localization, as well as serendipity and try to find the best balance we can.

I saw Eli Pariser's talk at TED and was skeptical, although I did enjoy his example of Facebook starting to return only his liberal friends because he only ever clicked on the links his liberal friends shared. I had a number of concerns browsing through Pariser's book, but I would encourage anyone interested in these issues to pick up a copy; it's a thoughtful read.

I agree, I don't believe that over-personalization is an issue. (As I already said in another comment.)

But here's why some people probably don't like personalization: It's invisible. There is nothing on the results page that tells you whether your results are personalized or not. Sure, you can look at whether the browser is in incognito mode or you can look for some parameter in the URL, but these things require that you already know about personalization.

In contrast to personalization there are various indicators that a page is localized. The most prominent is obviously the language of the text. As soon as all the search results are in my local language it is very obvious to me that I got localized search results. I can also detect localization by looking at the Google logo on the homepage (localized versions have the country name in grey text below the Google logo), by looking at the language of the Google interface, by remembering the domain name I used to access Google and by looking at the sidebar on the left that even displays a guess of my location on the city-level.

There are no such indicators at all for filtered/personalized results. Every user around me starts with the same version of Google results. That's how everyone got acquainted with Google in the beginning. Same results for everyone. There is no reason for a user to question that until you see the differences by comparing search results, which most users won't. Someone who doesn't happen to work at Google or didn't hear about the "filter bubble" can't know that search results will start to diverge from vanilla results over time.

So personalized Google results violate the principle of least astonishment.

While I can't speak for other people, I think that the concerns that some people have about this are rooted in the fact that you can get trapped in some sort of feedback loop without ever knowing. If you could see that your results are personalized you could compare them to unpersonalized results and decide for yourself which you like better.

"But here's why some people probably don't like personalization: It's invisible. There is nothing on the results page that tells you whether your results are personalized or not."

Sorry, this is just not the case: we do provide indication on Google's search results page for personalized results. Here's a couple links that talk about how we surface whether results have been personalized: http://www.google.com/support/websearch/bin/answer.py?answer... is our support page and http://searchengineland.com/google-now-notifies-of-search-cu... is an article on Search Engine Land from 2008 when we started surfacing information on how results were personalized.

Here's a simple demo. Do a search in Chrome incognito mode and go to the bottom of the page. You won't see a link that says "View customizations." Now do a search in regular Chrome and check for that link. When I did a search for [matt cutts] in regular Chrome, I saw the "View customizations" link, clicking the link gives this message:

"Search customization details: matt cutts

When possible, Google will customize your search results based on location and/or recent search activity. Additionally, when you're signed in to your Google Account, you may see even more relevant, useful results based on your web history. The following information was used to improve your search results for matt cutts:

Web HistoryOne or more items in your Web History were used to improve search results. Manage Web History Remove Web History from my Google Account If you're curious, you can see what a search for matt cutts looks like without these improvements. The 'More details' link on your search results page can be used to display this page for approximately 30 minutes, after which it will no longer show this page."

In other words, not only can you tell whether a search results page was personalized, you can click a link right on the search results to see exactly what criteria were used to personalize the results. And that page has a clear link to run the search again without personalization.

As I mentioned before, personalization is typically a minor effect in Google's search results and it's almost always an improvement. But for people who are worried about potential "over-personalization," we do provide easy ways to see when a search was personalized, why it was personalized, and do the search again without personalization.

Thank you for the clarification. This comes as a surprise to me. I did not know that.

In my defense, I couldn't know about the "View customizations" link because I do have web history turned off, so apparently I never saw any personalized search results. After reading the DuckDuckGo page I expected that everyone's search results get personalized, especially if I am logged in with a Gmail account.

It's obviously not your fault that I didn't know about that, but, on the other hand, you can never expect from a user to know the contents of any help page. Clicking on "help" links is not what most users do. (imagine smiley face here, I don't dare to do that on Hacker News)

Additionally, I think that the "View customizations" link is a bit misleading, because usually customizations (in terms of software) are not automatic. At least I would expect that customizations are something that I do.

Also, the link seems to be placed at the bottom of the page, which means that 99% of the users are probably blind for it. (I can't verify where it is actually placed, because I don't see it.)

After all, I am thankful for the great search results that Google offers. Thank you for your hard work.

Happy to discuss this, jannes. Your points are well-taken: when we first launched the ability to see why/how results were customized, we added a link at the top-right of the search results (the Search Engine Land article has a snapshot from those days).

But there's another guiding principle that things on the search results page need to "earn" their pixels. Since personalization is a second-order effect and very very few people ever cared enough to click the link and get more info, eventually that link made its way to the bottom of the search results.

I'm sorry, but a "View Customizations" link is nowhere near as clear as a simple statement like "These search results have been personalized." right on top of the search results page, where you can't miss it.

But the real solution is to use a search engine that does not track you. Even better is to use it in such a way that it can't track you (ie. through a Tor proxy, while taking other reasonable precautions).

In case you missed where I said this below, we did launch a message on the top-right of the search results page. The Search Engine Land article had a snapshot, but here's a direct link to what it looked like: http://www.flickr.com/photos/searchengineland/2717951328/

Over time, we saw that people didn't seem to notice/care about the message and corresponding link much, so it eventually migrated down to the bottom of the search results.

It's interesting that the response to people possibly not noticing the link was to make the link less noticeable.

Instead, you could have tried to make it more prominent by (for instance) moving it to the upper left rather than the upper right of the search results, right above/below the ads.

Another issue that might be interesting to explore is to what extent users really understand what search customization is, and whether they'd care more or less about it being done automatically once they understood it better.

I have a feeling the vast majority of them probably wouldn't care, and take the attitude of "do whatever it takes to make the results you return more relevant, and I don't really care how."

If the flickr image is the actual size, then no wonder it was not noticed. No matter how much I customize all my interfaces -- and with the increasing pixel count of displays -- interfaces are constantly populated with immutable 8 point fonts. Any font less than 14 points is fine for 1985 and VGA displays; but not anymore.

After knowing that there was a "View Customizations" link, it took me > 1 minute to find it. It is in the most unintuitive place where 99% of the time I don't even scroll to. Sorry this is in no way advertised.

Google also filters special terms like bittorrent in instant search. This is part of that bubble and people don't even realize it. Thats the point, that most people won't notice, not that the views are not there.

Its like experts-exchange.com putting content below the long footer of the page, yea they can claim its there but many won't notice.

There's also this http://i.imgur.com/PMD5U.png which I hadn't noticed before. I just clicked through DDG's links and it tied guns and Obama for me, pretty cool.

Here is a slightly different question, how do you get "no country redirect" to stick reliably?

I cant begin to describe how annoying it is that I am presented with a different language when travelling to a different country. All I ever want is Google in English but it keeps going back to a localized search regardless of the many times that I choose "google in english"

Good question. I just saw an expert in the hallway and asked him. The basic answer is that your preference is stored in a cookie, so the preference would be forgotten if you're clearing cookies. If you still have the same cookie and yet the "no country redirect" isn't sticking, that's a bug we could dig into.

By the way, I asked why the "no country redirect" isn't stored with your Google account rather with a cookie. The main reason he gave was that whether to do a country redirect is one of the first things we decide, and it's faster to use a cookie for that than to go looking up the user's account setting. Or at least, cookies have been faster up until this point. Hope that helps explain things.

I regularly see this without clearing cookies, so yes, I would consider this a bug.

There seems to be no discernible pattern as to why it works in one session, suspend laptop, go somewhere else and it stops working. Or it'll work twice in a row and when I return location a (both in the same foreign country), it stops working.

Interesting. HN isn't ideal for debugging, but if you wanted to send cookies or IP addresses (e.g. via Twitter), I could see if someone could look into it.

Use: http://www.google.com/ncr to disable. NCR=no country redirect. I've set this as my homepage and not had any issues

FANTASTIC TIP! This drives me (and 3 co-workers who will want to buy you a beer) insane - I travel a lot, and use proxies - so google is constantly swapping languages on me.

Though it makes me less worried about them collating all my personal information if they can't figure out I didn't suddenly learn to speak German ;)

There does not seem to be a way to use Google SSL + NCR. E.g.


does not seem to be a valid option. Is there a way to pass NCR in a param like ncr=1? Or the fact that Google SSL is used implies use of NCR?

EDIT: It looks like &ncr=1 does work but I'm not sure if this is equivalent to no country redirect.

Right now, this is what I have set up (which also disables personalization)


Cant easily do this as i am currently in country. Can you shoot me a contact email (mine is HN username @ gmail) and I can send you something when I'm on the road again and can send you cookies/IP

It's a shame how Google completely disregards your HTTP Accept-Language header. If they didn't, this would be much less of a problem.

This annoys me more than anything - not just google, but for a large and growing number of sites. They completely ignore your browser settings and select language based on IP address. I installed the google international search plugin from mycroft which has solved my problem with google - but still suffer the myriad of other sites that ignore my browser's config.

I think in the past we saw a lot of people with their Accept-Language header set wrong, which is why we haven't used it. But we've been having a good discussion internally about the "my language won't stick" issues raised on this thread.

How practical would it be to initially trust Accept-Language but also prominently display a link to change to the language detected through IP geolocation (with an easy way to hide/decline the link)?

I suppose you know about http://www.google.com/ncr ? It works some of the time.

Some of the time it doesn't, e.g. now when Google has custom logos I'll get a search term in Dutch (I'm in The Netherlands) when I click on it, even though I'm using google.com in English when doing so.

Yes, I know about about /ncr but its hard/impossible to set that on a mobile device

Also, it only works sometimes.

As yet another aside, has anyone else noticed how it is pretty much impossible to set SafeSearch to off without being logged in?

Trying to set the cookie just does not seem to stick

If you want, you can set "&safe=on" or "&safe=off" at the end of the url.

I reported this localization problem years ago. The best way is to take language from browser headers, like a lot of other sites do.

For some reason responsible googlers were ignoring my proposal.

I don't want to jump so many hoops just to do one damn search. Personally, I hate the country specific personalization, and perhaps it's useful to most people in my country rather than showing more US results, but I'm really not interested in those types of results myself. I wish I could just check a box in my Preferences and then be able to see the universal search results.

I know I can use /ncr at the end, but I'd rather not have to do that all the time, and I think you can't even make that the default search for the Omnibox in Chrome, which means I have to give up Omnibox, which I love using, in order to get away from personalization. And even then, I think it just means I won't see my country specific results, but it probably still personalizes my search results through other types of signals.

So fine, don't make universal search the default way to search, but just give me a checkbox so I can turn it on when I want to. I want to see the best results, period - not the best results for me (or whatever Google thinks are the best results for me).

"I want to see the best results, period - not the best results for me"

But aren't you a part of the relevance equation? The ideal results for a search like [bitcoin crash] should be different for a Japanese-speaking searcher in Tokyo vs. a German-speaking searcher in Munich vs. a bitcoin expert vs. a programmer trying to diagnose why compiling bitcoin is crashing vs. my Mom who has never heard of bitcoin before, right?

But why on earth do you persistently try to present German search results to an English speaker in Munich? This has been peeve #1 for years, across all Google services.

Good question; one person is annoyed by this endlessly a few offices down from me. It's hard to make sure that things are handled consistently sometimes, but if you use google.com/ncr or the "Google.com in English" link at the bottom of the home page, that should help. The "Language tools" link to the right of the search box should also let you set a cookie with your language preference.

And yet for some reason, the search link from Google Toolbar seemed to ignore that cookie and certainly ignored my account preferences, sending me to the localized search page regardless.

Short version of long story: I used to bounce my net traffic through an ssh tunnel to a hosted VM. The VM was moved to a new machine in south-east asia. Having Google Toolbar constantly send my searches to the localized engine, despite the cookie selection, my being logged in and my preferences being clearly set, was more of a day-to-day annoyance than having my traffic piped across the Pacific twice.

I appreciate that you went and asked someone for the clarification that this issue is about the preference being set in a cookie and not in user settings, but this doesn't solve the problem for many of us.

For YEARS, on a weekly and sometimes daily basis, always logged-in, always with preferences set to english, it is infuriating to routinely end up receiving results that conflict with your explicit language settings.

What, concretely, do we have to do to get someone at google to push a change from using cookies to a real user-setting to fix this absurdity?

You said they did it for speed. Giving me completely incorrect results in a language I can't even read 1 millisecond faster than giving me results that I actually care about is a win? This is over-optimization.

Exactly, and it's cross-device, cross-service. What about browser built-in search? Or mobile? Or not wanting cookies or having to log in? I see my browser having a preferred language setting...

Just add a new search engine with the special url and make it the default as opposed to the pre-baked search engines.

You cant easily do this on a mobile device.

Personalization tends to be a nice relevance improvement overall

I agree completely. Reading that page, my first thought was "if (not saying I don't) Egypt is a place I probably don't plan on going to, why should I waste time looking at those links in my search results?"

Overall, I think personalization (wow, this isn't a word?) reaches its own form of market efficiency. If the personalization algorithms are bad, then people would shy away from the search engines that provide them. However, if they make sense, and return the most relevant results most of the time, then that's saving us a lot of time.

Do you have a pointer to a publicly visible reference to the definition of some of the other query string parameters Google uses? Just thought I'd ask, given the context. I've noted various references people have cobbled together, but perhaps there's something a bit more... "canonical". Thanks for the above.

http://code.google.com/apis/searchappliance/documentation/61... is a reference for our search appliance, but a lot of parameters are in common. I'm not aware of any other official Google doc available externally, but if you search for [google search url parameters] the top 3-4 results are all quite good.

If someone prefers to search Google without personalization, add "&pws=0" (the "pws" stands for "personalized web search")

"Append a cryptic query param" is a terrible user interface, and it's a little silly to suggest this as a solution. Make it an option real people can discover and use.

when it launched

Just curious, why add this qualification? Is it significantly different now?

the impact was on the order of one search result above the fold for one in five search results.

That statistic needs a little clarifying. Query terms frequency follows a power law distribution, doesn't it? So if you're counting each individual term, the fact that 1 in 20 have altered results could very easily still mean a majority of actual searches are altered. And depending on how you calculated the 'one search result above the fold' number, it could very easily still mean that when a page is altered, it's altered significantly.

More interesting would be knowing these stats for just the fat head of the query term distribution.

Just wanted to say thanks to Matt_Cutts for addressing many relevant issues/questions here on HN. Your responses are appreciated and not always taken for granted.

A much better alternative is https://ssl.scroogle.org , which doesn't have personalization since google can't tell scroogle users apart. It also has benefits like having a bit of privacy while you search.

I think Scroogle hits Google from a relatively small set of IP addresses. Be aware that Google is probably trying to localize for those IP addresses, so your results could be less relevant (in the same way that if you searched through a proxy in Germany, you'd be more likely to get results with a German emphasis).

Personally, if I have a search that feels a bit sensitive, I just hop into incognito mode in Chrome. Control-shift-N is an easy shortcut to open an incognito window in Chrome. And don't forget that you can use https://encrypted.google.com/ to do a search via SSL as well, which provides an encrypted tunnel between your browser and Google.

Your reccomendation is hilarious because it still does not give the user any privacy. Neither private browsing mode nor SSL will give the user any privacy from google, which is the real concern. You still log searches and IP addresses for an extremely long time.

One solution that actually does solve this is to use Scroogle or to access google through Tor.

EDIT: I feel that i should clarify that of course SSL is a very important feature to have for a whole slew of reasons and that i'm glad Google supports it, just that it wasn't relevant to my point.

"Your recommendation is hilarious because it still does not give the user any privacy."

I'm a big believer in prioritizing actual issues over perceived issues. Using SSL search prevents bosses, ISPs, and governments from sniffing your queries, and I consider those to be the largest threats to your privacy. Some ISPs sell their customers' query data and surfing patterns, for example. In contrast, when the Department of Justice tried to subpoena two months worth of user queries, Google resisted that challenge and won in court. Having worked at Google for 11+ years, I know that my colleagues care a great deal about our users' trust and privacy and work to protect it with features like SSL search, two-factor authentication, warnings when sites might be hacked or hosting malware, etc.

If you're that worried about Google, don't use it, but if you still want Google results but with as much anonymization as possible, I would choose Tor+incognito-Chromium instead of Scroogle for your searches.

> ...the real concern.

Ironically, "the real concern" is personalized. The married guy searching for "hot local hookups" doesn't care what google knows, he just doesn't want it to show up in his browser's history. The junior high student reading The Big Book of Mischief in the computer lab doesn't want his school's network monitors to find out. The dissident in Tunisia doesn't want his government monitoring his Internet usage.

There's only a small set of privacy-conscious Internet users who should be concerned about Google, whether or not they're as impregnable a bastion of privacy as their employees might claim.

I had to stop using encrypted.google.com because it does not provide a link to switch to image search.

Which in turn, incidentally leads back to DDG : ).

Crazy good stuff has been going on with DDG the latest couple of months, I'm just glad I've finally found a search provider who takes these issues seriously!

This entire argument is simply without merit. Google is, and always has been, a filter.

If I type in "Barack Obama", it will not show me links about The Green Bay Packers NFL team. This is not because Google is conspiring to keep me from reading about the Green Bay Packers. It is because it is most likely that I am not looking for Packers links and will not click. The general idea is that Google starts with every single piece of content on the internet and filters to get the content it thinks I am looking for.

Now, this article complains that Google is flawed because it will more likely show some MSNBC over Fox News, or vice versa. The implication here is that you never click on Fox News when it is presented. Because if you did click on Fox News and its ilk from time to time, Google wouldn't start filtering it in the first place. The problem isn't that the search engine creates the "bubble", it is that the user does!

So if the scenario presented in this article offends you, then start changing your behavior and browsing more diverse sites. Otherwise, don't blame Google for noting that you hang out in a very narrow corner of the internet, and presenting you links from that corner. It's just doing its job correctly in that case.

Yeah this is an important issue. Google apparently ranks results (for you personally) based on some 57 inputs, even when not logged into Google services. In short results suffer from a self-reinforcing feedback loop, forever constraining what you see.

I wonder how easy it is to get "clean" or "default" results from Google?

And I know it's been discussed many times , but just how easy would it be to maintain real anonymity across the web?

You can get "clean" or "default" results from google by using http://www.scroogle.org/

How? I've seen other people link to it, it's just a page with text salad. What is there to use?

I don't understand that site.

Yeah, Scroogle's front page is really bad. Here's a link to where you can put in the actual search terms: https://ssl.scroogle.org/

That worked, thank you.

Use http://www.scroogle.org/cgi-bin/scraper.htm

If you are using firefox, right click into the input field and select add keyword search, and add a keyword. Now you can easily search via scroogle entering the keyword and then the searchterms in the address bar.

Thank you, I don't find this site that useful but I was really perplexed at its home page, it looks like it was designed by someone suffering from schizophrenia.

Incognito/private browsing + going through DuckDuckGo (even for Google queries, with !g prefix) works for me. (Edit to add since I can't reply: going through DDG avoids Google redirecting to country-specific variation).

I hate that country specific thing.

Somewhere, I think even on HN, I was taught about google.com/ncr - which should solve that problem (being in Country A and arriving on the localized, unreadable landing page) once and for all.

That said: I've DDG as my homepage and I'm a happy user.

Whit, what? Doesn't going through DDG with !g not just redirect to google, where your normal filter will be applied (private browsing notwithstanding)...

No it's a non-issue, and by the way Google probably uses much more that 57 'inputs' (signals) to determine relevance. Personalization is just a different term for relevant results and this is just FUD, the source of which is people who don't understand the technical aspects.

Personalization can, however, fail. I have a German IP address which causes Google to show me predominantly German search results. That’s not what I want at all most of the time.

There are ways to sort of get around that but they are cumbersome and they don’t always work right.

Google is pretty good at figuring out what to show you depending on the language of the search terms you are using. When there are German words in my search query Google will show me predominantly German results. That’s to be expected, that’s what I want. The problem is that Google seems to use my location (IP address, maybe also the language of the interface and whether I’m using google.de or google.com) and override that behavior so that even if I’m using english words in my query it will nevertheless show me predominantly German results.

I'm gonna humbly disagree. There are many potential ranking factors in a search result: how well the query matches the document text, how important that text is in the document, how important the document is (PR) and so on. What I don't necessarily want is further ordering by what a machine thinks is my political inclination or world view.

But probably more than 80% of the users do want that.

edit: speaking of bubbles :D

"Personalization is just a different term for relevant results"

Fine. It's not a "filter bubble", it's "excessive personalization", and it's still bad.

Arguing about what it's called doesn't do a thing to change whether or not it exists, or whether or not it's a real problem.

Is it actually excessive, though? You're just assuming that it's a real problem without any evidence to support that it is.

Read my last paragraph carefully; I'm just pointing out the name doesn't matter, it's a real phenomena that doesn't go away by changing the name.

My personal stance is actually a great deal more nuanced, which is that you can't not be in a bubble. It is mathematically impossible. Any way of slicing the torrent of information coming at you constitutes a bias. The entire idea of "piercing the bubble" is an instance of English misleading you, it's a concept without a referent. The question is not how to "escape" the bubble, the question is how do we choose our bubble?

So filtering/personalization is always present; we entirely agree on that point. Is it excessive, though? If it's not excessive, then, almost by definition[1], it's not a problem - or at least is not a major one. There is actually a huge difference between "personalization" and "excessive personalization", which was what I was trying to get at.

(Also, paragraph != sentence)

[1] Excessive meaning "more than is necessary, normal, or desirable".

A paragraph may legally consist of one sentence. It may legally consist of one word in some cases.

You seem to be trying to draw me into defending a point I'm not making. I'm making a much more subtle one, which is that you can't escape being in a bubble (not the bubble, which I initially typed, because there isn't the bubble, there's all kinds of them), so in a way arguing about whether it's "excessive" isn't even the right dimension to argue on; the filter bubbles simply are. (Not "simply ar excessive", simply are; they simply exists regardless of whether they are excessive or desirable or anything else.) The question is, what should be done about that fact, rather than how do we prevent that fact from being true, and to be honest I'm rather ambivalent about the answer to that question, because the answer is dominated more by your preconceptions and pre-existing goals than anything interesting.

I really should just blog this up.

I would agree with you if there was a button I could click to turn it off. Their algorithms aren't infallible by any means. Plus, ultimately they are a business and trying to sell, so they are going to eventually skew results towards things they think I may be willing to buy, rather than things I want to know. I think G is more susceptible than MS, but only because MS does make a living selling other stuff.

I'm not saying it's evil or wrong, just that if my first results don't appeal to me, a simple measure that might work for me would be to turn the filter off. Anyway, they could learn more about me if they let me do that.

Turn off personalization by adding "&pws=0" to the end of your search url on Google. You can also use incognito mode in Chrome.

Great! Put a button on the search page. Will "pws=0" allow the engine to "learn" from my corrections? I'm pretty sure incognito won't. But allowing me to "correct" the personalization might be useful to you.

I heard about this book on NPR a couple of weeks ago [1]. I really don't like this biased view of ML.

In summary what he said on the show was something like: At least the news shows (and news pappers, radio) gives the same information to everyone, so instead of showing Kardashians news they do show you Bin Laden news, even though, they know the Kardashians are more profitable... this is not the case with Google Search or Netflix, Yahoo, Bing, etc (all attacked by this author).

I think that ML helps more than it hurts and viewing it in a non-technical way is wrong, the author gives the impression that Google (the company and their execs) manually (via algorithms, but very manageable in his opinion) change the search results to not show things that they don't like, then it raises the question "do we really trust one company?".

Its arguable that the search results in Google or Netflix are optimized for profits, but how do you make profits in the customer industry? IMHO you do that by making their happier, showing useful results, for them, not for everyone.

I'm waiting for the time when I google: "what channel and time is Conan on" [2] and I get "channel 43 11pm" as the result. Of course that is very personalized, and the result will be just for me... but again I'm the one searching and I'm the one needing the results.

[1] http://thedianerehmshow.org/shows/2011-05-17/eli-pariser-fil...

[2] At the moment I get this 1st result: http://articles.boston.com/2010-11-10/ae/29330376_1_conan-an... which is not what I searched for nor wanted...

I think there's reason to complain that the reality is far from the ideal. Take Netflix, for example. I rated hundreds of movies, filled out the taste preferences to narrow down the genres I was most interested in, but the "Suggestions for You" were underwhelming, to say the least. Even worse, they simply never changed. So I went back into my taste preferences and checked that I "Often" watched every single mood and genre listed. Now I get a vastly improved range of suggestions, exposing me to some great movies, simply because Netflix is no longer hiding them from my view. I may have a unique individual taste for movies, but Netflix sure hasn't fathomed my criteria with their algorithm.

Exactly! it's just nonsense peddled by those who don't understand the technical aspects of search, people who don't realize that a search engine's prime function is to filter the millions of results for each query to down to the most relevant results for the users, and not the same 10 results are relevant to each and every user.

And user regardless of nay personalization can dig through any initial results.

It's just annoying an ignorant bullshit being disguised as a real issue.

I spent a few hours using Duck Duck Go before commenting. It's been my Chrome default search for 4 hours.

• The results are less relevant. Bubble or no, it was harder to find things I look for all the time. For example I had to constantly add the word Seattle to my search terms.

• It felt like an old search engine. The results, the display, the choice of Mapquest maps all made me feel like I was using Yahoo or even early Google.

• At first the feeling of not being watched was liberating but I forgot about it very quickly.

I'm going to continue using it as my default search for the lack of tracking but Iv'e already had to go to Google a few times to find what I was looking for. Convenience over privacy right?

Thx -- in my experience it really takes a week to get into it. If you can stick it out I'd love to get your additional feedback after that point.

Point of fact: it isn't MapQuest, it is OpenStreetMap served via OpenMapquest, which uses their resources to forward that project. You can read more about it at http://openstreetmap.com/ (left column) and http://open.mapquestapi.com/ - in any case, maps are relatively new and in process.

Thanks for clarifying. Open street map is a cool project.

Interesting. I've been using ddg for several months now, and haven't quite experienced my search results as being "less relevant" or "harder to find things" in comparison to Google or Yahoo!.

Ironically, I do use the ddg bang (!) operators quite often. ...like !gm or !gi if I know I want a map or image (for example)

Blekko has been my default search engine for the last months, and I really really like it.

The purpose of search engines is filtering content to give you what you want, which might be information, or it might be discussion, or polemics, or gossip. If you want to learn about some topic that has some element of subjectivity, familiarize yourself with the different points of view and read the writing of whoever you are interested in. There are lots of tools on the Internet to facilitate this -- Wikipedia, for example, is built around an ideal of giving people an objective survey of different things. If you type "climate change" or "Barack Obama" into Google and form an opinion based on the top results then fuck you.

The purpose of search engines is filtering content to give you what you want

The purpose of search engines is filtering content to give us what we asked for. Unless they developed mind-reading technology, Google doesn't really know what I want and attempts at guessing it will lead to substandard results.

*-- Wikipedia, for example, is built around an ideal of giving people an objective survey of different things.

A website where any amount of divergent opinions get edited down to a single article on the subject (target of edit wars and well-known editorial biases) is hardly "an objective survey of different things."

The purpose of search engines is filtering content to give us what we asked for. Unless they developed mind-reading technology, Google doesn't really know what I want and attempts at guessing it will lead to substandard results.

I think you really hit the crux of the issue here. My opinion is, these search engines are built around the UI paradigm of "type something into the box and go find that thing." The user doesn't have one box for finding discussion forums, one box for finding blogs that have people they agree with, one box for finding material that appeared in print publications, etc, and yet search engines are for finding all these things, if you want them.

As long as there's people typing "climate change" into Google, Google has to guess at what they are asking for because there ain't enough bits in the query to tell it. There's no a priori reason to expect that they are asking for the most informative and accurate links covering a wide variety of perspectives on climate change; many people probably aren't.

Regarding Wikipedia, well, that's why I said it was the ideal. You're never going to crowdsource perfect objectivity and truth from a million biased writers with ulterior motives, but they try, and they do an OK job on many topics.

The purpose of search engines is filtering content to give you what you want. I think this is fairly noncontroversial.

What is controversial is whether "what I asked for" is a better approximation of that, or whether "what Google's model of me indicates I really want" is a better approximation.

I can't say which one wins for you, right now; but it's clear that this hinges on the accuracy of the model--which is one reason privacy and usability are at odds.

I agree with you shouldn't form an opinion based on Google's top results: But I believe that's _exactly_ what most people do.

How to do I know what filters form the top results if they aren't transparent? What would lead you to believe that familiarizing yourself with a different point of view would take you down a meaningfully different path through the search graph? None that I can see. Every search has a non-objective filter. The original page rank is one such. What would be useful is to make the tree obvious and manipulable as a separate object itself.

I agree that it would be cool to expose the different criteria that Google (for instance) is using to help reorder your results, to the degree that those criteria can be discretely identified, but I'm not surprised that they don't; that's a pretty significant portion of their secret sauce they would be publicizing.

I also don't think that mere transparency would really help solve any search engine "filter bubble" problem, if such a thing is real. Nothing would make Joe Google User take five minutes off whatever he came to search for to fiddle with some tree full of sliders on his result page.

Agreed. I wouldn't either 80% of the time. But often I'm making a very directed search that I would really like the "best" results for where best is usually defined as different than the results I'm getting.

Hey, check this out, Google already sort of does it. I didn't know that (probably because I have "web history" turned off, so I don't get customizations.)


When the filters themselves aren't transparent, it is a major issue.

I would gladly prefer a "pre-filtered" list, as long as I could tweak it. For example, if I'm searching for viewpoints that don't correspond with my political views, it'd be nice to be able to find those by disabling the "political bent" filter based on personalization.

To not do so is to constantly wear rose-tinted glasses... pleasant, but ultimately, dangerous.

Confirmation bias is bad enough already, it's really a shame that powerful companies like Google reinforces it just for the sake of giving you more pleasing search results. (Pleasing and Good overlap, but they're not equal.)



Edit: I didn't intend to bash Google specifically. But they are faced with a choice in which their own interest conflict with those of their users. And as Capitalist Bastards are more and more accepted in our society, we don't blame them for the selfish choice. Maybe not a shame then, but at the very least a pity.

This sort of personalization is no different than using a user's click history to determine whether he means cycling or motorcycling when he searches for "biking". It's no different than using history to determine whether someone is looking for a television schedule or coding help when he searches for "programming".

One person's "confirmation bias" is another person's "relevance". Increasing relevance to the user will naturally result in confirming their biases, because there's (probably) no programmatic way to discern contentious subjects in which confirmation bias is applicable from non-contentious subjects where it's not.

The only "shame" I see here is people who ascribe some sort of devious intention to what's clearly the natural result of trying to solve the most important problem in search.

Yes, after all the primary job of a search engine is to fight the user's own biases, and who wants a product that's pleasing to use anyhow?

My phrasing was a bit harsh. I understand that Google (and others) are mainly out to make money. To do that they have to please users most. Anyway, they probably can't make the difference between "relevant" and "pleasant", because user's behaviour only grant them access to "pleasant".

As long as this is done in a neutral way (by delivering the same result to everyone), any confirmation bias will be averaged across entire populations, so this should be okay.

Personalized results however make the results noticeably more pleasant, and significantly more biased (this is probably unavoidable). Of course Google, Bing, and Co would shun that bias thing. Who can blame them?

I don't want blame Google specifically. I want to point out this old, common moral dilemma: make money, or don't hurt anyone? Google took the money. Many do. I'm not sure to what extent we should blame them, but clearly, the System™ has room for improvement.

I cannot see a problem here. Who exactly is being hurt by the "filter bubble"?

The end user is fine - they are more likely to see results they are actually interested in. If a user doesn't trust a source and won't click on their links, they'll soon not have to bother scrolling past them.

The sites themselves actually benefit as well. Sure, they may be bumped from the first page of results for users that are unlikely to visit their site, anyway, but the tradeoff is that they get a higher position for the users who may actually visit their site. It's an ideal trade for those being filtered.

I suppose that leaves the idea that the end result is a "biased" internet. I don't buy it. Google is not removing sites that disagree with them, they are re-ordering them for different users. If your profile wasn't factored in, then what options do they have?

They could order on popularity, but biasing towards popular opinion isn't any better than biasing towards my opinion.

They could randomize the order, this would be without bias, but absolutely useless to anyone.

They could judge the objective truth of sites, but that's far more biased than any of the other options.

The end user is not fine. He is more likely to see results that he actually agree with. See, the original confirmation bias will cause you to seek opinions you agree with more often than others. The search engine will then conclude that you are more interested in the kind of sources those opinions come from. That would be true, by the way, but then comes a point when a quick glance at your search engine result will show you more of what you agree with, and less of what you disagree with.

Now go use that as an estimation of popularity and veracity. I bet many people do, without knowing the result is strongly biased by their own prior behaviour.

Search engine, as the sole entry point of the web, do bias it. Page Rank for instance, could trigger a feedback loop: if a site is more prominent in searches, it will get more links. That will get them more search prominence, and feedback and foom.

Now is the popular bias better than the personal bias? I think it is. One would at least get to be exposed to other's opinions, instead of just his own.

If you just care about the economy of the web, in the sense of selling, advertising, promoting, buying… then of course the personal bias is currently best. That's the most efficient way to milk the tear$ out of eyeballs. The easiest way to reward the brains behind those eyeballs. When it's all about money, there is absolutely no problem with the method. But I have other values besides money. A very important one is respecting curiosity and search for truth. The personal bias doesn't.

Look, if it wasn't for this, my 100 "latex"-related searches today wouldn't have led me to a typesetting language.

Then you would have caught the habit of typing disambiguating words, such as "Knut", or "typesetting", or "language".

We often forget that, although it's obvious search engines filter results, the information we see on social sites is also filtered.

Consider users of Reddit. Now most of them would consider themselves very open minded and enlightened, yet their is active discouragement for radical ideas without due consideration as to their merits. It's just easier to downvote and look at Mario cake.

Overall, I think in a way we NEED filters to remove the faff, but be careful to keep a social circle which encourages radical ideas to be bought out into the light of logic and due consideration,

In all seriousness, this is actually my main use case for the private browsing mode in chrome: to search google without the filter bubble(1)

It's quite shocking to see just how much those results differ from the ones I'm usually served, actually.

I know it's actually supposed to be 'awesome' to have every search tailored to _you_, but it just makes me feel uncomfortable that I'm not seeing the internet "the way it's supposed to be seen" - if that makes any sense.

(1): Or at least a smaller bubble, considering it still knows my location - even though i use google.com, my os, my browser, etc...

I think this is a real issue, and I am glad that DDG is addressing it. This is a more-compelling take on the tracking issue IMO.

What I'd really like to see, is a search engine to allow me to do both. I'd like to have a profile (that didn't use my name), and when I wanted to, I could click a result as 'useful'. This would go into my personal algorithm. I could then toggle between filtered search, or unfiltered search, whenever I like.

It might even be useful to build search filter categories. But I'd keep that a bit buried for the power users.

Use Incognito mode in Chrome when you would like to search the un-"filtered" Google.

Irony: Use of a Google product to circumvent Google's lack of privacy or anonymity in one of their other products.

Unless I'm mistaken, Incognito mode simply disables local history recording, and has not much (if at all) to do with search results or the ability for Google to track your movements.

I suggest Chromium for the paranoid!

Incognito mode isn't just disabled history recording. It does not use any of your cookies from main session, and it deletes the cookies when it is closed. It also isolates your extensions from running incognito unless you elect for them to be.

Of course, I might want my search results to be filtered. To take but one example: Neither FOX nor CNN are providing any valuable insights at all, so filtering them out is a good idea for me.

Or, to get all pithy about it, the problem is not the filter bubble, it's the crap bubble.

If there's a problem with relevance sorting at all, it's the issue that it is based on history, not intent.

I heard you can add "&pws=0" to a search query to turn off personalization:


It's funny how "Paul Graham" is Hacker News' foo bar.

Also, works well for me.

Checking through a few different search terms, that works for me (provides different results)

I noticed this a long time ago. Suddenly I could ask general programming questions and it popped up results in the language I use primarily as the top results (Without me specifying it).

While some may think that this is hiding, do we not often create algorithms to asses and aide? It seems like disabling this would be against the advancement of a learning system.

I for one think its a great feature and have no desire to disable it.

I just had to explain this to my co-founder last night after he freaked out because our old site with a different name was showing up first in his search results with our new name as the query . With web history off the old site wasn't even on the first page.Now I just have to explain that no one searches for early-stage startups let alone cares if they change their names :)

What about the filter bubble of me only searching for what i am interested in? Will DDG throw in some random results in every search to save me from myself?

DDG won't build a profile about you; it only returns the most relevant links to the keywords entered. The original Google search, only on today's web.

That's not what he asked. He asked (facetiously) about the higher-order search bubble: ie, if I only ever search for articles about Erlang innards, I'm never going to read about The Kardashians, which is relevant to a sizeable portion of contemporary American culture. To truly pop my bubble, DDG ought to throw in completely random results every now and then.

It is well known that people seek out information that reinforces their own beliefs. And while it may not be ideal for an enlightened population, why should Google/Bing fight this natural inclination? Their job is to ultimately give their users what they are searching for, not challenge their users' opinions. Especially since there are tons of benefits to personalized search, and it may be hard to not throw the baby out with the bathwater.

There are other options, i.e. opt-in vs opt-out, separate sections (only show in sidebar), instant on/off, and overall greater transparency.

Surfing in anonymous mode with Chrome should work as an opt-out, or shouldn't it?

only as long as you don't log in to google or anything else before you search.

There's obvious opt-out things for the miniscule amount of people who care.

Exactly the same as the "search term leakage" rubbish.

This mentality ("give people what they want") results in the empty, sensationalist journalism that most people dislike. When I read the news, I don't want to be entertained, I want to be informed. The same goes for Internet searches.

It almost seems like DDG is trying to get me to think personalization == censorship. Sorry, but that ain't gonna fly.

You've just undermined your own search engine's potential to use such a feature in the future -- people will be spitting quotes back at DDG about how opposed on this feature they were. Know your audience.

Sorry to be pedantic, "beg the question" is being used incorrectly. See http://begthequestion.info/ ("This begs the question: what are you missing?"). Doesn't detract from the overall post.

The first example is totally borked. They first search for "climate change" (notice the ") and then search for climate change (without quotes). Of course the search engine shows different results for different queries.

I didn't notice that, but will fix ASAP. The base source of that image was from: http://www.thefilterbubble.com/what-is-the-internet-hiding-l...

Update: the quotes vs no quotes doesn't change the top results on Bing (at least for this search).

Except duckduckgo can't find my project and Google can:



Thx -- we do have our own github crawl, but it needs to be updated more frequently. Duly noted and will fix.

Doesn't that say more about your project than DDG?

Right because you'd only ever want to find things that were notable.

"Try: !github sphela (Searches GitHub.com using our !bang syntax"

!gh works too.

People that don't want "filtering", also want their search results in German, Chinese or Japanese? Or is "filtering" on language ok?

I think it's fair to assume that people want results in the same language as the search terms they entered.

Like Computer, oh well, no Computer in German is Computer. Try Biergarten, oh well.

Any filter at all will result in a "Filter Bubble" as defined here (except, I suppose, returning a randomly sorted list of all sites on the internet). Whether the filter is personalized or not doesn't change that fact. What is the actual benefit of everyone getting the same search results when looking for a particular term? Search engines are not news sites or research papers: biasing towards relevant results is not a bad thing.

DDG is quite good in regular search. I like their commitment to privacy and so switched my default search engine to DDG. But I noticed more and more that I use google maps almost as much search. I search for places and directions on maps end up going to google a lot. That is brilliance of Google. They built search verticals and so its tough to not use them. If only DDG could do something about that.

We are working on integrating OpenStreetMap, and just yesterday launched some upgrades, e.g. https://duckduckgo.com/?q=Valley+Forge%2C+PA or https://duckduckgo.com/?q=black+lab+bistro (map in right column expands to view). This goes with address detection (https://duckduckgo.com/?q=3+ames+st%2C+boston%2C+ma) and you can always force an OSM query by adding map to the end. And of course adding !gm will take you right to Google Maps.

Problem with OSM is that it's nowhere near as complete as Google Maps, and I don't see it growing the way Wikipedia did (contributing to Wikipedia is much easier than contributing to OSM)

Just add !gm to your query. If you do it a few times, DDG will remember it and make it easy to add !gm just by pushing ! to open the dropdown.

I've been using DDG exclusively for a little while. Half the time the results are identical to Google. The other half is usually a collection of links from sites I would have searched on to research the topic.

I have Google Web History turned off and I'm seeing a pretty good mix of things in my results. E.g., I'm heavily left-wing, somewhat pro-gun control, with no personal interest in shooting, but my results for 'gun' are pretty much all gun shops and fansites. For Barack Obama I get mostly official sites, although some of the top news is critical (I'm Canadian, if it matters). Egypt gives me a mix of travel, protests, general info and ancient history. I don't recall ever turning it off, so is it off by default? And if so, what's the problem here?

I noticed a while ago that a collegue and I got different results when we googled for our company name, but I didn't know that it's now happening in such an extent.

I've added DuckDuck to my FireFox search engines.

I've set it as default on Chrome. Maybe google is better - I'll get more python libraries and less news stories about sheep (or Darwin laureates) getting swallowed by pythons, but I can live with that. Besides, I'm pretty good at typing "google" if I need it.

You can just add !g to the end (or beg) of your search and it will take you right there.

Actually, this is super handy! When I know I want the wikipedia article I just add !w at the end.

Ditto for youtube (!yt) etc etc

fyi, you can also use !v to search youtube via ddg ('v' for video)

It's not happening to any extent, it's just a new FUD concept that is meant to sell books an confuse people, if you seek more info you can dig deeper into any set of initial results. Personalization == relevance.

That's funny, when I search for "filter bubble" I can't find anyone refuting it.

It is the ultimate quest for serendipity.

If you filter by what you want you don't get what you didn't know you wanted.

I will give DDG the benefit of the doubt and try it out for a couple of months.

Starting now.

Long time DDG user here. I highly encourage people to try and make the switch. I often find DDG gives me the needed information without having to actually having to visit the site. I still find myself going to Google sometimes if DDG doesn't give me what I want, but it's smart enough to get me the most relevant information faster most of the time. Often enough that I don't feel the need to switch back.

In making the switch, I highly advise you to spend a little bit of time learning the keyboard shortcuts, especially the !Bang feature (https://duckduckgo.com/bang.html). You'll love the HN search options. =) There are so many, you won't be able to learn them all.

You can also still use Google if you want, or even Bing (or other engines). Basically, the !Bang syntax would mean even if you aren't using DDG, it enables you to quickly use whatever engine you really want.

I'm just a fan of DDG. Hope you find it as useful as I do. =)

I'm glad I have been using DDG for over 6 months now. For the new user, here are some advices.

DDG is my default search engine. Basically, all of my searches are completed in DDG, only very few times I have to go to google to find what I am looking for. My suggestion is that you spend 5 minutes learning the bang! syntax, as it speeds up your search by alot, then set DDG as your default search engine for a couple weeks. You will see that the bang! syntax and the 0-click-info really makes the difference on search speed, and their results are pretty damn good.

Also a long time DDG user. I have it set as my default search provider in chrome, and it has sped up my browsing considerably. Shortcuts like "!w some article" or "!a some product" (takes you directly to Wikipedia or Amazon, respectively) are very handy. The only times I've had to revert to Google are for very specific forum trawling.

From the search engine revenue perspective, how much can be the loss in case they provide a preference option to show "generic results" (like "safe search option off") - with an easy to switch ui between the two?

Normally we don't like to crowd up our preferences page with a bunch of options that people don't use much, but you can turn off personalization by adding "&pws=0" to the end of the Google search url.

There's a personalization / privacy trade-off that needs to be considered. It is annoying that personalization cannot be achieved locally on the user machines or browser. Filtering / re-ranking results at home offers much privacy. My little personal project is to have the ML for personalization done at home, seeks the Seeks Project, http://www.seeks-project.info/ Personal control over the personal bubble matters...

I would like to use it as default, but I tryed it and it's very very slow... takes me 5 secs to get a search back. in Google I take 0.035s

So I will stay information sided for a while

You're missing a trick by that near-final slide of the DDG input just being an image - would be neat if it were a fully functional DDG search form.

Does hitting the logout button pop the filter bubble?

Personally, not convinced at all. Looks like saying "We are better because we cannot do it yet"

in our research study, we found the impact/presence of personalization very strong: after about 3'000 search queries, in some cases more than every search query received personalised search results. out of the 10 blue links in some cases we found 6.4 personalised: see for yourself (search for Hypothesis 1 to get to the data) http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/ar...

I don't like the direction Google is taking with Search. My view of "improvement" is very different from Google's view. I have the feeling that the more features they add to Search the worst the service gets. I miss the old simple Google Search. I think there is a big oportunity here for another search engine focused on: simplicity, speed, relevance (algorythm) and unfiltered. Back to basics.

In the end (1) technical people will find a way around in when they need to, (2) The conspiracy-theory loving will use the "Filter bubble"as an argument for their own ends, and (3) The rest-most people will just not care... I think the more interesting question is whether this third group will veer off to become more biased to their own world-view as the article suggests.

Google should invite Pariser for an Authors@Google event with Cutts conducting the interview.

How does duckduckgo.com compare with startpage.com?

"But aren't you a part of the relevance equation? The ideal results for a search like [bitcoin crash] should be different for a Japanese-speaking searcher in Tokyo vs. a German-speaking searcher in Munich vs. a bitcoin expert vs. a programmer trying to diagnose why compiling bitcoin is crashing vs. my Mom who has never heard of bitcoin before, right?"

Maybe, but that is not the point.

Besides, you are mixing localization (japanese vs german) with personalization (expert vs naive), and what is worst is that you are assuming that Google knows so well each user so as to be right (and that is either impossible, either extremely creepy), at every single instant of his life (a person can change interests).

Furthermore, to grab your example, how can Google know that an expert in bitcoin and expert in bitcoin compilation and crash solver, is not just interested in hearing about the "market" crash of bitcoin?

Google CANNOT read the users' mind. And even if it did, there would not be a need for "personalized search", as the "mind reading" would give enough search criteria to nail the results more easily (albeit, most people do not know exactly what they want, so it will still be an iterative process, which is good, as randomness is the seed for evolution).

So, back to the point, it is that in the quest for "adequate results" for each person, Google is turning web search into a non-deterministic event ().

Imagine the web being a library, and the search being searching for the library's book database, why would the search for a given book return different results to different persons? It should always return the same results, if the person doing the search is not satisfied with the results, then she/he will add more criteria. In other words, let the person do the filtering!

Once that Google accepts that in his quest for "better results" (where 'better' is a concept decided by solely Google and whose ranking parameters and algorithm are unknown) there is a potential (probably demonstrable already) for a "filter bubble" with positive feedback loop on user behaviour, which, as with any positive feedback loop, can go out of control, exacerbating certain ideologies and fueling extremisms.

And there is a fundamental difference between a "self guided" (as in self controlled) filtering, where users would knowingly filter out results in order to find those that they like, and a "google guided" (as in externally controlled), filtering.

() strictly speaking, search will not be deterministic as the web is a dynamic system and it grows, so search results can vary with time, but they should not vary from person to person at a given time.

Of course Google can't read minds, that does not mean they should ignore information that they have when deciding which results to show or what order to show them in. Sufficient data to provide a better filter for a given user is not the same as mind reading.

You are absolutely right that more criteria should be used if a user is interested in better results, but I fail to see why a deterministic base case is superior. If I never click on news links and always click on travel links, it makes perfect sense for Google to assume that my search for "Egypt" is looking for information related to travel to Egypt. If I am not following my usual search patterns, I can look on the second page, or I can disable personalized search, or I can search for "Egypt News" all of which would give me better results.

Why is it preferable to always make everyone clarify their searches when there is sufficient information to narrow the search down somewhat without requiring additional intervention by the user? This is usability 101 right here.

http://www.google.com/ncr to disable. NCR=no country redirect.

Works fine, but how do I return to country redirect? Am in Germany, used it for some global searches, now want my localized (German ) searches back. Must be missing something obvious here, please help, thanks!

Shame on DDG for endorsing this nonsense.

I understand the quest for a larger userbase but please don't use this sort of FUD that is being peddled by those who don't understand the technical aspects of search and are trying to sell their books.

A search engine's prime function is to filter the millions of results for each query down to the most relevant results for each individual user, and never the same 10 results are relevant to each and every user.

There is little difference between personalization and the relevance of search results.

How would you go about ranking then? alphabetically?! it's a matter of tuning the relevance 'dials' and it's all in early stages so a solution to this imaginary problem is more research and not to hide behind bullshit terminology.

And as a bonus a user (regardless of any personalization) can dig through any initial set of results if she seeks more information.

So please don’t buy into this misleading and ignorant bullshit that is being disguised as a real issue.

Search queries have different purposes. Personalization based on my preferences and past search works wonders for known-item seeking and for re-finding.

But for exploratory search and exhaustive search personalization works as an echo chamber (or filter bubble if you will) .

Then there is the problem of change and inertia. People change, their preferences vary in time. Personalization has an inertia that causes the recommendation engine to always be behind the actual preferences. It's more visible the better the recommendation engine.

I'm not saying personalization is problematic. There are problems with using it all the time and stealthily, without a clear possibility to turn it off.

It's not FUD, but a challenge to overcome.

All can be remedied by tweaking the algorithms and that happens all the time, you can also use a certain vertical directly for certain type queries (scholar search etc).

The same way that what a user might enjoy can be inferred the inverse of that can be used as a signal in the algorithm as well. Personalization is just relevance and all is just a matter of tuning.

I agree, it can be remedied by better algorithms - such that can recognize when I'm doing a search for a known item and when I'm performing exploratory search. In the first case, personalize away, in the second case, please show me everything.

The challenge is that people switch very quickly between these search modes and don't even think about it.

No, no, no, no, no, no, no.

"There is little difference between personalization and the relevance of search results."

Even if you're a Sailor Moon fan, you might actually be looking for information about the rock rotating around this planet when searching for "moon". In the same vein I might be a liberal but be looking for a multitude of opposing views and ideas when googling "core values of NRA members".

Oh btw, we engineers, our algorithms, aren't smart enough to foretell people's inclinations based on their past habits, we also can't read minds. The increasing use of personalization as a factor for relevancy will, imho, lead to user dissatisfaction, precisely because they make shoddy assumptions.

Believe it or not, but some people also want to be surprised and learn new things. YES. We exist!

So, your argument is that it's bullshit because it's true?

I see it as being a straw man argument.

The article makes the assumption that personalized search results are bad for you, then goes on exemplifying it, but does not say or demonstrate in any way why personalized search results are bad, especially since it doesn't give any context about the person making those queries.

Search ranking is a matter of context. "Egypt" does not mean anything, other than the name of a country, and different people mean different things. When you're searching for "Egypt" and want to get travel tips, if you don't get them you can always expand your query to "Egypt travelling tips". But even that is a pretty shallow request and you can further expand it like "Egypt travel guide to the pyramids".

The search engine's job is to find what I'm looking for. And IMHO the current state of the art is a little behind my expectations - I would have expected these personalized results to be far more effective than they are by now.

It's true that sometimes I'm looking for specific things, but other times I'm searching to get an idea of "what's out there", to fill gaps in my knowledge and make sure I'm appraised of what's going on in the world. It'd be nice to at least have some control over what kinds of searches I'm doing. For example, if I search for my own name, what I want to know is something like: what are the most common search results other people get when they search for my name. The same is usually true if I'm doing "related work" type searching for research; I explicitly want to find stuff outside my immediate area of specialization, in order to make sure I'm not missing anything that other people would've expected me to cover.

I understand that, but if the search results are not personalized for you, then they must be personalized for the common denominator, as in the hive, the majority's opinion, the status quo.

For general search queries, the long tail goes out the window anyway, and I haven't done or seen any quality metrics for DDG, but I doubt their search results are better for exploration or getting opinions different from your own.

I agree. I think people will surround themselves with a filter bubble with or without personalized search results. Ever tried to convince an avid Fox or CNN reader to read the opposing view's site?

From my personal experience, people rarely search for something as simple as "Egypt," and instead, search for "Egypt travelling tips" like your example.

If your Google-fu is strong, you'll write a query that targets your desired results much better than a generic shot-in-the-dark query.

It's bullshit because search is filtering and the filtering done by search is being exaggerated by those who peddle this bullshit.

I expect my search to be faithful to the terms I enter. If the results aren't satisfactory, I'll narrow the search with additional terms. Withholding information based on some guess at my political leanings is dishonest. I'd be appalled if a librarian did it; why should I expect less from a search engine?

If you went to a librarian and said "show me Egypt", she would have nowhere to start. A conversation would then ensue about what you are interested in.

Keep going to the library, keep having those conversations, and eventually, the librarian will take you straight to the books you want when you ask for Norway. If you want something different, you'll have to explicitly ask.

What Google is doing is no different. It just happens to have a lot more conversations with you.

What Google is doing is no different only if you can, in your ensuing "conversations," correct a misinterpretation that may have happened somewhere along the way. If your ability to do that is hampered because the only way to do it is by clicking on something, you don't see the thing to click on which will do this, then the filtering is not helping (and actually works against Google's and my mutual interests).

In theory, there are two types of corrections, one is blatant, the other is subtle:

- You blatantly correct the conversation by adding additional text to the search: "norway government" versus "norway travel".

- You subtly correct the conversation over time with more government queries/clicks/likes or more travel queries/clicks/likes.

Listening to Matt Cutt's response, it sounds like all of these signals are pretty minor anyway.

Not sure where you are going with your dichotomy- if you are unable to correct in either case (because you don't get the option of clicking on a correct answer) then it fails. I suppose with enough perseverance you could trick the search engine into showing you what you want- if you know it exists in the first place- but then, (1) just enter the very specific term to start with, and (2) this defeats the attempts to become more relevant.

I agree with you that these seem to be small changes. So far.

That's a good point, but I think it kind of misses the forest for the trees. It's true that personalization might sometimes lead Google to mistakenly show you less relevant results, but it seems pretty certain that the opposite would also be true — a generic page will sometimes show less relevant results than a personalized one. One search result is going to be shown ahead of another either way. If showing a less relevant result ahead of a more relevant one is considered "hampering," I don't see any reason this "hampering" is unique to personalized search.

I think it would come down to a matter of degree. A slight change in the order of results is pretty meaningless either way. But major alterations, where some pages are relatively inaccessible, could obviously be an impediment to getting the results we want.

Let me try this example: Suppose your searches are primarily academic. Let's say that whenever you search the term "momentum" you are looking for something scientific- ballistics, elementary particles, whatever. But one day you are writing a blog in which you want to search for background, but you need to use a non-scientific meaning of the same word. Perhaps a psychic used the term and you are debunking. The particular case is irrelevant. The point is tha if personalization is too aggressive you may not find the info most relevant to your interests.

This isn't privacy FUD or anything, just a pragmatic warning. The FUD comes when you start thinking about how their algorithms might actually decide which results you are "interested" in. Thinking of Google, what bu siness are they in? What about you behavior on the web most interests them? Would they decide that the most interesting things to you are the ones on which you clicked the most ads?

But whatever change personalized search makes relative to generic search, the opposite change will occur going from personalized to generic. To penalize personalized search when the change is of equal magnitude going either direction doesn't seem fair. As long as the personalizations are just a transformation and not a subtraction, the two options are just mirror images of each other. They have the exact same kind of failure condition. The question is just which one's failure conditions are more likely to occur.

As a counter-example: Suppose your searches are primarily academic. Let's say that whenever you search the term "momentum" you are looking for something scientific- ballistics, elementary particles, whatever. But most people aren't looking for scientific info, so you constantly have to dig and dig to find anything relevant on Google. The point is, if the search technology is too impersonal, you may not find the info most relevant to your interests.

As long as the personalizations are just a transformation and not a subtraction

This is the crucial point. As long as it's accessible, I can get what I want by search refinement either way (but I would imagine that it would be easier for me to refine in the space in which I am familiar than to refine in the unfamiliar space). Giving me the (easy) option to turn it off is a good idea for search.

What does faithful mean? You want German or Chinese results? You want global or local results (in your language of choice). There is always personalization in search.

The 'political leaning results' are just a tool the author used to try to explained his (ill-informed) point, there are no evidence to any such degree of personalization specially not to any 'bubble' inducing extent, it's all very sensationalistic.

And for the record the political example was about facebook wall.

I think you missed the point. Yes DDG ranks, but it may rank differently. Of course, there is an implicit slam on them here: instead of showing you what you want the vast majority of the time, they show you other stuff. This is good on occasion (so you don't live in an echo chamber), but if this was all the time it would mean lower quality search results for simple things.

This is exactly why I stopped using DDG as my main search engine. My workflow was often like this:

- Search for term on DDG - Look at results, find nothing - Try to find better term - Give up and use original term in Google - Find result

I tried to use DDG exclusively two time (once 8 months ago, once ~4 months ago), but the result was the same. I don't know how much personalization affects this, but Google just gives me the best results.

I have had almost the exact same experience. I want to use DDG, but I too regularly had to use Google after I couldn't find the results I was after.

That's quite a point.

My own standing is that most of the time you need what you have to have to complete task at hand. Then you could go at YouTube and refresh "Recomendations" to your heart content.

The other way around is to wreak havoc to important parts of your life.

They show you other stuff

Do you have any proof that this isn't what all search engines actually do?

Sorry but this is a real issue, whether DDG solves it or not.

At least to me.

Why do you think it is an issue? I think everyone already filters intuitively. It's called having a "bullshit filter". I believe it makes not much difference whether it's automated or everyone does it manually.

I think the chance that I would click on a bullshit source like Fox News is very low because I don't trust them. Even if Google brought it up as the first result.

My point is that people already only click on search results that they agree with. Nobody wants to see stuff that they disagree with. That is human nature.

So living in a filter bubble already happens without automatic filters. It happens in our brains and we call it personality.

It's not an issue just because you want to believe it is.

Media outlets always filtered stuff for you but now the algorithms are 'filtering' results according to your interaction to find what's relevant to you, so you have a 'vote' here.

You also can dig through any initial results (click the news vertical to find more news for example) you couldn't do that with the media of yore.

It's not not an issue just because it doesn't apply to you.

If you want to understand where I am coming from then this is why I think it's an issue


Maybe we don't want a filter that sucks, plain and simple? Whether the filter sucks or not is completely subjective, but it sucks for me, so there.

Changing the language not that intuitive in Google, it's a drag. I have to do it all the time when I clean my cookies, and sometimes I still get crappy "pt.wikipedia.org" results on top instead of "en.wikipedia.org" just because my browser isn't in english and I don't want to change it just because of Google.

I might want to type www.google.it and search for Berlusconi news in Italia, but it won't let me, because it just show me brazilian news. So yeah, I can't even get to another bubble.

Also, apparently Google filters by IP, since searching via Tor or a VPS renders very different results. I dislike this for personal reasons because it breaks the internet for me.

Also, I might not like the fact that Google collects stuff about my search habits, which is also a valid concern.

So, there, it's an issue for me. Might be an edge case, might be an exception, but I think that I deserve to know why this happens and the alternatives.

"There is little difference between personalization and the relevance of search results. How would you go about ranking then? alphabetically?!"

Oh, you know, maybe using Google's much vaunted "PageRank" algorithm, which is supposed to take in to account things such as links on other pages back to the page being considered for inclusion in the search results.. the more such links there are, the better the rank is supposed to be.

The above description is obviously an oversimplification, and I don't have access to Google's PageRank algorithm anyway, so couldn't tell you what it actually was if I wanted to, but it's something along those lines, and need not take in to account your previous search history in any way or "personalize" the search for you.

Spreading FUD about tracking search terms etc didn't seem to help DDG last time. I doubt this will have any effect either.

DDG: If you want to grow, don't keep attacking the big search engines with FUD that only a miniscule % of people care about.

The February DDG newsletter reported growth in January from 2.5m searches to 5m. I don't think DDG needs FUD to grow.


Searches per day in Feb average = 197,687

Searches per day in May average = 205,759

FUD certainly won't grow DDG out of the extreme niche it's in.

That agrees with what I said. Unless you don't think that much growth in searches over such a short period is significant.

It's even more significant when you look at where it was a year ago. 1.2 million in April 2010, 5.9 million in April 2011, 6.3 last month, and already 3.8 a little over halfway through this month. This is significant when you consider the lack of marketing and that DDG is run by one person.

It's not significant. In the world of search engines, you need to be the #1 or #2 player to earn major revenue. People aren't gonna wanna waste any time/energy advertising on your platform if you're getting just a handful of searches per month for say... mp3 players or cellphones. DDG is just betting on an acquisition, but it's just a wrapper around the Yahoo BOSS API.

Nothing starts at #1. Expecting DDG to displace an oligopoly built over a decade and a half in a year or two of existence is asking too much of anyone.

DDG has been around for more than a year or two...

This assumes that 'filter bubble' is something more than a nonsense term.

There is little difference between personalization and the relevance of search results.

How would you go about ranking then? alphabetically?! it's a matter of tuning the relevance 'dials' and it's all in early stages so a solution to this imaginary problems is more research and not to create bullshit terminology in order to sell some books.

Most people don't realize that Google and other companies are doing this. That's my main problem. It's not about selling books in my mind as much as it about communicating why something is in a search list for person A vs. person B. I don't want my Internet censored.

Do you see search result #1,000,000 on your first page? Congratulation, your internet is censored.

People don't realize many things about search engine from indices to ranking algorithms, they do realize however that a search engine is to return the most relevant results for them and that is were personalization/relevance fits in.

There is a major difference between personalization and "pure" relevance, at least as used here. However, it is more of a question of inputs than actual terms.

"Pure" relevance is solely a function of the query and the document - R(q,d).

Personalization involves the user and possibly the context - R(q,d,u,x). Now, if you consider the user and their context to be part of the query, then yes, it's the same thing as relevance.

So the real question: should the user's identity/history/profile and/or current context be considered a part of the relevance function or not? DDG says no, Google says yes.

Thinking of it in this way makes the real question clearer. Unfortunately, it will probably make little sense to most Real Users.

I suspect DDG is having trouble making money hence the need to make a commotion about filter bubbles and generating some linkbait. No surprise... < 1% of a search engine market is not gonna even make you much at the end of the day.

