We use Google because the results are useful, not because they are "unbiased". Ranking implies some sort of "bias" and is what makes search results generally useful. We don't want a search engine that does nothing clever and just spits back unranked results. Otherwise, we would be inundated with results containing credit card scams, porn, Bitcoin scams, Viagra ads, etc, when we search for... pretty much anything.
In privacy (incognito and not logged in) mode, all of the above still applies. What would NOT apply is something like: You are a vegetarian and suddenly all of your restaurant searches rank vegetarian restaurants higher in results while in privacy mode. Unless, of course, for some reason people in your general location happen to mostly eat vegetarian.
In any case, if people don't like it, stop using Google and go use some other search engine; there is absolutely nothing holding you back. More times than not, I think people will switch back to Google because they find the results more useful, even in privacy mode.
I now use duckduckgo as default search engine and my experience is mixed.
The problem with google is that sometime you search for something new and then you see the bubble very clearly, which applies non only to search but also to youtube (maybe even more).
The problem with duckduckgo is that you are searching for something specific or something you saw months ago and don't remember well then google's index and tracking can be useful.
At this point I don't treat search engines as some sort of dichotomy (Google or DDG or Edge, etc). Rather, I try to use them as a nice blend : Google for when I'm throwing darts at the dartboard and have no idea what I'm looking for, DDG for when I know exactly what I'm looking for (to the point where I can type in the url), so on and so forth.
There's absolutely nothing wrong with using multiple search platforms. Obviously Google is great for when you don't really quite know what you're looking for, but if I want to read Deadspin, typing "deadspin.com" into Google will be the exact same experience on DDG.
Seeing as how most people visit the same websites over and over again, it doesn't make since to just have 1 single search engine (e.g., a Google).
Startpage is basic'ly Google results, but the filter bubble is "all Startpage users". Also brings back some of the search operators that Google disabled.
Qwant is a European search engine that brags about privacy, but the results are hit-and-miss for me so far...haven't used it much.
I remember when Google blew us away with Page Rank (goodbye Alta Vista!), but in the last few years Google has gotten so good on providing entry-level information that it's useless for finding specifics, so I expect the next Big Thing in search to come along, though I have no idea how far out it is.
The other annoyance is the lack of Wikipedia results. For a general topic, I like to have a few pages to chooses from about the topic in addition to Wikipedia. Rarely are Wikipedia results in my organic listing unless I specifically add wiki or Wikipedia.
By the way, this is how to search your query.
Fair enough, so click on the Search Instead and it says "Did you mean: ways to make naan" and then shows a bunch of links about breadmaking anyways.
Disclaimer: didn't test it, but I often use this trick to force G to use some word, and to use it exactly as written.
Also, the DDG devs are certainly lurking on this thread, and can fix the class of queries in question.
> In fact, DuckDuckGo gets its results from over four hundred sources. These include hundreds of vertical sources delivering niche Instant Answers, DuckDuckBot (our crawler) and crowd-sourced sites (like Wikipedia, stored in our answer indexes). We also of course have more traditional links in the search results, which we also source from a variety of partners, including Oath (formerly Yahoo) and Bing.
What this means is that they use 400 sources for things like Instant Answers and other widgets but Yahoo and Bing for all their organic search results.
In the name of optimizing for 'engagement', my youtube recs are full of politically polarized clickbait. They're not merely reinforcing my existing beliefs, they're actively trying to push me into a bubble.
At least facebook has the excuse of actual people pushing this stuff.
My other favorite example is Netflix and WWII docs/movies. Watch just one and forever onward they will be half your recommendations.
In your vegetarian example, what if 51% of people were vegetarian in an area, and the general population was making decisions off these "localized" search results. We would likely expect that this would influence the minority to the tastes of the majority.
This might be fine for something like vegetarianism, but what about other topics? Should your search results be more racist because you live around a lot of racists? This is best case.
I have tangentially worked with groups that specifically utilize this to provide public opinion sway and consumer capture for their clients.
Also, search engines can play with inconsistent ranking of results to see how click-throughs might be affected. For example, if moving a link from first to third in the result list has no effect (people continue clicking on the same link even though it's now third instead of first), then it's a pretty strong signal that the link should continue to be ranked first in future results. This experimentation of search results is even more important the more uncommon a search is because there is less confidence in the current ranking until there is more activity to base the ranking on.
Just as stores shift around product placement (front of the store, back of the store, etc), a search engine is free to shift around search results. Keep in mind that product producers might pay for better in-store product placement too, just as customers pay search engines for ad placement in search results.
It is "useful" for me to be on the phone with someone in Cleveland and describe how to find something on the Web, expecting that they can follow a similar set of steps at a similar time and get a similar result.
A (sort-of-)deterministic Web can be good and useful. It is a very strong statement of preference and exercise of power to declare that "useful" results must be meaningfully different based on the characteristics of the individual searching.
For whom is that exercise of power most beneficial? I would argue that a rapidly shifting, slippery, personally-dependent presentation of the world's information is extremely useful as a tool of control, but gives only occasional and relatively marginal benefit to individual searchers.
The 2016 US election is a big case in point. Personalizing information delivery, when coupled with asymmetric processing power and data availability, lets you have situations where an atomized polity winds up seeing what suits each individual, but with a radically degraded ability to form collective truths or consensus.
The definition of "useful" is an exercise of power.
I feel more and more these ideas of optimizing for 95% of the use cases give good result on paper but shitty lives for the 5% left.
I understand the good intentions behind that calculation, because making life easier for a huge majority of people should be a good thing.
But for instance boosting local results is one of the way you’ll make people often searching for foreign information miserable. Searching for remote places will most of time be met by random local businesses first. Web based international content will be outranked by local content, and your local newspaper bitching about heat waves when it’s just summer will outrank by far rock bands and manga titles.
Sometimes that’s the wanted behaviors, but for instance currently Google already works with strong preference for localised search, and that’s one of the things that pushed me to DDG.
In a way if Google wasn’t so massively successful I’d root for them to better serve mainstream searches. But in the position they are now I think it’s harder to say they should just care about the vast majority of people. Even 1% of their userbase is an incredibly huge number.
But what I would also like is a way to search without using my context, as sometimes I want results that aren't related to my location etc.
The article runs through the analysis you propose, and (within the limitations of the study) show that Google does apply very similar filter bubbling logged out in incognito mode vs logged in.
In fact in “anonymous” mode, the results are much more similar to the same person’s “logged in” mode than to other randomly chosen people’s logged in or logged out results.
For the sake of education and not being evil (or "doing the right thing"), it would be nice to be able to view results from other typical profiles' points of view.
Should there not be an opt-out option though?
I don't mean this in a snippy way, but truly. If it's that bothersome, why not try something else? It seems that most people instead think that they have the privilege to change the product to their desires.
Because the "something else" lacks other features. Are you suggesting that users shouldn't have "the privilege" to suggest new features or discuss already existing features?
And I've been using DDG for a bit now and have found it perfectly useable.
That's great that you use DDG and find it useful! If Google was a true monopoly, as the current media blitz would have you believe, then you would not have been able to so easily switch to DDG (or Bing, or...).
I can't say with confidence how far along Google is toward the threshold of "Monopoly" and have yet to hear an analysis that I would consider definitive in any way.
This is shady, at best. It probably contributes to the US’ current political instability (different propaganda / news in red states), and is also probably an unauthorized use of personal information in places like Europe that have laws about such things.
The only reason for the biased localized results is due to corporate pressure from media and news industries.
Also, you are conflating localized and unbiased results with spam and scam. Nobody is saying google shouldn't remove scams and spam. People are just saying they want unbiased results.
That's contrary to everything I know about Google's history, including the origins of Backrub and PageRank. Please provide citations in your comment.
> Please don't impute astroturfing or shillage. That degrades discussion and is usually mistaken. If you're worried about it, email us and we'll look at the data.
1) repeated queries from the same user. Do the results stay constant over time or do they change?
2) comparisons to the same experiment run against e.g. Bing or DuckDuckGo.
It seems to me that some variation in results is to be expected because of users hitting different backends which might be at different stages of index rollouts. Similarly, response times of different backends matter. If for example the video results don't come back in time you'll end up not having them in the result set.
Lastly, the insinuation of the article is that "unbiased" search results are clearly preferable. I'm not convinced. I for one like that STD for me is associated with the C++ standard namespace (which I search for all the time) rather than sexually transmitted diseases (which I luckily don't have to care about as much).
the insinuation is that you should know if they are biased or that you should be able to get unbiased result if you so wish.
It also raises suspicions on how much google tracks each user.
From this point of view what would be interesting would be a local study, to see in 100 people all in the same neighbourhood with different browsing habits have different results. this would eliminate the "non-tracking" part of the personalization.
Let's say you have three search orderings: ABC, BCA, and CAB. Which one is the unbiased one?
Isn't DDG mostly Bing results these days anyways? (unless you're searching in Russian)
On the other hand, authors could find better names for their libraries ...
Further, there are different solutions, where the user has full control over the context of their search. For instance by maintaining a fully user-controlled list of keywords that is remembered by a cookie (which can be deleted as well).
It makes total sense for them to personalize search results. If I am searching for Django it's the framework not the musician. When I search for a restaurant name it's the one in Boulder, Co, not a restaurant by the same name on a different continent.
People always adjust their messaging according to who they are talking to. It's kinda weird how it's creeping people out when computers do this.
I also would likely mean the web framework, but if I suddenly become interested in Django music one day, I don't want Google to make assumptions.
I'm fine with others having the option of personalized search, I'm not fine with me not having it.
This will turn off all localization except language, location, and device type (mobile vs desktop).
Google doesn't have a way to turn off localization - obviously, it'll display different results for "football" in the US than in the rest of the world, and neither result is the "official" one.
If you need to change language, that's easy, it's just in search settings.
If you need to change location or device type, that's harder. You can either use Chrome's dev tools, or a tool like http://www.isearchfrom.com/
Karma-free HN could be a browser extension. We could call it nirvana or something.
Almost every time I search, I don’t get a single result I want on the first page. The first 3 results are sponsored adds, then there is the Danish Wikipedia article (useless), then 3-6 advertisements pretending to be content, and then if I’m lucky something that was relevant 5 years ago.
DDG isn’t much better, but it’s better.
I’m not sure if search engines are really to blame though. With everyone being on Facebook, Medium, Quora, reddit, 4chan and so on, it’s like the web just stopped having content worth visiting.
If it wasn’t because HN gave me interesting content, I’m honestly not sure why I’d ever browse the internet anymore. But maybe I’m just getting grumpy.
Quora is another good example, it’s a place I often visit after search results, but on google.dk, it’s almost never a result, possibly because it’s not in danish.
DDG is much better, but once in a while when I’m searching for something very specific that I know google will first, I’ll do the !g.
If I’m looking up anything technical or comitting an act of google programming, I’ll always go straight to google.
The other day I was looking for some pipeextenders for our shower though, and neither bing, google or DDG were able to help. I ended up finding them by searching on amazon. Google was 100% commercials for plumbers and completely useless otherwise. DDG and bing had no clue what I was looking for. A few years ago, google would have been able to help, I know, because google helped me find our current ones.
What do you see when you google pipeextender ?
On the other hand, when I expect my search engine to make a best effort guess about my intentions, I just use the !g bang, and almost always Google finds stuff that DDG misses. This is essential for research, but downright annoying otherwise (especially considering all the linkfarms like WikiHow that have spammed their way to the top of PageRank).
I'm quite a privacy-conscious individual but I'm not going to significantly hamper my engineering abilities to prevent google from knowing what technical problems I'm having.
In a very different context, I ran an analysis of terrorism coverage in the NY Times to measure what a geographic filter bubble looks like:
How Media Fuels Our Fear of Western Terrorism
I also ran the same analysis for all the articles over a decade by geography (and compared to population, GDP, etc):
Visualizing 10 years of International Coverage in the NY Times
While filter bubbles are more pervasive in digital media (where we can segment each user, including with personal information), they’ve also always existed.
I remember 2000s era search and looking at Page 2. Now I don't scroll below result five 99% of the time. Thank you, Google.
I have to say, though. US Google is better than any other Google I've used.
The problem is the term "filter bubble" conflates personalization, relevance, and recommendations.
I can do without the recommendation engines.
Source: Worked on a recommenders for mid-sized e-commerce site.
SearX is a metasearch engine that proxies out search requests and randomizes all browser fingerprints to make it difficult for any individual to be tracked via algorithm. I don't know how effective it is of course, but I find I prefer the search results I get out of it vs google, even if the image search interface isn't as flashy.
I put my instance behind https and simple auth to allow me a bit of security while using it outside of my private network.
If you want the privacy shield vs google/bing/etc and don't mind a middleman having your search history, there are public SearX instances as well .
Being open source, you're free to fiddle with it anyway you want and I consider it as a sort of condom for your privacy.
This is the assumption underlying their research, and it is fundamentally not true.
(additionally, I am highly skeptical of the filter bubble's existence/effects and the book was terrible - full of "mights" and "coulds" and few solid facts.)
Filter bubble examples:
Search services: Google, Bing
Music: Spotify, Apple Music recommendations
Social media: Facebook feeds
Customization is troubling, but less so than bubbling. (Hey now...)
Firefox offers integrated protection against browser fingerprinting, but you have to turn it on because it's off by default: https://support.mozilla.org/en-US/kb/firefox-protection-agai...
Fingerprinting protection is also available on Safari on Mac OS X Mojave and iOS 12: https://www.cnet.com/news/new-safari-privacy-features-on-mac...
That's may not influence too much normal, acculturated, adults but may influence young and unacculturated people, thinks for example at modern urban legend like "white sugar is poison", like "chemicals trails" and they "tam-tam effect".
Another point "censor effect": we know well that a search based information access is less detailed than a taxonomy based one, we experience that often when we organize our mails, documents, files, alternating taxonomy and search based UI. When our entire world will relay on search based UI instead of taxonomy who control search may control knowledge. So it will became easily "hide" something, "push" something else etc.
Normally this is not a problem, it start to became a problem when very few search system became so ubiquitous and dominant.
"convergence": tied to the first, think only about feeds vs aggregators. With feeds you search for specific stuff and stay up to date while you tend to ignore thing not interest for you. With aggregators this "soft polarization" effect get somewhat lost substituted by another (potentially driven) "hard polarization" effect. As a result general information became less diverse (any publisher try to be at top in any aggregator result instead of follow their style) and people became more "extreme" in their information interest.
That's have far more implication than mere privacy. And if you add to the sauce the actual communication systems status like Whatsapp, GMail etc...
That model was likely build with sources such as the English Wikipedia, and their archive of a few million books. So the space at the bottom of the page may be getting a little tight by now.
When was the last time you tried to permanently remove a domain/website from search results? And not temporarily via the "negative" operator, as that gets tedious.
So instead of having an open, curated or crowd-sourced list of bad domains that can cater to any specific crowd/subset, we are forced to accept or promote serious, government and large-scale censorship in order to hide bad content.
But more to the point. It's more about centralization than it is about "filter bubbles".
If search results were perfectly consistent, some smaller websites might not get any search traffic at all and most big corporation websites would get all the traffic. It would greatly exacerbate winner-takes-it-all effects and inequality.
Personalization allows for some small websites to start with a niche and slowly grow to become more mainstream.
The cost however is discovery, which is to say things you might be interested in but didn't know exist. To enhance discovery you often need a wide band curator that can surface "likely" interesting things without destroying the experience of always finding what you want.
In the world of real goods these sorts of discovery curators are enthusiast publications which might talk about the new things coming down the road, or a restaurant critic that is trying the new restaurants.
Real human search and discovery is a pretty personal thing. And when it goes on all inside your head/environment its pretty acceptable too. People putting their favorite cookbooks in a more prominent place, wearing specific fashions that they like while only really shopping at clothing stores that support that fashion look.
When that information is at a third party, and dissectable by tools, then it gets creepy.
Someone who doesn't "know you" but typically wants to sell you something, can find you and market to you, to help you "discover" something new on their schedule instead of on your schedule. When that knowledge about what you like and don't like, pay attention to and ignore, is weaponized into a tool against you (ostensibly to help you see "great deals" that you might have otherwise missed) whether it is a new job opportunity, fashion choices, the vehicle you drive, or even where you eat lunch. That is where it gets annoying. And when the version of you that you present to the world is quite a bit different than the version of you that only you or your most closest confidant see, and someone outside that circle gets a peek because of your search history and what you have shown interest in? That is an existential threat to 'outing' the real you.
That information is power; The power to influence you, the power to sell to you, the power to expose you, the power to control how you see the world and ultimately control your actions in that world.
If you could imagine a machine that as people used it, it condensed bricks of pure platinum out of the air. It was a side of effect of the machines operation. And now you tell the owner of the machine, you can't sell that platinum, you need to just grind it up and throw it away. Well that isn't going to happen, even if there is a big 'for show' grinding operation taking place up in the lobby of the machine's owner. The owner might say, "I charge you nothing to use my useful machine, I am going to keep some of the platinum it produces to cover expenses.
 I'm using the phrase in the colloquial where a "known" person is someone who is both familiar and has been granted a certain level of access to your inner thought processes.
I understand why the result page number matters but the exact rank having such a huge impact is surprising.
Thank god someone is discussing this. I think it's a real shame that the "media" is focusing only on the 2016 election is discussing how Trump manipulated voters. Sure, they ran advertisements (and Russia did), but the reality is this is no different than any of the recent elections.
Obama was right at the forefront of this tactic:
Yet, when Trump does it (perhaps better executed, or the platforms are better) it's "manipulating an election!"
Please, continue this research and keep it unbiased.
All of the Obama supporters who traded their personal information for a ticket to a rally or an e-mail alert about the vice presidential choice, or opted in on Facebook or MyBarackObama can now be mass e-mailed at a cost of close to zero.
That's a ...newsletter? Right?
How is that comparable, except in "It's on the internet" terms, to the Russian government secretly funding ad campaigns using illicitly gained psychometric data?
Not sure how you’re equating the two, but it’s disingenous.
It's disingenuous to claim a feature like targeted advertising is also "weaponized". There's always been targeted propoganda, that doesn't make it a weapon. Everyone is capable of self deception and their own decisions. If you argue against that, then we probably should start debating whether or not democracy is a good idea.
The problem here, is that we've gotten to a point anyone can target any person or subset of people. They don't even need to be a state actor. If we view that as bad, then we should probably research it (not just the 2016 election, but all elections).
But you complain people are calling it "manipulating an election". They're calling it that because of who is doing it (foreign agents) and why they are doing it (to gain control over a powerful enemy state). That is what makes it election interference and it seems very purposefully blind not to acknowledge it.
I get that an entity could be more interested in uncovering facts about one party than another but in this particular case, the fact was that the Democratic Party was acting "unethically" in some people's eyes. If they hadn't been then it wouldn't have "influenced" people's opinions.
Another way to look at it: that entity didn't annotate the emails, the content of the emails was enough to anger people.
I'm not try to fork this into a flame war, I'm genuinely interested in what other users of this site think, particularly those who think differently than I do.
How's that even a legitimate question? I can list a few reasons why it matters:
In fact I would even buy a filter bubble if it had a "Not Trump" switch.
How about when a foreign state, an enemy state at that, does it?