For some searches it might have helped, but for extremely targeted or highly technical searches it was a disaster. Almost as horrible as the bad old days when search engines tended to use "or" for search queries instead of "and". Still it's very, very difficult to tell google that you actually for really really realsies want to search for all the terms you typed in.
But there's no way they're dumb enough to think that's helpful, right? They do understand that people want less - but more relevant - results, instead of just having more useless ones?
Like just the other day I was searching for "ruby net::ldap <some error message>" and it decided to put, front and center, results from CPAN. Just about as irrelevant as a search result can get. The little grey highlights below the results had "ruby" listed with a strikethrough, to imply that if I had only searched without that word, I could have gotten these helpful results on my own.
I can only hope that this behavior was not a conscious decision by anyone at Google. I hope it's because they added some machine learning or something to their algorithms and this was just some emergent behavior they hadn't accounted for. Because if somebody working for a search engine company decides that irrelevant search results are a good thing, or that number of results are by any means a useful metric, I'm scared for the future of Google.
Try searching for "ruby net::ldap search", without the quotes. For me, a CPAN page is the second result. (This is even logged out, in a chrome incognito window, so no cookies are sent to google.) That page doesn't have "ruby" or anything related to "ruby" on it, at all.
But even if I was searching for something esoteric like an error message, it's much more helpful to give me no results than it is to give me irrelevant ones. Irrelevant results just waste my time clicking/skimming through them, where an empty result set will signal to me very clearly that what I'm searching for doesn't make sense.
Net::LDAP is a perl module too, one of the things it does is search, so 3/4 of the words are not just matched literally, but are the right kind of net (not fishing or stockings), the right kind of ldap , the right kind of search.
You claim there is nothing related to ruby on that page, yet how many times must people have linked to both those languages from a webpage, perhaps alongside other related terms like python?
Its not perfect, and it's not magic, but is that really the level we expect from Google, so much so that we're in a huff if we have to scroll past one easily discounted search result?
They had to put a lot of effort into a search system that can ignore words that you're searching for. I'm wondering why that's necessary when I can clearly get better search results when all of my search terms are included.
I wonder if it would help if they added some heuristic where, especially if it looks like a technical topic, it would try to find the word that was most domain-defining (in your example: "ruby") and never drop that one. I wonder how difficult that would be to solve. Even with a lot of false positives, the worst that you'd get is the results quality of not dropping any of the terms.
Yes, if I search for "Windows 7 NAT", dropping the NAT keyword returns more results. But none of them will give me the information I'm looking for.
This is especially true when what I'm searching for yields almost no results. That's how I typically know that what I'm searching for was probably bogus to begin with. But when the first page is full of a bunch of fuzzed-out results that are only tangentially related to what I searched for, I now have to skim through a page full of results to realize that Google just gave up and decided to omit several of my terms.
And they weren't at the bottom, they were actually the first result on the list.
> They do understand that people want less - but more relevant - results, instead of just having more useless ones
I would argue that since they have access to their data and they do not do anything without having data supporting the change, they would know better what people want. You and I may very well just not be in their main demographics.
Really? You would think it would be the perfect opportunity for competitors to advertise. Here's the user frustrated with their existing vendor because their computer is emitting some kind of gibberish instead of doing what they paid for it to do, and here you are offering a different product that presumably makes the error go away.
If the majority of people don't know how to use a search engine correctly, maybe we should focus on teaching them how to use one, instead of dumbing down the search engine so it gives only "OK" results for stupid queries and is nearly useless for people who know what they really want. I know at least the "introduction to the Internet" courses used to teach how to use search engines effectively, i.e. use precise search queries, +/-/boolean operators, etc; don't know if they do anymore.
I hope this isn't the case, but to me it almost feels like they're trying to deliberately make it more difficult to find exact information on what you're searching for, and that makes it harder for people to get the information they need to make informed decisions, so they can be more easily persuaded to think what companies like Google want them to. For a lot of people, Google is how they view the Internet - in some ways, Google is the Internet for them. Google has immense control over what and how people can find things on the Internet, and that is what I find most truly quite worrying.
The plus operator was just one problem. Soon after they started fuzzing words in doubleqoutes and at some point they even fuzzed verbatim searches. (Luckily it now works to some degree again.)
I have no idea about the (development of the) quality of the search query + results. I have no idea how to measure that (over time).
I suppose I could write a personalized search engine that runs queries I care about regularly on different engines, aggregate the results, write some code to analyze, monitor, and visualize all that, and start running my own valuing algorithms to determine the quality and depth of Internet searches. Space and computation power are cheap nowadays, so for certain well-specified domains, this might be feasible?
Broadly stated, double quoting permits more fuzziness that the former plus operator did. And there are times -- particularly with all the cruft in search results -- when you -- I, at least -- really don't want that fuzziness.
> > In the past, we provided users with the + operator to help you search for specific terms. However, we found that users typed the + operator in less than half a percent of all searches, and two thirds of the time, it was used incorrectly.
Check my math, but I think that means that + was used correctly in only 1 out of 600 searches.
People mention "power users". Google does not want power users. Google wants a mass market of people who see, and click, ads.
EDIT: i guess I need to say that I hated when Google changed the plus operator; and I find using Google now to be a frustrating and annoying experience. I'm shown results tha often are not relevant to my queries.
And Barrkel makes a good point about my confusing potentially misleading description of the times + is used correctly.
People don't often need a power search. But when you do, you really want it to work.
Google gets ad clicks by building a brand, they want us to associate them with "competent to handle all my search needs." This is why Ford and Dodge make stock cars, not because they want to sell them on a mass market, but because they want to convince people the brand is capable of excelling beyond their needs.
UPDATE: I think you're right about Google's rationale, I just think execs are missing some counterarguments.
That was part of the draw of switching to Google way back when. If you had a more challenging search to pull off, you used Google, and eventually it became a habit to just visit Google in the first place.
Now I'm starting to try DDG when Google frustrates me. I also don't need 'Google Power' for a lot of my searches.
It's probably a good thing they're implementing these changes. It makes it easier for an underdog to come in and disrupt their core business, introducing some healthy competition.
Matt Cutts obviously isn't stupid, but that's a very stupid thing to say, or think, or act upon.
Many things are done rarely, that are extremely useful or important.
Hey, most search strings probably represent "less than half a percent" of all possible searches; so by that metric they could drop most of their index, and only answer the most frequent searches...
Let's not forget how large an audience any Google feature has, though. + being used correctly, even by 0.33% of users, is still probably a million people.
Still, I am surprised that Google has made search so poor for so many people in some situations. Although I understand they've done some work on making programming code easier to search for?
Anyway, losing mindshare among power users is very dangerous in the IT business. Even when you really just care about the average Joe.
Absolutely. And Google have been stumbling hard the past 2-3 years.
but in this case, it probably just means that people were using it like "cookies+cream". they aren't using + as an operator, they're using it as a synonym for "and", and they are better served by google parsing their query as native language instead of parsing it as an expression in a rarely used query language.
Double quotes should translate to 'match this or nothing', what's the point of quoting otherwise. And if the + command is now no longer used then maybe bring back the old usage, which worked just fine.
Change for the sake of change is ridiculous, changing a well known user-interface in order to push a non-core product is slightly mad.
It also shows how bad it is to have all these services belong to one single company, imagine google+ being launched as facebook+, do you think that google would have dropped their '+' operator for that?
>Searching for mars surrounded by quotes — “mars” — generates exactly the same number, even though that number should drop.
As far as I know, that number is just an estimate, and is wildly inaccurate for the actual amount of results. It's the same reason you could have a search with 10 pages of results shown at first, but after you get to page 3, you only see 4 pages of results. It just estimates it until in needs a more accurate count.
I can't find the original source for this, though I didn't spend much time looking, but found this on stackoverflow:
>From a Google developer (Matt Cutts, head of the web spam team):
>"We try to be very clear that our results estimates are just that--estimates. In theory we could spend cycles on that aspect of our system, but in practice we have a lot of other things to work on, and more accurate results estimates is lower on the list than lots of other things"
"mars" - 228m results
+mars - 19k results
+"mars" - 197m results
Assuming the estimated result count is at all meaningful, it looks like + still does have an effect: it turns ["x"] in to meaning the same thing as [x].
So I tried verbatim search for mars, to find the estimated results count disappears. However, skipping along the pages got me the counterintuitive result that verbatim search for mars finds only 188 results - I'm pretty sure that google has indexed more mars-mentioning pages than that!
Oh, not counting the "very similar" results:
> In order to show you the most relevant results, we have omitted some entries very similar to the 188 already displayed.
Clicking this "include similar results" link gets me 45 results pages or about 450 results that mention mars found with a verbatim search.
I feel like I've gone back to 1994!
"In order to show you the most relevant results, we have omitted some entries very similar to the 260 already displayed.
If you like, you can repeat the search with the omitted results included."
even though obviously there are more results that aren't "very similar" to the 260 that have been displayed.
The assumption has always been that people don't keep clicking through tens of pages of results...they refine their search terms.
I think mars is probably not the ideal term for testing this sort of thing - just too many hits. I tried mars "philip k dick" and found the same thing - 30 or so pages and then it refused to give me more results. Increasing the obscurity level somewhat, devil's vindata vanilla search vs. verbatim search did indeed show a reduced set of results for verbatim (the vanilla search results including hits on devil's vendetta)
Overall though, if you're looking for more fringe results for common search terms, millionshort.com might be worth a try, where you can remove hits from the top 10^n sites for your term it says.
and not: ALWAYS mars
search for: +mars -mars
You might expect to always get 0 pages, but that isn't the case. You get pages that never mention mars and pages that contain +mars.
The problem is that Google clearly care about search less and less each day. If the trend continues, we'll be able to change some day, but not because the other sites got better.
Now I'm contributing to OpenStreetMap, because I think at least that stands a chance against google maps and might even help some other search engines. DuckDuckGo is using it and maybe Bing will at one point, at least that would explain why they provide aerial images.
>> Conducted by California-based Answers Research, the study queried an online sample of nearly 1,000 people at least 18 years old, all living in the US. None of the participants in the survey knew Microsoft was involved. Participants performed 10 searches of their own choosing, and were shown the results from both Bing and Google, side by side, with all the the branding removed. Additionally, notes Microsoft, “The test did not include ads or content in other parts of the page such as Bing’s Snapshot and Social Search panes and Google’s Knowledge Graph.” For each search, participants chose which side showed better results, or could call it a tie. Of the participants surveyed, 57.4 percent chose Bing more often, 30.2 percent chose Google more often, and 12.4 percent didn’t prefer one over the other.
It was this study that inspired Microsoft's Bing it On campaign in 2012. Unfortunately, the ad campaign mostly consisted of Microsoft recreating the independent study, but cheating to make sure Bing would win even bigger.
For instance if I search ping pong in google right now, my first result is a currently airing japanese tv series about ping pong (and the show name is not even in the same alphabet I used to search). Duckduckgo returns the english wikipedia article.
Also, the title of this article is so messed up...
I get Wikipedia's Table Tennis entry first (I do use Wikipedia a lot). The first page includes 5 local/state related links, also has an "In-depth articles" section at the bottom with three items. The 2nd page is good as well, I personally can't complain, although I still rue the day they zapped the + operator.
I still run the occasionally !sp or !g bang search (StartPage is a proxied Google, the other hits Google directly). And in almost all cases there's little if any discernable difference.
The main exceptions are:
1. Date-bounded searches. DDG doesn't support this.
2. Special collections. Books and Scholar in particular.
Until recently, I'd have included images, but DDG's added that. Maps can be searched through OSM.
My primary concerns are privacy and search bubbling, but quality is up there as well.
"I hear you. How to indicate to the user that we don't think there are any good matches for their query is something we debate and experiment with in search quality at Google."
Without a doubt, the '+' operator is the most important one, followed only by being able to search for a phrase surrounded by quotations.
That and the results changing as I type drive me nuts. It is quite common for me to see something I want only to lose it on the next keystroke which was already queued.