Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: do you feel Google search result quality has gone down?
112 points by coffeemug on Oct 26, 2009 | hide | past | favorite | 109 comments
Yesterday I was surprised to find myself trying Yahoo search because I couldn't get satisfactory search results in Google. It was the first time in years. I started thinking about this, and I realized that in the past few months I haven't been getting particularly good results from Google. I don't get spam or anything, but a lot of times I don't get useful results.

The thing is, I'm not sure if it's because I do a lot of very specialized stuff these days, or because the search quality really has gone down. Consider these two examples:

Search for "Linux asynchronous IO". You'll get a lot of articles, but most are four years old (which is an eternity in the Linux world). These results aren't very good - posix AIO is implemented in userspace threads, and io_submit and friends don't work in many cases. Which cases? Hard to tell - I couldn't find any information in the results no matter how long I searched. I couldn't find any benchmarks either.

Perhaps it's because there is no good info on this on the web (hard to believe). So let's try something else - search for "concurrent hashmap in C". After hours of searching and playing with keywords, I got almost no useful results (other than Intel's libs, but not too much info on that either). It's difficult to believe that there are no good implementations out there.

So, is it the specialized nature of my searches, or is it Google? What do you think?




I've used google since it was google.stanford.edu, and it's clear to me the results have suffered. My feeling is that two of the problems are SEO and feedback effects of google's own popularity.

SEO: When you cut through all the BS, the entire goal here is to make a less good match come first. And it works (sorta). Just consider crap sites like Experts Exchange that we've only learned about because they pollute many searches.

Feedback effect: Thanks to google, less people do less collecting of good links. Why bother when you can google for it? So there's less good information for google to use in ranking links. Bear in mind that when google started, nearly every home page had a long list of links to all the pages that particular user liked and frequently used. I used to have one; I've long since deleted it; my blog has some outgoing links that I like, but relatively few. If I twittered, I'd probably post a lot of outgoing links, but of dubious value; there's no gardening of just the perfect page of 100 links going on anymore.

(I think this also partially explains why some (generally more specialized, so less effected by other things) results feel dated -- legacy links that are still hanging around from days when links were still used that way.)

Feedback effect: Thanks to google, ten sites tend to be more important than any other sites on any given topic. This results in certain sites becoming increasingly important. Wikipedia is the chief example here. Why is there only one Wikipedia and not a dozen? Chiefly because it's gotten all the google juice. If you want your wiki article on foo to show up in google, you naturally write it on Wikipedia, not Fooipedia. The result here is that all google searches feel increasingly the same -- of course Wikipedia is always in the top ten, or maybe something like Stack Overflow for a technical search.

----

So, these days, if I don't see something interesting in the top ten, I often click on the link to page 10 (or 20, or 100) of the results. Often more interesting. For example, google for "mashed potatos".

Top 10 results: "Perfect mashed potatoes" (SEO), allrecipies.com (always in top 10 for any recipe search), foodnetwork.com, Wikipedia, about.com, nytimes, etc. Pictures of mashed potatos. All generic and useless.

Page ten results: Dairy-free mashed potatoes. _Potato_ free mashed potatos! Caramelized Onion Horseradish Red Mashed Potatoes! A poem about eating them. At least marginally more interesting and quirky. What I would have expected out of google circa 1997.


I'm not quite sure what exactly it is you would want from them. http://www.google.com/search?q=mashed+potatos has a list of recipes for the query. Seems like exactly what you would respond to the question "What do you know about mashed potatos?" If I changes it to "mashed potatoes" like suggested, I get the rest of the results you mentioned. Again, this is exactly the kind of stuff you wanted.

Now, if you want something "quirky", why are you searching for a generic term? What kind of "useful" result do you want from a search on mashed potatoes? If you give them a crappy search query, they should be giving you as generic of results as possible.

One thing I've found is that if you are looking for something specific, don't search for something generic. If you wanted something "quirky", why didn't you do "mashed potatoes quirky"? Then you get a restaurant that features mashed potatoes heavily in their recipes, a carmelized onion mashed potato recipe, a mashed potatoes festival, several more "interesting" recipes, and a book called "Grinning in His Mashed Potatoes".

It sounds to me like the results have improved, not gotten worse, if you aren't getting a poem about mashed potatoes on the first page of search results for just "mashed potatoes".


The results from Experts Exchange are typically useful, but you have to scroll all the way down past the ads and other crap to see the actual answers.


Last time I checked, EE was hiding all actual answers to visitors. It doesn't really qualify as a useful link, then.


If you come to the page with a google search result referral, the answers are all at the bottom of the page. This is called first click free: http://www.google.com/support/news_pub/bin/answer.py?answer=...


They hide the answers to visitors who weren't referred by Google. As long as referrer=google.com, they'll give you the answer.


Never.


I really wish Google had the option to blacklist certain sites from the results, such as EE. Maybe they could even use the data of what people are blacklisting.


Agreed.

Until they implement it, you can use the CustomizeGoogle add-on for Firefox that lets you filter results out on your side.


A lot of non-technical people lately seem to type whole questions and sentences in the search box. Syntax analysis is hard for them, it seems. But Google now encourages this and levels everything down.


Somewhat perversely, I've specifically tested that syntax on a few occasions, and have had surprisingly good results as compared with a classic search.


So, do you think it's time to start adding a top links and help them out?

I have started tailoring my searches in odd ways to help them out. Ex: Adding a the year when I want current results. But, without useful links it's all GIGO.


I think Google knows about this exact problem. They know that the links people are sharing on Facebook and Twitter have supplanted the site-site links, and are therefore much more important to search quality. To this end, the agreement to include 'real-time' search data from Twitter is partially a misdirection, since the importance of having the data about shared links far exceeds the value of someones 140-character blurb.

A related point: whoever first owns the data from all link aggregators (digg, reddit, mixx, etc) and all URL shorteners (bit.ly, tinyurl, ad nauseum), and weighs those results more heavily is going to have an awesome search engine... albeit better for entertainment than productivity.


whoever first owns the data from all link aggregators (digg, reddit, mixx, etc) and all URL shorteners (bit.ly, tinyurl, ad nauseum), and weighs those results more heavily is going to have an awesome search engine... albeit better for entertainment than productivity.

i will humbly disagree. i think folks who browse the web are different from those who search the web. search is what gets you the most relevant results, therefore more opportunity for ad money.


A related point: whoever first owns the data from all link aggregators (digg, reddit, mixx, etc) and all URL shorteners (bit.ly, tinyurl, ad nauseum), and weighs those results more heavily is going to have an awesome search engine... albeit better for entertainment than productivity.

Except I'd never find the obscure stuff that Google helps me find every day. I don't think a lot of links to manuals,mailing lists, etc show up on any of the sources you mention.


No, you'd keep the long-tail search results in tact, but then tend to surface the good content people are actually sharing amongst themselves.

Google already does this, by surfacing youtube videos in many search queries.


SEO: When you cut through all the BS, the entire goal here is to make a less good match come first.

That is not the entire purpose of SEO. There's good sites out there that don't provide their content in a way that can be indexed by spiders. SEO often solves that. There certainly are plenty of people making bad websites and trying to make them rank, it's Google's job to weed out the useless information.

People still collect links (shameless self promotion: http://internetmindmap.com ), they still have huge blogrolls, there are human powered search engines, a vast amount of directories for every imaginable niche...

Google doesn't need the perfect page of 100 links and I doubt it ever did.

Your mashed potatos example does not make sense. Google gave you generic info for your generic search query. How is that bad?

Now the fact that certain sites dominate a very wide range of search queries, is an interesting point. Personally, I would just add a sidebar or something similar, to be occupied by the "staple" sites, such as wikipedia, about.com etc.


Yes.

One thing I noticed is that searches no longer require that all words in the query be present in the search results. Adding a + before a word is now required to ensure that it's present in results. That frequently results in me having to do 2-3 searches to find something that could previously be found with one.


Another annoying snag is that if you search for "A B", google will also search for "AB" (eliminating the space). This affects a lot of searches with acronyms and technical terms. For example, if you're looking for info on MIT's RAs, the top search results for "ra mit" or "mit ra" are related to "ramit" or "mitra".

This seems to be an optimization for their average user, but is really inconvenient for people searching for system errors, mathematical/cs theory terms, or other queries where acronyms are common.


I have noticed this too and find it very irritating. I expect ALL words to be present. Why did they change this?


I have no idea why they changed this, I noticed it recently and I can't figure it why.

User testing must have somehow played a role, I hate to blame non-technical users but... still I can't believe anyone gets better results when the keywords are optional.

The odd thing is I remember switching from AltaVista to google, before google you always had to discount the first bunch of results, but google was just so amazingly accurate. And yet now I find myself skipping the top results in a google search.

I remember when google bombing first started, it didn't bother me much, but then google tried to counter it and I could swear searches got a bit worse. And recently they've gotten even worse.

It's a shame this seems to be destiny of all truly great things.


Even with the +, google makes some interesting "interpretations". I notice when I search on one of the BSDs (e.g. OpenBSD) it seems to pick pages that just have BSD on it.

Plus, google's handling of punctuation (e.g. f-script) is a pain since (even with the +) it will do weird substitutions and consider blanks good enough.


Since I started using more advanced search features regularly I have gotten significantly better results. Things like +"my search term", -free, -download, -cracked tend to heavily limit the spamming of results, and if I want something specific using tricks like inurl:keyword and site:siteToSearch tend to make what I want just jump out on the first page.


Also the - minus comes handy to exclude similar terms or phrases. Yeah, this feel just like Altavista.


Thanks for the tip of the + !!


Yep. I never used to use search operators unless I was looking for something really specific. Nowadays I get completely irrelevant results and I am forced to quote strings and explicitly specify term precedence, conditionals and other regexy stuff.

Last night I was searching for the syntax of DEFTYPE when used with various types (i.e. MEMBER, SATISFIES, OR, etc.) and the #1 his for "deftype member" was the personal MySpace page of some guy.

I think they're optimizing for "social" results now.


They'd probably do well to have a seperate 'technical search' for searching for things related to technical matters, eg programming languages, physics, chemistry, medicine, engineering, etc. And remove the casual stuff (facebook, myspace, pages that are clearly not about a technical subject, etc) from that index.

It would probably be a highly praised feature to seperate off a second index like that, as specifically searching for programming language concepts and documentation can be difficult (the C# and .NET problem).


Having separate indexes and separate searches for sub-domains of knowledge is an interesting idea. The original idea of PageRank was that each link constituted a 'vote' for a page. Perhaps different epistemic communities on the internet use links to mean different things. So, the meaning of links on technical webpages is slightly different from the meaning of links on social websites. Interesting idea to play around with.


It's also aggravating that quotes don't allow matching of specific symbols within the quotes.

And, yes, the results are different. I'd agree with you they seem enhanced (eg., classified results) and fresher for popular culture, but somewhat worse for domain specific queries.


Which is dangerous, since it seems like the biggest threat to Google has always been the 'vertical' search market. If they don't have a better way to narrow search results to a specific domain, they're going to run into issues from competitors.


Wait until they integrate twitter. It will be 99% real-time crap and 1% historical results... Google should just stay away from real time frenzy, or maybe fork the search engine to avoid disrupting the good results it has.


Can't agree more. Let's hope in the future we can choose to search for tech-specific results.


There's evidently a market for a tech-specific search engine..


Just out of interest, do you find lispdoc.com useful generally for this kind of thing?


Yes, and this is why I started Duck Duck Go: http://duckduckgo.com/. Thanks for posting specific cases--they help me immensely. Anyone else have more?


I really have to applaud your efforts on that front. I've been using duckduckgo on and off for some time now and I must say I'm mighty impressed. The best part is the keyboard style navigation and the big clean look. Always retain those two. :)


I've used Duck Duck Go a bit and the results are okay. Maybe just a shorter URL would be awesome (such as ddg.com ddgo.com, etc...).

Here is a search that I have had problems with :

octave "--eval"

Your site does pretty well with this (fourth link is somewhat relevant).


I was searching for a way to enable/find the chat logs for Microsoft Communicator which we've recently switched to at work.

Google was basically filled with dead end forum postings and SEO spam.

DuckDuckGo was more helpful and brought me to the MS TechNet article with full documentation on MS Communicator Policy configuration.

Bing was surprisingly the most helpful and brought me to the Communicator Team posting from 2008 which shows me where I should have been able to find the logs.

It looks like my work has blocked/disabled this feature on a global setting even though I haven it enabled locally.


There is one change that gets me often. Using a hyphenated-word used to require that the two words occur in order, although it would also match the two words joined together. It also used to turn off stemming.

Previously it was equivalent to ("hyphenated word" OR +hyphenatedword). But now it seems to behave almost the same as the unquoted (hyphenated word).

Just to make matter worse, when I tried out my example just now I found that the first result (a wikipedia page) for "hyphenated word" doesn't even include the phrase!


I hate searching google for technical jargon (and its worse the more technical the jargon is). Usually google just gives you a bunch of academic papers (in case you are wondering academic papers do not actually explain anything, they just use a bunch of jargon in a plausible way). Gee thanks. If you want I will think up an example.


One feature I use a lot from google is the cache, just to avoid websense at work.


Wow, that's amazing. I think I'll be using this a little more often.

Notably, you return a result for a specific osCommerce error message that I wrote about a while back (!); Google doesn't even know I exist.


I've been using duckduckgo; very nice. One UI comment: when I click on "More Links" I'd like some kind of cue -- say, a horizontal line -- so I know where to start looking when the results come back. I frequently click for "More" then while that's loading I go look at another tab. When I return to Ducky it's sometimes difficult to separate the new results from ones I've looked at (but not clicked on) before.

You might try to track page age; some of the results I get are from 5 years ago and as noted by others, aren't always useful today. But that's a harder problem for another time.


It has certainly gone waaaay south since day 1, that much is undeniable. Of course, a large part of that is that the internet has gotten a lot more useless filler in those years, and this has of course made 'relevant search' an astronomically harder problem than it was in 1998.

On the more important metric of 'has quality gotten worse in the last couple of years', I would say 'sort of'. The direct quality of results HAS suffered, but on the other hand, google have implemented the user-wiki thing that allows you to modify, to a degree, which sites are less relevant.

I will add that I think google needs to rethink it's keyword fuzziness, in the past it used to be acceptable if the results didn't exactly match what you were searching for, but these days that is becoming more of a problem. If I search for a bunch of words, I typically know I want those words, by all means suggest 'did you mean ... ?' but the fuzziness in the results needs to be pulled back.


Back to the Altavista days of +search +must +include +plus +signs +"and quotation marks"

Seriously, the problem is advanced users get unexpected results with the query expansion and refinement layers they've added on. While dropping obscure words is helpful when grandpa has misformed queries, it's maddening for a technical user looking for a very specific, infrequent keyword. However for grandpa, it's probably a better experience for a generalized result.

Using +'s works well enough, but it's disappointing we have to use a less efficient method of querying now.


More generally, I think that we don't have a good mental model of how Google is searching. It used to be straightforward, almost like Git, which won't make any "clever" merges. Google, as a tool, is on the decline because it's not comprehensible (even to us advanced users!).


Quote from Google employee:

It does work as described! - which may very well not be as desired. If you want a more mathmatical-logical use of operators then you need to go find another search engine.

http://www.google.com/support/forum/p/Web+Search/thread?tid=...

I am afraid that Google is now more of a 'social search engine' than a 'hacker search engine'.


I think the the results quality on Google has been the same or better over the last few months; one thing we have been looking at is helping less savvy users who might mistype a word or type extra words that they don't really need in their query. That can be a little more annoying for power users, but on the other hand the power users pick up tricks like "Use a '+' in front of a word to require Google to match that word."

Regarding the query [Linux asynchronous IO] returning older results, here's a tip. Above the search results click the "Show options" link to open up what we call "toolbelt" mode. From there, you can click to show only results from (say) the last year, or in a certain date range.

Toolbelt mode is really handy, e.g. if you search for a product, you can click "Show options" and then click the "Fewer shopping sites" link to get more reviews and manufacturer pages instead of comparison shopping sites.


Did you try http://www.google.com/linux

?

Although I alternate search engines regularly, I do think the way Google indexes for specialized searches is pretty smart.


Is there a list somewhere of all the specialized searches?



Oh, that's a shame! I was looking forward to seeing a big page of specialized searches. Oh well.

Thanks for the link!


There was a comment on LessWrong where a poster suggested using Yahoo over Google for non-quoted search. I tried it myself (I need to find an article based on several words mentioned in it) -- and was pleasantly surprised. Yahoo seems to be doing much better job than Google for non-quoted search requests.


We had the exact same discussion on reddit:

http://www.reddit.com/r/programming/comments/9s6p2/google_is...

It is starting to get incredibly difficult to do specific searches (such as for documentation). It also seems that SEO have become incredibly successful in gaming Google.

It would actually be good if there were some competition.


I tried Bing the other day out of curiosity, and the results where incredibly dissapointing compared to Google's.


Did you try their image search or product search? I've almost completely dropped those two parts of Google due to Bing.


I definitely agree with the image search. It's lightyears ahead of Google.


I thought so too, Bing seemed very poor for the technical related searches I made.


I've actually noticed the inverse problem with some non-tech-related searches. Politics is a fine example: try and find an article about a 2002 House vote on Social Security and there's no chance you'll find it; the results are all present day.

Strange that with regard to current events the web seems to have little historical memory, and yet with regard to current technology it has too much.


Google News has an archive search that is useful for that type of query.

http://news.google.com/archivesearch?q=2002+house+vote+socia...


The autocorrect is maddening. I've been researching Riak recently. It's new so there isn't a lot available. Paired with another search term I frequently get results only for "risk." Let me know if I might have made a typo, but don't assume I'm an idiot and do something different than what I told you to do.

Dropping keywords also annoys me. If the keywords don't exist then tell me that so I can adjust my search. Don't give me a long list of results that I have to click through before realizing you screwed up the search.

The only reason the google search bar is still my default is because I use it as a quick and easy calculator.


If you're using a Mac you might find it interesting to know that the spotlight search bar can be used as a calculator almost exactly the same way as Google can... Unfortunately it can't do conversions (at least not on Leopard). Also, the shortcut for spotlight is CMD + Space.


Absolutely-- especially when you drift out of the world of technology. Google is based on the "linkerati" (hat tip to SEOmoz) - geeks and bloggers who link to stuff aggressively. That works great in the worlds where people link (social media) but poorly if you're searching for non-geeky stuff.

I've been doing home remodeling a bit lately, and it's clear to me that there are NO home remodeling linkerati. But there are plenty of SEO guys out there and it only takes a few low quality links to top a lot of searches. So search for home remodeling stuff and you see plenty of adsense spam.


its more the internet has gotten worse I think

They should drop yahoo answers, any site with an affiliate link, and any of the internet marketer sites ( like ezinearticles.com ) from their index


that's funny, I was about to suggest dropping Yahoo answers too.

My friend had an idea to make search rankings go by the number of ads on a page, the less ads the higher the ranking. I guess Google would never do that because of their adsense program.


See also the discussion here: http://news.ycombinator.com/item?id=803201.


Not overall, but I have found Google frustrating for things that are very recent. For instance I wanted to watch Obama's speech to the schoolkids on YouTube (the day after he'd given it), but the only videos Google would bring back were from his race speech in Philadelphia. I tried all the operators and keywords I could think of, and couldn't get what I wanted.

FWIW, Bing nailed it on the first query.


It's eye-opening to try "Blind Search", which submits your query to Google, Yahoo!, and Bing simultaneously, and displays to you the three sets of results without (at first) identifying which is which:

http://blindsearch.fejus.com/

I was amazed to discover that I was consistently choosing the blind results from Yahoo! as best for my own searches.


First two attempts I picked Yahoo and Bing, heh.


I think you picked two bad terms. "Linux asynchronous IO" has meaning more than just "IO under Linux that's not blocking". That's what most pages using those terms are referring to, so it makes sense that that's what Google would give you.

"Hashmap", the term you searched for, is a term popularized by the Java world; I think most C programmers still call them "hashtables". (Hash map is a better term, as many excellent map implementations aren't actually based on hashes; see "Judy" for example.) A search for "C concurrent hashtable" gave me a lot of useful results. (You are also suffering from C's lack of a coherent community here. Lots of people write this sort of thing, but few think to share it. Hence, not many search results.)


Hi, My name is Steve Baker and I'm a Software Engineer in our search quality team here at Google. We take these types of complaints seriously and use them to debug the problem/brainstorm solutions.

For those of you who have been forced to use + or quotes to get google to return what you want, we would love if you would post the specific example queries on this thread. (Some people did, but a few of the complaints were vague.) Also, in general you can click the 'Dissatisfied? Help us improve' link in the footer of google search results pages to report problems. We check those out, but if you wanted to include the word "hackernews" in the info you submit, that would let us see all your reports easily.


I think the overall signal-to-noise ratio of the Internet has simply taken a big dive since the advent of blogging. I'm sick of seeing the mindless duplication of content in search results from a re-blogger who copy & pastes a summary and link. It's frustrating because there's no easy way to spot it until you've actually clicked the link. Once you see it enough times you can spot the duplication on the Google search summary and skip it. The risk is you might miss the 1% of people who can be bothered to spend a few minutes to add some useful information.

One thing I have noticed is YouTube's search is dreadful. It's embarrassing for a company like Google to not offer an adequate search engine on their own site. The grace period of the YouTube purchase is long over. I shouldn't have to retype my search 5 different ways to find what I'm looking for. These days I've just given up and search for simple terms and look through 10 pages of results or try related videos until I find what I'm looking for. Not super impressed with Gmail's search functions either. My mailbox is mirrored on Gmail and my local mail client (Apple Mail) Mail will almost always find what I want when Gmail gets confused. Apple isn't even a search company. Google can do a lot better.


It's google's success undermining the assumptions behind pagerank (!) leading it to perform less well.

Before google got so effective people put more work into maintaining collections of links to other sites; hand-curated omnibus directories like yahoo and dmoz or narrower fields.

As search became better this was increasingly a poor use of time; you could, after all, just go and google stuff to find it again, so why bother maintaining a hand-curated set of favorite links.

A consequence of this is that link-actions-taken-for-reasons-consistent-with-pagerank's-assumptions dropped, and reducing the signal-to-noise ratio insofar as pagerank is concerned; similarly link-actions-taken-for-reasons-contrary-to-pagerank's-assumptions increased (SEO), further dropping the signal-noise ratio.

Even extremely robust SEO-detection techniques can only cut away the increase in noise; now that people are less incentivized to publicly post high-quality hand-curated link collections there's less for the algorithm to go on.

This is why there's talk of twitter and fb vis-a-vis search; twitter and fb are where the info the search engines need currently lives.

(!) I know they use additional techniques these days but whatever they're throwing into the mix doesn't seem to be counteracting the overall utility decay.


I wish there were a search engine that was not optimized for the mass-market, but weighs technical oriented stuff a bit higher. So if I search iPhone view, I do not get results like "View all iphone features" but something related to the iphone uiview.

I think in future search engines should lean a particular way depending on your profession - "Google for Medical Personell", "Google for Architects", "Google for Programmers".


In my opinion, information can never be retrieved from a single source. It always pieced together from different sources by the person with a task at hand. Google is a tool which can let you jump to different pages quickly and let you develop insights into the data they offer and put together your own judgment. Most search engines used to rely on token matching semantics to retrieve results. This has widely changed in the last few years. The golden hammer today is user-data. Google receives millions of clicks a second. With this, it tones its indexes and cached results to provide "better results". So if many people think Wikipedia is their best answer for "mashed potatoes", you probably also do.

I guess a single keyword search is not going to get anything sensible out. The answer lies in being able to interact with such systems and get what you want. Solving the search/retrieval with a single query is a very hard problem as it depends on the context of the user. Try searching for the word "Execute". Does it show pictures of a person being hung or a how-to for the first hello world program in a terminal?


tech has always been hard for me, even with plusses and quotes. searches like "uboot arm cross-compile error xyz" always turned up miles of mailing lists that never amounted to anything.

on a similar note - http://guillaume-nargeot.blogspot.com/2009/08/think-twice-be... (probably saw this on HN)


There seems to be a consensus that SEO is distorting Google search. I think I can refine that a bit. It appears that Google is using a "don't feed the trolls" strategy in response to link spam. Here's my evidence:

"The Modbookish" is an inconsequential six member Ning social network, but doing a Google search on "Modbookish" is interesting. There are the annoying 'Google thinks you are stupid' hits like the site in which there is a sentence ending with "mod" followed by a sentence beginning with "Bookish". The remaining results are sites that link to the Modbookish. However, the Modbookish itself is NOT in the results. As Google ads are on the site, there is no doubt that the Google bot visited it shortly after it went online. Nonetheless, Google has refused to put the site into its web index for a couple months now. Clearly, (and not without cause), it is assuming that the pages that link to the site are link spam. It is probably waiting for some semi-authoritative link to appear before it indexes it.

So, why is this bad? It undermines Google search purpose: to provide the most relevant results. When Google got started, I recall that when Sergey or Larry were asked why Google search was better they often offered the Harvard example. If one enters the search term "Harvard" with no other terms into a web search box, there is really only one reasonable search result: the Harvard University website. However, in the search engines of the time, the university often ended up on page five or six. On Google, it was search result number one. By the same token, if you enter "Modbookish" (no spaces) there is only one reasonable result, and Google doesn't offer it. In contrast, both Duck-Duck-Go and Yahoo list it in the first spot.

Why is this really bad? Many (Most?) websites do not do their own site indexing; they let Google do it for them. Clearly if Google refuses to index a certain important but only occasionally referenced page, that is a problem.


Can't your problem be solved by clicking the 'Web options' link on google's search results page and selecting 'past year'?

http://www.google.com/search?q=linux%20asynchronous%20io&...


I think it's hard with a generic search page to get good results. Like other people commented if it appears in Wikipedia, MySpace, or Facebook those results will be first almost everytime. I built a page in javascript (http://www.bygsearch.com/) to aggregate & combine the results from Yahoo, Bing and Google which is sometimes better because of the quirks in each search engine. I was going to add some specific search verticals like /programming or /travel and combine results from different sources. Like maybe use github and google code in programming and combine with certain sites like stackoverflow while specifically blocking any result from experts exchange or a list of other crappy forums.


No opinion overall, but here are a couple tips:

'Linux async io' -> select 'show options' and then 1-year view on the LHS.

Sadly for the concurrent hashmap I got to 'concurrent hashmap "c" -java', then saw this very discussion page at #3 and I'm in an infinite loop until further notice.


Also, go to stackoverflow and ask about the concurrent hashmap. Someone will be very happy to help you out. After that, you'll find it on Google :)


Yes. It's changed a lot recently, to the point that I made a whingey blog post about it:

http://www.cederman.com/2009/09/google-continues-to-disappoi...


Hey Tim, your blog post mentioned the [recency microsoft word misspelled] query. The way I normally check if a word is valid is just do a query for the word, e.g. [recency]. In the top right we show the phrase "Results 1 - 10 of about 2,910,000 for recency" and "recency" will be a hyperlink to a dictionary definition if we recognize that as a valid dictionary word. Looking at the number of results is a good signal too.

Your "Red Room" query is hard in a couple ways. First, it looks like that root page used to have the words on the page: "The Red Room Doors open 6pm $18 Pre-Booked" And it's also tough because it looks like the name changed to the "2nd Degree Bar & Grill" at some point. The fact that you can type [red room] and get a suggestion for [red room st. lucia] is actually pretty helpful in my book because it leads you to the answer that the name changed.


Hey Matt -- thanks for investigating and replying, much appreciated.

I actually did a search on "recency" first, and Google didn't correct me and the first hits were a couple of dictionary entries, which is why I searched for the longer query. I don't know why, but I guess I figured there might be a blog post perhaps complaining about Word lacking common dictionary words and marking them as misspelled. In hindsight, including "recency" in the query was silly, but I'd prefer to get nothing back instead of Google assuming I want something else, which is the issue I was trying to highlight.

I agree that the suggestion of "red room st lucia" is a good one, and it helped me find what I was looking for! However the problem I was trying to show was Google's new approach at suggesting a first hit without search terms you've entered. As another example, I was trying to find my old (and embarrassing!) Geocities page recently, so I searched for [tim cederman geocities] and the first hit was my own blog.


Simple question - simple answer: quality of results is degraded for me (both technical and educational - I am a work in progress home handyman).

In my experience Google has not kept up with the assault of companies trying to beat its algorithms!


Yes. I blame SEO :-(.


http://www.google.com/support/forum/p/Webmasters/thread?tid=...

Let me guess:

1. Google's Pagerank patent is about to expire, and they have to make something new?

2. New ranking algorithm adjustment to compete with Bing but failed?

And similar discussions on reddit:

http://www.reddit.com/r/programming/comments/9qro1/

http://www.reddit.com/r/programming/comments/9s6p2/


The pagerank patent isn't really relevant anymore as the search algorithm has so many more components and variants now. Pagerank as it is defined is a much smaller signal than people think.


I agree with the points made by others here, about the quality of links being degraded in recent times, lots of SEO in lots of websites that are meant to just play with pagerank, and a general crowding of results with old and irrelevant stuff.

I find myself alternating between google and bing these days, for some technical searches sometimes bing works better (especially - and unsurprisingly - for Microsoft-related technologies).

Some competition from bing may lead us to a better google (just like competition from apple made microsoft's new OS better).


I have always had the habit of quickly editing the query if the top 5 results aren't what I was looking for, up to the point where I don't even really think about it, just adding some more keywords or sprinking a couple pluses, minuses or quotes. So I can't really tell if it has deteriorated in the past years.

However, I do think it could be possible that the pagerank algorithm does not scales good enough for the growth of Web pages, and especially with the "SEOs" out there always trying to come up with new ways to game the system.


In my experience, Yahoo really excels in image search and whenever there are structural changes made to websites (for some reason, Google doesn't really handle moved websites / pages well - there was even an article about that a few months back). But overall, there is around 90% overlap between the major search engines as demonstrated by multiple studies.

That being said, I do feel often like Google's search result quality is decreasing. Bias due to using it exclusively perhaps?


This may be true when you are searching for blog posts or articles, but when searching for code examples, I find that usually Google gives the best results. Also I have noticed that over time, these results have been getting more accurate. this maybe either due to the fact that the number of "code blogs" may have increased, or that Google's search tech maybe getting better. Personally I'm inclined to believe the former.


Originally, a Google search for some Linux-related terminology would lead to an LKML post or white paper where that technology was explained. Now, most of my searches lead to dozens of user-oriented self-help forums where the questions are never answered. Finding information on actual development now requires knowing which sites/lists to search (like LKML, alsa-devel, kernel subsystem maintainer sites, etc.).


Occured to me several times.

And it's duckduckgo.com that helped me.


Hate to disagree with everyone here but I haven't noticed much of a change. For me, Google is still the best search out there by a long shot.

My behavior has changed slightly though. Increasingly I'm using twitter, reddit, delicious, & hacker news to search and browse for content. Google site search for Hacker news has also come in handy.


I agree with your second thought. delicious' general search and multi-tag search (http://delicious.com/tag/term1+term2+term3) is so much more useful because it's essentially edited by humans.


So, was Yahoo - or any other search engine - any better? You didn't say how good Yahoo was for your two examples.

It's possible that Google doesn't give you perfect results, merely the best of a bad lot. If so, it suggests the problem is the web, not the search of it.

And the web certainly has changed.


I feel that your Google search results are only as good as your query. Sure my results are polluted with some sites that have invested heavily in SEO, but I usually can filter those out by tweaking the query a bit more.


Unfortunately yes. Most result sites are built solely to drive search traffic.


vldtr is now a couple of months old. Google has about 22 pages in their index, Bing 8.

http://www.google.com/#hl=en&source=hp&q=site%3Avldt... http://www.bing.com/search?q=site%3Avldtr.com

As long as the Google has the most up to date index there is no good alternative for me.

You mention trying Yahoo. Did it give you better results?


Where did you get that weird google.com/#hl... link from? It broke in my browser (the redirect from google.com didn't take it in to account when it bounced me to google.co.uk), but I've not seen URLs like that for google.com before.


It's a normal Google url since Google started using ajax. I also get it when I go to google.co.uk and search for something.

See also here: http://news.ycombinator.com/item?id=464393


Don't use this as a metric. The site:www.example.com syntax does NOT return all the pages in a site indexed by the search engine.


Google returns 20-26 pages (no idea where the variation comes from), but no matter what I do, I never find more than 9 pages in Bing's index with any query. Sure it's not exact science, but the difference is large enough to indicate a quality difference.

Another example: when I search for "vldtr is now a couple of months old" (from my post up there) Google returns this thread, Bing returns nothing.

Different metric, same result.


Yes. And the evidence is I find my self going to page 2 or 3 to find what I wanted and sometimes Bing!


There is actually a perverse incentive for Google to do this since they get another chance to show you more ads. I'm not saying they optimize for this case, but the incentives are there that might encourage this kind of behavior.

Edit: redundancy


I think it's also a function of people trying to game how Google's engine operates.


In lieu of having a poll, I'll say in a comment: Yes, most definitely.


Yes, I feel. And the so-called black SEO is the cause. All those sites with keywords in hostnames spoils the result.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: