That said, its an interesting experiment. And at blekko.com we've been running for over a year .
What we've found is that there is a huge brand bias, which is to say that if you use some search engine as your primary search engine, you tend to think that is the best one regardless of whether its Blekko, Bing, or Google (or even DDG although DDG is more a search utility rather than a search engine as it doesn't have its own web index).
But 'quality' is also a very subjective thing as well. So if you search for highly SEO'd categories you will find the Blekko and Bing do better than Google mostly because there is a curated input (in Blekko's case it was from day one with it's from slashtags and in Bing's they started doing outsourced curation (putatively after seeing how effective it is in Blekko :-) in some topics with their 'editors' program ) At some point Google will realize what Bing and Blekko have which are that the 'indexing the web' problem became the 'filtering the web' problem when the signal to noise ratio started decreasing in about 2005, and that the only viable weapon at the moment for human on human spamming (this is where real humans are working for 5 cents a page to write web pages that draw hits) is human judgement.
If the Bing guys are reading (and I know you are) why not open up your challenge to the new kid on the block, we don't just do Blekko vs Google in our monte results :-)
Just took a look at Blekko. It looks good, but you can't honestly believe the layman cares as much about it as he does about search engines coming out of MS or Google. Does the layman (MS' target audience here) even know about Blekko?
So do I think the 'layman' cares? Yes and no. I think they care that they get good search results, I don't think they care where they come from. As pg points out building a replacement search engine is a Hard Problem(tm). You can make a competitive search engine with just 15 - 20 million dollars, but displacing the existing player is more nuanced.
This is something Bing knows all too well, they effectively created a clone of Google's search capability and they only pull in about 30% of the market. Blekko has been steadily adding share primarily by not being the same as Google, rather we've added specific features that were valued by different segments of the market (we're still the only search engine that lets you completely turn off ads as a user preference for example) but mostly folks just use what ever the search box provides.
 Note that both Yahoo and Bing are effectively the same search engine as Yahoo is 'powered by' Bing results. http://www.comscore.com/Press_Events/Press_Releases/2012/6/c...
You can do that with DuckDuckGo: http://duckduckgo.com/settings.html > Layout tab > Advertisements.
[I work at DDG.]
DDG is a great product, we are one of their partners and we are providing some of their search results. They are also quite popular on HN I think in part because of the cool things they have been doing about creating interesting one-box kinds of things. But that said, they aren't a search "engine" they are a search "provider." That isn't a bad thing, in fact its a good thing because you can get Google results through DDG without having to get Google's "value added" stuff like G+ friends feeds or being "bubbled" as Gabe is great at pointing out. But the difference is that if the search engines stopped providing results, DDG would stop being able to return them until they built their own index. As would a number of other places like search.com or dogpile.com etc. The weird thing about search is that until Blekko came along and built a third, there were really only two web scale indexes being used, Bing's and Google's. Amazing is it not?
So I think DDG is pretty cool, love the ideas that Gabe comes up with, and am happy to partner with him to provide access to our index.
To rebut your statement that I was being disingenuous however, I suppose it depends on how measure or define 'well known.' From the perspective of various traffic reporting companies blekko.com is a bit more well known than duckduckgo.com which is not a perfect measure either but it does provide a different perspective.
Actually we crawl the Web a lot and have a bunch of our own indexes. A more apt description would be a hybrid search engine. More details at http://help.duckduckgo.com/customer/portal/articles/216399
Just to be clear, are you saying you no longer backfill results with Bing?
Also, we don't use any Google results right now.
"we do not expect to be wholly independent from third-parties."
That is the technical difference, search engines do expect to be, if not wholly, then at least materially independent from third-parties, they are the third parties, search providers are all about the experience.
But lets be really clear, that terminology distinction really only matters when you get behind the search bar as far as the world is concerned we're all search engines, even if, like search.com, they don't index anything, and while DDG provides tailored indexes which support key aspects of your site's experience. The differentiation is the experience in that case not where you get your data. And I fully recognize that this distinction is not important for a large chunk of the Internet .
For a general audience, yes, we're all search engines.
For a technical audience, no, we're not. And the distinction is whether or not you have a generalized web index or not.
For organic search they present to the end user in a nearly identical way, caveat the 'experience' benefits of one over the other.
But for other things, like "tell me all the sites on the internet that copied this article verbatim." or "give me a rundown on link authority to all inlinks to this page" those kinds of things you need both a web index, and rights to use it like that. That's a pretty objective difference in capability.
So in technical company (and I consider HN to be technical) I try to be crisp about the terminology, in non-technical company I refer to all of these offerings as search engines because it is less confusing and frankly they don't care what its called, they type in words and get results.
 My father in law (part of the 99%) thinks he logs into "Google" to get to the internet because Chrome defaults to Google's home page when it starts up.
These were all front page. Over all I think Blekko has done a great job of being relevant and covered on HN.
Actually, here's a simple example. 44 pages when you search for Blekko using HNSearch - http://www.hnsearch.com/search#request/all&q=blekko
You are aware that Google hires thousands of search engine quality raters, all over the world right? But rather than directly curating, i suspect they just use these humans to train their algorithms. There is likely some inductive step going on, in selecting which algorithm (or 'sort' of the web) is selected, and then filtered by keywords.
Secondly, most search engines, actually do also use end-user curation. I have funky enough search queries, that I often get google redirects as search results: they want to know (and record) which link i pick. Also, because of all the google-adds-cookies, google-analytics-cookies, etc, they likely have 'session-bags' full of destination links and search queries that happen to correlate.
For me personally, neither search engine did very good in this test. But that's likely because neither result set is personalized in this test. And i don't know, if Bing does that now, but even if it does, for me to switch to another search engine, means to train another search engine in my contextual world. Hell, i even notice difference in result sets in Google based on my operating system. I get different results in Linux, than i do on my iPad.
People that get a lot of spam in search results, really ought to login, and stop clicking those links.
So one of the things that hinders the adoption/growth of search engines is muscle memory. Sometimes people type into their browser's address bar the address of a search engine, and then search, but more often than not they just click into the search box and start typing. If all the results are 'good enough' people don't bother to check who their search provider is, much less try others. You can manually change your search on Internet Explorer away from Bing to something else, you could manually change your search in Chrome from Google to something else, and you could change the default in the drop down box in Firefox to something other than Google. But if you don't know what you are missing, well you don't do that do you?
Now various sites which have bills to pay and no direct revenue stream, like download sites, offer a service to folks like Blekko which says "We can run an offer to our customers, which if they take us up on it you'll pay us for that." Just like Google gets paid by an advertiser if you click through their advertisement to the site and buy something, these services make your offer for you and then charge you if someone takes you up on it.
So that is pretty standard stuff, the down side is that this space has some pretty scammy operators. We have debated quite a bit about this internally. Our goal has always been to introduce people to Blekko who may not have heard about it (see above :-). However, we've also discovered that sometimes that policing these guys is a pain. They get paid if you install our toolbar (which gives you quick access to our search engine). We insist that they make it an option and clearly spell out how to continue on with your download or what not without choosing to install it, and we insist that the vendor of this software make it straightforward and easy to revert the install if folks aren't happy with it. So that leads to situations where lots of people download software and say 'no thanks' on the offer. Sometimes that annoys the site because they aren't getting paid and they "change things" so that it always installs whether you want it or not. When we find out about that we cut those people off. And if the toolbar doesn't uninstall we cut those folks off too. We had one vendor who gave us a 'final' release to QA where it worked easily and the actual release didn't. That really irritated us. Sadly there isn't the equivalent of the better business bureau for these folks, we discover them, add them to our list of bad actors and move on. We've put a number of changes in place to be better at managing the process and it may be that it ends up there are just too many scumbags in this space to make it a credible channel. Time will tell on that, in the meantime if you ever spot a site telling you that you must install a Blekko powered toolbar in order to proceed (which is to say its not completely obvious how to opt out) please report it so firstname.lastname@example.org and we'll fix it.
To Everyone: Try it out. Blekko's results are a breath of fresh air.
"we've focused on building a better search engine by concentrating on what we think are long-term value-adds -- having way more instant answers, way less spam, real privacy and a better overall search experience."
Now I'd probably word it 'building a better search experience' and remove the duplicate at the end but that's just word-smithing. One of reason we love you guys is that we share the same values in this regard. The more 'engine-like' you get away from 'experience-like' the more dubious Google and Bing get with sharing their index with you :-)
And, like you guys, we like to partner with folks who can provide the 'long tail' as it were.
Looks like you might be wrong about that.
DuckDuckGo gets its results from over 50 sources, including DuckDuckBot (our own crawler), crowd-sourced sites (in our own index), Yahoo! (through BOSS), embed.ly, WolframAlpha, EntireWeb, Bing, and Blekko.
However I don't think Microsoft's study is any good. They say, "In the test, participants were shown the main web search results pane of both Bing and Google for 10 search queries of their choice."
So it seems that the study participant had to think of 10 search queries, one after the other. But this is not how search works. In the real world users need to perform one query every so often, and their choice of search terms is based on an immediate need rather than what comes to mind.
A good Bing vs Google study would ask 1000 people to use a split-test search engine for their browser search box, then monitor the results over time. I've asked HN about this before and got nowhere:
So is there business opportunity here? I imagine that there must be thousands of Google shareholders willing to pay £100 per year for the results of an unbiased, ongoing study into who gives the best search results. In fact, why doesn't Google set up the study and provide such information to shareholders?
Now, all of my searches were technical; I plan to try again later with more general, "normal people" searches. Bing may do better there.
Also, good results from even google searches have become harder to come by in the last year or so. I have to work harder at my search terms. I don't know if that's Sturgeon's Law at work, or if there's room for a search engine to improve enough to become better than google.
Perhaps targeted search engines are the answer (i.e. engines's who have both their search and crawler algorithms tuned for a specific target audience) are the answer?
Of course it's going to be a draw for most searches (especially the suggested ones: "math games", "coupons", "time zones"...), they're generic and easy to find.
You can use any search engine you want to find those things. Shoot, even AltaVista was good at those kinds of searches, and that was 10 years ago.
The ones I care about—even if I'm not a programmer or technical user—are the difficult searches. What if I need to find a specific car part that only ever existed in the 1972 Ford Pinto? What if I want to find a plumber who specializes in pre-20th-century homes? What if, what if, what if.
I used searches like that and more in this comparison, and it was 100% Google. I could even tell easily why it was Google, and it was pretty clear which one it was. I still chose Google (despite being basically unblinded) because it was fundamentally better at these specific searches.
That's what bing needs to compare. But will they show it? Of course not.
Edit: I did 5 more searches all non-technical but specific and picked only based on result quality. Again 5/5 google. (for those curious, the searches were: "lego mindstorms", "Mars curiosity hi-res", "home developments in myrtle beach", "used bookstores in portland, or", "top baby names in 1982"—the last one was very cool because Google also provided links to the 5 surrounding years right below the 1982 result. It's the little things you know...)
All searches were related to pages I know and I searched for some keywords or parts of the domain.
The further my search is from this, the worse the results I get. I'd say this is a common theme throughout Google's software. Google Wave, for example, seemed to be aimed squarely at the "I'm a Google engineer" use case. (I assume not many Googlers ride the bus, because that's a weak point of their maps!) The Chrome commercials they're running right now are a great example: they look like something straight out of the old demoscene, that Google engineers might do in their spare time, not something that I could imagine a use for.
Here's what I searched for -
-my name (Google, I chose Google because it gave more of my results, which may not be fair.)
-extract articles from websites (Google, in this search google's result were much better.)
-python ide for mac (Draw)
-new york time (Draw)
I did my best to still pick without bias to my current search engine (Google) but ended up picking them 5/5 and Bing 0 times.
Seriously, guys; this is a problem. I'm searching for a coffee shop or restaurant or something, and I want to link my friends to the map.
I --DO NOT-- want to go to google+. In fact, I don't really want to google+ ever. For anything. Ever.
Seriously I feel like I'm navigating a maze of accidental google+ links every time I use google anymore. It's really frustrating. Bing at least seems to be just...a search engine.
In fact, google hiccuped all over my 8 years old gmail account, deleting my inbox (yay! Thanks for that guys, oh, and it's just lost, too bad for me!) two days ago.
The upside to this? It somehow also unsubbed me from google+!
Edit: holy shit it works! You just made my internet faster. How did I, or rather everyone, not know about this?
It reminds me a lot of myspace, actually.
Okay, instead of whining about this, I'm going to look into a maps API client :)
This actually made me pretty sad.
Also, were the messages merely archived (i.e., moved from Inbox to All Mail), or were they completely deleted (i.e., you can't even find them via the search box)?
No, no strange activity in that window. I see myself coming out of Black Rock City, and the sparse signal I was getting on Monday as a result of it, but other than that it's just my office.
Can you check for unusual forwarding rules or delegation in settings? As a follow-up, do you use two-factor authentication on this Gmail account?
Also, check your
I tried the link you're suggesting, as well as looking through the trash etc. What seemed very odd to me was that the messages which started in my inbox were missing from he trash, but things that were "skip the inbox" weren't.
Meaning anything I had "trashed" before yesterday was intact.
Again this seems to suggest that the account wasn't compromised, but that I fell victim to some sort of bug. It would have to be a very strange/determined hacker to go through and move my inbox to trash, then go to trash and only remove messages that started in the inbox.
I searched what I /thought/ was EVERYWHERE yesterday. Thank you thank you thank you!
Fear of doing that sort of thing has led me to disable GMail keyboard shortcuts altogether on my account. (To do that, press ? in a GMail window and click the link on the second line of the overlay window.)
Yesterday was some very odd behavior, so I must have missed them in the morning.
1) I logged into gmail, and saw only one email that I had gotten that morning, everything else was gone. I looked in trash, and the things in it were from over a week ago [I've been out of town at Burning Man for the last week], so things hadn't just been moved to trash.
2) Did the "recover deleted messages" request via google, and a few hours got a response that some things should be restored to "All Mail", but still didn't see much.
3) Now they're back!
This is great! :)
Why? Because I tested it with fairly specific queries. A while ago I tried to switch to Bing because after testing it with a few general searches, it seemed just as good. However, when I switched and started searching for compiler errors, etc - I soon switched back because the results were just nowhere near as good.
Here's what I searched for:
bash: !": event not found error
gtk calendar tutorial
python import gtk error
seed funding in uk
Bing ended up winning, which shocked the hell out of me, but after trying a couple more times, I noticed that pattern.
Whether to get the job done, to get some certain information or something third.
* Google https://www.google.com/search?q=insertObject+atIndex+zero+bu...
Search for "Hunger Games" on Google returns the IMDB page as the first result, which is exactly what I want. But neither panes on the Bing's test has the IMDB page as the first result.
So I wonder if this test is really "Google vs. Bing + Google".
EDIT: Source - http://news.ycombinator.com/item?id=2165469
"When Suggested Sites is turned on, the addresses of websites you visit are sent to Microsoft, together with standard computer information. ... Information associated with the web address, such as search terms or data you entered in forms might be included. For example, if you visited the Microsoft.com search website at http://search.microsoft.com and entered "Seattle" as the search term, the full address http://search.microsoft.com/results.aspx?q=Seattle&qsc0=... will be sent."
Most people have little idea that allowing a feature called "Suggested Sites" will result in their Google searches and clicks being sent to Microsoft, or that Microsoft will use clicks on Google search results in Bing's ranking.
MSFT also uses something called the Microsoft CEIP (Customer Experience Improvement Program), and I think that's either opt-out already or they're making it opt-out in Windows 8--it's built into the "Use Express Settings," I believe.
Again, I haven't looked at this very recently, but if you're using a recent version of Windows and IE, you're probably sending your searches and clicks to Microsoft unless you've been very careful about how you configured your computer.
In any case, I think you've convinced me that my decision to use a non-windows OS and a non-IE browser was the correct one.
Thanks for mentioning this...
Then I right clicked one of the results one day and did "copy link" since I wanted to send it to a friend, and got something like this (with a bunch of other stuff in it too that I've yanked out cause I have no idea if any of it would be identifying info, but the end result is that this link no longer works):
I just hadn't thought about it before. It makes sense that they want and use that info, but it wasn't obvious at all that they were collecting it in that way, since the address shown in the status bar is the destination site, not the actual link target. I had just always assumed that the results were based on links and such around the web, and didn't think clicks would be factored in.
Seems it is opt-in users.
The most famous example would be The Pepsi Challenge , but it happens all the time, for example AMD have done it at a few events comparing an AMD machine to a similar spec/price Intel build.
In my case Bing won 1 out 5 of the rounds. I used searches I had done recently that hadn't given satisfactory results on Google.
In any event Microsoft now at least has searches it can work with and improve. I can't imagine any other way they could get Google users to provide search data.
See the following fictive example:
1000 people, 600 have the following score: 4 Bing, 3 Google, 3 Draw; 400 have the following score: 10 Google.
So, in my example, 60% of the people chose Bing more often and 40% chose Google more often. But, if you look at queries numbers, Google was chosen for 3600+10400=5800 queries, and Bing for 4*600=2400 queries, that's a 2:1 ratio, but in favor of Google.
After 5 searches the screen greyed out and was unresponsive.
Guess Google wins by disqualification?
Google results also seems to show share counts from Google+ as in: 73,352 people +1'd this
You would rather they blanket the city in ads on bus shelters? Or give away $1,000 a day to a random Bing searcher?
Tip: try to search for plot lines in movies such as:
'kid that is able to see people already dead'
But I could tell which side had which search results in 2 or 3 of the tests, so I can't claim complete blindness in the test. I tried to be impartial, though.
MS has definitely caught up quite a bit, though.
Google's strength is not in searching for head or torso terms like "facebook" or "2013 calendar". It is in searching out tail terms; those obscure terms which might include a typo or two, and the phrasing may not be correct.
Unfortunately (and I don't have much time to explain this) one simply _can not_ judge a search engine's comparative performance using random users. And the Bing guys are falling into that trap.
EDIT: To expand more on what I wanted to write.
// DISCLAIMER: This is my personal view, and has nothing to do with my employer.
There are 3 large (fuzzy) classes of queries on search engines: head, torso and tail. The head queries are small in number but large in volume: queries like "Facebook", "Google", "Yahoo", etc. (yes, people do 'search' for these on search engines. Just a handful of such queries may make up 10% of the entire search engine traffic, which would be 100M queries per day (or so) on Google. It is depressing to see the numbers). In general, any search engine worth its salt will get the head queries right.
The torso is basically deciles 2-6 or so (there's no hard and fast definition). Here you'd have most musicians, actors, popular restaurants, etc. Bing and Google both would do a mostly adequate job here, for the first result. But: if you want to explore the concept a little, Bing goes astray and Google does not. For example, search for "idle hand tattoo" on http://bingiton.com/ . The first result is the same, to my local tattoo parlor. But the second result on Bing is from Mesa, AZ; while Google sticks with the right result and offers up other, related sites (Yelp, etc.). Google knows that "Hand" in the title is not the same as "Hands", and gives me not only the store, but also their Yelp, FB and Twitter pages; in other words: Google _understands_ your query better.
And finally, the tail. Here we have obscure error codes, weird technical terms, etc.; most of which are seen maybe once or twice a day. These constitute the bottom ~30% of the query stream. This is where Google really shines. It is very rare indeed to look for an obscure term and find that Bing does a better job than Google (I have yet to find an example).
So, as a user, I have a choice of 2 search engines; and one of them I know will do a better job on obscure terms. Guess which one will I pick?
When Bing claims that "most people prefer Bing in our user studies" (or something similar), their conclusion is flawed because the study itself is flawed. You can't sit a person down in front of a computer and ask him to evaluate your performance on tail queries! Where are those tail queries coming from? If they are provided by Bing ("here's an obscure term, search for it and tell us what you think"), then the user has no way to evaluate how relevant the results are, because, by definition, it's an obscure term! On the other hand, if you ask the user to come up with an obscure term, it will most likely be a term they are familiar with and they'll already know the answer, so they won't really hunt for information. So in my "idle hand tattoo" example, they'll see the top result and claim satisfaction; when, in real life, I would like to see the Yelp reviews, maps, etc. for the place.
If they (Bing) really want to compare how well they're doing, here's a suggestion: setup a search engine (like bingiton, but with only 1 pane of results, and neutral branding), and make it the default for a large pool of users (with their permission). Randomly, switch the backend to Bing or Google. Then, monitor the heck out of what the users do (all with their permission): how often do they click on the first result, how often do they re-formulate queries, the time to the first click, how often do they quickly come back to the results page from a bad click, etc. etc. etc.
When I tried to come up with obscure queries, Google does better at the technical queries, and Bing is doing better at the non-technical queries. But the technical queries usually weren't that far off.
They're just better in search. I generally dislike Google (because I don't like their business model, which is advertising [I don't care for privacy much, just hate ads]), but nothing is as good as them in search, nor could be (for the foreseeable future).
I used Bing for several months. I actually found it indistinguishable from Google except for linux howto questions (Google seems to understand forum websites better).
The dealbreaker for me was lack of SSL support. Switched back to Google.
Nobody who uses Google regularly will switch to Bing: not enough of an advantage. But people who start out with Bing may never switch. That's the long-term game Microsoft has to play.
Bing is as good as Google on average. The problem is, when you're up against an entrenched competitor, good enough just isn't good enough.
I've yet to see a website where the search referrals are even within 30 points of Google.
How about we build a community test?
We could make a little page showing the results for various search engines (e.g. google, bing, blekko, duckduckgo) and and put them to a similar test. We could open and show the data collected in real time so the community gains from the experience.
Immediately there is a problem: how to get the results.
Naively we could use iframes to show the pages for each results. But it is not possible.
We could use their APIs to get the results and show them. However these are limited to N calls on their free package. That would make the survey limited.
Another way is to screenshot the page (fetched using a simple browser call) and present it. This, however, should also be paid for more than N calls..
Anyway, after resolving this, we should be go to go.
We could gather data in an unbiased way and present it to the world.
This is simple to make, I think.
Upvotes if you think it is a good idea (for this to be a join effort probably).
I suspect a lot of this may be down to how challenging the queries are that people put in. Bing's historically been great at generic searches, but I made a point of looking for more difficult stuff, where google has (for me) always been better.
So, the comparison is completely useless, as they did not anonymize the style of the results, and thus either conscious or unconscious bias can creep in. I expect that for users that are not aware that the results look different, there will be a significant preference for the results that they see every day (e.g. the engine they normally use).
I am not going to switch yet, but this is good news, the gap is closing and maybe, like we had an IE monopoly, we are going to move slowly in direction of more search engines.
My queries: one software-technical python virtual machine, one philosophical argument from evil, one highbrow musical/artistic (this one was the draw), one engineering-technical, one geeky-popular cultural.
This was made by Microsoft??
What bothers me most about this is that the programmers didn't bother to remove the S from "rounds" when the result was singular... sloppy.
When your system can be gamed, it is broken. Period.
They also suppressed the Knowledge Graph box.