But even though this makes Google look good, PR-wise, Bing should still use this trick, if it makes their search results better. It seems like a short term solution, but a good one to get their results more competitive, while they work on the core problems Google has already solved. Google should call them on it and expose their hackery, so people know where the good search science still comes from, but Bing should still do it. They are both playing the game very rationally.
As an aside, I don't buy the arguments of "they shouldn't be mentioning Bing." This isn't like the POTUS running against some no-name congressman - this battle is already well-publicized, via hundreds of millions of dollars of ad buys by Microsoft, so the general public already knows there is a competition between Bing and Google.
Microsoft, and any other would-be competitor, would essentially be committing suicide not to try to make up this data gap. If their toolbar is opt-in on the part of users, and you agree with me that my click history is mine to share with Microsoft if I so choose, this is helping consumers. Without some of this data, building a viable competitor to Google is impossible, and consumers do benefit from competition in web search.
Disclaimer: I work in Facebook search. Not the same thing as web search, and I don't really care whether Bing or Google "wins", though I'm temporarily rooting for Bing because as a user I want better, more competitive web search.
How about Yahoo ? They have a longer history I suppose. IMO it's all about quality and engineering.
More importantly why did Google survived and flourished for 12 years ?
But now Google doesnt really need ( Possibly then dont even use it) because of two set of queries where first user mispells and second where he corrects, gives them a StringMap that they can use to map mispelled queries to correct queries.
Google only gets the query volume it does because it is the quality leader. The query volume itself helps Google to retain its quality lead. Google likes to portray search quality as being algorithm-driven, and it is to some extent, but in the modern era quality is also about collaborative filtering with clicks. If you don't have the users, you don't see the clicks, and you can't have the quality. Web search is a natural winner-take-all monopoly, unless someone gets creative, which is what Microsoft seems to have done.
Don't Google, Facebook, et al run a lot of experiments for new projects on a subset of users/queries that's far smaller than 1% of traffic, and still yields very useful results?
1. User mistakenly types a query [mazad], meaning [mazda]. (Probably less than 1% of total queries for Mazda, which is an infinitesimally tiny fraction of the total queries in your system.)
2. The user gets garbage results, and the user realizes their mistake and fixes it, rather than giving up in frustration. This is probably rather rare too, though
3. The user clicks through something that ranked highly for Mazda, and stays there long enough that your system thinks it is a "long click" that probably satisfied the user.
The golden datum here is literally a one in very-many-thousands-of-sessions event, and you need to catch a statistically meaningful number of them for every misspelling (or synonym, or whatever you're trying to learn from this data) you'd like to have your system learn. To have good coverage of the English language, we're talking about many billions of search sessions.
A previous commenter pointed out that Yahoo! probably has enough data; I bet they're right. I don't know if Yahoo! and Bing's technology partnership included access to such data.
collaborative filtering with clicks ... you don't see the clicks,
and you can't have the quality
This is so typical of Microsoft btw.
They win by entering the dialogue. They copied UI from Apple back in the day, and then were able to cut prices, etc., and Apple lost (at least for a decade).
I would caution Google to ignore the controversy completely and move on to more interesting problems in search. Microsoft is very good at playing this game increasingly on their terms, and then they win.
ETA. As a quick caveat. I have nothing but respect for many Microsoft employees. And I don't mean to generalize. I just can't help but remember the similarities to:
Being 'right' is sometimes a premature optimization....
ETA2. Esp. with someone who is very good at making the second dance move. But think back on all the major Microsoft products (IE v. Netscape, Office v. Lotus, though this is a reach maybe, Xbox v. PS, .NET v. Java, etc.). It's just a different approach. But it's a mistake to not analyze it and appreciate it for what it is..
How would you explain Xerox suing Apple over this?
I can't find a mention of Apple licensing anything in the wikipedia article either:
Also this claims to debunk the story:
"Xerox did at one time owne stock in Apple, they were purchased as an investment"
(All this followed from the Wikipedia page, I'll admit)
What is Google made up of? The then MSFTs are now at Google.
There are no non-Google page of links. The only association between these terms and the page exists on Google.
>To be clear, the synthetic query had no relationship with the inserted result we chose -- the query didn’t appear on the webpage, and there were no links to the webpage with that query phrase. In other words, there was absolutely no reason for any search engine to return that webpage for that synthetic query.
Note also that the example search links in the blog post are all from the non-SSL Google search. User tracking doesn't necessarily rely on the HTTP REFERER in this case, since the browser already has access to all of the necessary information. But it would still be interesting to see the experiment repeated using the SSL-encrypted Google search, which disables referrer information, along with non-Google sources.
If it's the first case, then that's a pretty cowardly way to act. If it's the latter, then you should just be ashamed of yourself in general.
Calling names is not OK. I'm pretty shocked you got any up-votes for your comment at all, and I'm ashamed of everyone on the site who gave you the bump.
EDIT: AT time of posting this comment, he had +8. My faith has been restored in the HN community.
Eh? Then why did it happen only in 6% or 7% of the cases tested and not close to 100%, a fact that the blog conveniently glosses over?
In discussions of this, I'm always surprised that people forget that Microsoft has had the axe of anti-trust litigation looming over it's head for years. They haven't been able to compete aggressively on a number of fronts, because they could easily be seen as anti-competitive, and end back in the courts.
But, now, search is a market where they are clearly the underdog. And, Microsoft finally has an arena they can step into and fight bare bloody knuckled without arousing anti-trust eyebrows in the US or EU governments.
And, Bing is competing rather well. Bing launched 18 months ago, and they now have 30% of the total search market (Yes, the bought a good bit of that through Yahoo). And, they are finally presenting a viable competitor to Google in general web search, which I believe is good for all.
Not only that, but in a number of search verticals, IMHO Bing's technology is quite a bit better than Google's (image search, video search and travel search).
I think Google has a reason to be scared and start throwing punches. Microsoft has a much more diverse revenue stream, and they can afford to lose money on search for years without breaking a sweat. Google primarily has a single source of revenue: search advertising. It'll be interesting to see how Google reacts when it's back is against the wall fighting for it's only revenue stream.
I agree with the OP. This is going to be a fun fight to watch.
Also, as stated on the earlier HN thread on this topic, it isn't at all clear that Google will come out on top. Many users suggested that the Google toolbar collected data on the sites people visited and if the Bing toolbar does also -- well is that really a game Google wants to play? Bing could come out a net winner given the increased attention on their search engine.
Why does this last aspect matter? It is an acknowledgment that Google is taking their competition seriously, which indicates a bit of uncertainty with respect to their own product. The more that goes into this (esp. if Bing hits back w/ info on how Google historically has mined data with their own toolbar) the more that Google has to lose and Microsoft has to win.
I do buy that argument but only for mass media PR type stuff. The market leader can only give attention and raise awareness to the followers.
The thing is in this case, it's a technical blog posting and really anyone following that already knows about Bing. It was in searchengineland. If they had a link to it on their home page it would be a different story.
The result of all this is that I now perceive Bing to be considered more of a contender by Google than I previously did.
I doubt that is what Google intended. But then, this PR move was probably meant to change the opinion of people who had a higher opinion of Microsoft and Bing than myself, or were just casual users with no real position.
Either way, it does put Microsoft on the defense for something that I wouldn't construe as malicious.
1) They do the same thing. When I have the google toolbar installed and use bing, my clickstream data goes to Google.
2) On a micro level, there is nothing wrong with this at all.
Of course, on a macro level, both sites trample all over user privacy, so you should be using DDG.
Using any Google product, you implicitly agree to them, in exchange, using your data. That's how Google works, and has always worked. That's your payment for using the service.
With Windows - the operating system that you are using - it's an entirely different proposition. For one, you've already paid for it. And secondly, you don't expect the software that you bought to spy on you and give away links you were clicking on in a Google search results page.
Links in a Bing search - sure! That's how search engines work. But tracking my clicks on any other web page, by my OS, that's spyware, plain and simple.
Unless I'm missing something just running Windows isn't enough for MS to do the kind of data collection Google is claiming.
I think this distinction has legal as well as technical implications.
* all with some specific exceptions for perhaps SSL and locally-resolved zones, or something like that.
The big search engines are large enough to show up clearly in any of their competitors statistics (and there are always human teams monitoring the automated process) and should specifically exclude data from their competitors.
For Google to use Bing clickstream or vice versa also perpetuates a vicious circle where bad results from one search engines will spread among the others.
Bing either hasn't acted ethically, or hasn't though through the consequences of absorbing clickstreams of competitors' sites.
Some would say, "Well, Android has innovated on top of iPhone's precedents." So has Bing, right? In fact, I'd claim Android owes far more to Apple than Bing does to Google.
Some would say of Android copying iPhone, "Well, it's fair because we want competition in the mobile space, not for one company to dominate." Sort of like how Google dominates search? How much would I love for a true competitor to Google, so we can test, e.g., their policy of having terrible customer support.
Which is not to say that Apple hasn't contributed ideas to Android, but just an indication that some people were thinking about good mobile devices before Jobs came out with the iPhone (which, by the way, is a bright, colorful, phone-calling PDA, not all that different in principle from late Palm Pilots/Treos).
What Apple did was revolutionary, but they stood on a lot of shoulders to do it.
There's a huge difference between doing work that is influenced by someone else and just stealing someone else's work.
(Legit sidenote: Google has, via the use of Analytics data, a mass coverage of clickstream for the whole web, which are default opt-in, follows you everywhere, and can identify you uniquely. The Bing Toolbar at least asks first.)
If this is the case, Google isn't being picked upon; rather, they are merely the first, who figured this out externally. Cookie for the scientific rigor, but no cigar for the way they PRd the story. Correlation, after all, does not equal causation.
Google does not use Google Analytics data in any way in our rankings. I've said that plenty of times before, but it's worth mentioning.
As a webmaster I would opt-in for this sort of thing in a heartbeat if I thought it would help your algorithms understand my site. I'm sure Joel Spolsky and most other legitimate online publishers would do so too.
Search engines to me are an obvious case of a means to an end. If a search engine better than Google were to come out tomorrow I would switch to it (from Google) instantly with no regrets. Google's sense of propriety about their results (or, more accurately, what users clicked on after searching via Google), especially given the fact that they are well-known for their penchant for sucking in user data like a black hole (not that I care-- I want them to use it if it means better searches), to me seems 9 parts hypocritical and 1 part prima donna.
Need people be reminded that this is the same company that "accidentally" logged users' WiFi browsing habits while driving StreetView cars around Europe? Give me a break. Everyone is guilty, and no one is going to do anything differently now than they did before.
Why does the user's click from the results page suddenly belong to Google (apart from the fact that in this specific case they actually artificially created a fake long-tail result)? If I Google Bing, and then Bing's ranking of Bing goes up a a result (not that it's not already #1, but whatever), can you actually say that it's Google's result and ranking? What if it's nytimes, or any number of extraordinarily common searches where you're really just doing a domain lookup for a name you already know?
What if I didn't click on anything until the 30th page of results because that was the only useful result, and it causes the Bing rank to go higher? Does Google have any ownership over the rank then, even if the useful page was ranked lower than much more useful results? Couldn't Google then just return a list of every page on the internet in response to every query and then claim that their results are being stolen?
To be honest, I'm not really convinced that either side is in the right here. I just think that it should be made clear that there is a large distinction between stealing results and tracking clickthrough behavior. One would be laughably shortsided and of dubious ethics, the other is basically common practice, and is being made a bit more than it is because of its superficial appearance.
They use results for terms users entered to Google to crawl pages that are not in their index (torsorophy example which is not an artificial one) therefore enriching their index based on google's results, incresing their depth.
As for ranking, it is more blurry, but When you record users clicks, which directly correlates with ranking, it starts stinking.
Even considering the top 100 most visited websites on Alexa: all of them has a search form, and only 20 or so belongs to Google; it's very easy to see how the aggregated usage of the other 80 could be much, much higher, than the aggregated usage of Google properties.
Therefore, while Google might be single most impacted organization in the world, most of the data comes from non-google properties. And none of this has anything to do with my original argument of the algorithm itself being benign.
I'd also guess that the data from domain specific sites are more valuable than generic search sites. (User selects appropriate site, does search, selects appropriate result.)
>Suffice to say, Google’s pretty unhappy with the whole situation, which does raise a number of issues. For one, is what Bing seems to be doing illegal? Singhal was “hesitant” to say that since Google technically hasn’t lost anything. It still has its own results, even if it feels Bing is mimicking them
This is actually just IE's "spying" working properly. If an MSIE user that has allowed Microsoft to see their browsing habits follows a link after a search then MS are associating that link. This is sensible as it's measuring actual visits following a given search.
If someone searches for a googlewhack and Bing have no results for that term then it's natural that MS would then use this data to associate the googlewhack with the visited page.
Initially I thought this sounded like MS being underhand but really they're tracking their users and associating their users search terms with the pages that they visit - _not_ using this data for search (given they have permission) would be silly, no?
The flag this waves for me is how easy is it to manipulate Bing results using false MSIE reports back to MS, anyone know of botnets sending fake data to boost page rankings??
You sir, have won the thread.
It's that Microsoft has no confidence in Bing. They aren't willing to trust their algorithms to produce the best search results. They've decided that, some portion of the time, the single best search result they can return is whatever Google is returning.
They've given up on trying to be better than Google, and are settling for being a cheap, off-brand knockoff that rebrands stale Google search results.
That's rather shocking, and I frankly thought the Bing team were better than that.
I don't see how this indicates that they've given up on Bing. They're spending a lot of money on the online services team, even making a loss for the past several quarters to improve Bing. They're playing the catchup game, so this is a quick and easy way to stay competitive while they get their algorithms up to scratch. It's better than the alternative of loosing all their customers/market share and then have no data to help them improve.
I see this more as a compliment to Google, even though Google certainly doesn't see it that way, and I can definitely understand their frustration.
Since you're interning in the Bing team, I'll ask: did a lot of people at Bing know about Google's rankings as a data source in Bing? I'll understand if you can't answer, but I'm genuinely curious.
Sounds like a very rational position. Anything less could be termed 'delusion' on Microsoft's part. And for the longest time, they were delusion about the search market in general. No longer.
Google being an additional source of data for them does not equal Bing "giving up on bettering Google".
I don't see how making this a public issue is a win for Google. Seems like something they should have kept in their back pocket. "Keep your enemies closer", as they say.
Whether or not people perceive the claim as accurate or complaining or whatever, first impressions are still powerful. If the first time you hear of Bing is the accusation that they're copying someone else's results (true or not), that's probably not a first impression you would want people to be left with if you were Microsoft. Chances are, people are going to read this as "My trusted search provider accuses unknown search provider created by the same people as Windows Vista of stealing its results".
Google can and does track clicks from Bing via Google Analytics. Every time you click a result on Bing and land on a page using Google Analytics, Google knows about it, and they record your Bing search terms from the referrer. The same is probably true for pages with Google ads.
Bing have done wrong (granted probably not legally), and their response to a very detailed Search Engine Land article was a quick, nonchalant 'Huh? Oh that. Yeah, we don't copy Google's results. I know that doesn't really answer the claims but we don't really care enough to give a proper response.'
Bing's actions here (and their response) has seemed very poor and I definitely praise Google in going public with this.
I'd certainly like to think that if I was in a position where I caught a competitor piggybacking off my work, I'd go public with the information too (in a non-confrontational manner of course, as Google are doing).
So yeah: good for Google. Bad for Bing.
I don't think that it is a secret that Bing uses click data from browser/toolbar as a signal, it's just a not well known fact. For example in the paper "Learning Phrase-Based Spelling Error Models from Clickthrough Data" (http://aclweb.org/anthology/P/P10/P10-1028.pdf) by Microsoft Research, they explain how to improve the spelling corrections by using click data from "other search engines".
I just pulled down the paper and noticed this: "The clickthrough data of the second type consists of a set of query reformulation sessions extracted from 3 months of log files from a commercial Web browser .... In our experiments, we "reverse-engineer" the parameters from the URLs of these sessions, and deduce how each search engine encodes both
a query and the fact that a user arrived at a URL by clicking on the spelling suggestion of the query – an important indication that the spelling suggestion is desired"
Some of the recent discussion has been about whether Microsoft looks at lots of different sites vs. doing something special or different for Google. This paper very much sounds like Microsoft reverse engineered which specific url parameters on Google corresponded to a spelling correction? Figure 1 of that paper looks like Microsoft is using specific Google url parameters such as "&spell=1" to extract spell corrections from Google.
Targeting Google specifically is quite different than using lots of clicks from different places. It looks like you work at Microsoft--can you say any more about this?
Well, no, that's a research paper that says that they have made experiments in that direction, but this doesn't imply that this is currently done in Bing. But it gives an hint about what kind of data is available from the "log files from a commercial Web browser".
> Targeting Google specifically is quite different than using lots of clicks from different places.
From the article, they have handcrafted rules for both Google and Yahoo, that together with Bing have (I think) the 95% of the market. I'd say they are not targeting Google, they are targeting the majority search engine users. There just happen to be only 3 major search engines, so a few handcrafted regexes are sufficient.
I wouldn't be surprised if Google Maps has handcrafted (or manually tuned) scraping code to extract reviews from Yelp and other major review sites, and same for Google News for the extraction of the news body from the major online news sources. How is this different?
> It looks like you work at Microsoft--can you say any more about this?
Yeah, I should have been more clear about this. I am interning at MSR and have some involvement with Bing (and actually worked there last year), but my comments are personal and about facts that are public.
BTW, IMHO using the click logs can't be considered "copying", more like "a way to discover new sites to crawl and the keywords that lead to them". This is not copying the SERP results.
Since it "looks like" you work at Google :) can you answer this question (it was also asked here: http://news.ycombinator.com/item?id=2165963)? Doesn't Google use Chrome to get traffic statistics, through the opt-in "send usage statistics" and the malicious site protection?
Sorry, but Google drives traffic to their sites. That's what a search engine is supposed to do. Msft just scrapes Google's results and presents the data as its own.
Then why are newspapers not so happy about it?
And, BTW, just to be clear, Msft can't "scrape". That would violate robots.txt.
Rupert Murdoch and his kin are shortsighted, blustering fools when it comes to the 'net. Relying on their attitude to make your point is counterproductive at best.
I saw that Peter Kasting from the Chrome team commented on this question at http://www.mattcutts.com/blog/google-bing/#comment-712619 . Here's what he said "I work on Chrome and we absolutely do NOT collect clickstream data through Chrome. Not even when you turn on the off-by-default “anonymous usage statistics”."
Google Analytics knows that the search term 'autodesk revit devlopers guide' on Bing lead someone to my blog. I take it this information is in the HTTP header on the request to my site which the Google analytics code reads.
If Google were to use Google analytics information in their search results, how would that be any different to what Bing is doing? Or is the distinction that Google claims not to do this?
"[Google] Search Quality in general does not use Google Analytics in ranking ... You can use Google Analytics, you can not use Google Analytics, it won't affect your ranking within Google Search results." It's dated middle of last year, I guess it's possible that something has changed, but nothing I'm aware of.
As Paul said, customers don't care. All they are doing is giving Bing some front and center advertising on it's blog (which has several non-tech readers) and the tech people who actually care probably don't enough to actually switch search engines.
Google didn't fire right off the bat with the hard-hitting blog entry, but instead basically gave a more detailed version of the same thing to Danny Sullivan. They wanted to see how Microsoft would react before going official with it, because even though Microsoft's response was predictable, there's always a chance that Microsoft would have surprised everyone with their response. (They didn't, in my opinion.)
What's struck me most about this story as it has developed throughout the day is that Google's actions are very deliberate and planned.
I wouldn't consider Google as continually mentioning Bing, either; in fact, I don't think they've paid much attention at all to them. Put Bing in the search box on their official blog, and you'll see that this is the only post specifically about Bing -- a perusal of older posts indicates that the rest are hitting on comments or TrackBacks (i.e., the background image misfeature).
In this case, I don't know how this discussion (verging on disagreement) could occur without mentioning the competition.
Google should be more careful here: either it's OK to repurpose other site's content or it's not, and Google has built their entire business around repurposing content. They shouldn't be surprised when their competitors start doing the same.
The way that they describe the approach, it seems like the Bing Toolbar would also be scrapping results from bing itself, yahoo, altavista, ask.com and many others.
Isn't Yahoo search powered by Bing anyway?
"Opt-in programs like the [Bing] toolbar help us with clickstream data, one of many input signals we and other search engines use to help rank sites."
“We do not copy Google’s results.”
I see MS denying _copying_, not denying _using_ Google search results. That makes the title of the Google blog post incorrect.
Microsoft inadvertently benefits from Google's research by simply watching and recording how people use Google. The end result is a Microsoft product that isn't as good as its competition, but it's good enough for some people. Sound familiar?
It's a classic case of true innovation vs. "Microsoft" innovation.
Similarly, it's plagiarism if you take a Harry Potter book and publish your own version with the names changed, but James Patterson's "Witch & Wizard" has a copyright of its own despite being rather similar in concept.
(Edited to remove question about phrasing thanks to atularora's clarification.)
TripAdvisor says "Google, don't copy our reviews for Google places."
Google says "The only way we won't copy your content is if you opt-out of completely."
TA says "We can't do that, you're the only search engine there is."
Google just laughs maniacally.
I know of some other media companies that are hyper-paranoid about their mass produced, widely disseminated, public content being "stolen" by others, maybe Google should set up a lunch date with the RIAA.
I'm not saying that they should or shouldn't do something like this, but it seems very effective. They have a perfect narrative for people to wrap their heads around, and even if nothing illegal was done, it still feels like Microsoft is doing something "wrong".
For example, if I don't chain my bike it doesn't mean anyone is allowed to steal it. Taking my bike is still a crime. How I protected it has nothing to do with the criminal act.
This is merely Google whining that Bing is delivering sort of good results because that threatens Google.
As to whether this is a crime or not would probably be a copyright issue. If Google had a valid legal claim I bet they'd make it in court.
However, it's not that what Bing did is not criminal, it's that it's a non-issue; there is nothing wrong with keeping an eye on competitors directly or indirectly and implementing good ideas or products that you don't have. Everyone does this, it's just normal business.
I don't know why Google cares or is making a big deal out of it, it just seems like whining to me, like they are mad that Bing can just manually compile a list of good results, even if those results come partially from Google. Sorry Google, that's just the nature of the format you're in and the game you're playing. A list of automatically-generated links with no custom or special content, much less a single link, would be a hard case to claim copyright protection, especially since for most actual queries it would be difficult to prove that Bing couldn't have come up with it independently.
I see nothing wrong with it and nothing unfair about it. Do you think the Google guys developed and tweaked their algorithms in a vacuum when they were starting out? You don't think they ever brought up AltaVista or Yahoo! results for comparison and tweaked until they got the same (better) results? I've heard several times that Google constantly has people manually tweaking queries and results, totally extra-algorithmically, to make sure they provide good results for everything possible. So what if Bing does this too? Bing is even being accused of something much less direct. I just see no problem at all, it's just the way this is played.
I'm curious how you'd implement that idea.
In the 1990s, it probably look a lot of iterations, user studies, and market research to decide that copy/paste, undo, etc were the "right" set of features to include in a word processor. Do you think Google Docs re-did all that research? No, of course not. They probably just looked at Word and said, "we need to support these features". And there's nothing wrong with that. This is the exact same thing.
If you have a product out in the market, it's fair game for your competitors to look at and analyze its strengths and weaknesses, and use those to improve it's own product.
It's understandable why Google's concerned, because it's likely that Microsoft has access to a lot more this data due to their OS and software's ubiquity.
"Why are we getting so much traffic from people searching for 'delhipublicschool40 chdjob'?"
"Is hiybbprqag some new band the kids like? Why don't we have tickets for them?"
OK, over the course of several weeks 20 google engineers were able to inject 7/100 false searches into Bing's database. That is more structured like a brute force attack than a scientific experiment. Is Google really surprised that SEO works? The blog contains nothing significant about methodology - no control groups, no restrictions on automation, no limits on methods used. In other words, what this shows is that 20 Google Engineers were able to hack Bing and that they did so for PR purposes.[/edit]
That's not my understanding of what Google did at all. Google fed back search results for keywords that didn't exist on the Internet -- period, and they started eventually showing up in Bing.
Because what you describe would be underhand and probably a violation of Google's ToS that could cost them very dearly in PR and money (in a court case).
They have access to user data that users say they can access (click-through or whatever) but they don't have access to Google search results directly unless Google allow this (which I've not checked but can't imagine they do allow).
The grey area is that Google's ToS relates to their relationship with their clients (people who search using Google), do they disallow their clients if they're using MSIE with tracking? Doubtful, if they did (by a technicality say) then they could sue their clients but they couldn't (a priori) sue MS as MS are acting in good faith in their relationship with the same clients (people using MSIE allowing data tracking). The onus would appear to be on the Google users not to have tracking enabled (if indeed Google's ToS disallow such things).
Why not just grep Bing logs ... well clearly they can tell a lot about the relevance to a particular term by seeing how long a user spends on a page after searching for that term and following a link. If the user bounces then it's not likely to be high quality. This sort of info won't be easily gathered from Bing logs if indeed it is possible to get at at all.
From what I've read, the general consensus seems to be that Microsoft is using IE in conjunction with the Bing toolbar to analyze user's search data. And this is something that worked only on 6 or 7 of the 100 terms that they tried it with? That was enough to incriminate Bing?
Google could've at least tested to see if this behavior is limited to just Google or if Bing was also analyzing other search engines (or even other pages). I would've expected MS to have released something like, not Google.
Qwiki - see contents
I would try it now, but the test has been polluted by all of the news articles.
Also, why do I now want to buy things with "hiybbprqag" printed on them?
Also, Google has used the links provided by hub and search pages to find relevant sites within a niche. They have happily indexed links they discovered on those pages, and then removed or penalized the pages that pointed them to it. It's OK, of course, because any SERP not provided by Yahoo Google or Microsoft is termed "spam"
It's actually more like trap streets on maps.
Google has assisted the user with that action. Bing is only correlating these two individual actions(search and click) by the user, to get some additional signals.
This seems like the latest in a series of indications that Google has moved past the innovation stage into the "protecting its turf" stage. That would be a shame."
Does Google use the data I give them? Are they copying me?
If anyone from Google reading this. This is not a smart move, and should be ended asap.
Combing features 130, 131, 135 and 136, I think it is understandable what Google engineers did can give those fake links a boost in the search results. In a way, they cheated the algorithm.
If Bing was outright stealing Google results, all you have to do is:
1. setup the synthetic queries on Google
2. search for them using Bing
Clearly, it took several weeks of Bing toolbar being installed and people going to site X after searching for Y. The Bing toolbar has the right to assume there's a relationship between X and Y. It's a legitimate "ranking" strategy.
... and even then it only worked for 7-9% of "synthetic" nonsense queries.
Automated spidering or not, the way this is setup borders on the edge of stealing. I can see why they feel the need to complain about the issue.
We gave 20 of our engineers laptops with a fresh install of Microsoft Windows running Internet Explorer 8 with Bing Toolbar installed. As part of the install process, we opted in to the “Suggested Sites” feature of IE8, and we accepted the default options for the Bing Toolbar.
Essentially, the engineers enabled the user tracking features of IE and the Bing Toolbar, ultimately seeding Bing with the desired results. How is that stealing?
On a related note, can this technique be exploited to improve site ranking on Bing?
That was my thought when I heard of all this! I don't know what kind of authentication the bing toolbar does, but this seems ripe for reverse engineering, then pumping fraudulent data to Microsoft through a botnet...
if current_page == "www.google.com":
""" Steal results from competition """
I think google are framing the debate well, and perhaps exposing something that isn't exactly privacy friendly but to claim they are stealing from google is similar to me claiming google is stealing from me because they indexed my collection of favourite links on my homepage.
If you've denied google access to your site in your robots.txt, then it would be a reasonable claim. (Google has denied bing access in its robots.txt)
I guess its an interesting point, clearly denying access to an area in the robots.txt suggests you do not have permission to use that information. However, only the TOS will be definitive on what you can and cannot do. For example not including a robots.txt file clearly does not waive all rights, but interpreting the TOS is clearly beyond the capability of an automated system.
In this case I would suggest the only safe course of action for Bing would be to have an exclude list of domains and allow anyone to have their own site excluded from this information gathering.
When there's only one result, it doesn't matter how you rank it, it will always be the only result.
I believe the word you are looking for is "interweb".
Having the evidence in code would have made the accusation irrefutable.