
Microsoft’s Bing uses Google search results—and denies it - atularora
http://googleblog.blogspot.com/2011/02/microsofts-bing-uses-google-search.html
======
andrewljohnson
Setting aside the ethical questions, because I don't really care, when I look
at the probable outcomes of this, I think it's wise for Google to point out
what's going on here. This string of stories positions Google as the smart,
sciency search engine, and Bing as a collection of hacks. This is how I'd want
the public to perceive the battle if I were Google.

But even though this makes Google look good, PR-wise, Bing should still use
this trick, if it makes their search results better. It seems like a short
term solution, but a good one to get their results more competitive, while
they work on the core problems Google has already solved. Google should call
them on it and expose their hackery, so people know where the good search
science still comes from, but Bing should still do it. They are both playing
the game very rationally.

As an aside, I don't buy the arguments of "they shouldn't be mentioning Bing."
This isn't like the POTUS running against some no-name congressman - this
battle is already well-publicized, via hundreds of millions of dollars of ad
buys by Microsoft, so the general public already knows there is a competition
between Bing and Google.

~~~
kmavm
Especially in the case of spelling correction, it is not so much that Google
has "solved hard problems" to get the long-tail right, as that they have a
monopoly on the relevant data. Unlike all their competitors, Google has 12
years of the entire history of queries and clicks to mine for signals about
how to rewrite queries. Even if you have all of Google's _algorithms_ , it is
technologically impossible to build a better query rewriter, because you don't
have their _data_. You can't buy this data short of buying Google, and if you
believe Peter Norvig, it's an irreplaceable component of Google's quality
advantage.

Microsoft, and any other would-be competitor, would essentially be committing
suicide not to try to make up this data gap. If their toolbar is opt-in on the
part of users, and you agree with me that my click history is mine to share
with Microsoft if I so choose, this is _helping consumers_. Without some of
this data, building a viable competitor to Google is impossible, and consumers
do benefit from competition in web search.

 _Disclaimer_ : I work in Facebook search. Not the same thing as web search,
and I don't really care whether Bing or Google "wins", though I'm temporarily
rooting for Bing because as a user I want better, more competitive web search.

~~~
andrewljohnson
Exactly... to beat Google, you need time and data. Or you can do what MS did
to fill the gap for a while.

~~~
kmavm
Actually, time won't do it either. Unless you somehow compete with Google
today, the data will never come.

Google only gets the query volume it does because it is the quality leader.
The query volume itself helps Google to retain its quality lead. Google likes
to portray search quality as being algorithm-driven, and it is to some extent,
but in the modern era quality is also about _collaborative filtering with
clicks_. If you don't have the users, you don't see the clicks, and you can't
have the quality. Web search is a natural winner-take-all monopoly, unless
someone gets creative, which is what Microsoft seems to have done.

~~~
metamatt
What volume do you really need, to get enough data to learn from? I'd think
that 1% of Google traffic would still be a pretty big firehose to feed
whatever learning algorithm you need to feed.

Don't Google, Facebook, et al run a lot of experiments for new projects on a
subset of users/queries that's far smaller than 1% of traffic, and still
yields very useful results?

~~~
kmavm
In the case of spelling correction and query expansion, every little bit
helps. Suppose you want to learn that people typing [mazad] mean [mazda].
(This is kind of a silly example, as dictionary- and edit-distance-based
techniques can do corrections like this. So bear with me.) The event you need
to catch is:

1\. User mistakenly types a query [mazad], meaning [mazda]. (Probably less
than 1% of total queries for Mazda, which is an infinitesimally tiny fraction
of the total queries in your system.)

2\. The user gets garbage results, _and the user realizes their mistake and
fixes it_ , rather than giving up in frustration. This is probably rather rare
too, though

3\. The user clicks through something that ranked highly for Mazda, and stays
there long enough that your system thinks it is a "long click" that probably
satisfied the user.

The golden datum here is literally a one in very-many-thousands-of-sessions
event, and you need to catch a statistically meaningful number of them for
every misspelling (or synonym, or whatever you're trying to learn from this
data) you'd like to have your system learn. To have good coverage of the
English language, we're talking about many billions of search sessions.

A previous commenter pointed out that Yahoo! probably has enough data; I bet
they're right. I don't know if Yahoo! and Bing's technology partnership
included access to such data.

------
seanalltogether
Google should proceed with caution, do they really want to get dragged into a
debate about tracking user actions to influence search results?

~~~
nikster
I think there's a very clear ethical divide.

Using any Google product, you implicitly agree to them, in exchange, using
your data. That's how Google works, and has always worked. That's your payment
for using the service.

With Windows - the operating system that you are using - it's an entirely
different proposition. For one, you've already paid for it. And secondly, you
don't expect the software that you bought to spy on you and give away links
you were clicking on in a Google search results page.

Links in a Bing search - sure! That's how search engines work. But tracking my
clicks on any other web page, by my OS, that's spyware, plain and simple.

~~~
barake
Google clearly stated in their blog post they used IE8 and the Bing toolbar
with settings that provide user experience data to MS (for the Suggested Sites
feature and such). I'm sure Google's toolbar and Chrome phone home if you let
them, too.

Unless I'm missing something just running Windows isn't enough for MS to do
the kind of data collection Google is claiming.

------
ellyagg
The holier-than-thou attitude by Google here dumbfounds me. Android phones and
tablets would not exist in remotely the shape they do today but for the
innovations of Apple. They organized with Apple's competitors to provide an
offering extremely similar in spirit and often in form to what the iDevices
do. There is zero chance Android adopts all the "conventions" it has without
copying Apple. The world was not on a fast track to full phone, bright screen,
touch capacitive displays and gestures and app markets before Apple pioneered
them. iPhone was not the logical inevitable implication of the technology that
had gone before. If Google thinks Bing is not playing fair, Google has 10x as
much to answer for in real damages to Apple for thieving their innovation.

Some would say, "Well, Android has innovated on top of iPhone's precedents."
So has Bing, right? In fact, I'd claim Android owes far more to Apple than
Bing does to Google.

Some would say of Android copying iPhone, "Well, it's fair because we want
competition in the mobile space, not for one company to dominate." Sort of
like how Google dominates search? How much would I love for a true competitor
to Google, so we can test, e.g., their policy of having terrible customer
support.

~~~
cookiecaper
Android existed back into 2003, well before iPhone was available or even
announced:
[http://www.businessweek.com/technology/content/aug2005/tc200...](http://www.businessweek.com/technology/content/aug2005/tc20050817_0949_tc024.htm)

Which is not to say that Apple hasn't contributed ideas to Android, but just
an indication that _some_ people were thinking about good mobile devices
before Jobs came out with the iPhone (which, by the way, is a bright,
colorful, phone-calling PDA, not all that different in principle from late
Palm Pilots/Treos).

~~~
enjo
That last part is key here. Almost everything that the iPhone does existed
before the iPhone. You can find most of the UI concepts throughout the old
Palm app eco-system. Apple did a great job of bringing it all together and
(most importantly) bringing a consistent UI metaphor to it all.

What Apple did was revolutionary, but they stood on a lot of shoulders to do
it.

------
sdrinf
Here's an alternative hypothesis: the bing toolbar might look for explicit
search queries (either strings entered into a textbox, or q=, query=
parameters), and navigation from such pages to external domains. This would
match _all_ "search engines" in the most relaxed meaning of the term: product
search, thesaurus, lexicons, dictionaries, everything; and I'd argue to be a
legit signal for a "general search engine" to match.

(Legit sidenote: Google has, via the use of Analytics data, a mass coverage of
clickstream for the whole web, which are default opt-in, follows you
everywhere, and can identify you uniquely. The Bing Toolbar at least asks
first.)

If this is the case, Google isn't being picked upon; rather, they are merely
the first, who figured this out externally. Cookie for the scientific rigor,
but no cigar for the way they PRd the story. Correlation, after all, does not
equal causation.

~~~
Matt_Cutts
"Legit sidenote: Google has, via the use of Analytics data, a mass coverage of
clickstream for the whole web, which are default opt-in, follows you
everywhere, and can identify you uniquely. The Bing Toolbar at least asks
first."

Google does not use Google Analytics data in any way in our rankings. I've
said that plenty of times before, but it's worth mentioning.

~~~
trevelyan
Honest question: why not? Surely identifying sites that have disproportionate
organic traffic relative to search engine referrals can only be good in
identifying places people actually want to visit online?

As a webmaster I would opt-in for this sort of thing in a heartbeat if I
thought it would help your algorithms understand my site. I'm sure Joel
Spolsky and most other legitimate online publishers would do so too.

~~~
budgi3
Google can already calculate this ratio of organic traffic to search engine
traffic using the Google Toolbar stats - no need for Google Analytics.

------
pbhjpbhj
From [http://searchengineland.com/google-bing-is-cheating-
copying-...](http://searchengineland.com/google-bing-is-cheating-copying-our-
search-results-62914)

> _Suffice to say, Google’s pretty unhappy with the whole situation, which
> does raise a number of issues. For one, is what Bing seems to be doing
> illegal? Singhal was “hesitant” to say that since Google technically hasn’t
> lost anything. It still has its own results, even if it feels Bing is
> mimicking them_

This is actually just IE's "spying" working properly. If an MSIE user that has
allowed Microsoft to see their browsing habits follows a link after a search
then MS are associating that link. This is sensible as it's measuring actual
visits following a given search.

If someone searches for a googlewhack and Bing have no results for that term
then it's natural that MS would then use this data to associate the
googlewhack with the visited page.

Initially I thought this sounded like MS being underhand but really they're
tracking their users and associating their users search terms with the pages
that they visit - _not_ using this data for search (given they have
permission) would be silly, no?

The flag this waves for me is how easy is it to manipulate Bing results using
false MSIE reports back to MS, anyone know of botnets sending fake data to
boost page rankings??

~~~
narrator
>The flag this waves for me is how easy is it to manipulate Bing results using
false MSIE reports back to MS, anyone know of botnets sending fake data to
boost page rankings??

You sir, have won the thread.

------
neild
The number-one point I take away from this isn't about ethics or what is
"right".

It's that Microsoft has no confidence in Bing. They aren't willing to trust
their algorithms to produce the best search results. They've decided that,
some portion of the time, the single best search result they can return is
whatever Google is returning.

They've given up on trying to be better than Google, and are settling for
being a cheap, off-brand knockoff that rebrands stale Google search results.

That's rather shocking, and I frankly thought the Bing team were better than
that.

~~~
chucknthem
Disclosure: I'm interning in the Bing team at MS but my statements here are
mine only and do not reflect that of Microsoft.

I don't see how this indicates that they've given up on Bing. They're spending
a lot of money on the online services team, even making a loss for the past
several quarters to improve Bing. They're playing the catchup game, so this is
a quick and easy way to stay competitive while they get their algorithms up to
scratch. It's better than the alternative of loosing all their
customers/market share and then have no data to help them improve.

I see this more as a compliment to Google, even though Google certainly
doesn't see it that way, and I can definitely understand their frustration.

~~~
Matt_Cutts
"[Bing is] playing the catchup game, so this is a quick and easy way to stay
competitive"

Since you're interning in the Bing team, I'll ask: did a lot of people at Bing
know about Google's rankings as a data source in Bing? I'll understand if you
can't answer, but I'm genuinely curious.

------
chaosmachine
Will any Bing users suddenly switch to Google because of this? Probably not.
Will people who've never heard of Bing be reading about them in the paper
tomorrow? Yes. Will complaining about competition from underdogs make Google
look bad to some? Yes.

I don't see how making this a public issue is a win for Google. Seems like
something they should have kept in their back pocket. "Keep your enemies
closer", as they say.

~~~
alecco
If the privacy bit is properly reported in MSM, it should definitely raise
some concerns. Think corporations with sensible data and all that.

~~~
chaosmachine
The thing is, Google monitors which search results you click, too. Bing's
toolbar just does it via browser integration.

Google can and does track clicks from Bing via Google Analytics. Every time
you click a result on Bing and land on a page using Google Analytics, Google
knows about it, and they record your Bing search terms from the referrer. The
same is probably true for pages with Google ads.

~~~
alecco
But as Matt Cutts pointed out, there's a very clear red warning on Google
Chrome while IE just silently gives you a very unsuspicious disclaimer.

~~~
teaspoon
What does Chrome have to do with this mode of tracking? Google Analytics will
capture a referrer from any browser that includes it, and no browser that I
know of displays a "red warning" when including it.

~~~
kissickas
NoScript provides a yellow warning, and is by default opt-out.

------
tristanperry
Good for Google.

Bing have done wrong (granted probably not legally), and their response to a
very detailed Search Engine Land article was a quick, nonchalant _'Huh? Oh
that. Yeah, we don't copy Google's results. I know that doesn't really answer
the claims but we don't really care enough to give a proper response.'_

Bing's actions here (and their response) has seemed very poor and I definitely
praise Google in going public with this.

I'd certainly like to think that if I was in a position where I caught a
competitor piggybacking off my work, I'd go public with the information too
(in a non-confrontational manner of course, as Google are doing).

So yeah: good for Google. Bad for Bing.

------
ot
As many others have said, I don't think that using click data from the
browser/toolbar as one of thousands signals can be considered "copying". When
doing a query with a nonexistent word, all the other signals are zero because
there is no knowledge about the query, so the only remaining one is the
history of clicks "sniffed" from Google/etc... SERP. OTOH, on real world
queries the signal has probably a relatively low weight.

I don't think that it is a secret that Bing uses click data from
browser/toolbar as a signal, it's just a not well known fact. For example in
the paper "Learning Phrase-Based Spelling Error Models from Clickthrough Data"
(<http://aclweb.org/anthology/P/P10/P10-1028.pdf>) by Microsoft Research, they
explain how to improve the spelling corrections by using click data from
_"other search engines"_.

~~~
Matt_Cutts
The paper you mentioned appears to be saying that Microsoft is extracting
spell corrections via clicks on Google. That's pretty surprising news.

I just pulled down the paper and noticed this: "The clickthrough data of the
second type consists of a set of query reformulation sessions extracted from 3
months of log files from a commercial Web browser .... In our experiments, we
"reverse-engineer" the parameters from the URLs of these sessions, and deduce
how each search engine encodes both a query and the fact that a user arrived
at a URL by clicking on the spelling suggestion of the query – an important
indication that the spelling suggestion is desired"

Some of the recent discussion has been about whether Microsoft looks at lots
of different sites vs. doing something special or different for Google. This
paper very much sounds like Microsoft reverse engineered which specific url
parameters on Google corresponded to a spelling correction? Figure 1 of that
paper looks like Microsoft is using specific Google url parameters such as
"&spell=1" to extract spell corrections from Google.

Targeting Google specifically is quite different than using lots of clicks
from different places. It looks like you work at Microsoft--can you say any
more about this?

~~~
ot
> The paper you mentioned appears to be saying that Microsoft is extracting
> spell corrections via clicks on Google.

Well, no, that's a research paper that says that they have made experiments in
that direction, but this doesn't imply that this is currently done in Bing.
But it gives an hint about what kind of data is available from the _"log files
from a commercial Web browser"_.

> Targeting Google specifically is quite different than using lots of clicks
> from different places.

From the article, they have handcrafted rules for both Google and Yahoo, that
together with Bing have (I think) the 95% of the market. I'd say they are not
targeting Google, they are targeting the majority search engine users. There
just happen to be only 3 major search engines, so a few handcrafted regexes
are sufficient.

I wouldn't be surprised if Google Maps has handcrafted (or manually tuned)
scraping code to extract reviews from Yelp and other major review sites, and
same for Google News for the extraction of the news body from the major online
news sources. How is this different?

> It looks like you work at Microsoft--can you say any more about this?

Yeah, I should have been more clear about this. I am interning at MSR and have
some involvement with Bing (and actually worked there last year), but my
comments are personal and about facts that are public.

BTW, IMHO using the click logs can't be considered "copying", more like "a way
to discover new sites to crawl and the keywords that lead to them". This is
not copying the SERP results.

Since it "looks like" you work at Google :) can you answer this question (it
was also asked here: <http://news.ycombinator.com/item?id=2165963>)? Doesn't
Google use Chrome to get traffic statistics, through the opt-in "send usage
statistics" and the malicious site protection?

~~~
tejaswiy
>I wouldn't be surprised if Google Maps has handcrafted (or manually tuned)
scraping code to extract reviews from Yelp and other major review sites, and
same for Google News for the extraction of the news body from the major online
news sources. How is this different?

Sorry, but Google drives traffic to their sites. That's what a search engine
is supposed to do. Msft just scrapes Google's results and presents the data as
its own.

~~~
ot
> Sorry, but Google drives traffic to their sites. That's what a search engine
> is supposed to do.

Then why are newspapers not so happy about it?
<http://www.guardian.co.uk/media/2009/nov/09/murdoch-google>

And, BTW, just to be clear, Msft can't "scrape". That would violate
robots.txt.

~~~
kelledin
> Then why are newspapers not so happy about it?

Rupert Murdoch and his kin are shortsighted, blustering fools when it comes to
the 'net. Relying on their attitude to make your point is counterproductive at
best.

------
rodh257
I'm a bit confused - see this image, from Google Analytics on my blog.
<http://i.imgur.com/oWK8q.png>

Google Analytics knows that the search term 'autodesk revit devlopers guide'
on Bing lead someone to my blog. I take it this information is in the HTTP
header on the request to my site which the Google analytics code reads.

If Google were to use Google analytics information in their search results,
how would that be any different to what Bing is doing? Or is the distinction
that Google claims not to do this?

~~~
gregable
From [http://analytics.blogspot.com/2010/07/will-using-google-
anal...](http://analytics.blogspot.com/2010/07/will-using-google-analytics-
have.html)

"[Google] Search Quality in general does not use Google Analytics in ranking
... You can use Google Analytics, you can not use Google Analytics, it won't
affect your ranking within Google Search results." It's dated middle of last
year, I guess it's possible that something has changed, but nothing I'm aware
of.

------
c2
Isn't it bad form from a marketing perspective to continually mention their
top competitor? Does Apple mention android so extensively in their press?

As Paul said, customers don't care. All they are doing is giving Bing some
front and center advertising on it's blog (which has several non-tech readers)
and the tech people who actually care probably don't enough to actually switch
search engines.

~~~
jedsmith
Well, if you notice, they toed the water first.

Google didn't fire right off the bat with the hard-hitting blog entry, but
instead basically gave a more detailed version of the same thing to Danny
Sullivan. They wanted to see how Microsoft would react before going official
with it, because even though Microsoft's response was predictable, there's
always a chance that Microsoft would have surprised everyone with their
response. (They didn't, in my opinion.)

What's struck me most about this story as it has developed throughout the day
is that Google's actions are very deliberate and planned.

I wouldn't consider Google as _continually_ mentioning Bing, either; in fact,
I don't think they've paid much attention at all to them. Put _Bing_ in the
search box on their official blog, and you'll see that this is the only post
specifically about Bing -- a perusal of older posts indicates that the rest
are hitting on comments or TrackBacks (i.e., the background image misfeature).

------
ecopoesis
It's funny that Google has a problem with Microsoft using their content (the
search rankings), yet has no problem taking content from places TripAdvisor
([http://www.tnooz.com/2010/12/14/news/tripadvisor-shrugs-
off-...](http://www.tnooz.com/2010/12/14/news/tripadvisor-shrugs-off-reports-
google-blockade-still-in-place/)) and newspapers
([http://www.google.com/url?sa=t&source=web&cd=5&s...](http://www.google.com/url?sa=t&source=web&cd=5&sqi=2&ved=0CDkQFjAE&url=http%3A%2F%2Farstechnica.com%2Fold%2Fcontent%2F2006%2F02%2F6095.ars&ei=jsRITYfAHoiq8Aag6NSsBg&usg=AFQjCNFtRwakdQRICDmGh4YSIRm5xXN1TA&sig2=DPLaZ5dASlNDaZKiEc4O5A)),
even when those companies specifically ask Google to not.

Google should be more careful here: either it's OK to repurpose other site's
content or it's not, and Google has built their entire business around
repurposing content. They shouldn't be surprised when their competitors start
doing the same.

------
droz
I think they (google) can really only cry foul if there is specific code in
the Bing toolbar that targets google's search results.

The way that they describe the approach, it seems like the Bing Toolbar would
also be scrapping results from bing itself, yahoo, altavista, ask.com and many
others.

~~~
mishmash
>Bing Toolbar would also be scrapping results from bing itself, yahoo

Isn't Yahoo search powered by Bing anyway?

------
jeroen
These seem to be the relevant parts of MSs responses:

"Opt-in programs like the [Bing] toolbar help us with clickstream data, one of
many input signals we and other search engines use to help rank sites."

“We do not copy Google’s results.”

I see MS denying _copying_, not denying _using_ Google search results. That
makes the title of the Google blog post incorrect.

------
thought_alarm
Google developed an impressive spell correction and error detection algorithm
to improve their search results.

Microsoft inadvertently benefits from Google's research by simply watching and
recording how people use Google. The end result is a Microsoft product that
isn't as good as its competition, but it's good enough for some people. Sound
familiar?

It's a classic case of true innovation vs. "Microsoft" innovation.

------
atularora
Some perceive Google's stand with hypocrisy e.g.
[http://twitter.com/#!/counternotions/status/3256864602692403...](http://twitter.com/#!/counternotions/status/32568646026924032)

~~~
chc
That tweet is drawing a very specious connection. Android did not, as far as I
can tell, copy any of Apple's algorithms or piggyback on top of them. It is a
novel implementation of some of the same _ideas_ in the iPhone.

Similarly, it's plagiarism if you take a Harry Potter book and publish your
own version with the names changed, but James Patterson's "Witch & Wizard" has
a copyright of its own despite being rather similar in concept.

(Edited to remove question about phrasing thanks to atularora's
clarification.)

~~~
ecopoesis
While Android it obviously inspired by iOS, it's not a direct copy. A better
example of Google's hypocrisy is their outright copying news articles into
Google News despite the source companies asking them not to.

~~~
cma
Healines and snippits, and they respect robots.txt. The full articles they
have are licensed from the AP.

~~~
ecopoesis
What about the case of TripAdvisor?

TripAdvisor says "Google, don't copy our reviews for Google places."

Google says "The only way we won't copy your content is if you opt-out of
completely."

TA says "We can't do that, you're the only search engine there is."

Google just laughs maniacally.

[http://www.tnooz.com/2010/12/08/news/google-places-
blocked-f...](http://www.tnooz.com/2010/12/08/news/google-places-blocked-from-
using-tripadvisor-reviews/) [http://www.tnooz.com/2011/01/11/news/tripadvisor-
content-on-...](http://www.tnooz.com/2011/01/11/news/tripadvisor-content-on-
google-still-restricted-but-talks-continue/)

~~~
chc
I don't see how that's relevant at all. Google gave them the option of not
being indexed. They decided they would rather be indexed by Google than not.
The complaint that TripAdvisor doesn't get precise, fine-grained control over
what Google does with its index seems like a fairly different issue.

------
cookiecaper
I think this is silly. Unless Google can come up with a copyright claim on its
search results, and I seriously doubt they can, they have nothing to complain
about. I use Google's search results too -- information you pump out publicly
can be used to the advantage of your customers as well as your competitors. If
Google is scared that Bing is "stealing" their search results, they should
quit making those results public in a way where people can "steal" them.
Accept that freely available information is freely available or clamp down and
stop publishing information that might help your competitors. In Google's
case, unfortunately, the info that helps Bing is also the info that is
essential to Google's customers.

I know of some other media companies that are hyper-paranoid about their mass
produced, widely disseminated, public content being "stolen" by others, maybe
Google should set up a lunch date with the RIAA.

~~~
jedsmith
> If Google is scared that Bing is "stealing" their search results, they
> should quit making those results public in a way where people can "steal"
> them.

I'm curious how you'd implement that idea.

~~~
cookiecaper
It's practically impossible for Google.

------
aneesh
Why do people think there's anything wrong here? Here's a (hypothetical)
similar example:

In the 1990s, it probably look a lot of iterations, user studies, and market
research to decide that copy/paste, undo, etc were the "right" set of features
to include in a word processor. Do you think Google Docs re-did all that
research? No, of course not. They probably just looked at Word and said, "we
need to support these features". And there's nothing wrong with that. This is
the exact same thing.

If you have a product out in the market, it's fair game for your competitors
to look at and analyze its strengths and weaknesses, and use those to improve
it's own product.

~~~
j_baker
There's a difference between looking at word's copy/paste and deciding you
want to make your own and making your copy/paste a hack around Word's
copy/paste.

------
radicaldreamer
This is an excellent short term PR tactic, but Microsoft can just say that
they're not copying Google's results, they're just using user click data to
improve their search results and that sometimes that click data happens to
come from users' google searches.

It's understandable why Google's concerned, because it's likely that Microsoft
has access to a lot more this data due to their OS and software's ubiquity.

------
callmevlad
I can only imagine the confusion the Clyde-Findlay Area Credit Union SEO team
is going through right now:

"Why are we getting so much traffic from people searching for
'delhipublicschool40 chdjob'?"

~~~
juddlyon
I thought the same thing about Team One Tickets:

"Is hiybbprqag some new band the kids like? Why don't we have tickets for
them?"

------
dfranke
More interesting than anything Microsoft is doing here is Google's answer to
it. If Microsoft caught Yahoo doing this, they'd bury them in lawyers. Google
is confident enough to just go public and take the PR win.

~~~
recoiledsnake
I guess they gave up the idea of suing because they were unlikely to win. Even
if Bing was directly scraping Google's results from Bing's servers, it's iffy
because robots.txt is not something that's enforceable, and listings of things
are not copyrightable. See <http://en.wikipedia.org/wiki/Feist_v._Rural>

------
d0mine
It is unclear whether the Google results appear in Bing from the same computer
the initial queries were made i.e., it is a personalization of search results
or Bing uses that results for other users too.

------
brudgers
> _"We gave 20 of our engineers laptops"_

OK, over the course of several weeks 20 google engineers were able to inject
7/100 false searches into Bing's database. That is more structured like a
brute force attack than a scientific experiment. Is Google really surprised
that SEO works? [edit]The blog contains nothing significant about methodology
- no control groups, no restrictions on automation, no limits on methods used.
In other words, what this shows is that 20 Google Engineers were able to hack
Bing and that they did so for PR purposes.[/edit]

------
enthalpyx
"so Google set up a honeypot – some made up words like [hiybbprqag] linking to
random unrelated sites."

That's not my understanding of what Google did at all. Google fed back search
results for keywords that didn't exist on the Internet -- period, and they
started eventually showing up in Bing.

------
eps
What strikes me odd is why Microsoft would bother with making a sneaky toolbar
that calls home instead of just grepping through their Bing logs for queries
with no results and then running these queries against Google...

~~~
akavlie
Well, Windows doesn't come with grep, so...

~~~
akavlie
Well I guess THAT joke fell flat. Oh well...

------
Athtar
Seriously? Anybody else surprised at how little data there is given how
serious the accusations are?

From what I've read, the general consensus seems to be that Microsoft is using
IE in conjunction with the Bing toolbar to analyze user's search data. And
this is something that worked only on 6 or 7 of the 100 terms that they tried
it with? That was enough to incriminate Bing?

Google could've at least tested to see if this behavior is limited to just
Google or if Bing was also analyzing other search engines (or even other
pages). I would've expected MS to have released something like, not Google.

------
kcdenman
Another example like this that has just come out - Qwiki by Eduardo Saverin,
co-founder of facebook. The new search engine pulls data for its results
directly from Wikipedia and adds it's own flare. The open source text comes
directly from Wikipedia.

Qwiki - see contents
<http://www.qwiki.com/q/#!/George_VI_of_the_United_Kingdom> Wikipedia
<http://en.wikipedia.org/wiki/George_vi_of_the_united_kingdom>

------
dholowiski
Did they do the same test with blekko or duckduckgo too? It would be
interesting to see if Microsoft is the only one doing this.

I would try it now, but the test has been polluted by all of the news
articles.

------
wardrox
What a very clever way to test their theory. I'm very curious to see how
Microsoft explain this one.

Also, why do I now want to buy things with "hiybbprqag" printed on them?

~~~
Gibbon
It's not really that clever.. it's the search equivalent of a trap street in
cartography.

<http://en.wikipedia.org/wiki/Trap_street>

------
mrnothere
As many have pointed out, Google has used clickstream data from their toolbar
for a number of years.

Also, Google has used the links provided by hub and search pages to find
relevant sites within a niche. They have happily indexed links they discovered
on those pages, and then removed or penalized the pages that pointed them to
it. It's OK, of course, because any SERP not provided by Yahoo Google or
Microsoft is termed "spam"

------
caf
"You can think of the synthetic queries with inserted results as the search
engine equivalent of marked bills in a bank."

It's actually more like trap streets on maps.

------
akshat
While this is not completely Black and White, one thing which favours Bing is
that a user has clicked on a search result which is determined using his own
intelligence.

Google has assisted the user with that action. Bing is only correlating these
two individual actions(search and click) by the user, to get some additional
signals.

------
kaze1
Never mind the ethics part. The fact that Bing's 'signal gathering' mechanism
can be fooled into accepting some bogus links, without even an iota of content
verification, illustrates a fatal flaw. I am sure developers at MS are capable
people, but this throws poor light on them (well, at least in my eyes).

------
mmaunder
Well, at least Bing will get a pagerank boost from all the new backlinks in
the press coverage.

------
littlestove
How about this analogy? If Company A publishes a book and Company B somehow
manages to get the content of the book from its readers (with their consent).
Then B publishes the same content as if it is original content, how does that
sound to you?

------
bayjinger
Think I'll just quote Mike Masnick's post over at techdirt, since he sums it
up so well: "For Google to attack a competitor for using open information on
the web -- the same way it does -- seems like the height of hypocrisy. It's
fine for Google to crawl and index whatever sites it wants in order to set up
its ranking algorithms, but the second someone looks at Google's own rankings
as part of their own determination, suddenly its "cheating"?

This seems like the latest in a series of indications that Google has moved
past the innovation stage into the "protecting its turf" stage. That would be
a shame."
[http://www.techdirt.com/articles/20110201/11022312911/google...](http://www.techdirt.com/articles/20110201/11022312911/googles-
childish-response-to-microsoft-using-google-to-increase-bing-relevance.shtml)

------
Natsu
They should have made one of the nonsense queries something like
"westealourresultsfromothersearchengines" and linked it to something like
yes.com just to make the copying easier for non-tech people to understand.

------
sliverstorm
I may be misunderstanding what is happening, but is it possible Bing is just
doing what it was programmed to do- examining the user's browsing habits? Is
it necessarily directly and purposely targeting Google?

------
rome
Do I understand this right? Google (over a period of time)knowingly submitted
info to Microsoft. When Microsoft used that data, Google accuses them of
copying?

Does Google use the data I give them? Are they copying me?

~~~
txxxxd
Close, but you didn't understand it quite right. Google knowingly submitted
info to Microsoft to confirm a suspicion that Microsoft was using google.com
search results from _other users_ to populate bing results.

------
jtchang
In a way I see this click tracking that Microsoft is doing innovation. What
better way to get relevancy than directly from the user and what that user
clicks on?

------
BarkMore
Google could have made this story all the more interesting by using
misspellings of the terms nihilartikel and mountweazel in the honeypot
queries.

------
axod
Google should make their own toolbar that continually sends random bad data
back to bing to screw up their results. I'd install it.

------
jetaries
I think this is a very bad move by Google to get into these types of pointless
arguments. Google being in the market leader position, there is nothing Google
can gain from doing this. Google won't convert Bing users over with these
acquisitions, all Google doing is raising awareness for Bing. If anything,
Google would lose users to Bing for doing this.

If anyone from Google reading this. This is not a smart move, and should be
ended asap.

------
slackgentoo
Take a look at feature list here: [http://research.microsoft.com/en-
us/projects/mslr/feature.as...](http://research.microsoft.com/en-
us/projects/mslr/feature.aspx)

Combing features 130, 131, 135 and 136, I think it is understandable what
Google engineers did can give those fake links a boost in the search results.
In a way, they cheated the algorithm.

------
MichaelApproved
Is it possible bing is pulling the results from a third party provider who is
doing the cheating? Maybe a middle man is partially whats to blame for the
delay in the results.

------
kowsik
this is brilliant. all those years of "innovation" ruined by a simple MiTM
tap.

~~~
kowsik
[http://labs.mudynamics.com/2011/02/01/bing-google-and-omg-
yo...](http://labs.mudynamics.com/2011/02/01/bing-google-and-omg-you-copied-
me/)

------
random42
Google, you are embarrassing yourself.

------
krisrak
use hashtag #BingGate

------
hasenj
I don't see anything wrong Bing is doing. There's clearly an indirect link
between the synthetic query and Google result.

If Bing was outright stealing Google results, all you have to do is:

1\. setup the synthetic queries on Google 2\. search for them using Bing

Clearly, it took several weeks of Bing toolbar being installed and people
going to site X after searching for Y. The Bing toolbar has the right to
assume there's a relationship between X and Y. It's a legitimate "ranking"
strategy.

~~~
zyb09
Well look, this thing comes preinstalled with Windows, so all the Google
engineers did is go to www.google.com on IE, search for something and - voila
a short time later the results are on bing.com.

Automated spidering or not, the way this is setup borders on the edge of
stealing. I can see why they feel the need to complain about the issue.

~~~
qjz
That's not all they did:

 _We gave 20 of our engineers laptops with a fresh install of Microsoft
Windows running Internet Explorer 8 with Bing Toolbar installed. As part of
the install process, we opted in to the “Suggested Sites” feature of IE8, and
we accepted the default options for the Bing Toolbar._

Essentially, the engineers enabled the user tracking features of IE and the
Bing Toolbar, ultimately seeding Bing with the desired results. How is that
stealing?

On a related note, can this technique be exploited to improve site ranking on
Bing?

~~~
wingo
> On a related note, can this technique be exploited to improve site ranking
> on Bing?

That was my thought when I heard of all this! I don't know what kind of
authentication the bing toolbar does, but this seems ripe for reverse
engineering, then pumping fraudulent data to Microsoft through a botnet...

------
maeon3
Google should have kept quiet, and figured out how Microsoft is pulling data
from Google. They could then create a script that would cause Bing to link new
borrowed content to goatse.

~~~
AmazingMe
bing only sees, what google's customer are seeing. so Google can't do it
without screwing its own search results, and even worse it won't affect top 1
million query strings, as for them bing is getting much stronger signals from
other sources.

------
pedanticfreak
Why didn't Google's investigation go further? Why didn't they decompile the
IE8 toolbar to figure out what it was really doing? Maybe that's against the
DMCA and Google can't admit to it?

Having the evidence in code would have made the accusation irrefutable.

~~~
jkterhune
Agree. I have to imagine that someone at Google captured the toolbar's HTTP
traffic. Maybe they're holding it back, or maybe it's the same for Google
results as it is for, say, Lycos.

