

Bing versus Google, some observations - ttol
http://jacquesmattheij.com/Microsoft%27s+Bing+versus+Google%2C+some+observations

======
gjm11
Jacques suggests that this is all about those 9 results that Google was able
to "force" into Bing's index. But -- unless Google are simply lying about this
-- the point isn't really those 9 results; they are just the clearest evidence
Google has of shady behaviour by Microsoft.

Google claim that they saw lots of (less obvious) evidence of Bing mining
search results from Google before they began their sneaky test, and the point
of the test was simply to confirm that Microsoft is doing what they thought.

This is not about whether Bing is easy to "game" -- whether Google can get
nonsense into Bing's index by sneaky means. It's about whether _real_ Bing
searches commonly derive their results from Google.

Imagine that I think you're reading my email and using the information in it
to play the stock market (maybe I have secret insider information about some
companies, or something). So I do a test: I arrange to be sent a bunch of
email that, if you acted on it, would make you buy particular companies'
shares that you'd otherwise have no reason even to have heard of. And, lo, you
do that for 10% of the companies involved. Would anyone, looking at that, say
that the real news is that I was unable to "game" your stock market
transactions effectively, and that your spam filters caught 90% of the junk I
tried to inject into your information?

~~~
contextfree
If Google think they have other evidence, let's see it. The burden of proof is
on the accuser, I think.

BTW, that's not necessarily about whether they are "lying". The other evidence
they think they have could be convincing to them (because, e.g., they already
have a strong predisposition to believe that they are inherently far better
than anyone else, and therefore anyone building a competitive search engine
could only possibly be copying them - this is a caricature, but you get the
idea) but not necessarily convincing to others.

~~~
lysium
Why should Google show other evidence? The evidence they have provided is
enough for me to be suspicious: How could Bing possibly come up with the same
odd spell correction without copying from Google?

Yes, it could be a side effect of some machine learning algorithm. But Bing
never explained that, which would have been very easy if it was the case.

~~~
dsmithn
Because they did not prove Bing specifically targets results from Google.

~~~
contextfree
This is also true, but is sort of tangential to the point we were discussing.
(Personally, I think they probably were special-casing Google, at least
insofar as it was one of a list of set URL patterns it knew how to parse.)

------
lysium
While I find the thought interesting, that Microsoft may have google-specific
code to filter out redirects, I don't like the notion that Google should not
complain because it allegedly ignores copyrights.

I see a difference between aggregating content and presenting it _and
mentioning the source_ and just plain copying (such as spell corrections) with
no mention of no source.

~~~
contextfree
I also think there's a big difference ("ethically" - I'm thinking more in
terms of an artistic/creative idea of originality than an academic or legal
definition) between using content from a blog (or whatever) to make a search
engine, and using content from another search engine to make a search engine.

------
BoppreH
Mentioning the google redirect is a point I haven't seen yet and extremely
valid. "Anonymous click data", as Bing said, can't account for what was seen.

But I think the author focused too much in the "9%" part. Who knows what
Google did with the other 91%? Maybe they were trying different approaches,
which actually would be the most sensible thing to do.

~~~
brudgers
> _"Maybe they were trying different approaches"_

That is sensible, but can easily slide into an ends justifying the means. In
other words, we know that the engineers were tasked with proving that
Microsoft was copying Google. That's a different task than figuring out what
the Bing toolbar does with data from the Google search page.

Given the low success rate, it is not implausible that the engineers pushed
whatever ground rules there were (if any) in the pursuit of evidence. It seems
to me that is the most plausible explanation for the uncertainty about whether
it was 7, 8, or 9 cases.

To go beyond what is easy for the media to report, it is reasonable to expect
that a company in Google's position has twenty or more full time engineers
analyzing their competitor's products. The story about "torsoraphy" really
only makes sense if Google has such a program. Seriously, is anyone surprised
that Google and Bing analyze each other's engineering?

[WildSpeculation] One or more Google engineers tasked with analyzing the
competition discovered "torsoraphy" connection and identified its correlation
with the Bing toolbar - however, keep in mind that Google has not claimed that
the "torsoraphy" naturally occurred in the wild. [/WildSpeculation]

[GoogleClaim] Based on the "torsoraphy" discovery, one or more Google
engineers hard coded web pages - presumably with permission from senior
managers since a leak that it was done casually has such serious blowback
potential - twenty Google engineers armed with laptops were tasked with
creating top keyword rankings.The project was at least active for two weeks
over the traditional Christmas Holiday[/GoogleClaim]

[WildSpeculation] The Google engineers, surprised at their initial lack of
success tried increasingly diverse and aggressive methods as time passed
despite the diversity of IP's they used as they traveled over the Christmas
Holidays. Being good hackers some even tried exotic techniques perhaps even
creating manipulating existing web pages to influence the search rankings. On
advice of lawyers, Google is not comfortable accusing Microsoft of copying in
these cases.

A month later, Google plans to announce the Android Marketplace while Apple is
going to proclaim that Rupert Murdoch is the future of journalism. Google is
going head to head with the reigning PR champion of the world. They turn the
experiment into a torrid story and release it on February 1.

Larry Page knows the difference between a Founder from Stanford and a former
IBM'er from a cow college in Alabama. It's no match. Binggate drowns out the
traditional eve-of-event Apple adulation. On the day of the events, the
mudslinging is far sexier than "and it has a hundred pages" for the tech
press.[/WildSpeculation]

How bad was it yesterday for Apple? Stories about Apple's triumph with _The
Daily_ didn't make the front page of HN. The tech press even found Microsoft
more interesting than Apple yesterday. Binggate was Googles attemt to kill two
birds with one stone.

------
brudgers
Jacques brings up an excellent point about how Google regards copyright.
Here's IP of Microsoft directly from Google:

[http://books.google.com/books?id=-ANomF3oARkC&printsec=f...](http://books.google.com/books?id=-ANomF3oARkC&printsec=frontcover&dq=microsoft+press+books&hl=en&ei=hb9KTbyLEoPGlQezx9Ep&sa=X&oi=book_result&ct=book-
preview-
link&resnum=5&ved=0CFcQuwUwBA#v=onepage&q=microsoft%20press%20books&f=false)

~~~
lysium
Google shows only 50 pages out of about 300 and provides several links to buy
that book, including to Microsoft Press.

So that 'excellent point' only shows that Google even promotes products of its
competitor. I don't think it says anything about Google "ignoring" copyright.
Heck, you could even look into that book at Amazon for free:
<http://www.amazon.com/gp/reader/0735622841>

Along your reasoning, every library or friend that shows you a book is
ignoring copyright.

~~~
brudgers
Google's allegation is that Microsoft shows one of several search results.

~~~
tejaswiy
Okay, Google links to pages to buy that book. Revenue goes to Microsoft. You
can see clearly all over the place that it's a Microsoft book. You can browse
where it came from originally.

Microsoft "copies" the results. Does not say where they came from and presents
them as their own. In this context, that would be copying all the text from
that book, removing all the branding and creating a new book saying that
Google wrote it.

------
powertower
I seem to be the only one that keeps scratching his head thinking ... what is
the problem here?

I have no issue whatsoever with Bing analyzing the data sent by _opt-in users_
that is further processed to help increase its search relevancy results.

I would not care if Bing even sent Google synthetic and organic search terms
directly, to analyze and make use of the results.

Remember, Google uses you (the searcher) as a _product_ that is sold to the
real customers: the adwords and adsense clients.

Google scans the net of content _that they do not own_ , and makes an enormous
amount of profit in return.

Google owns nothing. You owe Google nothing.

~~~
lysium
It's clear that search engine use user 'feedback' to improve their search
results.

But if " _Bing even sent Google synthetic and organic search terms, to analyze
and make use of the results_ ", I would suggest they add 'powered by Google'
to their search result page.

Further, I don't think it makes a difference if Bing is querying Google
directly or using opt-in users to do so. In any case, their copying search
results efforts from Google.

~~~
powertower
"copying" and "making use of" are not the same thing.

Again, I don't see why it's an issue for Bing to analyze and make use of the
behavior of Bing toolbar users that opted in.

So what if their Google search behavior holds good weight as a metric?

------
seanos
A long, long time ago I decided to build a machine to pass the Turing test.
Part of this involved giving 1000's of people microphones to record the
questions they asked and responses they received in normal life. This data was
streamed directly to my machine. I then programmed my machine with a complex
algorithm that used this data, along with data from many other sources, to
attempt to produce human like responses to questions it received.
Unfortunately, a group of friends who applied for the microphones decided to
ruin my attempt to pass the Turing test by grouping together and repeatedly
asking a friend, Jim, 100 different nonsensical questions to which Jim, who
was in on the trick, gave pre-determined responses. When I finally finished I
put my machine in a box and asked the public to ask it questions. It was doing
pretty well for a while since no one could distinguish it from a human. Then
Jim and his buddies turned up. They asked it the 100 nonsensical questions, to
which, for 7 of them, it gave the same response as Jim had. Jim got very
angry. “This machine is copying me!” said Jim.

------
lysium
Interesting point that Microsoft may have special code to filter out google
redirects.

On the other hand, do we know the percentage of searches that actually get
redirects as results? The 'honeypot' was rather small, so the redirects just
might have been too little to appear as a signal in Bing.

------
shubber
I'm just glad to see I'm not the only one who's been led to think: if Google
can demonstrate that Microsoft is picking up search correlations from the
Google site, and Microsoft just explained that, no, it's just that we pull in
clickstream data, then couldn't one feed synthetic clickstream data to Bing as
a blackhat SEO technique? That seems like much less work that setting up
content farms or botnet clickfraud.

------
rbarooah
If I were building a toolbar that followed tracked people's clicks, I would
take some measures to have it record the 'final' URL loaded by the browser and
not the naive link from the DOM. There are all sorts of redirectors in use and
not working around them generically would give distorted results.

The lack of google redirects in bing's results doesn't look like proof, or
even a smoking gun to me.

------
shasta
For your second point, what evidence do you have that all click results that
point to google redirects aren't simply discarded?

~~~
rst
In fact, he suggested that that might be what's going on: "It is also possible
that the 91% that didn't 'make it' was actually because they were pointing to
google rather than to the target. Of course Bing does not like to link to its
competitor and filtering out www.google.com/url can't be that hard."

(Answering for Jacques, since he said he won't be here anymore to answer for
himself:
[http://jacquesmattheij.com/Tell+HN%3A+So+Long+and+Thanks+for...](http://jacquesmattheij.com/Tell+HN%3A+So+Long+and+Thanks+for+all+the+Bits)
)

