Hacker News new | past | comments | ask | show | jobs | submit login

Read google's "Bing Sting" experiment on Google's blog. Their test had no control variable (ie: running the same test on another website that wasn't Google). If they had done their expirement on another site and Bing results didn't change then Google would have a very plausible case against Bing.

Google fell into the classic trap of confirmation bias with bad scientific method. And thus the test was 'rigged', they never had a control variable.




Their assertion was not "bing is using clicks on google search and only google search." The assertion was just "bing is using clicks on google search." They've demonstrated that pretty conclusively.


Bing has never denied using clicks on google search. They have, in fact, admitted to using clicks in general before Google even began their experiments. Given that information, why should it be surprising that Google Search, as one of the most-clicked sites on the Internet, has a big impact on that?


Surprising or no, it's clearly pretty controversial. If you are using click data like this, you have to know that you'll end up essentially copying Google. People only click on the results that are there, and Google puts them there.


Of course it's controversial. Someone made a deliberate decision to stir up controversy over this. It's easy to make some controversy if you just use the right words. Words like "Cheating" and "Copying" are great for that.

If you really want examples, just watch Fox News or MSNBC for fifteen minutes. You'll probably see at least one or two examples in there somewhere.


Their assertion is that they're imitating Google.

"Bing results increasingly look like an incomplete, stale version of Google results—a cheap imitation"

Which they have NOT demonstrated. Their results can easily be interpreted that they imitate user clicks.


You can't decide to reason through induction on one set of variables (all searches, not just "hzzxsqqdga", are copied by Bing) and leave it out on another (click data on all websites, not just Google, are copied by Bing).


I'm not doing that. How did you get that impression?


The phrasing of "bing is using clicks on google search" implies that google search is a single case (I could likewise claim "bing is using clicks on duckduckgo") that does not extend to others.

Wouldn't it be more accurate to say that "bing is using click data"? The fact that google.com is in a lot of that click data is a questionable decision and the root cause of all this drama.


Yes. Why can't people grasp this very simple idea before opening their mouths and spreading this FUD. I see even pg fell for this.

The only way the words "Bing copies google" would be justified is if MS were directly querying Google on certain keywords and ripping off search results. Google have provided no evidence to suggest this. I expected the commmenters of Techcrunch to be unable to grasp this, but it seems that HN is often like this too.


To be fair, I think that claim would be justified if they were "merely" grabbing the Google SERPs their users happen to receive, turning them into ranked lists of URLs, and using that data some way.

It would even be justified if they were harvesting click data only from Google (or explicitly treating that click data differently), because then they're just doing the last one, but obfuscating it: it would be like refusing to bribe a politician directly, but instead making a large "investment" in a corporation they own.

It's not a justified claim if they built a mechanism that genuinely gathers interesting data, and would continue to do so in Google's absence. I think Microsoft is claiming this, but their responses have been so murky that it's not 100% clear. Google certainly hasn't produced evidence that renders this version implausible.


Your "control variable" only matters if the hypothesis they are trying to prove is "Bing special-cases Google."

That is not the hypothesis. The hypothesis is "Bing has results in its index that it could not have gotten in any other way than from Google search results." Their experiment does indeed confirm that hypothesis.


The problem is, the intentionally ambiguous and misleading wording Google has been using implies the test was "Bing special-cases Google". This is why I'm not buying any of this. Google is smart enough to use precise language when they want to, and apparently not to when they want to.


Incorrect, once again to come to that conclusion you would need to test on a site that WASN'T Google as well.


If your analysis is correct, you should be able to explain a scenario under which, given Google's experiment, Bing's result for "hiybbprqag" came from somewhere other than Google.

What is that scenario?


It came from the Bing toolbar tracking the user browsing. Yes, obviously "hiybbprqag" came from Google. But that's because they only tested it on Google.

They never tested the fact that it could have come from any other website as well. Thus, they can't conclude that Bing is copying Google or whether its copying the user's browsing behavior.


> Yes, obviously "hiybbprqag" came from Google.

That's all Google is saying.

> They never tested the fact that it could have come from any other website as well.

It doesn't matter, Google is only complaining about what Bing has copied from Google. What Bing copies from other sites is between them and the other site.


Exactly. They never did any testing designed to specifically not have the result show in Bing.


Which means they do not know the boundaries of their problem. What part of googles argument is countered by it though?


It means that they have falsely arrived at a conclusion due to positive bias. They might be right, but they haven't proven it sufficiently.

The inverse of the conclusions of their experiment are also incorrectly assumed (Google results, using IE8/Bing toolbar, make Bing results != Bing results, when using the IE8/Bing toolbar, are from Google).


My understanding is that running the experiment on some random other sites would work as well as long as Bing user's were actually clicking the honeypot links. This is probably the intended functionality of the click stream data. That doesn't change the fact that using this type of data from competitor search engines results in "stealing" (for lack of a better word) search results, particularly for rare terms such as "tarsorrhaphy".


What would be a suitable control variable? How would one know if Bing is using results from that website too or not?


So what exactly would a control that satisfies you be? How about I submit search terms on the computer terminal that connects to a server that isn't even on the internet. There, those searches do not end up affecting Bing.

The fact is that they used unique search terms that link to unique subjects. They ONLY submitted searches using what they described. That means the only places those searches passed through were the OS, toolbar, IE and Google. They know Google received those searches, did it propagate to Bing?

That's it. They didn't need to offer a placebo to anyone. It's like throwing a ball and hearing an echo. You don't need to NOT throw a ball just to make sure it doesn't echo.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: