Hacker News new | past | comments | ask | show | jobs | submit login

Hmmm. Let's say that Bing sets up a script that sends queries to Google and then records the results. That's clearly copying. But what Bing does is when you use its toolbar, it watches what you do and uses that information to rank results. Is that really copying? It showed Google's Honeypot page because Google's engineers were clicking on the Honeypot page with the toolbar installed. That isn't copying Google's results, that's copying the actions of Bing toolbar users.

This can easily be demonstrated. Google can set up a second honeypot but instruct its engineers not to click on the link, ever. If it shows up in Bing's results, then Bing is watching what Google returns and scraping its results.

But if the second Honeypot doesn't show up in Bing's results, then clearly Bing isn't copying Google's results, it's copying its toolbar's preference for links.

The entire thing is moot to me. The takeaway in't whether Bing copies Google. The takeaway is that Bing's toolbar is spyware :-)

Let's say that Bing sets up a script that sends queries to Google and then records the results. That's clearly copying.

I'd question even that hyperbolic interpretation. Let's say that Google sets up a script that sends queries to websites and then record the results and incorporates what links are shown on that site into their search rankings. Is that clearly copying? No, that's just pagerank.

If you have a web directory, a link page, a blogroll--isn't Google "copying" your work by using it to improve its search results? How is that any different from what Bing's doing?

> No, that's just pagerank.

This is my first thought as well. Google's pagerank analyzes the link structure of the web as one of the inputs to its search ranking. Apparently, Bing's toolbar analyzes page content coupled with user click behavior as one of the inputs to its search ranking.

These two things don't seem very different to me. Both of them are relying heavily on the value provided to them by tracking and analyzing the behavior of users on the web to drive search results.

I have the same thought. It is more a matter of framing. A while ago some people accuse Google of unethically profiting because they are farming the link structure of the Internet, which is the labor of many people (is it Nicholas Carr?) I don't really buy this framing. But Google's accusation seems fall on a similar line of argument. You can also setup a "Google sting" to prove they are copying from the Internet. It is called "Google Bombing".

Bing will be at fault if they specifically target Google. But if you consider entering a keyword and then click a link is essentially targeting Google search, then it only expose another problem, that is Google's monopoly on the search market.

We agree that it is copying, the subject open for interpretation is whether it's "wrong" in some way. Information is all about copying, that's the whole point :-)

but the link between the obscure query and the click on the page wouldn't have been made without bing knowing the user first searched for the query on google, no? if it were simply boosting page clicks that would be one thing, but how else could rim.com rank 1st for "mbzrxpgjys"?

What the experiment shows is that where no other data is available Bing will use what it has and that Google can successfully seed Bing on the long tail. What it doesn't show is that in typical circumstances Bing is relying on data gathered from Google searches.

Microsoft is collecting the same sort of information on Google queries that it collects on Bing queries and that Google collects on Google queries. All this is happening at the long tail where both companies are most likely using something other than webcrawling to tailor search results - afterall the whole experiment is only possible because Google can seed page rankings at will to link arbitrary terms to specific search results.

Well yes there is a hard link happening between a google search result and a link being clicked, however googles argument isn't as strong if it turns out bing is doing this for all search engines. It might be that they aren't targeting google specifically, but instead they're targeting all search sites generically.

But then still, the Bing toolbar is watching what you're searching for and recording that information. That's a pretty big privacy issue

Google doesn't really want to get into a heated discussion about the evils of a search engine knowing everything you've ever searched for. Stones, glass houses, etc.

(Given Google's near-monopoly of the market, Microsoft and DDG have some amusing competitive synergy going on, don't they. DDG can criticize Google all they please for retaining user data because DDG doesn't and isn't in a position to benefit from it. Microsoft, which certainly is in a position to benefit from it, doesn't need to worry about Google calling them on it because Google is the only search engine that can actually lose market share over the issue.)

For all Google’s sins, there is a Dashboard that let you erase all you care them not to know: Google could promote that heavily in case of stone, glass house. I always assumed this would be a great way to learn more about queries: spotting what words people are ashamed to have searched for.

It could be something a registered user could set from a browser toggle, and DuckDuckGo is a very good project, or course. My point was: data portability and user control are within Google's long term interest, not being evasive about their data cache.

While it's great that Google lets you delete information, harvest-your-data-by-default is not a choice made with your best interests at heart.

This is what all toolbars do, and is largely the point of why big companies offer them and pay little software companies to make them optional installs (see Corel's WinZip, which installs the Google toolbar)

And that is explicitly stated in their tos. There's no hiding here.

Let’s agree “not hiding” is what is in the demo video. TOS… we all know they could add that you sell the soul and the virginity of your mother in there, no one would read it.

It only records that when the user has explicitly agreed to send anonymous data. chrome and Google toolbar do that too and so do most of the toolbars out there.

I think this would still leave Google with a fairly strong argument - if Bing does it for all search engines, then they're effectively copying whoever is most popular. Since it's done through Internet Explorer, which is still bundled with Windows in most places, they could try to make the argument that Microsoft is using their position in the OS market to crush competition in other markets.

Interesting angle to go through the tied market and competition policy: that's a type of authority that is far more intelligent, and precisely just prosecuted IE in Windows. However, you'd have to either have a US court acknowledge that a European was right to disagree with them in the first place, or have a European court admit that their previous decision wasn’t enough. It’s feasible, but hard.

Where you’ll be more limited with it, is that it’s apparently not IE, but the Bing Bar that is at stake—the connection is getting thinner.

If a page contains a unique word, and people who were on that page universally go to a different page, that could be enough evidence for bing to assume there's a link between the unique word and the target page.

There are browser toolbars that aren't spyware? I'd think anyone likely to actually hear about this would have already been assuming that all browser toolbars are spyware.

It might be a good exercise to chase down old posts of people who wrote the first browser toolbars, as well as the browser infrastructure that made them possible. We can contrast the speculation on why they might have been a good idea with the actual result. Not as a way to trash them, but as an exercise in how smart people can miss the mark.

More popular if done as a way to trash them, I expect. You could make "trash old usenet posts".com and have a ranking system for "worst decisions ever".

(C.A.R Hoare's billion dollar mistake, for example).

As it happens there's an exact historical precedent for this. Post code <-> long/lat data is copyrighted in the UK, but users were using Google Maps (and others) to do conversion and supply them to open source databases, the end result is that Google had to change their licensing/API to restrict this sort of behaviour.

Just because you're copying the data indirectly through a third party doesn't mean you're not breaching the copyright.

Interesting case to raise. Reminds me of the NFL terms voiced after every game about the broadcast being for private use only. I imagine that if their terms don't already include a clause like this, they can try to suggest that a toolbar tracking user clicks is violating the terms.

Very murky waters. If Google starts complaining that other people are tracking their users, they might end up educating users about how much they and their advertisers track.

...I hope DuckDuckGo figures out a way to capitalize on this brouhaha...

UK law seems to be unusually stringent on this though (e.g. FootballDataCo claims copyright on facts such as dates of football fixtures)

"The takeaway in't whether Bing copies Google. The takeaway is that Bing's toolbar is spyware" And also that you can control search results in Bing. Nice feature for advertisers.

You might need to hold in your hands the ethics of the webspam team at a major search engine, though.

Only for terms that are worthless.

For all the money, resources and engineering talent that Microsoft has, you'd think they wouldn't need to do this, though. That's what's baffling to me.

Call Google the market leader all you want, but let's not forget that Microsoft's market cap is around 40 billion dollars greater than Google's.

That's more than Research in Motion's total value!

how would you expect a search engine to be able to surface a web-page given a specific query in which there is no data to create the relationship except the signal from google engineers spamming bing with click data.

these are cases of outliers. they don't exist on the real internet, or at least where pages exist without any other data (anchor text, inlinks, outlinks, words from the query in the document) they never get surfaced from a search engine.

abscent fake click-data there is no way google could surface these documents for the specific queries. in fact google states this openly in their "attack piece". before they manually changed the rank of these document they didn't surface these either.

the only evidence of "cheating" is that bing surfaces document for which there is no known relationship between the query and the document, except for spam created by google engineers. this is evidence only of a bug in bings ranking algorithm. clearly it is using signals from google. just like google uses signal from CNN (keywords, inlinks, outlinks, anchor text, etc).

i'm sure bing is thankful to google for helping find this defect in their system and are hard at work to fix it.

people talk about bing copying search results like google invented search results and put a lot of hard work into them. in this case the only hard work they put in was designed to spam bing.

i can only conclude that google is getting worried about bing quality and has run out of ideas on how to fix their own problems.

all search engines make use of a variety of signals. Bing decided to use what users click on as a signal. Google spotted it and thought it was 'zomg bing are stealing our results'. I don't understand why you think taking advantage of a new signal to improve search is not a smart move by Bing?

The googlers are just angry that they couldn't come up with that genious idea themselves.

I don't see this as being any different from what Microsoft has been doing for 20+ years. They let a competitor put the work into figuring something out, then make a a reasonably accurate facsimile thereof. I think it's lazy, but not particularly unethical. If Google were Benz, would they be complaining that Ford was making 4-wheeled vehicles with an engine? More appropriately, and given that I've been on a Top Gear bender lately, if Google were Cadillac would they be complaining that everyone else was copying their method for operating vehicles, with three pedals, a gear shift, a steering wheel and a handbrake?

I get why Google is upset, but this doesn't strike me as unethical behaviour in a free market.

Does anyone think that the Google toolbar doesn't do the exact same thing? Just sayin

Here's an easy test. Test with the Bing toolbar installed but with Bing and some other search engine (blekko, whatever).

This should help establish if it's the toolbar that is sniffing.

If so, while it may be questionable behavior, Bing would not be copying Google's results.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact