
Google-An update to our search algorithms - Copyright Related - ZanderEarth32
http://insidesearch.blogspot.com/2012/08/an-update-to-our-search-algorithms.html
======
onli
This is wrong. Google should aim to provide the best possible result for the
search. If that is an copyrighted videostream on a non-authorized site (e.g. a
stream of all of the episodes of the series i'm googling for), this should be
exactly the first result. And not a lame but authorized preview of 30 seconds.

Already found myself using duckduckgo in such cases, imagining the results
were better.

~~~
barista
Does this mean Google will never surface YouTube in its search results? That
is probably the largest and most known source of copyright infringing material

~~~
fascinated
Heh yeah, that is amusingly unaddressed

~~~
maxerickson
YouTube follows the DMCA (and does ContentID or whatever). Google search
probably doesn't get very many requests to remove links to YouTube.

This policy shouldn't have much impact at all on any large site that
discourages copyright violations and responds promptly to DMCA takedowns,
because it isn't worth bothering Google Search about that content.

~~~
capsule_toy
I agree this is probably what's happening but I'm worried this might change in
the future. Right now I think all that happens is that the link is removed
from the search results but if the site itself is also penalized, I'm afraid
content owners can now see the value of sending both a DMCA and copyright
takedown notice simultaneously.

------
ChuckMcM
There are a lot of interesting signals for web spam and poor web content. I
don't know if this one is very good. In particular I see this scenario:

1) Google algorithmically genreates take down notices on youtube, they move
this over to search results.

2) Algorithms 'err on the side of caution' with regard to fair use.

3) Every page that quotes or excerpts a copyrighted work gets flagged.

4) So Google 'manually' makes exceptions for sites they strongly believe won't
engage in poor behavior (NY Times, Techcrunch, whatever)

5) No new web site that reviews products or provides critical analysis ever
makes it to Google's front page.

That might be a stretch, the auto-flag stuff on YouTube is out of control, and
then trying to use that as a signal in results is just inviting abuse. Black
hat SEOs are malicious enough without dealing with 'Copyright joejobs'

[ Full disclosure: I work at a search engine blekko.com ]

~~~
revelation
This has nothing to do with search quality. This is just Google bowing down to
big copyright. They have done it before with the autocomplete feature,
censoring many suggested terms.

They have been asking for this for a long time now. "De-ranking" is number one
on this proposal: [http://www.scribd.com/doc/79607883/Proposals-to-Search-
Engin...](http://www.scribd.com/doc/79607883/Proposals-to-Search-Engines)

It seems Google has cushioned this in sweet words, as to not make it blatantly
obvious.

These sites were very high on the rankings because people click them (and
rarely return to search further). Google simply aligned its search results
with peoples needs.

Google is enabling media companies in their delusions, when they could be
destroying them. They might end up as a casualty.

~~~
ChuckMcM
I think you and I have different working definitions of 'search quality'. My
definition is that the sites you find by searching for something give you the
information you want.

I think 'bowing to big copyright' is no different than 'bowing to Chinese
censors' or 'bowing to Administration staffers' or 'bowing to conservative
thinkers'. All of them affect what results you get vs what you asked for. That
those web pages have information that some other folks don't want you to see,
is a discussion between page provider and complainer, not a fight the search
engine should get into.

Using blekko.com as an example. You could create a slashtag with all the best
torrent sites you can find on the web, and then you and anyone else who uses
your slashtag could be searching all those sites for what is useful
information to them. As a copyright holder they could search those sites too
(and presumably they do) and attempt to take action against the sites if they
chose to. The search engine is a window between two spaces, and while sure as
a choke point it provides an easy target for those would would want to control
the greater masses, it doesn't improve the 'quality' of the service when that
occurs. It is of demonstrably lower quality so some folks.

~~~
revelation
I think we have the same definitions. I was just being unclear. My intent was
to express that their change was not motivated by improving search quality.

------
fpgeek
I wonder what the secondary incentive effects of this will be. I see three
possibilities:

1\. Encourage copyright owners to send more takedowns to Google Search, since
those takedowns will be "more powerful".

2\. Encourage targeted sites to send more counter-notifications since a Google
Search takedown has an effect on an entire site's ranking, not just a
particular link.

3\. Encourage shared sites (hosting providers, etc.) to shut down particular
users or areas that get a lot of Google Search takedowns, to stop the
spillover effect on the rest of the site.

I don't know which of these effects will matter the most, but I suspect at
least one of them will be behind a significant unexpected consequence of this
change.

~~~
jakeonthemove
Number 1 and 2 are most likely to happen...

------
guelo
Speaking of deteriorating Google search results, my latest peeve is when a
search shows dozens of results from the same domain. There used to be the much
more sensible "show more results from this domain" link. I have a hard time
imagining why Google thinks the new way is better, my guess is some designers
wanted to remove the link because they thought it was "cleaner".

More signs that designers, lawyers and MBAs are moving to the fore forcing
engineers out of the way at Google Search.

~~~
nkurz
Yes, this is a sad regression. In case anyone from inside Google is reading
this and looking for a good example, try this search:
[https://www.google.com/search?q=scansnap+%22s1300i%22+%22s15...](https://www.google.com/search?q=scansnap+%22s1300i%22+%22s1500%22+difference)

The first 70 results are all from the same site!

It's a decent site, but most of the pages are barely relevant. I was searching
multiple viewpoints that might explain why I would prefer one model of a
scanner over another, and getting page after page from the same single source
is much worse than the previous behaviour.

~~~
magicalist
weird, I see 5 different domains (in incognito mode in chrome and private
browsing in Firefox). The first three are all from one site, but after that
there's fujitsu, amazon, lawyerist, and some kind of "snapsnapcommunity".
There are more than a few decent looking reviews on the second page, too,
which I'm not used to seeing in review searches...

There are quite a few from that one site, though.

~~~
nkurz
Thanks for the detailed results. I get many more than 3, so your info inspired
me to try figure out what's happening. I'm now more convinced that it's a bug
rather than desired behaviour.

I first tried signing out, in case personalization was breaking things: no
change. Then I tried "verbatim": no change. Then I tried changing the number
of results from 100 to 10, and I see what you do: 3 from Documentsnap, 5 from
various Fujitsu sites, then Amazon and Snapscancommunity.

Switch the number of results back to 100, and I'm back to getting more
overwhelmed by Documentsnap. I counted again, and this time got 65
Documentsnap followed by 10 Fujitsu, then finally some variety.

------
mtgx
I wonder if this has anything to do with Google wanting to server more and
more paid content through Google Play and Youtube. Will Google be on our side
or their side when the next SOPA bill appears?

~~~
forrestthewoods
I have no idea what "our side" is supposed to stand for in this context. Is
"our side" the side that wants unlimited copyright infringement? I support
that side as little as I support SOPA-like bills.

~~~
muuh-gnu
"our side" is the pro-freedom, pro-copying, information-wants-to-be-free,
consumer side. "their side" is the pro-restriction, pro-enforcement, pro-1984
content producer side.

> I support that side as little as I support SOPA-like bills.

Theire is no "undecided" side. Either you support copyright enforcement
against people who want to freely share information without a north-korea-
style censorship body over their shoulders, or you dont. There is no middle
ground. You cant support "a little bit" of censorship.

~~~
forrestthewoods
Haha. Well in that case no I _strongly_ do not support what you so generously
call "our side". But thanks for throwing my hat into your ring.

I support environments that enable and encourage the production of new
creative works. I believe that copyright in some form is necessary to support
such a system.

"A distinguishing characteristic of intellectual property is its "public good"
aspect. While the cost of creating a work subject to copyright protection—for
example, a book, movie, song, ballet, lithograph, map, business directory, or
computer software program—is often high, the cost of reproducing the work,
whether by the creator or by those to whom he has made it available, is often
low. And once copies are available to others, it is often inexpensive for
these users to make additional copies. If the copies made by the creator of
the work are priced at or close to marginal cost, others may be discouraged
from making copies, but the creator’s total revenues may not be sufficient to
cover the cost of creating the work. Copyright protection—the right of the
copyright’s owner to prevent others from making copies—trades off the costs of
limiting access to a work against the benefits of providing incentives to
create the work in the first place. Striking the correct balance between
access and incentives is the central problem in copyright law. For copyright
law to promote economic efficiency, its principal legal doctrines must, at
least approximately, maximize the benefits from creating additional works
minus both the losses from limiting access and the costs of administering
copyright protection." - Posner, 1989, An Economic Analysis of Copyright Law

~~~
hxa7241
While that is certainly a nice and well regarded model, it does not say much
about what actually happens. It really only amounts to a proposition, and
cannot be read as indicating a necessity. As Landes and Posner themselves
later say:

"Economic analysis has come up short of providing either theoretical or
empirical grounds for assessing the overall effect of intellectual property
law on economic welfare."

\-- 'The economic structure of intellectual property law'; Landes, Posner;
2003. Conclusion, p422, s3.

That is, to spell it out, we do not know if intellectual monopoly law is doing
any good at all. This is easy to see for oneself: there must be, and obviously
are, costs -- legal, enforcement, search engine 'fixing' -- yet we have no
idea how much the gain is. If there is a negative, but the positive is
unknown, it is quite possible the sum is itself negative.

------
HotKFreshSwag
How does this work for countries with different copyright laws?

------
WiseWeasel
So if you want to decrease a competitor's search engine placement, would you
just have to file C&D notices on them?

I don't see how the amount of C&D notices correlates with value to users.
Google does not appear to be benefiting its users in doing this.

~~~
wmf
You'd probably have to file thousands of takedowns and hope the site doesn't
challenge them.

~~~
lukesandberg
Also, Google outright rejects some dmca requests and Google also releases data
about the actions of copyright owners and DMCA activity on
<http://www.google.com/transparencyreport/removals/copyright/>. So even if not
all abuses were caught they would at least be published for public scrutiny.

------
lelandbatey
My first reaction on hearing this is the by _demoting_ unauthorized content
they will end up indirectly _promoting_ authorized content.

And I don't know how I feel about this. From what we've seen with Youtube
where unwarranted claims have taken down videos very much under fair use, I
don't want Google automatically demoting search results because of a claim
made by another party.

I foresee this causing no small amount of contention.

------
joeybaker
Google has not shown that it is able to competently handle copyright
complaints. YouTube copyright take down notices happen all the time for
legitimate content, chances are a similar ratio of false positives for site
content.

------
armored_mammal
Sounds like the path to breaking Google's "monopoly" has just been revealed...

Seriously, unless it's just the faintest hint of influence I can see this
blowing up in their face. Takedown notices are too ubiquitous and too often
bogus.

------
some1else
<http://dmcarank.staticloud.com>

DMCARank is a Google Custom search that only displays results from the top 50
domains with a high number of DMCA takedown requests
([http://www.google.com/transparencyreport/removals/copyright/...](http://www.google.com/transparencyreport/removals/copyright/domains/?r=all-
time)).

------
mistercow
In addition to this being a dubious signal for measuring result quality, this
could have a chilling effect for sites concerned about SEO. We've seen the
bizarre rain-dance that businesses are already willing to do to try to improve
their position in Google results, and now we can add "avoid fair use" to that
dance.

------
latchkey
So I guess YouTube won't show up in search results anymore. #doublestandard

~~~
somesaba
right?? But obviously, this wont happen, and I don't think that's fair.

------
elsewhen
Won't this impact YouTube dramatically?

~~~
B-Con
FTA:

> Starting next week, we will begin taking into account a new signal in our
> rankings: the number of valid copyright removal notices we receive for any
> given site.

Following the link, you get to this page of stats (
[http://www.google.com/transparencyreport/removals/copyright/...](http://www.google.com/transparencyreport/removals/copyright/domains/?r=last-
month) ) on take-down requests. If that page is what they make their basis on,
then YouTube doesn't crack the top pages, so I would guess that it won't be
hurt too badly.

Although, I have no idea if their count for YouTube is accurate or not. It's
not like they wouldn't have an interest in under-reporting the numbers.

~~~
lukesandberg
read the FAQ for the site (or the "Whats included" box on overview page).

"The data below consists of the copyright removal requests we've received
through our web form for Google Search. It is a partial historical record that
includes more than 95% of the copyright removal requests that we have received
for Google Search since July 2011."

In other words youtube data is not included. This is only for websearch

~~~
magicalist
That's the point. If they are going to rank links to youtube the same way as
they link any other sites (as gp is using as the basis of the question), they
would only use search takedown requests as the signal, not requests that
youtube receives directly.

------
drucken
Copyright wars begin in earnest...

~~~
brudgers
And a power vacuum in search created.

~~~
zabraxias
Exactly what I was thinking. Not sure how Bing and DuckDuckGo stack up but it
might create an opening for a new player also - granted search is probably
still the toughest game to get into.

------
yuhong
EFF's take: [https://www.eff.org/deeplinks/2012/08/googles-opaque-new-
pol...](https://www.eff.org/deeplinks/2012/08/googles-opaque-new-policy-lets-
rightsholders-dictate-search-results)

------
RealGeek
This can be disastrous for user generated content websites.

------
jakeonthemove
Oh man, they should've never started with copyright removals in the first
place (although I get that it's near impossible to do). This new filter has
the same abuse potential as "negative SEO", where competitors posted so many
spam links to good sites that they were dropped from the first page (with the
competitor's site ranking higher as a result).

------
iyulaev
What happens if you have two pages, each listing the takedown notices applied
to the other? Seed the pages periodically and wouldn't you get a pretty good
measure of the popularity of your taken-down links by ranking them based on
how often they are taken down?

------
some1else
This will mean ore traffic for <http://www.filecrop.com> and
<http://www.filestube.com>.

------
soulclap
Sounds like this might make it harder for companies to actually find websites
posting 'stolen' content. Just saying.

------
hnriot
With Google increasing it's content footprint over the years, it's interesting
to see how the search group treats other Google groups. It's entirely possible
to spend a day on the internet without leaving Google web properties (except
hn, of course)

