
Coding Horror: YouTube vs. Fair Use - fogus
http://www.codinghorror.com/blog/2010/09/youtube-vs-fair-use.html
======
bl4k
The more interesting next step is that the matching is being used to send
advertising revenue to the copyright owners. So instead of taking down your
video, the copyright owner can opt to leave it up and collect on the ad
dollars.

The recent example of this was the clip from Mad Men about the carousel (if
you haven't seen it, look it up). Uploaded by an ordinary user but AMC left it
up and instead collected on the ad revenue.

So the tech isn't just being used to prevent any clips from reaching the web -
it is actually enabling copyrighted content to be shared and quoted on the web
with a sustainable business model - a whole point that Jeff missed.

The Economist wrote about this a while ago.

~~~
brown9-2
And everyone wins in this scenario, don't they? The content owner collects
revenue, and the general public gets to view the video they were (probably)
searching for.

~~~
borism
and what if you added just a little bit of your own creativeness into the
video/sound clip? shouldn't you be paid also? will machine be also determining
how many percent of the revenue you should be getting?

~~~
chopsueyar
If I buy a can of tomato soup and pour it in the ocean, do I not own the sea?

~~~
msg
According to Andy Warhol

------
w1ntermute
Jeff has it wrong. This isn't about fair use, this is about YouTube's content
policies. And if YouTube doesn't want any copyrighted content being hosted on
their site (fair use or not), they have the right to remove it. I don't
understand why it's so hard to realize that YouTube _is letting you host video
on their site for free_. If you want fair use to apply, host it on your own
server and spend your own money on the bandwidth costs.

~~~
mcmc
IANAL, But this doesn't sound entirely correct.

Youtube is operating as an OSP under the DMCA safe-habor laws, and this has
been upheld in court. Youtbue must therefore act neutrally in each case --
that is, they have obligations both to the uploader and the alleged owner. If
they don't fulfill these obligations, such as reinstating a video that doesn't
actually violate copyright law (given the existent of fair use,) then they can
lose their protection under the DMCA.

Granted, Youtube could opt to not seek protection under the DMCA, but it seems
like this isn't their best strategy.

------
dkarl
_Unfortunately, my fair use claim was denied without explanation by the
copyright holder._

Denied by the copyright holder? Now we see that the system does not exist to
uphold the law, but rather to serve copyright holders whether they have the
law on their side or not.

~~~
ergo98
I really don't think that Jeff's assumption of fair use would stand any legal
test: You can't use excerpts of movies to add humor to a piece. That is not
fair use.

Fair use entails criticism, comment, news reporting, teaching, scholarship,
and research. Fair use != bedazzling a blog post with humorous excerpts.

~~~
stonemetal
1.the purpose and character of the use, including whether such use is of a
commercial nature or is for nonprofit educational purposes 3.the amount and
substantiality of the portion used in relation to the copyrighted work as a
whole; and 4.the effect of the use upon the potential market for or value of
the copyrighted work.

It was for personal use, not much was taken and there was no market effect so
it very well could be fair use.

~~~
ergo98
It was not for nonprofit educational purposes. It isn't even reasonable to say
it was for _personal_ use given that Jeff's last blog entry was packed full of
affiliate links.

His blog is a commercial enterprise.

~~~
oasisbob
Commercial use does not invalidate a defense of fair use, it's only one part
of a multi-part test.

Don't equate fair use with classroom use. There are enough uptight librarians
who do that already.

------
sharpn
I had a similar experience, but the matching success was even more surprising:

I uploaded a home made video of a world cup goal as seen from a plaza
screening the game. Loads of crowd noise, AND a big chunk of the video did not
have the screen visible (though I guess the commentator could be heard). About
20 minutes later I got a similar email saying FIFA had claimed the rights &
the clip is now restricted in some countries. I guess they could have data
mined from the title etc., but that's still damn quick for a live broadcast
match.

~~~
dmoney
Can a ballgame be copyrighted? I would assume that only a particular recording
of it could be, and since you recorded this it would make you the owner.
Apparently what makes FIFA the "owner" of the video is that they are a
"Partner" ( <http://www.youtube.com/t/partnerships_faq> ). You have to meet
some criteria, such as popularity and exclusively "owning" content you upload,
to become a partner. The partner system appears to be stacked in favor of
large entities, regardless of whether they truly "own" the content.

Let's say you had a popular channel though, and you submitted the homemade
clip. Depending how YouTube's system works, it could have been FIFA's video
that was flagged instead of, or in addition to, yours.

Edit: The logic's a bit circular isn't it? You have to own the content to be a
partner, and you have to be a partner to own the content.

~~~
brazzy
__I would assume that only a particular recording of it could be, and since
you recorded this it would make you the owner. __

No - he recorded a broadcast as seen on a screen, i.e. made a vastly degraded
copy of a specific live video stream.

~~~
dmoney
Oh, I thought he had made a live recording of the game. My mistake.

~~~
sharpn
No, brazzy is correct - sorry if that wasn't clear. The game was in South
Africa, but I filmed a live screening of it in Mexico - a shot of the screen &
stage as a goal was scored, then pan to the (Mexican) crowd reaction. I wasn't
complaining it was blocked, I was (hopefully) giving colour to the original
assertion & conveying how quickly the obscured 'owner' was identified. I
understood this was as a result of a 'safe harbour' ruling, but I guess
dmoney's point about YouTube's 'partners' also played a part. [edit] corrected
attribution.

------
saucetenuto
> Unfortunately, my fair use claim was denied without explanation by the
> copyright holder.

Wait, what? Why is it the copyright holder that gets to decide what
constitutes fair use? The present system seems subject to abuse.

~~~
lhorie
Conveniently, nobody said anything when it used to be the other way around...

For whatever it's worth, look at the first comment:

I _n the context of youtube -- which is the context you were uploading the
clip to, after all -- it's just a 90 second clip from their movie. No
editorial, it's the entire thing. Someone browsing youtube who finds it will
have no idea that there was a blog connected to it._

------
willvarfar
The amount of CS that Jeff doesn't know never ceases to amaze me; he should
read more Hacker News ;)

~~~
jshen
Yet he's built one of the most useful websites I've ever used.

I think the disconnect between these two statements says something.

~~~
KaeseEs
Jeff has one of the classic problems of an autodidact - gaps in his knowledge.
If you followed along with the saga of SO's creation on codinghorror, you saw
him run into these headlong on many occasions. He also tends to pontificate on
issues he's learned recently and doesn't understand completely a bit too often
for my liking. One thing he does have going for him though is an unbelievable
amount of pluck.

~~~
evo_9
Sounds like Jeff has a non-CS background; aka, he's one of these guys that has
the capability to really understand this stuff at a deep, meaningful level,
but sometimes (often?) runs into his own limits (note: this is not to say CS
'people' are incapable of this - but rather - this is a trait that is not
automatic whether you are 'self-taught' or a CS grad).

The 'plunk' you are describing is his ability to just 'get it done' - a trait
I think most tech founders greatly underestimate. It's an ability or mental
strength that makes a person able to will the project over the goal-line. I'm
amazed how hard it is to complete anything - whether it's a startup app I'm
working on or a piece of music - finishing it 100% and getting it 'in the can'
is no small feet. It's special gift, and one that I would highly recommend
anyone searching for a founder look for in their partner(s).

I have to admit I like Jeff, from his articles and such he seems to have a
pretty clear way of thinking through big tech issues, and while he may
sometimes surprise me with a 'lack of knowledge', I just remind myself that we
all have gaps in our knowledge and at the end of the day, he's done some
incredible work, and this 'lack' hasn't hurt him much (maybe it caused him to
take longer to get to where he is, but he still managed to get it done).

~~~
sbov
I don't quite understand your first sentence. How can you learn anything new
if you don't run into your own limits?

~~~
evo_9
Ha! Good point... I was trying to get at this idea that some people are
naturals and Jeff seems to be that type of person; for what he might lack in
formal education - and hence may run into these 'limits' more than others
might - he has other worthwhile traits that compensate or even compliment this
perceived lack on his part.

For all we know it's his lack of knowledge that gives him an edge - he runs
into something he doesn't get or hasn't been exposed too and dives in, eats it
up because he needs to really know it. Compare that to a guy that takes a
class on some esoteric aspect of coding, when he/she doesn't have any need to
apply it, when they just care about getting a good grade - which person really
has learned the material? Does the person that already learned it but didn't
need it at the time have a really big advantage? They might work through the
problem faster, so maybe that's the advantage - you save some time.

But I like the root of your point - one doesn't grow/learn without trying to
hit their own limits and push past them. Good stuff man.

~~~
sbov
That makes sense.

Theres one big difference between real life and (undergrad?) CS classes: the
discovery phase. In most classes, the solution is laid out for you or implied
in some way. Algorithms classes generally just consist of following a spec
outlined by the book/teacher.

But when you run across a problem in real life it doesn't tell you to use a
specific algorithm to solve it. Because of this, finding the solution usually
requires a deeper level of knowledge.

When I graduated, that was the scariest part for me. Sure, I knew a bunch of
algorithms, but I didn't have much practice in discovering when to use them
and applying them to problems.

~~~
evo_9
Exactly - how to find answers to new problems is huge. That's probably the
question that makes or breaks interviews I give to new programmers. If they
can't tell me how they'll figure something out on there own, yeah not good. In
fact, even just being honest and saying you'd not consider this and weren't
sure would be fine too. But you'd be surprised how many people how no real
answer or scramble something out that doesn't quite cover it.

The funny thing is - esp. depending on how old you are - it is incredible how
much easier it is to learn/find answers now. Pre-google it was tough - I
frequented Barns & Nobles often to see what new books had arrived, or just to
go find an answer in a book I knew they had.

But I digress... google/stackoverflow/etc have all made our lives so much
easier.

------
terryjsmith
Can anyone on here speculate on how you might even start with this? Even using
their heat map system, you'd have to run that same test on every frame (every
few frames maybe) of every piece of copyrighted content uploaded. I don't
consider myself a brilliant programmer, but I wouldn't even know where to
begin on doing something like this at scale.

Maybe you could index key parts of the frame and use that to narrow it down
bit by bit (pun probably intended)? So confused...

~~~
j-g-faustus
Here's a starting point, how to do it with music:
<http://www.redcode.nl/blog/2010/06/creating-shazam-in-java/>

You chunk a song into small segments and create a sort of "fingerprint" for
each piece using Fourier analysis or similar. (So a song segment is
represented by a frequency histogram, similar to this:
<http://www.flickr.com/photos/svartling/4229109164/> ) The fingerprints are
stored in a DB.

To match a song fragment, chunk and analyze it in the same way. For each
segment, find the closest match in the database. If most of the closest
matches belong to the same song, and appear in the same order as in the song
fragment you are trying to find, you have a match.

Creating fingerprints from a series of video segments is roughly similar,
Fourier can be used for 2D images as well.

Doing it like this, you can reduce the volume of data you need to compare
against with several orders of magnitude. Scaling to YouTube volumes is still
hard, but that's the sort of scaling a company like Google already has plenty
of experience in.

~~~
robryan
What about slowed down or sped up video? The chunks would be of different
size, I suppose cropping would be less of a problem as it would still match up
best to the same movie. The shazam paper may have covered this but I read it a
lot time ago, can't remember.

~~~
j-g-faustus
I don't know how YouTube does it in practice, but you could either

\- make the chunks small enough that they are essentially static images (one
or a few frames)

\- use some form of dynamic chunking where a chunk lasts until the image is
sufficiently different according to some metric.

In both cases the chunks are fairly resistent to changes in speed, and the
worst you need to deal with is that multiple chunks in clip A may map to the
same chunk in clip B.

------
edanm
Interesting side point: in building their ability to "fight" copyrighted
material, YouTube is amassing one of the largest libraries of digital videos
ever created. I'd be interested in hearing about this, and whether it really
is bigger than any other video library.

Edit: Also would be interesting to hear how many content-owners choose to
block video uploads vs. add advertising.

~~~
VladRussian
looks like a foundation for a B2B (web)service from Youtube: request - digital
signature of content (Shazam or other) response - results of search against
the library.

------
acqq
It seems that Jeff didn't try to read:

<http://en.wikipedia.org/wiki/Fair_use>

Otherwise he would see how easily his 90 seconds segment of the film can't be
treated as the fair use:

"In 1985, the U.S. Supreme Court held that a news article's quotation of
approximately 300 words from former President Gerald Ford's 200,000 word
memoir was sufficient to constitute an infringement of the exclusive
publication right in the work"

~~~
kenjackson
We need to be very careful here. You may want to read the Wikipedia entry that
talks about this specific case
([http://en.wikipedia.org/wiki/Harper_%26_Row_v._Nation_Enterp...](http://en.wikipedia.org/wiki/Harper_%26_Row_v._Nation_Enterprises)).

There are some VERY important points about this ruling. The judges ruled that
this 300 words violated right of first publication. These quotes were
effectively taken and printed PRIOR to publication of the book, and materially
damaged the authors.

From Wikipedia: "The purpose or character of the use was commercial (to scoop
a competitor), meaning that The Nation's use was not a good faith use of Fair
Use in simply reporting news."

This is equivalent to PopStar X about to release a groundbreaking song that
they are going to premiere on MTV and then VH1 gets a copy from an intern and
plays 30s of it before MTV does. The court is saying that you can't claim Fair
Use here... I think this is fair and reasonable.

~~~
acqq
I agree about that case.

However back to the wikipedia Fair Use article: "(...) the quantity or
percentage of the original copyrighted work that has been imported into the
new work. In general, the less that is used in relation to the whole, e.g., a
few sentences of a text for a book review, the more likely that the sample
will be considered fair use." and regarding the use of samples: "Samples now
had to be licensed, as long as they rose "to a level of legally cognizable
appropriation.""

His 90 seconds clip of the movie is the 100% of his video "creation" and is
also a clearly "recognizable sample."

Finally "To justify the use as fair, one must demonstrate how it either
advances knowledge or the progress of the arts through the addition of
something new."

It's really hard for him to claim fair use, knowing how fair use is judged in
court practice. There's huge difference in our idea of fair and what's
considered "fair use" in the current law practice, which makes precedence.

------
pbw
I don't know how their system really works, but surely they extract features
from the video and index the hell out of it. So they are not actually
comparing frames of video here. With clever features and a good index you can
rule-out millions of videos with very small amounts of computation.

It's impressive but it is the same idea as the normal text index. You could
say it's mind boggling that Google searches through every document in 0.2
seconds, but they don't really search through them, they do lookups in an
index.

~~~
loumf
Right, but they match words against words more or less. This matches even if
the video is degraded, cropped, etc. They have figured out how to apply
stemming concepts to video and audio.

~~~
pbw
Approximate matching of images/video/audio is not new to Google's system.
Google's system is certainly an impressive application of this stuff, though.

<http://en.wikipedia.org/wiki/Digital_video_fingerprinting>

<http://en.wikipedia.org/wiki/Acoustic_fingerprint>

[http://en.wikipedia.org/wiki/Scale-
invariant_feature_transfo...](http://en.wikipedia.org/wiki/Scale-
invariant_feature_transform)

------
barrkel
Now imagine your Google TV examining what you're watching, and checking that
you're correctly licensed to view it.

~~~
houseabsolute
Since that's not among the features of Google TV, why not just say FutureTV
instead? Or indeed TiVo or your DirectTV set-top box? I'm sure this was not
your intention but you might be unfairly giving the impression that GoogleTV
is anti-user, when in fact there is no evidence that this is the case.

~~~
Splines
My set-top box already does that.

I'd be worried to have such functionality baked into a mainstream OS. That'd
definitely be a deal breaker for me.

~~~
houseabsolute
It doesn't compute to me that a feature you call a deal-breaker can be found
in a device you already own, but ok.

~~~
Splines
Sorry, I guess I should have been more specific. I don't have much of a choice
when it comes to my set-top box. I was thinking more along the lines of
desktop operating systems. At this point, OSX/Linux/Windows are roughly
replacements for each other (Linux/Windows moreso than OSX/Windows), and such
a "feature" in one would give me great reason to migrate to another.

Then again, if there was an application/feature that was a must have for me in
said OS, I might stay anyway. It's not an absolute.

------
guelo
Why couldn't Jeff host the video himself?

~~~
pixelbath
Isn't the point of a video sharing site so you don't have to worry about the
overhead of transcoding, building a player, and distributing/debugging your
videos?

~~~
toolate
Why should video be any different from images? I wouldn't consider putting my
sites images on imgur, why should I put videos on Youtube?

~~~
ludwigvan
Because simply "the point of a video sharing site is so you don't have to
worry about the overhead of transcoding, building a player, and
distributing/debugging your videos". Video formats on the web aren't as
standardized as images, at least till html5 catches on, and then there is
again a controversy bt. webm and h264.

------
gigafemtonano
I'm torn on this issue. If you watch the TED video referenced in the blog
post, the YT representative speaks about the JK wedding video [1] as a vehicle
for getting the Chris Brown song back onto the charts. She also points out
that The Office "parodied" the video for their season finale. I'm seeing a bit
of a disconnect here - it's as though she's making an argument to the studios
and record labels: let users make use of your content and you'll get free
ideas for TV shows and better sales of old songs. Aside from getting on talk
shows and raising a small sum for charity, what did the wedding party receive
for their creative efforts copied verbatim by The Office?

Perhaps an alternate strategy would be the following: a fixed dollar amount of
revenue generated would go to the copyright owner to cover a licensing cost
for the content in question, with any additional amount going to the creative
individual(s) who in all likelihood have a fair use of the content in the
first place. The YT rep speaks of culture and joy, but watching others benefit
financially from one's creative efforts is the surest way to smother such
expression.

[1] - <http://www.jkweddingdance.com/>

------
narrator
This is why I don't watch movies anymore or read anything but the classics.
Why fill your brain with images and thoughts you're not allowed to make
reference to without paying someone. Why not just watch all your movies on
YouTube, even if they are crappy homemade ones, at least you'll be able to
make reference to them later.

Basically, I never let my thoughts get owned and polluted by restrictive
copyright owners, especially when it comes to entertainment.

~~~
michael_dorfman
Let's not be silly. You're allowed to reference whatever you want. You're
allowed to _quote_ whatever you want.

The open question is: how long of a quote is permissible before you exceed
"fair use"?

~~~
acqq
See: <http://news.ycombinator.com/item?id=1702783>

It was already ruled by the U.S. Supreme Court that 300 words taken from 300
words from 200,000 word material IS an infringement. That's only 0.15 %. Here
if the movie was 2 hours and he took 1.5 minutes, that's 1.25 %.

~~~
michael_dorfman
You're over-reading the Supreme Court case. They said that _in that particular
case_ , 300 words from a 200,000 word book was an infringement. That doesn't
mean that all 300 word quotes are necessarily infringing.

There are criteria for judging "fair use"; Jeff refers to them in the article.
All require judgment, and in the end, if both parties want to push it far
enough, it is a judge who will decide how a particular case meets the
criteria.

~~~
acqq
I fully agree that based on that case we can't establish the unique amount of
percentage of the material. But also note how little of material can be
considered infringement: The article I linked contains much more criteria than
Jeff presented and has even more dramatic quote: in music "samples now had to
be licensed, as long as they rose "to a level of legally cognizable
appropriation."" I can imagine that for usage of the video material the court
can be as much restrictive as for music.

