

Ask HN: Is this copyright infringement? - zaroth

Let&#x27;s say I build a node-webkit browser based app which pulls content down from various public sites to display into a new dashboard.<p>Maybe it&#x27;s financial data (stock quotes), maybe it&#x27;s real estate, maybe news, or RSS feeds, whatever.<p>There are public APIs, designed to be pulled by a browser, where  anyone running the software would be sending requests to whatever services.<p>To me, my guess is it shouldn&#x27;t be infringement because the user is operating the software to access public interfaces. The user should be able to decide how to view content which you make available to the user.
======
patio11
_To me, my guess is it shouldn 't be infringement because the user is
operating the software to access public interfaces._

You will find this distinction nowhere in the laws of the United States.

 _The user should be able to decide how to view content which you make
available to the user._

The laws of the United States, broadly speaking, make copyright holders able
to make decisions of how to exploit their copyrights. That is what puts the
"right" in copyright. "I feel like this would be more convenient for me" does
not establish a copyright exception. There's some gray areas here for e.g.
format shifting, but the model you're outlining is simply wholesale copyright
violation. You would be unwise to pursue it. If you do pursue it, you will
eventually be told to stop. After you fail to stop, should one or more
copyright holders attempt to use the legal process to enforce their
copyrights, you will lose.

~~~
zaroth
Let me give an example -- I wrote some code to scrape real estate listings,
populate my own DB, and a front-end to rank properties and track changes. It
polls the listings and lets me know when things change.

At this point it's just code I wrote for my own use. I can't see how that's
copyright infringement, but even if it were, it's just me organizing data on
my own machine. There's no distribution, there's no commercial element. I
could achieve a similar result pinning printouts to a bulletin board.

Now, let's say tomorrow I open up a Github repo and post the code under an MIT
license... Someone wants to download and run it, they can. The program doesn't
serve any content to anyone else, there's no P2P going on here. There's no
sharing, or inducement to share.

~~~
dangrossman
Merely populating your DB was infringement of their copyright. So is "just you
organizing data on your own machine". Making copies, and making derivative
works (transforming the work in some way), are both exclusive rights held by
the author of a work. Distribution is _another_ of those rights, and
infringing _any_ of the rights is illegal. You do not need to distribute
anything, or induce distribution, to be infringing someone's copyright.

The Copyright Act isn't terribly long or complex, and there are many good
summaries of it online if anything's confusing. You might want to read it.
Ignorance of the law is not a defense should you be sued.

~~~
zaroth
I think it's quite a bit more complicated and nuanced.

For example, derivative works and transformative works are totally different
beasts, so using the words as synonyms, and then saying this stuff isn't
complex, I think is itself a good counterpoint.

If you're interested, take a look at the article I linked on Personal Use and
Copyright Exhaustion.

------
dctoedt
IP lawyer here. Don't rely on the following as legal advice about your
specific situation, but in _U.S. law_ :

1\. The odds are that _contract law_ is at least as much of a problem as
copyright law. Check the terms of service of the various sites that you
access; they probably prohibit doing what you're doing.

(It's another question whether any given site's terms of service would apply
to your access of the site if you didn't click on something like "I agree" at
some point [1].)

2\. There's no copyright in facts or data per se, but there _can_ be a
copyright in "original" selection and arrangement of facts and/or data. [2]

3\. Copying and/or distributing copies of someone else's original content --
such as news write-ups or images -- is very definitely a problem under
copyright law.

4\. A content provider doesn't need to include a copyright notice to be able
to get copyright protection.

5\. Some content providers have tried to claim IP-like rights under the "hot
news" doctrine; it's unclear to what extent such claims are viable. [3]

6\. Conceivably one or more of the sites you access could claim that you
infringe their trademark rights if you included their trademarks in your
dashboard or in your marketing.

[1]
[http://en.wikipedia.org/wiki/Browse_wrap](http://en.wikipedia.org/wiki/Browse_wrap)

[2]
[http://en.wikipedia.org/wiki/Feist_v._Rural](http://en.wikipedia.org/wiki/Feist_v._Rural)

[3]
[http://cyber.law.harvard.edu/people/tfisher/IP/2011_Barclays...](http://cyber.law.harvard.edu/people/tfisher/IP/2011_Barclays_Abridged.pdf)

~~~
zaroth
Thanks very much for your comment.

I agree that contract law is definitely involved. I think Terms of Service are
an agreement between the end-user and the site/service, not the software
developer and the site. Google isn't agreeing to the ToS of every site someone
visits using Chrome. Similarly, and end-user can probably use 'curl' to
violate many sites' ToS, but that's not curl's problem.

I did find this article "Copyright Exhaustion and the Personal Use Dilemma"
[1] somewhat informative, although it doesn't really address my particular use
case, because the content the users of my software would be accessing is not
purchased, but being provided for free under some Terms of Use.

E.g. "To the extent a copy owner reproduces or adapts her copy in order to
enable a personal use, exhaustion insulates her from liablity." But it seems
like this is a very gray area of copyright law.

Digging back into the specific example, Redfin Terms of Use state:

    
    
       You will not copy, redistribute, or retransmit any of the information provided
       except in connection with your consideration of the purchase or sale of an
       individual property.
    

So providing software which enables end users to collect and organize
_specific_ listing information from Redfin (not a site crawler), for that
user's own personal use, stored on that user's own machine... I think would
actually fall within the Redfin ToS.

Another way to think about it, it's analogous to extending the 'Bookmark'
functionality of a web browser to better organize and monitor your bookmarks
for changes. Does 'Bookmarking' a webpage become copyright infringement if the
bookmarking software supports...

    
    
      Mashups / co-mingling the content of related bookmarks
      Identifying diffs since the last time you visited
      Background polling of the bookmark with a notification when it changes
      Integrated "Bookmarklet" functionality which changes *how the content is displayed*
    

[1] - [http://www.minnesotalawreview.org/wp-
content/uploads/2012/11...](http://www.minnesotalawreview.org/wp-
content/uploads/2012/11/Perzanowski_Schultz_MLR.pdf)

~~~
dctoedt
> _I think Terms of Service are an agreement between the end-user and the site
> /service, not the software developer and the site._

Don't be too sure about that -- the software developer usually _is_ an end-
user to at least some extent, if for no other reason than to ascertain how to
scrape the site, and might very well be bound by the ToS. (Check out the link
I provided about browse-wrap agreements.)

> _Google isn 't agreeing to the ToS of every site someone visits using
> Chrome._

No, but you might be inducing someone else's infringement, which could make
you just as liable.

You really need to talk to a good copyright lawyer about this. (Not me --
unfortunately I'm committed to other projects.)

------
zaroth
However, the alternate argument is, how much control should the publisher have
over the full manor in which that content is displayed to the end user? I
guess, this is what robots.txt is all about.

But where's the line to call something "robot"? A script run by a user, for
their own use, I don't think it's bound by robots.txt. Does it matter who
wrote the script?

~~~
fenomas
robots.txt is nothing more than a way of politely asking the world not to
recursively crawl through links on a site. It has no connection to whether
someone is allowed to access the data or what they're allowed to do with it.

------
notatoad
You can't void copyright - the owner of the data still owns it, no matter how
they release it. Even for a work that you put in the public domain, you still
own the copyright to it - you're simply giving everybody an unrestricted
license to use it. By putting data on a publicly available website (or
webservice) you are not giving up any copyright protections.

It sounds like in this case the copyright belongs to the owner of the API you
are querying. They are licensing their data to the consumers of their API,
under the terms of service on that API. if you violate their TOS, you're
violating their copyright.

------
zhte415
Check the terms of use of each of the APIs.

------
dbpokorny
Well, why don't you infringe the copyrights and then tell us how it goes?

