Hacker News new | past | comments | ask | show | jobs | submit login

Why wouldn't this be the case? In what sane legal system would one not own the work they published just because it was generated in part or full by an AI? If company B lifts company A's AI and starts generating articles with it, the suit would then be about the code itself, just like with websites, video games, etc.



https://fairuse.stanford.edu/overview/faqs/copyright-basics/

"Finally, to receive copyright protection, a work must be the result of at least some creative effort on the part of its author."

So the question is, who is the author and if it was machine generated, did it take creativity? The algorithm took a lot of creativity, but did the output that is being copyrighted? I mean I would come down on the side of yes, but it makes for an interesting case.


> if it was machine generated, did it take creativity?... but did the output that is being copyrighted?

In the US, this question has been settled since at least the 1990s (e.g., in the context of videogames). The output of algorithms is, in general, copyrightable, although there are some rather common-sense exceptions.

The question isn't whether you can copyright the output of an algorithm. The more salient question, in my mind, is whether the output of ML algorithms belongs to the owner's algorithm or to the owner of the training set.


> It's just that the owner of the training set -- not the owner of the algorithm -- is the one with the valid claim to copyright.

Not all that different from a pop song made by editing together licensed samples, no?

In that case, the song is certainly a derivative work of the samples, and so the producer of the song needs to get derivative-works-allowed licensing from the samples’ authors (which is what you must necessarily get when buying samples from a sample library, for them to be of any use at all.) The produced song is then its own work with its own copyright. Sometimes, larger samples (like reused vocal performances) require payment in, essentially, “equity”—a percentage of the song’s royalties are transferred as royalties to the sample. But in most cases, the sample is purchased for a flat fee, and there is no ongoing relationship between the revenue of the song and the revenue of the sample.

Is anything different if you replace “song” with “news article” and “samples” with “training set”?


Copyright isn't a natural right, and giving rights to computational algorithms isn't a normal legal act - how does it benefit society to do that? Is the deal good for the populous as a whole?

We can have the output for the cost of the energy, or we can perpetually (AIs never die!) pay tax to a wealthy capitalist and have the same output; why is the latter better?


Hmm, yeah I can see the argument there. I guess it boils down to whether one believes in a transitive property of creativity. I think it applies until the B in "A -creates-> B -creates-> C" is deemed to have certain rights, which is going to be the really interesting question with all this. AI will be like a child prodigy with exploitative parents


AIs don't need to "eat", therefore they don't need copyright protection, if someone duplicates the work the AI produces it doesn't jeopardise that AIs livelihood as it doesn't have a livelihood. Copyright is a bargain intended to enlarge the public domain and reward creative people for the creative works they make.

Yes, we reward AI makers by giving them copyright protection over their work, we don't - and shouldn't in my personal opinion - reward machines. Why would we, what's the benefit in human terms? There's no moral hazard in turning a machine on and off when we need creative works that the machine is programmed to make or don't need more of such works.

Copyright protections that serve the wealthy owners of AIs whilst they simultaneously undercut creative people producing simulated culture (cheaper than actual culture) would not serve the demos.


The creator of the AI still needs to eat. Your suggesting that AI developers should effectively have none of the existing legal protections for software and other creative works. Also, the "bargain" clearly applies to AI applications. Why/how would anyone start a business like https://brandmark.io/ if the generated logos have no legal protection?


> Why wouldn't this be the case?

One possible legal theory: because the algorithm was trained on a text corpus upon which the algorithm's owner has no legal claim.

In this particular case, I don't think that theory would hold much water.

However, consider, e.g., a model that produces encyclopedia entries and is trained on a half dozen existing encyclopedias. IMO, if that model is using techniques similar to SoTA and isn't producing utter garbage, then the owner of that model should have a very difficult time claiming that the output of their model is anything more than a sophisticated round-about way of copy/pasting from existing encyclopedias.

But still, in that case, the output is still covered by copyright. It's just that the owner of the training set -- not the owner of the algorithm -- is the one with the valid claim to copyright.


>> One possible legal theory: because the algorithm was trained on a text corpus upon which the algorithm's owner has no legal claim.

The same can be said about human writers: they learn to write based on thousands of "training examples" - the articles and books they read thorough their life.


Not at all.

Or rather, Who knows? Maybe. But certainly, at least today, a SoTA model generating a quality encyclopedia certainly is not doing what human writers do, and is certainly effectively copy/pasting.

Maybe in 50 years -- or 10 years with a major breakthrough on the level of general relativity -- that statement might be true. but it's certainly not true of today's deep NLP systems.


A better example is the "copy and paste" news articles that saturate feeds everyday.

The exact same set of facts, that were obviously reported originally by a single individual, then rearranged, reworded, and republished by 100's of "reporters"/"bloggers", (sometimes) with an attribute of origin.


That would be a problem, but would be a data licensing issue, which is distinct. It's more analogous to "Blurred Lines" infringing on "Got To Give It Up" or w/e.




Applications are open for YC Summer 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: