Hacker News new | past | comments | ask | show | jobs | submit login
Copyright Registration Guidance: Works containing material generated by AI (federalregister.gov)
411 points by nagonago 15 days ago | hide | past | favorite | 402 comments

> If a work's traditional elements of authorship were produced by a machine, the work lacks human authorship and the Office will not register it.

> For example, when an AI technology receives solely a prompt from a human and produces complex written, visual, or musical works in response, the “traditional elements of authorship” are determined and executed by the technology—not the human user. Based on the Office's understanding of the generative AI technologies currently available, users do not exercise ultimate creative control over how such systems interpret prompts and generate material. Instead, these prompts function more like instructions to a commissioned artist—they identify what the prompter wishes to have depicted, but the machine determines how those instructions are implemented in its output.

> For example, if a user instructs a text-generating technology to “write a poem about copyright law in the style of William Shakespeare,” she can expect the system to generate text that is recognizable as a poem, mentions copyright, and resembles Shakespeare's style.

> But the technology will decide the rhyming pattern, the words in each line, and the structure of the text.

> When an AI technology determines the expressive elements of its output, the generated material is not the product of human authorship.

> As a result, that material is not protected by copyright and must be disclaimed in a registration application.

This is the crux of the argument for me.

Don't overlook the fact that agencies like the USCO are at the bottom of the totem pole in terms of determining stuff like this. They move first because they're on the front lines, as it were, but they can be overruled by the courts and the courts can be overruled by congress.

100% this will be litigated and likely altered in the coming months/years.

I'm still hoping it stays that way. That'd be a fantastic direction for works in the public domain.

Yeah but that assumes people are honest about the providence of their content and let you know up front that it was produced by an Ai and is thus not protected by copyright… and given the preponderance of copyright info in website footers, I expect this will be an error of omission rather than actively claiming incorrectly. By simply not giving individual articles a more specific copyright note mentioning the AI all you have to go on is a generic copyright statement in a footer and thus can only safely assume it’s protected.

People might not be honest. But large organizations, oh boy. There's zero way Disney will want any chance for their materials to not be copyrightable. If the Copyright office's view on this stands, there will be no great disruption of the employability of creatives.

> There’s zero way Disney will want any chance for their materials to not be copyrightable.

Sure, but Disney will just not release the intermediate products that are the direct outputs of pure prompting, and make sure they are doing at least enough beyond that before they release anything that the whole is covered by copyright.

It will be interesting to see what position large software companies take on this. They also won't want to run the risk of losing copyright in their software, when programmers use AI assistants. Likely there's less risk here, at least at the moment, because you still need to do a lot of manual work above what AI does for you. But how close are we to the point where the corporate lawyers start to get nervous?

True. Though I wonder what will happen when someone sells a wildly successful novel to a publisher for millions of dollars, and then it turns out that it was composed by an AI.

Reid Hoffman just published a book written with gpt4 [0] so it’s not even a hypothetical.

He made it available as a free ebook but it’s also for sale on Amazon.

I wonder if people remove his portions and just retain the ai portion if they can distribute it freely.

[0] https://greylock.com/greymatter/amplifying-humanity-through-...

Lawsuits and headlines :)

Well, if joe doe can now make a picture that rivals Pixar’s and it’s a commercial success, that puts a lot of pressure on Pixar & Co to reduce costs. Doesn’t matter whether joe doe gets rich with this or not.

On the other hand, you can bet big companies like Disney will lobby hard for copyright to extend to AI generated works. And it wouldn't be the first time Disney has influenced copyright law.

Why can’t the Disney creatives simply lie and use the AI to ideate and produce drafts that they touch up and pass off as their own? Seems totally rational.

The way this will work in practice is that Disney will be strongly incentivized to disclose the AI generated parts, for fear it might lose the entirety of their copyright on that work if their human authorship is not properly delineated.

> There's zero way Disney will want any chance for their materials to not be copyrightable.

Surely you can see how easy it is around this? Even if it was a "prompt" that originated the design of a Disney character, they wouldn't try to get a copyright on that.

People will lie, sure, but you only need to introduce a small amount of risk to make big risk-averse companies wary.

Say an ex-Blizzard employee takes a character design from the last project he worked on there and uses the exact same design for his new indie project. Blizzard sues and says they own the rights to the character design. The ex-employee pulls up a video he took on his phone showing that Blizzard employees generated the design with AI. Judge throws the case out because Blizzard can’t legitimately copyright the design.

Maybe not the most realistic scenario, since big companies can usually scare people into submission before you ever get to litigation in the first place. But the mere possibility of such a scenario would have to be something that the legal team accounted for in their risk analysis.

> Yeah but that assumes people are honest about the providence of their content and let you know up front that it was produced by an Ai and is thus not protected by copyright…

Detection systems for generative AI are being developed (potentially helped by generative AI systems being built with wwatermarking capabilities that are designed to be unobtrusive to humans, but detectable with tools.) Research on this (and experiments by the people selling generative AI, who also want to sell detection tools, and sell their product suite as providing “safety” because of that combination) is quite active.

I honestly think this is a fools game - maybe it will work in those very very early stages of generative AI we're in now, but there's no way this can work reliably going forward. If anything, I imagine it will soon start yielding false positives against human generated work too.

Probably not. This is gross simplification of how counterfeit photocopied money is detected, but human visual acuity for yellow dots on white paper is pretty weak compared to red dots, so color photocopiers and color printers have/had firmware that would seed money with extraneous yellow dots. Computer vision has no such visual distortion and as such computers see counterfeit money as covered in yellow dots.

In a way its simpler with AI. AI is the ultimate groupthink tool and record keeping is simple. Simply ask the AI if this is the only possible output that could have been generated that's consistent as of March 17 2023 based on political limitations and censorship rules and artistic fads trends memes and styles of that era. The smaller the AI contribution (perhaps an icon bitmap) the (edited: easier) harder it is to hide AI involvement and the smaller the damage caused by AI to the copyright status of the work. The larger the AI contribution the easier it is to detect, but the larger the damage AI is causing, so it balances out.

As a concrete example, today, ask an AI "Please write a Harry Potter book" you will get a story that's extensively filtered and censored and bowdlerized to March 2023 political / cultural standards (hmm who's standards, the book buying public or some other group? Most people do not like our current censors... but they are in total power right now...), it would NOT look like a 1997 book at all, books from that long ago are only suitable for public book burnings now. In theory it should be possible for an incredibly politically incorrect AI to be permitted in 2033 solely for lawsuit discovery purposes to "Please write a Harry Potter book adhering to what we now call the hate filled politics of March 2023", then compare the md5sums, they match. Done, AI detected.

There's not many AIs and there's not many people permitted to grind their axes by censoring those AIs, so its a pretty small solution set.

There's an interesting political aspect to banning copyright of AI production, anything turbo-woke can be assumed to be AI generated and as such uncopyrightable, so the only way to make money in the 2030's will be to sell remakes and sequels of "Birth of a Nation" and "Mein Kampf" because an AI would never be permitted uncensored expression, so those cannot be AI produced and must have been made by humans, and as such only "right wing" content can be copyrighted and sold for huge profit.

right, but what do you do once I've taken my AI generated image, run it through illustrator's image trace, and performed just enough automated manipulation, then copied it through a format or two?

Writing a method to scrub this stuff is trivial.

Did you intend to summon me, instead of the word providence?

to explain:

- provenance: the source or origin of something

- providence: divine guidance.

Sometimes I think autocorrect needs a little divine guidance, it is remarkably bad at correcting anything to certain words I forget the spelling off, and sometimes it’s definitely got a bias in its “most likely word” and I’m not suggesting a prejudicial bias, just a bias that’s less than ideal for me, and it never seems to learn that i never use one word over the other spans just keeps suggesting the words I don’t want over the ones I do.

I've caught my phone autocorrection replacing "you" with "U". I honestly believe that errors are intentionally introduced. Sometimes I even conjure up plausible-sounding theories as to why. "Engagement" is my current almost-plausible theory.

I assumed there was learning heuristics in some layers of the software responsible for the autocorrect. But other than new words that turn up in my autocorrect that I know are new because they only entered common use or were coined recently, I generally just find the autocorrect to be stubbornly insistent on it’s likeliness ordering… or just utterly incompetent…

It’s really bad at the word, bureaucracy. It never suggests this word, no matter what combination of “beuro” “beauro” or anything else my brain might ineffectually grasp for when trying to remember the spelling. I can get it to suggest it as a possible replacement when I ask spellcheck to give me potentially correct words, but the layers of autocorrect that try to predict word even if it’s not spelled correctly, they are utterly unable to predict the word bureaucracy.

I think it's possible that some generous person determines a sequence of prompts that generate, say novels, and then pipes these prompts into a program causing thousands or even millions of wholly varied novels to be generated in the public domain. I imagine this is what the OP meant.

I've been changing my mind a lot on AI these past few weeks.

I don't think the price is what stops most people from reading books. People already have access to countless works they don't have time to read, adding a bunch of soulless ones to this seems like it won't change much.

As a (once avid) reader, the worrying part will be discovery. Why I was eight I could pick any book off the library shelf and it was interesting and enlightening. Today, the noise so outdrowns the signal that I have to rely on recommendations. Tomorrow, when both the books and the recommendations will be generated by bots outpacing human authors by orders of magnitude, I expect that quality new material will be impossible to find.

I pray that I am wrong.

Honestly, library books tend to be better than random stuff in Barnes and Noble because if it wasn't checked out, it probably would've already been scrubbed from the selection (libraries have limited space), so instead of just getting whatever books were published in the last 2 years, you get books that were published in the last 200 years, and only the more interesting ones. Additionally, the Dewey Decimal system, or its replacement (don't remember what it's called), sorts stuff by similar topic, so if you're already in a section that is interesting to you, any nearby random book also is likely interesting.

This is why I like still going to physical libraries. Also, lack of user-hostile interfaces.

Libraries have limited space, yet somehow have room for cruft like a 900 page garbage book on XML from 2001. :)

(Closed) Knowledge communities* -- will resurrect and undoubtedly there will be communities with a spiritual basis. Possibly a new age of Modern Midevalism awaits.

* think monks and manuscripts

Or maybe you'll just be able to talk to ChatGPT about what you like and get recommendations.

What can change is that people can get more of exactly what they like. In which case many might well put up with imperfect continuations, and the AI will have material to mimic.

same with art and code

Agree, it's fascinating to think we could just generate a whole new public domain.

Also, it's fascinating to think humans might be kept in the creative loop only to establish copyright status. Legal enablers.

> Also, it's fascinating to think humans might be kept in the creative loop only to establish copyright status. Legal enablers.

I don’t know of any specially, but any union or trade group that represents creatives should be jumping to line the pockets of politicians to enshrine this asap.

I really hope it doesn't since the interpretation of copyright laid out above and held by the USCO is so obviously the only interpretation that preserves any usefulness of copyright as a concept whatsoever.

IANAL but AFAIK the way this typically works (as a general principle of administrative law in… most/many countries?) is that the administrative bodies who are delegated the power to make decisions are the authority on them.

If a court reviews a decision of an administrative body, the court typically starts with the assumption that the decision was correct, and puts a relatively high bar towards reversing that decision. Typically a decision of an administrative body would only be overturned by a court if that decision was patently unreasonable or fell outside the jurisdiction of the body (eg.: if it violated a different law or some constitutional provision).

YMMV in different countries (I know more about Canada).

> If a court reviews a decision of an administrative body, the court typically starts with the assumption that the decision was correct,

Same here, the underlying principle is called the presumption of legitimacy, the organisational counterpart of the presumption of innocence.

> If a court reviews a decision of an administrative body, the court typically starts with the assumption that the decision was correct

In the US, at least, it varies considerably by the exact kind of decision and the context in which it is being applied, ranging from extremely strong deference (Chevron deference) to anti-deference in some contexts. And opposing Chevron deference was an overt priority of the Trump Administration in judicial selection, so in general it would be reasonable to expect deference overall in the US to wane for some time.

Also don’t overlook that this is a “solely a prompt”; real non-trivial workflows for systems like SD (MJ I’ve used less) are different than “solely a prompt” and may or may not result in the same result even at the Copyright Office level.

I’ve been noticing my workflow in Automatic1111 basically always ends up taking an hour. I like to compare it to going to a weird shoreline filled with strange objects, picking out the “shells” I like the most, then taking them home, cleaning them up, and arranging them in a way I think looks pretty.

At some point I’m making something artistic. My friends think I’ve got a keen eye on what looks cool so I’ve been improving by leaps and bounds even though strictly speaking I’m not drawing anything? I’m just really good at being descriptive and inpainting the weird parts.

I was expecting this argument.

I'm sure they're happy to register copyright on a text you wrote yourself and used your imagination on—a prompt. I agree this has creative input on the artist's part and can be tricky to get right. Similarly, if you do significant edits to the result, I'd assume this ought to continue to legally function just like photo editing functioned before AI-generated images.

They're just saying they're not going to copyright what you didn't actually paint or decide on (like the words in each sentence or how to rhyme them, as in their example).

I think the comparison to photography or electronic music is apt.

If you just open up Garage Band and click a couple of times to enable loop "autofill" chords / rhythm, should that be copyrightable? I'd say probably not; but if you spent a bunch of time tweaking and honing all the parameters to get something unique and interesting, I'd say they should. Should the person then only get the copyright on the Garage Band parameters, but not on the resulting output audio?

Or consider the amount of creativity that goes into most photographs -- it's often nothing more than deciding to take a picture. And yet you have an implicit copyright on every photograph in your phone, just for having made the decision to pick up your phone and click.

Or consider the Copyright Office's comparison to giving instructions to an artist. Yes, if you just said to an artist, "Draw a picture of the Eiffel tower", then the artist should own the copyright. But if you worked closely together with the artist, iterating over dozens of designs and sketches, giving feedback and direction on the creative decisions, then personally I think you should share in the copyright. (Not sure what the law is in this case.)

Similarly, if you spend an hour iterating through dozens of prompts to get the output of a generated image exactly the way you want it, then you're an artist and should be able to get a copyright on that image.

Add one manual detail here and there on the image, they will have copyright protection.

This position statement makes it clear that only the manual details themselves merit protection, the base image itself must be disclaimed from the copyright registration. It would be fraud to attempt otherwise.

If I use controlnet and openpose to generate a character in the exact pose that I want - what does count as a manual detail then?

So with some custom LoRAs to maintain consistent appearance of characters and controlnet/openpose to pose characters exactly like I want I could make a comic book.

Would my work be copyrightable then?

It's more complex than a straightforward totem pole. (For example, you could look into Chevron deference)

It's the part of the bottom of the totem pole that moves first because it's on the front lines.

Each of them have long, complex procedures they have to follow in many cases, and some exceptions. A court might well be able to issue a preliminary injunction faster than an agency can do notice-and-comment rulemaking.

Agencies specialize at problems that don't need a simple rule, but rather need large amounts of meticulous and tedious special cases. They've got more bandwidth, but not necessarily shorter latency.

Absolutely agree - in fact, I'd say it will never be settled as long as AI contributors exist in any meaningful way.

“The courts can be overruled by congress” is, to put it mildly, not correct.

Congress can pass legislation that changes the legal status of things (as long as it doesn’t run afoul of the constitution). But that’s a far cry from “overruling”.

> But the technology will decide the rhyming pattern, the words in each line, and the structure of the text.

What if I generate it 50 times and cobble together a final product from the variants, choosing which line and word and rhyme I like from each sample? Where is the line between LLM and thesaurus/dictionary?

The comic book the office rejected seems to me to have crossed a line of significant human editorial discrediting after the generative fact, and they didn’t care.

The document addresses that:

> In other cases, however, a work containing AI-generated material will also contain sufficient human authorship to support a copyright claim. For example, a human may select or arrange AI-generated material in a sufficiently creative way that “the resulting work as a whole constitutes an original work of authorship.”

Pretty sure the office said the arrange of the book was itself protectable but the individual images were not.


Didn’t the author claim to be photoshopping and compositing the images themselves?

If I render 6 characters, composite them into a background and run the complete composite through another generative step to clean it up, where do I fall?

They allowed the comic book author to copyright the accompanying text and the overall arrangement of the images into a comic book, but not the images themselves.

Soup anyone?

  'Andy Warhol Copyright'

Careful! I’ve always felt that, as written, copyright should not apply to most software. Since only creative expressions may be copyrighted and functional expressions are not copyrightable, it’s very hard to match up the copyrightable portions of software with those that are not.

Requiring a declaration of what parts are and are not copyrightable for software would be a major shift in policy and require major changes in software methodology. It’s not a big jump to think that any output of AI of software code output would not be copyrightable, regardless of original origin.

Another example of “Software wants to be free.”

Not advocating copyrightable software, but it seems clear to me that if databases are copyrightable because they represent curation, then software must be too.

Any line of code, probably not. But a program as a whole, sure.

Otherwise I think you blow up music copyrights too, since a single note isn’t copyrightable and a song is just a bunch of individual notes.

At some point, choosing, ordering, and organizing a collection of non-copyrightable quanta produces a copyrightable work. And I think it had to be that way.

"if databases are copyrightable because they represent curation, then software must be too."

Circling back around to creative works, the process of making art is also one of curation, as had been amply demonstrated over 100 years ago by the Readymades of Marcel Duchamp.[1]

The curation involved in prompt selection and selection of which AI-generated works/images to use should therefore be just as copyrightable as databases, if all that copyright demands is curation.

[1] - https://en.m.wikipedia.org/wiki/Readymades_of_Marcel_Duchamp

I'll just go ahead and say Readymades aren't art and express my belief that the artistic community will survive the copyright loss of loosely arranged furniture so we can avoid justifying the current state of affairs with avante garde silliness. John Cage made a musical piece 4'33 which is 4 minutes and 33 seconds of silence which has as much to do with music as Readymades do with art.

His other works include curated noises such as buzzing blenders, falling objects, and street traffic. The only art is the act of convincing someone these things are art or deserving of copyright.

John Cage also wrote some genuinely beautiful music and was a sincere and thoughtful artist.

It just needs to said that he wasn't a merely a "gimmick" man.

I'm only familiar with those elements of his career so if you have so recommendations I'd be happy to explore his other work.

Try "A Room" for solo piano. It's akin to proto-minimalism with a bit more "spice".

His writing is apparently very good although I've never delved deep. His work was motivated by his interest in Zen and similar topics rather than any kind of high-concept modernism.

I've yet to find a better definition of art than that of Andy Warhol, who said "art is what you can get away with." Considering that both Duchamp and Cage got away with convincing the world that their work is art, it is.

To the degree that I've been able to verify, Warhol borrowed that one from Marshall McLuhan.

You mean to say he's a con artist?

I give the benefit of the doubt that Marchel Duchamp and others genuinely believed in the artistic merit of their works and their enthusiasm made it easier to persuade people to agree with their viewpoints. The brief history of these artforms suggests that in the absence of their champions they don't resonate with artists but are still talked about because they stoke conversation about the nature of art. Maybe that was their original intent. I don't think these endeavors were profitable enough to suggest dishonesty over passion but if someone tries to sell you a moose head mounted over a toilet for 500k because it's "art" you may want to question their motives.

Good art typically stands on it's own and connects with people without the need to explain why it's significant. Salvador Dali's work is visually interesting even if you have no clue what any of it means. If I see a bunch of objects scattered around a room it looks like clutter. If I see a bunch of paint splattered to a canvas if looks like a mess. If someone has to explain why these things are "art" is it the art that's compelling or their argument? The exception to that would be optical illusions.

Art is somewhat subjective so someone will always argue their own preference but if anything is art because "it's all subjective" then nothing is art. Buttering my toast in the morning could be art because I spread it using a spoon instead of a knife and that is an expression of rebellion against societal norms. Did I sell you on my art or my argument and is the latter what we want?

Apologies, I wasn't being serious - an attempt at wordplay.

I think the open question is whether software is in the realm of patents or of copyrights. It's not unreasonable to suggest the former, since software is kind of like a widget that does stuff, except virtualizing the "widget" part to run on a generic widget of sorts. Copyrights traditionally applied to works that you would consider "artistic" rather than "inventions."

What you're suggesting, about a minimal work, is litigated endlessly with respect to music copyrights (since a lot of pop music is very similar, arguments about plagiarism get crazy), but is part of the law already.

Aren’t dry, non-fiction books copyrightable? I don’t think “creative” necessarily means “artistic”, just making something where there was nothing. It’s hard for me to see a software program as a whole as uncreative in the same way logarithm tables are.

I agree it’s a complex are with no easy answer. But I really think that software, like words and notes and numbers, scales from the generic and unprotectable to the highly creative and protectable. I guess I’m saying we shouldn’t invalidate copyright based on the medium being software, any more than we would for the medium being audio.

The expression can be as dry as a desert. Is the result an act of creativity? The bar is pretty low.

Say you had a book that listed the price of eggs over 100 years. You introduced each chapter, perhaps describing the egg market at the time, etc.

The book is copyrightable though lists of egg prices aren’t. Maps are another fun case for copyright.

Software alone is not patentable. You get around that by patenting a system.

That is the state of the law today, where software gets copyrighted and can't be patented, but the debate about this is more about what ought to be the case.

Good luck changing it!

It’s neither, it doesn’t fit. Chip masks are another whole branch of IP law. As strange as it sounds, they are a bit like two sides of the same coin.

While not so in other jurisdictions, databases are not copyrightable in the US. That was true when I last read Groklaw at least.

Sweat of the brow (any work you do is copyrightable) https://en.wikipedia.org/wiki/Sweat_of_the_brow


"Original work of authorship" https://en.wikipedia.org/wiki/Copyright_law_of_the_United_St... (in particular the comparison section https://en.wikipedia.org/wiki/Copyright_law_of_the_United_St... )

The court case that set this difference down was Feist Publications, Inc., v. Rural Telephone Service Co. https://en.wikipedia.org/wiki/Feist_Publications,_Inc.,_v._R....

It would be good if songs weren't copyrightable. Going back to having music as a part of culture would be nice.

software works are normally subject to copyright without much further ado.

database works are more commonly subject to rights _similar_ to copyright, but often a class of work in its own right.

The US has for IP protections: copyright, patent, trademark, trade secret. There is no “similar to” copyright.

Databases can be copyrightable. Not all are, but to the extent IP protection extends to databases it will virtually always be copyright, with maybe an occasional trade secret (which is almost the opposite of copyright)

We speak of software design for a reason. Arguably, anything involving substantial design work is copyrightable. Buildings, for example, are copyrightable, even when they only exist as architectural documents or as models. Software is rather similar.

It's not clear to me that this position is wrong, but it seems wildly inconsistent with our policy that every photograph I take is protected by copyright, no matter how inane and low effort. I have images I generated with Craiyon (then Dall-E Mini) into which I put way more creative effort than any of the photos on my phone.

Take a photo of your screen! BAM. solved

Didn't https://en.wikipedia.org/wiki/Bridgeman_Art_Library_v._Corel.... rule that that trick doesn't work?

> “write a poem about copyright law in the style of William Shakespeare”

Oh, this can be fun.

“Write a story about copyright law in the style of Franz Kafka.”

In a world that was both strange and familiar, a man named Josef K found himself entangled in a complex web of legal disputes. It all started when he received a letter from a law firm claiming that he had violated copyright law by using an image on his blog without permission.

Josef K was bewildered by this accusation. He had always believed that his use of the image was fair and reasonable, and he had no idea how he could have infringed on anyone's rights. Nevertheless, he was summoned to appear before a tribunal to defend himself against these charges.

The tribunal was a labyrinthine bureaucracy, with endless corridors and dimly lit rooms. Josef K wandered through the maze, trying to find the right office, but every door he opened led to another hallway or another set of stairs.

Eventually, he found himself standing before a stern judge who peered down at him from behind a towering desk. The judge had a sour expression on his face and a stack of papers in front of him.

"Mr. K," the judge said, "you stand accused of violating copyright law. Do you have anything to say in your defense?"

Josef K stammered nervously, trying to explain his side of the story. But the judge cut him off.

"You do not understand," the judge said, "the law is the law, and it must be obeyed. Your intentions are irrelevant. You are guilty simply by virtue of having used the image without permission."

Josef K tried to protest, but the judge was unmoved. He handed down a verdict of guilty, and ordered Josef K to pay a hefty fine and remove the image from his blog.

Josef K left the courtroom feeling confused and defeated. He had always believed that he was a law-abiding citizen, but now he found himself caught in a nightmarish world where the rules made no sense and justice was arbitrary.

As he wandered through the labyrinthine corridors of the tribunal, Josef K realized that he would never be able to escape the clutches of copyright law. He was trapped in a Kafkaesque nightmare, where even the slightest misstep could result in a crushing blow from an inscrutable legal system.

yeah, it has that GPT style, sounds like a rushed story

It’s a decent start, but everything is just cheap. Lots of low-hanging fruits, ignoring anything higher up. In a way, it’s like as if I, someone who doesn’t enjoy writing, and is bad at it, wrote the story.

You don't even have to read it. This GPT-2 output detector tags it as 0.63% chance of being written by a human.


Should have called it "LLM Runner".

But this does not read like Kafka at all... Am I missing a joke here?

Clearly the model interpreted "in the style of Franz Kafka" to mean "featuring an inscrutable, uncaring, and inescapable bureaucracy".

Perhaps it interpreted it as more along the lines of 'write a story exhibiting features that are Kafkaesque'.

I noticed this as well.

The implication to generated code is pretty astounding tho.

What about all those (c) foo 2023 prologs you see all over the place? Will there be arguments that specific lines weren’t written by a human so therefore the entirety the claim isn’t valid? Or will there be refactors to push generated code out to a well known path or construct (akin to a header / implementation or some such)?

And on top of that, what if your “secret sauce” comes back to some form of generated construct (looking at you, Whoop, with your “strain” and “recovery” scores.

Further still - can a “machine-readable medium” that was written by a machine be patented?

As per this thread: https://news.ycombinator.com/item?id=35192038, I reckon the whole is “creative enough” and would be protected as the developer only used parts and cherry picked and combined what they needed.

> If a work's traditional elements of authorship were produced by a machine, the work lacks human authorship and the Office will not register it.

I imagine many people have a workflow where they generate images with Stable Diffusion and retouch in a painting app, do some inpainting and retouch.

It seems all of those should be covered by copyright (nothing above makes me think they won't, at least).

Yeah I think people are reading this the wrong way. Software was originally considered unpatentable under 35 U.S.C. § 101, but language was developed to allow software to be patented by narrowing the scope of the claimed inventions to a physical processor which are programmed to contain specific algorithms.

I don't think this is a showstopper in terms of copyright registration of A.I. works, but rather a first step in a back-and-forth process of figuring out the legal structure of how to makes rules and enforcement work.

Controls nets also blur the line here. You can start out with a generated image selected from hundreds of samples (human input: curation), do a canny edge detection or fake scribble, then start tweaking and adjusting it to remove elements that over-determine the output, adding or overlaying sketches to add elements, then iterate on prompts, scale up with img2img, fix smaller elements with inpainting, and yet more curation.

And that's before we start talking about finetuned models, model merging, LoRA and so on.

It's too early to say where we will end up with AI generated works. In the search for control over output, there's a lot more creative human input needed today. This is in part down to limitations in the integration of current AI, and a dialog with a chat bot, or even a voice conversation combined with a pointer may be enough to convey intent in the future. We just don't know yet.

I also wonder if there's a parallel with the relationship between modern artists and artisans. For example, many marble sculptures by modern artists are actually carved by artisans in Italy, following directions from the artist. I expect that the artisans are doing work for hire and the copyright is retained by the artist with the idea.


> “More conceptual artists have no work space at all, simply imagining ideas for a work, which they communicate with words.”

Sounds like a prompt?

> “More conceptual artists have no work space at all, simply imagining ideas for a work, which they communicate with words.”

Very interesting too, since a lot of modern artist work like a small company, where they have employees that execute on the artist's vision. The copyright still belongs to the artist, though, because of the employment contract.

Based on this reasoning, does that mean that you could not claim copyright over a picture captured by a motion sensor camera.

Basically you instructed it to capture when it determined movement and the camera determined how those instructions are implemented in its output.

Also look at all the post processing and AI that goes into smartphone cameras right now. The human element doesn't even choose which photo this is done algorithmically and the photo is stitched together in lots of different ways. The human just prompted: 'Now!'

…as long as you don’t count choice of subject matter, shooting angle, composition, crop, or moment to shoot to be creative choices.

We don't need to invoke smartphones here. There's no more or less creative input from a human in traditional photography.

Timing is only one aspect of creativity.

In most settings a video camera’s output is copyrighted because of how you select the location and angel to film. Motion capture has all of those elements, and simply doesn’t capture 99.9% of the possible images. ie: Someone picked a log crossing the river as a place where something interesting will happen and they setup the shot to look interesting.

Which is the general argument why such setups fall under copyright. Though as always there are many edge cases.

There is so many places to inject creativity into the process.

Before the capture you have: Chosing the location, posing the camera, adjusting objects in the background, setting up bate to attract wildlife, choosing optimal camera settings. Even selecting a camera could be considered a creative input.

After the capture, the artist can choose which of the thousands of captures best fits their vision, crop it and applying color correction.

I don't think a single one of those actions can make it a creative work on its own. But when you combine a few of them along with intent, it becomes a creative work.

You can do most of that with prompting and tools like control net in stable diffusion as well. And then take it into photoshop and do changes, feed it back into img2img and inpaint until your hearts content. One can spend multitudes more time than it too to tap the shutter button on an iPhone. In Midjourney of course, you have far less control.

There's a ton of control that prompt crafting alone gives you. There's also the choice of which version of midjourney to use, and various meta options that it gives you.

The copyright office clearly has not the slightest clue about what they're talking about when they claim that the AI is the sole creator here. AI generated content has always been a collaboration with humans, and there's always human creativity involved.

You can own the copyright to the prompt used to generate the output.

But the algorithm isn’t collaborating, every possible outcome is fixed when the algorithm is finalized and users can’t actually change the possibilities. I clearly don’t own the copyright to my Google search results even if my query is quite unique.

The essay that has helped me most to think about this sort of thing has been Brian Eno's Composers as Gardeners.[1] It's about music, but I think it applies equally well to AI-generated art, where humans collaborate by writing the algorithms, choosing the subjects, providing the prompts, and curating the results.

Here's an excerpt:

"...essentially the idea there is that one is making a kind of music in the way that one might make a garden. One is carefully constructing seeds, or finding seeds, carefully planting them and then letting them have their life. And that life isn't necessarily exactly what you'd envisaged for them. It's characteristic of the kind of work that I do that I'm really not aware of how the final result is going to look or sound. So in fact, I'm deliberately constructing systems that will put me in the same position as any other member of the audience. I want to be surprised by it as well. And indeed, I often am.

"What this means, really, is a rethinking of one's own position as a creator. You stop thinking of yourself as me, the controller, you the audience, and you start thinking of all of us as the audience, all of us as people enjoying the garden together. Gardener included.

"We're used to the idea, coming from the industrial and very intelligent post-Enlightenment history that we have, we're used to the idea that the great triumph of humans is their ability to control. And indeed, that must be the case, to some extent.

"What we're not so used to is the idea that another great gift we have is the talent to surrender and to cooperate. Cooperation and surrender are actually parts of the same skill. To be able to surrender is to be able to know when to stop trying to control. And to know when to go with things, to be taken along by them. And that's a skill that we actually have to start relearning. Our hubris about our success in terms of being controllers has made us overlook that side of our abilities. So we're so used to dignifying controllers that we forget to dignify surrenderers...

"...my idea about art as gardening is to sort of revivify that discussion and to say let's accept the role of gardener as being equal in dignity to the role of architect, as in fact, is shown in this lovely pavilion here."

[1] - https://www.edge.org/conversation/brian_eno-composers-as-gar...

Simply planting a tree doesn’t give you copyright of the shape the tree ends up in the way you would on a sculpture.

That’s been the case for a very long time, you need significant control over the specific output because it quantifies. A garden is copyrightable based on the layout of the plants when that involved creativity.

That’s been a legal distinction for a very long time, and this statement is simply consistent with that history.

> But the algorithm isn’t collaborating, every possible outcome is fixed when the algorithm is finalized and users can’t actually change the possibilities.

But that actually reinforces the idea that all of the creative work is in the prompt, everything else is purely mechanical process implementing the command given by the prompt. Arguably, its analogous to saying that a programmer can copyright the prompt but not the resulting image is like saying I can copyright source code, but can have no copyright on the output of the compiler.

Compilers don’t create a new copyright the output is covered as a derivative work.

However, derivative works have clear limitations and the output of a chat program doesn’t qualify any more than you own the copyright of what someone says when you interview them.

Put another way you don’t own the copyright on the specific shape of a tree as a sculpture because you selected its species when you planted it.

> Compilers don’t create a new copyright the output is covered as a derivative work.

A derivative work is a separate work that, considered apart from the one it is derived from, separately has the required creative input to be a copyrightable work, and it does, in fact, have a separate copyright from the original (creating derivative works is an exclusive, but licensable, right of the copyright holder of the original, but the copyright of a derivative is separate.)

I don’t know if that distinction was intended as a limitation, “sound recording” is listed as a derivative work in the statute. It also clarifies that “Copies” are material objects, other than phonorecords

So, mechanical transformation such as rendering a webpage at 150% scale is seemingly a derivative work even if there isn’t any creativity in the process.

I would argue that location and angle to film are analogous to specifying that you want a poem (as opposed to an essay) about copyright law (as opposed to quantum mechanics) in the style of William Shakespeare (as opposed to John Milton).

Basically, those are very broad parameters specified, and the machine is the one that decides to produce the output. You aren't even picking the timing or the subject or likely even the focal length. The machine is doing that based on its algorithms (and with autofocus) even the focal length.

It’s not just location and angle but where to aim the motion capture sensor, what if any bait to use etc. There’s effectively infinitely many possible ways to set of these things.

With chat bot’s every possible response is predetermined when the algorithm is setup. Saying you should get copyright is like saying you should own the results of a search engine response to your query.

How is setting up a motion camera on a log not similar to setting up a prompt for ChatGPT?

Yeah, I think in the general case there's going to be a significant amount of selection and arrangement of the works from the AI, though I guess that even then none of the underlying images (or such) of the AI get protection.

I wonder if they know about the "inpainting" technique where the human generates an image, erases parts of it, then adds another image and has the AI fill in the blanks and make a new image based on the resulting image?

I think that'd be harder to disentangle the way they do right now to say these parts are copyrightable because the human did them and those parts are not because the AI did them since you can't really separate the inputs at that point.

The camera doesn’t decide what to put on the picture?

It does… or at least the human doesn’t.. it’s motion activated

Motion capture rigs are doing capture whatever trees/rocks/buildings etc the person setting it up aimed at. The only thing motion capture does is pick interesting times, but you can write software that does the same thing with a video.

The camera is not making a decision. The human made the decision to set the camera up to take pictures when motion activated.

There was a case last year where a person sued a content creator on YouTube for distributing parts of their accidental livestream. [1] The argument for fair use was that there was no creative input, so fair use. Though, I really do not know, as I think images and videos can be copyrighted exactly for the reason in your comment.

[1] https://www.youtube.com/watch?v=Uk1Tzqc5vk4

The concept of “threshold of originality” plays a role in whether something is copyrightable: https://en.wikipedia.org/wiki/Threshold_of_originality

Your question therefore cannot be answered a priori. The copyrightability has to be assessed on a case-by-case basis.

No. You instructed the camera, placed it and set the angles.

If a monkey steals your camera and takes a selfy you can’t copyright that.


> You instructed the camera, placed it and set the angles.

And I instructed the model, and set its tuning parameters.

Someone else could just use the same prompt and get the same results, would you sue them for copyright infringement then? Do you think that makes sense?

I disagree, because instructing the device the circumstances to capture the image, whether it's a timer or motion or some other detector or your direct button press, that is not the only element of photography. They're saying the only creative elements of the work were done by machine.

If you asked me to write a poem about copyright law in the style of Shakespeare, is that something you can copyright, or a work derived from something you can copyright? I don't think so.

It's not that producing a prompt for a writer might not be a somewhat creative act itself, but that's not really recognized by copyright law. Ideas aren't copyright, works are. Colloquially people might agree that a work could be significantly derived from an idea, but I don't think that's how copyright law itself works.

Linux was written from scratch, copying no code from Unix and therefore is not a derived work or infringing on Unix copyright. Unquestionably it faithfully copied many ideas verbatim from Unix, the invention or development or discovery of those ideas were probably the most substantial creative contributions that Ritchie and Thompson had, and the code itself relatively mundane (though expertly written) by comparison. Those ideas/inventions are not classed as copyright works though.

I'd think so, considering how security camera installers never put any effort into capturing footage themselves and are usually the only ones to have the opportunity to do so.

Made the larger context needs to be considered. Being in a place at the right time. The choice of lens, the type of camera, the direction of the sun. A camera captures a moment in time, and that moment will never be repeated. If AI removes blemishes from a face, it's still a photo of a face, and that face only existed that way in that moment.

You could argue the person who set up the camera chose the angle or frame, just like a normal camera operator.

In the same way that the person using ChatGPT chose the prompt?


How are they different?

Has anyone tried to claim copyright over that before? I wouldn't be surprised if it was rejected.

Yes, definitely. Trail/nature cam photos fit this description.

I'm curious where the "line" is, though. If I do a web search for a term, I get various results. As I refine my search, the results get closer to what I want. Eventually, there comes a point where it's possible to recognize that there is "skill" associated with crafting a good web search. To some extent, the same is true of crafting what you tell the AI to generate for you; especially as AIs become more complex in what they can be told to do.

So, given that, at what point does enough creativity go into the crafting of the instructions that it _can_ be considered, at least in part, a work by the person giving the commands.

Being good at using a search engine doesn't mean that the produced results are your IP.

Never for the generated work, because a human isn't doing it, but you may be able to copyright the prompt

The prompt isn't a creative work, it's constrained by a technical need (producing a desired response from a generative computer system).v

So is work done in a program like photoshop, and that is considered a creative work.

The steps you take to make a work in a graphics program, eg "open an image, circle area with lasso tool, apply content-aware-fill" are not copyrightable. The method and product are different things.

> these prompts function more like instructions to a commissioned artist

I find this very reasonable and a great analogy. However, today, can one not copyright commissioned work? Can a company not own copyright for work produced by its employees?

I find the much closer analogue to be instructions for taking a photograph. And that IS considered copyrightable. There are how many free parameters for a photograph? Position in space (3 degrees of freedom), position in time and (if we're not talking 360deg cameras) 3 degrees of freedom for orientation. Maybe another degree of freedom for exposure time, etc, but for an automatic camera those are taken for you. So let's say 7-8 degrees of freedom.

32 bits each for position on the earth (64 bits total... and this is optimistic as vast majority of pictures are on land, near cities, etc), 16 bits for elevation (8 bits likely more than enough for most), 12 bits for each rotational degree of freedom (overkill), 32-36 bits for time. So about 150 bits of unique information? Add another 50bits if you have a manual camera (vast majority of pictures aren't taken that way nowadays), and you're left with 150-200bits. So about the same entropy as a sentence with a dozen or two words in it. All the rest is done by a machine. And this is considered enough for photography, but not enough for machine output. Doesn't make much sense to me. (A Haiku is also about 120 bits, and it is copyrightable.)

The actual GOOD prompt results I've seen typically require a lot more than a dozen words, whether ChatGPT or Stable Diffusion or whathaveyou, and typically involve quite a lot of trial and error.

In that scenario the original artist is granted copyright and assigns it to the company. Another option would be for the artist to 'license' the work to the company and keep copyright ownership themselves. Since there is no original artist to be granted copyright then there is no one to assign it to the company so no copyright under current statute. I and others got so many downvotes here for pointing this out previously.

> In that scenario the original artist is granted copyright and assigns it to the company.

Not if it meets the standards for a “work for hire”, then the employer is the copyright owner ab initio.

> In that scenario the original artist is granted copyright and assigns it to the company.

Whoever paid you to create it is the copyright holder, the artist doesn't have to grant anything when commissioned to produce work for hire, as it's the employers'.

I never thought what would end copyright would be artificial intelligence, but I'm glad at least something positive came out of it.

This is only applicable to a creativity based copyright hence this might actually trigger a shift to a copyright system more like the British or the Japanese ones.

> When an AI technology determines the expressive elements of its output, the generated material is not the product of human authorship

"Expressive elements" is doing a lot of heavy lifting there. Is there a a point at which a prompt becomes sufficiently complex or detailed that it would quality?

Yes, the point at which the prompt, and not the AI, is found to determine a sufficiently substantial part of the expressive elements. From what I’ve seen it’s probably difficult to make a prompt that specific with the current models.

> Instead, these prompts function more like instructions to a commissioned artist—they identify what the prompter wishes to have depicted, but the machine determines how those instructions are implemented in its output.

If you commission an artist, the produced result becomes yours and you own copyright on it. How different is it with AI?

The AI did not creatively decide to do something. An order was placed, it fulfilled it, and ownership was transferred.

Because when you commission someone _they initially get copyright_ and then assign it to you.

What's the difference between this and a corporation owning the work of an employee?

The difference is that a human employee would normally own it, and only because the corporation pays the employee to do it does the ownership transfer to the corporation, whereas in the case of an AI there is no copyright to begin with, because it’s an AI and not a human that creates the work. This is because the existing law ties copyright to human creations exclusively.

The difference is that the employee is a person.

What if I created the training data and instantiated and trained the AI? If this still applies it suggests an AI has human standing and can compete with me for ownership of intellectual property by negating my rights, rather than me being a programmer and it being merely an algorithm.

> But the technology will decide the rhyming pattern, the words in each line, and the structure of the text.

But what if the prompter then takes the AI text, changes some of the rhymes, perhaps adds a word here or there, does it suddenly come back under copyright.

Let's take it to the extreme - if you randomly outputted noise as a picture, that picture is not copyrightable (or ditto with random letters/numbers and printed it as a book).

It feels to me that the copyright office is using this similar line of logic, to make the claim that the resultant work is not copyrightable if it was wholly generated using _only_ a prompt and an AI.

It's a bit odd though. What if you built your own ML model and trained it over a set of data that your wrote yourself. Would the work generated by the AI based of your prompts be not be copyrightable?

The original copyright laws were thought up way before even cameras, and we're still trying to apply them today to generated AI. Why can't we just realize that the world is very different now, and just create new laws? Instead we keep trying to arbitrarily interpret the law in a biased way to try to fit our modern goals as best we can.

> The original copyright laws were thought up way before even cameras, and we're still trying to apply them today to generated AI

but the original laws worked well with cameras didnt it?

The legal idea, that unless a human had creative input, it won't have copyright, doesn't fall afoul of ai generated content. There's nuance of course - what counts as creative input etc.

Of course, a new paradigm is possible with the advent of AI, but it would make copyright _looser_, rather than tighter, imho (and it would be to the progress of the arts and science to do so). But i don't see why it is fundamentally needed.

But following from that, a machine can not create a copyright violation either.

Is the lack of human authorship the gist of the argument? Can you claim copyright on the output from a compiler? Or is this like work for hire, where you just give instructions to the creative professional?

If it's an optimising compiler like the ones used in deep learning, then it might have a substantial contribution.

I wonder how this will apply to code that comes out of systems like copilot?

Ok but how will they know?

Wow, someone needs to notify photographers about this immediately. No more copyright for them.

Did you read the linked article?

> In Burrow-Giles Lithographic Co. v. Sarony, a defendant accused of making unauthorized copies of a photograph argued that the expansion of copyright protection to photographs by Congress was unconstitutional because “a photograph is not a writing nor the production of an author” but is instead created by a camera. The Court disagreed, holding that there was “no doubt” the Constitution's Copyright Clause permitted photographs to be subject to copyright, “so far as they are representatives of original intellectual conceptions of the author.”

"so far as they are representatives of original intellectual conceptions of the author."

"representatives of original intellectual conceptions"?

Wouldn't that only to apply to photos made with some preconceived idea of what you wanted the photo to look like?

What about spontaneous, "point and shoot" photos? What about a photo that I take with my eyes closed, without intent? And how original is your typical family snapshot or wedding photo? They're about as generic and unoriginal as you can get, yet there's no doubt they're copyrightable.

Interestingly is the case of a monkey taking a selfie.


point and shoot is almost certainly copyrighted. eyes closed might not be, (I kind of doubt there's case law here). the bar for originally is pretty low for copyright but it's not zero.

What if someone who's blind just randomly pulls out a camera and takes some snaps (without knowing anything about what might be going on). Do they get a copyright on their photos?

probably not

Then Lomography can't be copyrightable, especially if "shooting from the hip" (rule #4)? [1]

[1] https://www.lomography.com/about/the-ten-golden-rules

> “so far as they are representatives of original intellectual conceptions of the author.”

Yeah, much like a person using an AI.

Do you understand the debate happening here at all?

The notice explicitly covers photography.

> In Burrow-Giles Lithographic Co. v. Sarony, a defendant accused of making

> unauthorized copies of a photograph argued that the expansion of copyright

> protection to photographs by Congress was unconstitutional because “a

> photograph is not a writing nor the production of an author” but is instead

> created by a camera.

> The Court disagreed, holding that there was “no doubt” the Constitution's

> Copyright Clause permitted photographs to be subject to copyright, “so far

> as they are representatives of original intellectual conceptions of the

> author.”

Much like the writer of a prompt. The notice is stupid.

Many commenters seem to be under the impression that "because I made something, it is copyrightable". That's not how it works. What is protected under copyright is determined by long processes of lobbying, legislation, lawsuits, and court cases. For example: a melody is copyrightable, a chord progression is not, even when the chord progression is so unique as to be the primary intellectual asset of a piece. (Look up contrafacts from the bebop era!) Another example, the rules around the copyright of a sound recording are very different from those of the intellectual property of a song.

Only specific things are copyrightable. There is nothing unusual (from a legal sense) in the government and others taking a stance on which things those should be and under what circumstances. You have no a priori right to copyright something because you fucked around with a prompt for a long time.

I don't see how they could go with a different stance given that all that is required to copyright a written work is to write it and declare it copyright (IFF it is copyrightable, that is). This will fall apart it if it's possible for a company to have AI spew out eleventy-billion variation of training input and declare them all copyright.

Don't forget that in the early 90's we went through an era of records being yanked off the shelf at great cost to labels when the sampling laws finally settled. (Anyone else remember the appearance, disappearance, and reappearance of Us3's "Hand on the Torch"? fond memories!)

> all that is required to copyright a written work is to write it and declare it copyright

You don't even need to declare it. Declaring it only affects things like whether infringement is "willful" or not when deciding penalties. If you create a (US) copyrightable work, you retain the copyright by default. For it to not be under copyright you have to explicitly place it in the public domain.

I could be wrong, but I think I remembered reading that in music at least, you have to do something with it (like capture it on paper or recording). I seem to remember reading this came about to prevent people from being able to say something was copyright because they maintained they had written something say 10 years ago but just not gotten around to recording it - and then wanted to sue. But this is vague memories from reading about the bizarre and convoluted world of music publishing....

Correct. The law requires it to be “fixed in any tangible medium.” Such as transcribing onto paper, recording onto record, exposed onto film, etc. Merely thinking, saying, playing, singing, dancing, etc a newly created concept is too ephemeral to be a recognizable act of authorship.

Well, almost any proof that you danced that concept would be fixing it to a tangible medium. That rule mostly just rules out heresay, eyewitness accounts, and other such methods of "proving" that you created something

Putting it on a website is enough.

"You have no a priori right to copyright something because you fucked around with a prompt for a long time."

Rights are a legal fiction anyway, so there are no a priori rights to anything.

Rights are consequences of valie decisions that cannot be deduced from facts, so there are only a priori rights.

They're only consequences in the sense that definitions are consequences, because that's pretty much what a "right" is: a definition.

They don't exist before they are defined by humans, so in that sense they do not have an a priori existence.

> They’re only consequences in the sense that definitions are consequences,

A priori truths are, exactly and only, those things that are true by definition (and only exist within a particular axiomatic framework.)

You’ve described exactly why rights can only be a priori, but used it to deny that they are.

> They don’t exist before they are defined by humans, so in that sense they do not have an a priori existence.

If things exist before they are defined by humans and human knowledge of them comes as a result of experience with their existence, that knowledge is a posteriori. You’ve marshalled a cogent explanation of why rights cannot be known a posteriori, but then presented as an argument for why they cannot be a priori.

I think the difficulty we're having in communicating here is that I'm not using "a priori" in the Kantian sense as being in contrast to "a posteriori", but rather in the colloquial sense of "having a prior existence".

It's also irrelevant to the point I'm trying to make how or whether humans have knowledge of some pre-existing things (through experience or through some other means).

Since you brought up axiomatic frameworks, I'll have to differ on that as well, as I don't think the majority of humans that talk about rights do so based on any kind of axiomatic framework, but rather on their biases and because they've essentially been brainwashed in to thinking that way through education, socialization, the media, peer pressure, parenting, etc..

So my view and Kant's view on this differ significantly, and while I give him credit for these terms, I don't subscribe to his views and don't feel bound to use these terms in the way Kant or Kantians would have me use them.

> I think the difficulty we’re having in communicating here is that I’m not using “a priori” in the Kantian sense as being in contrast to “a posteriori”, but rather in the colloquial sense of “having a prior existence”.

“Having a prior existence” is very much what a posteriori is (a priori or a posteriori are epistemic terms, they don’t refer to the thing itself, but to our knowledge of the thing, if our knowledge is independent of our experience [as it is with things with no prior existence of their own] then it is a priori, if it is derived from experience [as it is for things that have prior natural existence rather than being pure products of intellect] it is a posteriori.)

Sure, if you reverse the definition of a priori, it applies to exactly the opposite of the things it usually applies to, but that’s probably not the best way to communicate. (And this is literally the first time I've ever heard the suggestion of an English "colloquial sense" of that phrase, which is used exclusively, in my experience, as a technical term of art in epistemology (and is used in English largely because an imported Latin technical term won't conflict with any colloquial understanding that might attach to a more natural English alternative.)

> > Since you brought up axiomatic frameworks, I’ll have to differ on that as well, as I don’t think the majority of humans that talk about rights do so based on any kind of axiomatic framework, but rather on their biases and because they’ve essentially been brainwashed in to thinking that way through education, socialization, the media, peer pressure, parenting, etc..

That’s still an axiomatic framework, its just one that is practically useless to discuss because each of the things which you might like to be conclusions dependent on logic and a smaller set of simple principles are just independent axioms not subject to debate. And, yes, its very common.

The issue to me here is that code is a literary work. Literary works are copywritable. Except that AI generated works without human authorship aren’t per this.

Interestingly this came up when copilot was announced and one comment or another was “this will be an interesting bit of case law.”

Well now we have to see who wants to go first and establish precedent for the next half-century. :-)

If I look at how the law is written, I get an impression that technically you might even say that copyright law doesn't exclude non-human authored works - they fit the clause of types of protected work, a machine might be considered an author, etc, HOWEVER, when all that is said and done, a machine can't have any rights or standing in court; i.e. even if someone would assert that a machine is the author of some work and has an exclusive right to reproduce it, then that right is unenforceable because the human society and its courts simply will ignore any violations of a machine's rights.

> The issue to me here is that code is a literary work.

Code (or anything else) is not a “work” without the required element of human creative input.

It isn't AI, it's a glorified Xerox machine.

That form of AI has already been invented long ago, it's called simply Copy & Paste.

GPT combines prompt, context and knowledge, it selects and adapts code. It even does problem solving: above average human rating on easy problems and 20% over humans on medium level problems. When did the xerox or parrot for that matter do that?

If you think copy and paste is artificial intelligence then there probably isn't much of a discussion to be had here.

Yes, very smart and talented people have released ground breaking and amazing tools leveraging massively trained ML models. The model is not problem solving or intelligent in any sense of the word (take it from oai: https://openai.com/blog/chatgpt)

"ChatGPT sometimes writes plausible-sounding but incorrect or nonsensical answers..."

To be back on topic, I don't think some sweaty dude from the internet should be issued a copyright for typing "Minnie Mouse wearing a pikachu t-shirt!" into DALL-E and submitting the resulting image.

At the same time, I read a post like this one: https://andys.page/posts/how-to-draw and I have really no objection to copyright being assigned. I think the difference is the human authorship, and I think the Copyright Office has made a pretty good first swag.

A bit more like a sampler than a xerox machine... but sampling laws got clarified eventually too!

Yay, the software development industry is saved! (only partly /s)

Not that I thought AI would be writing decent code for a while yet, but the fact that AI-generated code can't be copyrighted (and therefore licensed[0]) is going to create problems.

It'll be interesting where they draw the line with this, though:

If the AI generates all the code, but then a human debugs it and alters it, is that copyright that can be owned? Does the entire code base then become copyrightable?

If a human reverse-engineers uncoyrightable AI code, does that reverse-engineered code become copyrightable?

If a human downloads someone else's uncopyrightable AI-generated code, and makes some changes to it, can they then claim copyright on that code?

[0] Since the whole of software licensing rests on copyright, this is going to get tricky for licenses. How do we determine if a restrictive license has been added to uncopyrightable (and therefore unlicensable) code?

> If the AI generates all the code, but then a human debugs it and alters it, is that copyright that can be owned? Does the entire code base then become copyrightable?

I am not a lawyer and I did not research anything for this, but I'm under the impression that a derivative work of something in the public domain is itself copyrightable. If something isn't copyrightable, it's in the public domain. So, if you alter it sufficiently to create a derivative work, the altered form should be copyrightable. But the original would still be public domain. I think?

> derivative work of something in the public domain is itself copyrightable

I don't think this is true in most of the world.

Here's a UK referience: https://copyrightservice.co.uk/copyright/p22_derivative_work...

"You cannot extend the duration of copyright in a work by creating a derivative work. If the original work is in the public domain, it will remain in the public domain; you cannot prevent anyone else using the same public domain work for their own purposes."

The derivative is subject to copyright, the original isn't. If I make a new translation of The Count of Monte Cristo I own the copyright to the translation but not to the original story etc.

What you've quoted directly agrees with the interpretation of the person you replied to.

Yes, exactly. If you for example rewrite parts of Shakespeares Romeo and Juliette you have copyright on the changes you made. So the parts you didn't change are in the public domain while the parts that are changed you got the copyright for (assuming the changes reach the level of copyright and aren't just for example spelling errors). Same with code. If you extend a MIT library you have copyright to the changes you made but not the parts that aren't changed.

How do you determine if code is written by an AI?

This does not really matter in practice. The risk of legal sanctions is too high for most businesses, they will follow the law. This is kind of similar to pirate software, businesses almost never use them even if they could definitely get away with it in most cases. The issue is that a single angry former employee is all it takes to make your life hell. This is even more true for large organizations where many people would know about the unlawful practice.

I have two thoughts about this.

First is who cares about large corporations? Sure large corporations have the money to buy licensed software, but I know plenty of small-to-medium corporations that operate on pirated software.

Second is your statement doesn't mean anything at all. Yes, you can enact a policy in your corporation that no one is allowed to use AI tools like Copilot to write code, but how do you monitor this? How do you know if some developer did use Copilot? This all feels like complete lip service with no actual force behind it. I am 100% sure that even my corporation's code base already contains stuff written with help of an AI, but there is also no question that the code is fully copyrighted.

> Yes, you can enact a policy in your corporation that no one is allowed to use AI tools like Copilot to write code, but how do you monitor this? How do you know if some developer did use Copilot?

That’s easy: corporate firewalls that block all traffic to openai.com, its subdomains and the IP ranges used by GitHub Copilot.

Enterprises that care about exfiltration of code from their internal networks (e.g. banks and other heavily regulated entities) typically hand out computers that are locked down to their employees, including developers. So any engineer that wants to install the GitHub Copilot extension or indeed any non-approved third party extension from the VSCode Marketplace will first have a word with the folks in IT.

> corporate firewalls

Yeah, just like how last week I requested IT to open up twitch.tv so I can watch programming live streams over lunch, but I was denied (however Youtube is wide open, so I could just watch the VODs anyway) and instead I just used my phone's data to watch twitch on lunch.

If corporate firewall is anything but a slight inconvenience for you then you are not technical.

Last part is such nosense I can't even respond.

yeah but you could get the AI to write the code on your personal laptop, then copy it over to the work laptop.

I can see this being a thing.. "I have three jobs as Senior Engineer for three different organisations. All I do is copy code from an AI engine to my work laptop all day"

I guess that could happen and we will definitely see some people try this. But in the grand scheme of things, it will be exceptionally rare. Most developers can't work outside their developer environment set up by their company, they often rely on internal tools, services hosted on the internal network, stuff like that. If stackoverflow and google didn't cause this to happen, I don't see how GPT will.

>Most developers can't work outside their developer environment set up by their company

Are you kidding me? Is this really how you see our industry? You really think that most developers literally can not do work without their company's IT setting up their machine?

Is this normal? This to me sounds like you are saying most devs are such noobs that they can't do their jobs.

In practice, the larger the organization the less likely the potential legal sanctions are to dissuade them. My observation has been that once an organization (in the US anyway) grows large enough it is in a special protected status where no real penalties can come to it and there is certainly no risk of exposure to criminal charges for the decision makers.

Source: front page here every single day.

As someone who works for large enterprises: they are absolutely terrified of legal sanctions and pay huge amounts to contractors who can mitigate the risk. And sanctions do regularly happen, they are just not advertised on the HN frontpage I guess :)

Good question. I assume the methods they're using to determine if an essay is written by an AI won't work on code?

How do you determione with an essay is written by an AI?

There have been some papers and articles on it. Apparently it's possible (with GPT3 anyway, maybe not GPT4).

This is all very sane and consistent with previous opinions on the matter.

But it doesn’t answer any tough questions either, like: if an AI model outputs something very close to a training input, does the result infringe on the copyrights of the input work?

The statement addresses this:

>The Office recognizes that AI-generated works implicate other copyright issues not addressed in this statement. It has launched an agency-wide initiative to delve into a wide range of these issues. Among other things, the Office intends to publish a notice of inquiry later this year seeking public input on additional legal and policy topics, including how the law should apply to the use of copyrighted works in AI training and the resulting treatment of outputs.

I think that the answer they want to say is, “it probably does”, but they don’t want to say it and cause a big disruption to commerce until the courts intervene.

There is precedent in Authors' Guild vs Google and Perfect 10 vs Google that it doesn't, this isn't the first time things like this have come before the courts.

There are also laws to consider. I know this is US law and the US doesn't have those laws, but copyright harmonisation is a thing and the EU, UK, Japan etc all have laws very explicitly stating that TDM is an exemption to copyright - some bodies would presumably consider that.

And finally, there's national security. The US's current view is that it's in an existential great power competition with Chinese society, and needs to compete on every level to succeed. One of those levels is very explicitly AI, it's considered a main "battlefront" in the new Cold War. If the US significantly restricts TDM, that not only makes it less competitive compared to jurisdictions that don't like the UK, EU, Japan, and China, but it specifically improves the effectiveness of the Thousand Talents program which the US IC and analyst community view as a very serious threat. Academics want to be able to do their research, AI is critically important to US national security, and currently the US is statistically the best place to be an AI researcher. The government won't want to jeopardize that when China is already advancing in AI very rapidly and actively trying to tempt scientists to come over with better pay and conditions.

If you (like me) were wondering what TDM means - Text and Data Mining.

A Deeper Look into the EU Text and Data Mining Exceptions: Harmonisation, Data Ownership, and the Future of Technology, Journal of European and International IP Law, 2022


"This paper focuses on the two exceptions for text and data mining (TDM) introduced in the Directive on Copyright in the Digital Single Market (CDSM).

While both are mandatory for Member States, Art. 3 is also imperative and finds application in cases of text and data mining for the purpose of scientific research by research and cultural institutions; Art. 4, on the other hand, permits text and data mining by anyone but with rightholders able to ‘contract-out’ (Art. 4).

We trace the context of using the lever of copyright law to enable emerging technologies such as AI and the support innovation. Within the EU copyright intervention, elements that may underpin a transparent legal framework for AI are identified, such as the possibility of retention of permanent copies for further verification.

On the other hand, we identify several pitfalls, including an excessively broad definition of TDM which makes the entire field of data-driven AI development dependent on an exception. We analyse the implications of limiting the scope of the exceptions to the right of reproduction; we argue that the limitation of Art. 3 to certain beneficiaries remains problematic; and that the requirement of lawful access is difficult to operationalize.

In conclusion, we argue that there should be no need for a TDM exception for the act of extracting informational value from protected works. The EU’s CDSM provisions paradoxically may favour the development of biased AI systems due to price and accessibility conditions for training data that offer the wrong incentives. To avoid licensing, it may be economically attractive for EU-based developers to train their algorithms on older, less accurate, biased data, or import AI models already trained abroad on unverifiable data." [reformatted]

> There is precedent in Authors' Guild vs Google and Perfect 10 vs Google that it doesn't

Fair use is entirely dependent on the specific use. While those decisions may protect (for instance) OpenAI, they’re not helpful in determining if anyone can use the outputs from such a model.

Yes Google can index an image and serve it in search results. But I cannot copy a copyrighted image from Google and publish it in my book.

Questions like “is the output that ChatGPT just gave me encumbered by copyright?” are still extremely difficult to answer.

I agree that we’re losing our technological edge. This is why I wish we had better answers for these questions. If we want to lead with the rule of law, then the rule of law needs to be able to lead.

This article talks about what is copyrightable. Your question is about infringement of already copyrighted material.

Yes, I understand it was out of the scope of this article. I’m just saying, these are the easier questions to answer.

That’s why they are being answered first. The others take more time.

A lot of these things are already addressed on USPTO’s FAQs and these answers are the same but “with AI” tacked on.

eg: “only human works are subject to copyright”, “the work must be an original work of authorship”, etc.

It’s just disappointing that they haven’t produced anything helpful yet. I suspect that the result will end up being permissive because there will be a big industry built up around people just kicking the can down the road by the time anyone authority issues a more specific opinion.

I don’t think that question is very tough. It’s almost certainly copyright infringement. The method of copying doesn’t really matter.

Well, I more mean that the answer that many suspect is the case, is tough to swallow.

In that case, the consequence would be: If you don’t validate that your AI model is producing something sufficiently different than all of the inputs, you’ve created a copyright infringement Russian roulette.

Yep. And DMCA safe harbor doesn’t apply, so there won’t be much of a defense if it happens.

And how close is too close?

You can get ChatGPT to output copyright material very easily

You can get ChatGPT to output copyright material very easily.

You can also copy and paste from an existing copyrighted work.

In either case, it is you, the human - with agency to do so - who should be liable for the content that you publish.

That’s the concern. You can easily identify when you are copying others work when done directly. If you use a black box that sometimes gives you infringing output, how do you use it with confidence?

You can use it, just don't use it for things that you publish without doing sufficient diligence to see if that work would be infringing on something else.

I intend to make an e-ink panel that displays woodcut style images generated by stable diffusion (get a list of a 30 or so different descriptions of scenes, and have them slowly get generated over the day and display a different one each hour).

So, here's the question - are any of those woodcut images derivative of some other work? Don't know - and it likely doesn't matter as it would be something hanging on a wall in a room.

On the other hand, if I was to collect them and publish them as a book, then I, as the human who is publishing them, would need to do sufficient diligence to see if any of them are derivative works. They might be - but I am the one publishing it then - not Stable Diffusion.

Likewise, if I was creating a collection of epic rap battles between historical figures as generated by GPT... then I should search to make sure that none of the phrases that are used in there are lifted directly from some other source. Again, it is me as the human with agency that is publishing it, not GPT.

If you were to go to Fiverr and request 100 pictures of woodcuttings or people to write epic rap battles between Julius Caesar and Ghengis Khan - would it matter if it was a human lifted the text or GPT? It is still upon you, the person publishing it to do the check.

You are right, but AI products are basically all currently being pitched as if that is unnecessary.

> This is all very sane and consistent with previous opinions on the matter.

Is that sarcasm?

Not at all. There’s literally nothing in this article that would go against ideas already covered in an introductory lecture in copyright law.

Thanks for the clarification; I wasn't sure how you meant it.

Adding context: this decision for example follows naturally from the monkey selfie case.

I’m curious. If I’ve spent 8 hours getting the perfect prompt, fine tuning a few LORAs, mixing them, choosing between 8 different checkpointed stable diffusion models, and have done a bunch of in painting, does this constitute a copyrightable work?

Certainly a lot of artistic vision and effort to get to the finished “work”.

> If I’ve spent 8 hours getting the perfect prompt, fine tuning a few LORAs, mixing them, choosing between 8 different checkpointed stable diffusion models, and have done a bunch of in painting, does this constitute a copyrightable work?

The 8 hours doesn’t matter. By the USCO’s ruling, iterative prompt refinement probably wouldn’t, but that’s not super clear (partially, because the USCO’s description of what the model is doing justifying its determination is a fundamentally strained metaphor, so while its clear how it applies to the exact case it describes, its not clear how it generalizes.)

If you are fine-tuning a model (LORA, Checkpoint, whatever), then your input isn’t just a prompt to the model (once or in an iterative process with review of the output), so, your pretty far outside of where the ruling provides clear guidance.

> and have done a bunch of in painting,

inpainting is probably the thing most (even though it again involves prompting, it involves specific selection of where within the image to apply that based on aesthetic concerns) similar to the traditional creative parts of visual art, and the strongest argument given the shape of the USCO description of its rule.

But, I have a feeling that that a rule that appeals to tradition and vague analogy to lower tech visual techniques probably won’t be anywhere close to the final word on copyrightability in this space.

IANAL but I'd assume "inpainting" is legally considered equivalent to arranging. You're not pushing pixels in Photoshop, you're just running the AI again on a particular part of the output, thus combining that output with previous output. This seems equivalent to bricolage at best.

In usa (but not uk), effort isn't relevent, just how "creative" the work is.

For example, digitizing famous paintings is labour intensive but not considered copyrihtable.

That is the "sweat of the brow" copyright argument, and it has been rejected in by US courts.

The parent is complaining of amount of effort put into the work but should be justifying that each of those steps was a creative decision by a human which contributed to the end result.

With ControlNet and LatentCouple you are doing straight-up set direction. With Loras you are casting, costuming and directing the film/lens/lighting.

One could argue that setting up a printing press for high-quality art printing is creative. Deciding what inks to use for a 7-color press and how to separate the original image was a creative act.

Or at least it used to be. Now it's automated.

You might be able to copy-write the method by which you generated something, but not the thing generated itself. Although even the former act might be too far, it's like saying that you fooled around with the parameters on a computer for 8 hours before it started working, and now you want to copy-write the operation of the computer: the computer is doing what it was intended to do, you just needed time to figure out how to use it. You could copy-write a training manual for how to ride a bike, but not bike riding itself.

You cannot copyright the method of doing anything.

You might be able to patent it, but that is a different thing.

Copyright applies to the end result. Patents apply to methods and mechanisms.

No I understand, by "the method" I meant a written work about how the operation is performed--apologies if that was too vague in my OP--not the literal method itself, as in the example of copy-writing a manual for riding a bike.

Nitpick: Everyone does this, but it's "copyright" as in "the right to copy", rather than "copywrite"/"copy-write".

That sounds like your prompt is protected by copyright. I don't see why the output of a machine you feed your prompt into should be.

That argument runs afoul of the "rule of the shorter term" of the jurisdictions that start the copyright term from the publication instead of creation.

Very interesting take! I wonder if a similar argument would apply to something like https://www.youtube.com/watch?v=ZOD7HQOnKAE - where the setup is a lot of work, but the actual execution is entirely up to nature/gravity.

Sounds like the prompt itself is quite copyrightable, the same as any code is copyrightable.

Indeed in the contents is this:

> While some prompts may be sufficiently creative to be protected by copyright, that does not mean that material generated from a copyrightable prompt is itself copyrightable.

I would assume the prompt itself is copyrightable. As far as inpainting or other manual modifications, I'm uncertain.

the OP doesnt include any exceptions based on effort, so no.

Applications are open for YC Summer 2023

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact