Hacker News new | past | comments | ask | show | jobs | submit login

This is just an end to copyright. There is no definition of "AI", it has always been a marketing term for monetising csci research. There is no difference between a lossy jpg, taking its pixels as weights, and the weights of a NN.

So if i just zip up copyrighted images using a NN, then, what? They're public domain?

Regulators here are miles away from understanding the implications -- this is what happens when you let companies whose profit motive is selling "AI" be the "Experts" on the topic.

If you don't care about Disney, fine -- so what about your health records? This is also the prelude to an end to privacy




Completely crazy take. Your health records are not currently protected by copyright law. If someone magically snapped their fingers and eliminated copyright law, the protections on your health data, scant though they may be, would more or less be the same (IANAL)


So to be clear -- you're well aware that in the case of your private data that you have an interest in preventing it being used to train AI.

Great. So d'you think you could outline a reason why you wouldnt have an interest in your creative works not being used also?

Either the training data is, as big-ad-tech says, essentially equivalent to generic human experiences -- ie., weakly repoducible; OR it is extremely reporducible, and equivalent more to standard contemporary data compression.

If you're kool-aid'ing the former on copyright, why not the latter on privacy?>


Because my private data isn't protected by copyright, it's protected by things like HIPAA which doesn't matter one iota about human experiences and applies equally to humans and machines. It's about data sovereignty and who may access my data and for what purpose. A human is not allowed to share, retain, or reproduce my medical data.

So arguments like "I can get the AI to output my chart verbatim" start carrying weight because it's granted access to data that the humans that created the AI are not permitted to share in any form whatsoever where as copyright concerns what I may do with the data after it's produced. Copyright is full of exceptions for things that don't count as a reproduction or performance of the work and this is just one more, it doesn't change the nature of copyright.


Equivocating health records to art style is a laughable proposition to build an argument on. I mean, c'mon.


If you don't care about Disney, fine -- so what about your health records?

Can you help me understand the relationship here?

Something can be protected by privacy laws without being under copyright, and vice versa.


see my reply to throwawaymaths


It'd be really interesting to open up a movie theater in Japan that just ingests blockbusters through a "do nothing" NN and then be able to screen them royalty free. This decision feels incredibly half baked.


This is explicitly for training, not for the distribution of copyrighted material. Courts aren't stupid.


I'd clarify that in my example the screening would be of the potentially random output of a model... just one that was only trained by watching a specific blockbuster movie and thus extremely likely to just reproduce the source material. My example is obviously an extreme but it gets at the core of the NYT case here in the states... I think it's a bad thing if we allow models to output data nearly indistinguishable from copyrighted data it was trained on.

W.r.t to the NYT case - It's my opinion that it's completely reasonable to use a corpus of vetted english literature like the NYT as a way to train your model to comprehend language - but if the model also begins to echo the contents of those articles then that may be a serious breech of the NYT's right's to monetize their work.


That will probably go about as well as distributing an encrypted copyrighted work along with its encryption key and then claiming that none of the bits are the same so you did nothing wrong. Courts historically have not had any problem sorting out nerdy fantasy workarounds of the type often posted on HN.


How is screening a movie training an AI?


It seems pretty clear what AI means in this context.

> There is no difference between a lossy jpg, taking its pixels as weights, and the weights of a NN.

Somehow I don't think this is going to hold up in court.


At that direct level yeah probably, and I do think copyright is dumb and should be if not abolished, limited to 20 years or whatever. That said, imagine training a network from scratch entirely on Disney's catelog. Even if that model is then prompted to generate new characters, it seems weird to say that Disney's copyright wasn't infringed.


This is little more than baseless fearmongering.

"zip up copyrighted images using a nn" is trek level technobabble.

that's not how NNs work.

how the hell does it even connect to privacy?

copyright isn't what makes it illegal to expose and have your medical records

it's privacy violations, which this doesn't even touch


> "zip up copyrighted images using a nn" is trek level technobabble.

Look up 'overfitting', neural-network based compression, etc. or that paper that used zip compression as a neural-network basically. Farthest thing possible from being 'technobabble' once you understand how inextricably linked compression and 'understanding' is.


Regardless of whether it's "technobabble," it's a misunderstanding of how courts operate. The law is not a formally specified algorithm. If you overfit a NN to produce someone else's work, that's not going to get you off the hook in front of a court.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: