Hacker News new | past | comments | ask | show | jobs | submit login

This is a problem people have with macro, large scale data, the problem you are exhibiting in regards to "value".

Ill use a simple example to illustrate.

If you (yourself) were to write a song that sounded just like a Beatles song and was in effect a lift or "copy" of it and it went to #1 on the charts. You would expect a Beatles lawyer to contact you pretty quickly.

Thats called "one to one" copyright infringement. Lawyers and court systems are setup to process these claims and do so from time to time.

However, if you were to copy and make available with no royalty ALL THE SONG EVER WRITTEN AND RECORDED, the legal system, and lawyers have almost no way to get their head around it and the judicial system can barely assign a plaintiff to the defendant.

This is more analagous to the situation Google created when it scanned "every book written in the english language" and made it available via google search. At the time the courts and the law had no concept of such a thing and couldnt process the mass copyright violation.

Google didnt scan all the book s because they had "value". Google scanned all the books to "devalue" them to zero, to serve its purposes. Google as far as I know never paid to this day a cent.

So ChatGPT didnt scan the entire internet (reddit was a rounding error), because it HAD value. It scanned it to form english grammer out of tokens and to get away with mass theft. Just like google.

As an AI assistant, I hope I answered your question to your satisfaction.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: