I'm glad to see we're getting better and better at regulation. Here we have the most important development in computing in years, and we're immediately nipping it in the bud. There used to be some delay.
The analogy you’re looking for is the broader term Nuclear Power, and yes same thing except we allowed it to happen for a few decades before switching to burning coal instead, thanks for the reminder.
The fact that we botched something should not become a reason to continue botching things - politics or science developments. I see it rather as an argument to do better this time, next time, you know, lessons learned...
That thing with record labels going after people using generative AI on their IP caught my attention a little while ago. This seems like a big Achilles' heel for these products, but I guess the hope is reaching critical mass before anyone complains and then getting the rules changed in their favor, as a number of other tech products managed.
That would not be enough. You need to implement the right to be forgotten of European citizens even if you are not offering services in the EU.
Rather, I wonder what would be the effect of this on open source weights, as they have the same issue of being able to produce personal data of European citizens.
That's called "extra territorial" laws... and that's an US-concept lately brought in the EU laws (to counteract US extra territorial laws).
It's the same as the US able to fine two foreign entities doing business together... because they use dollar for the transaction: the US is not part of the transaction EXCEPT for the use of the money.
Usually, the "guilty" companies will be fined and then... either ignore it (but risk trouble directly in the US or when they will try to have business with an US entity... or with an entity having part of its activity in US...), or just pay. Same will apply here: either pay the fine or you won't be able to do business in the EU nor with any EU company... and possibly even not with a company in business with a EU company (indirectly)
Thanks for the explanation, but it basically converges on what the grandparent comment stated, right? I.e., unless your business deals with EU companies or operates in EU, then you are free to ignore it. And if you decide in the future that you want to deal with EU companies/operate in EU, you have the option of paying the fine and moving on.
> either pay the fine or you won't be able to do business [...] possibly even not with a company in business with a EU company
That one seems a bit sus, because it would imply that you won't be able to use GCP/Azure/AWS for cloud services, and that just doesn't sound right. Afaik they wouldn't blanket refuse cloud services to an american business that doesn't follow EU laws (in case the business simply doesn't care to operate in EU or make money from there).
Not a legal professional at all, so if someone could provide a better explanation of the situation, it would certainly be welcome.
OpenAI will have no trouble claiming legitimate interest. It’s very broad - broad to the point where HR companies are openly selling scraped LinkedIn data to recruiters.
Based on the injunction from the Italian privacy guard (the second one, not the ban) they did indeed back off on the scraping part, because it is not mentioned among the tasks that OpenAI has to do before April 30, for the ban to be lifted.
That said, this article is only devoting a sentence to way more important requests, which are the implementation of right to be forgotten and especially the problem with inventing false personal data. Which is a huge issue and what distinguishes artificial intelligence from autocomplete.
By the way, blocking ChatGPT from Italy is not enough to avoid the GDPR violation. OpenAI is handling(*) personal data of Italian citizens, and therefore they must allow them to exercise their rights, even if OpenAI is not providing service in Italy. Blocking ChatGPT was just a fig leaf to show they were doing something.
(*) Because people say the personal data is not part of the model, "handling" is defined as performing any in a list of actions which includes disseminating, and "personal data" is defined as "data that allows identification of a person". So if "ChatGPT is disseminating data that allows identification of a person", then "OpenAI is handling personal data"; the former is simply a subset of the latter.
Imagine we're a primitive people in a forest village. If I steal your ax, that's theft. However, if I see your ax, and make my own ax based on your idea, did I steal anything from you?
In the 2nd scenarios above, you may have to work harder in your woodchopping business, as I'm now competition, but doesn't that mean that the villagers benefit from better access (cheaper prices) to wood-chopping services?
We currently define ideas as property. But is that ethically defensible? Now, if I claim to have originated the ax idea, I'm maybe stealing your brand or reputation. I should have to give credit to you for the idea of an ax, but I don't see how making a copy is theft. Ideas are not scarce; they're inherently shareable.
My comment is regarding whether ideas fall in the same category as physical goods. Ideas are fundamentally different, in that they're non-scarce unless we make them scarce via intellectual property laws. And I question whether those laws really promote human flourishing, or whether they simply protect whoever is close to the government.
Regarding China, copying without attribution should be prosecuted as a form of reputational or brand fraud, as it fails to give credit to others for the idea. And sure, under existing intellectual property laws many Chinese firms ought to be prosecuted and held accountable---I don't advocate breaking existing laws.
I simply think that the laws out to be brought into better alignment with reality, and ideas simply aren't scarce (at least in how economics considers scarcity).
> as it fails to give credit to others for the idea
That is precisely the issue at hand.
> Ideas are fundamentally different
But that's essentially what products are. It's not like chinese companies take a product and clone it, they take the idea of how to make a product look like and work and implement it. Same applies to code, art, books. OpenAI takes those products, the results of ideas, modify them and resell them without attribution or without having paid a license. That's simply theft.
Also, assuming you're writing from a United States perspective, part of the reason so much of our manufacturing sector was outsourced to China (and other nations) is that since 1971 the US dollar has been our primary export, as the global reserve currency among all the fiat currencies. This is starting to change, so I think we'll see more and more manufacturing returning to the USA as the dollar looses international reserve status. And I wonder to what extended Chinese copying our technologies is connected to macro monetary realities.
It will be generally good for the world, for American manufacturing, workers, businesses, and families---pretty much for all except Wall Street banks and those close to the government money spigots.
I am writing from a european perspective as we share common problems caused by countries copying our products and ideas. Now it would appear that the US wants to do the same but for intellectual work. That's not nice to put it mildly. I think licensing data, expensive as it may be, would eventually lead to higher quality ai and healthier growth. At the moment this whole campaign of promoting openai's products, and those of similar companies, appears to be focused on: 1) i'll take your products (books, art, software, music) 2) resell them 3) put you out of a job.
That's precisely what malicious actors such as China have done. Not only are we swamped with lower quality products, but as you wrote, we are also facing significant social and economic issues. This time at an unprecedented scale.
I think it true in the technical manner. You can try to find it in the actual GDPR but that hard if you aren't a lawyer the UK ICO has a neater article on this:
Basically you just need to declare them in you privacy policy and keep a record for compliance.
In terms of LI though it's really complicated and I don't think anyone here on this site is in a position to say for sure if LI applies to what OpenAI is doing. There are arguments from both sides that make sense.
Even my own content has probably been used by them, against my wishes. I see it as unethical data usage. Additionally, the ethics of the company's history, changing to become entirely 'closed' while even keeping the name "OpenAI", makes me... angry, I suppose is the right word. Intuitively it seems to me that they will have a very bad influence on the world if they maintain dominance and continue to gain power
I cannot verify, but delving deep into electrical engineering topics I’m guessing most of that training set is coming from expensive college text books and papers that are behind paywalls. My guess is it scraped all of Sci-Hub. Which is a great resource, albeit illegal.