Hacker News new | past | comments | ask | show | jobs | submit login
EU AI Act, the first ML law, is expected to come into force by early 2024 (infoq.com)
45 points by elras 10 months ago | hide | past | favorite | 66 comments



I lol'ed at a tweet the other day;

'“AI Alignment” is just a cancel culture for GPUs.'

https://twitter.com/tunguz/status/1681733141508046852

---

On a more quasi-serious note, surely no technically-modest person has a sincere idea on how to regulate LLM's or any GAN stuff et cetera for that matter.

If so, does it imply there is a potentiality for a black market?


I'm not an economist or particularly knowledgeable about communism/socialism, but the recent writers strike and copyright arguments over generative AI really stuck out as a new example of the importance of "controlling the means of production" issue to me. It's a new way in which a new segment of labor risks losing control over that production.

I don't have the answer, but it actually made me think a bit about the other kinds of automation we've seen in e.g. manufacturing, and how capital is in a unique position to reap the gains of that kind of technology.


Except that the same technologies makes it easier for new competitors to come in. It's a trend we've already seen for the last ten years. Compare the 90s to today how much of the total time watching video media that was produced by someone with a budget less than $1000.


That's a perfect example though, since monetizing content is almost exclusively feasible due to platforms owned by capital.


I got one:

If you've used internet scrapes or items under copyright, then you get protection so long as you publicly publish the weights and sufficient code to run it.


Then you get the same problem those Google papers have: nobody can actually run any of that.

Google and other cloud companies have produced research that goes "we bought a million GPUs and then ran this algorithm", some times even with source available, ur nobody is going to spend the millions or more to validate those results. Google may as well claim proof for the existence of God, we'll never know if there's any truth to their claims.

It Microsoft or Google publishes weights and code that requires a million in hardware costs alone, that solves nothing.


A quick Google search suggests there are about 65 million millionaires on earth so it's not nothing. Granted 1,000,000 my be to little I don't know.


I got one:

If you've used internet scrapes or items under copyright, then you get to chose which hand they chop off.


The publisher's. They're scummy rent seekers who produce nothing worthwhile.


Hmm, they've tried to be clever by defining artificial intelligence by how it has been created and the fact that it is used to make decisions.

However since the techniques available are so varied they've basically just written a catch all that explicitly includes all kinds of statistics or optimization.

So really the rules apply equally to everything from GPT to basic linear regression to calculating an average. Frankly it's hard to come up with a computer aided decision that isn't covered by their definition of 'artificial intelligence system'. Which I guess is fine, but then why mention artificial intelligence? Just ignore the technique entirely and focus on the end results you want to avoid.


Good. There's nothing special about any of these methods.

The linear regression can still discriminate against you for your race, age, medical history, gender, etc.


Not to worry, all the ways governments discriminate is exempted from this law, like with most EU legislation (e.g. GPDR). By exempted I mean governments, the executive, NOT courts, are the arbiters of this law. This may be a law, but like the GPDR, it has nothing to do with judges (although the justice system is exempt).

This means in practice: all of health care, police, immigration, housing, unemployment, all judicial cases (will we see divorce cases decided by AI ?), child custody (in fact everything with any connection to minors, e.g. schools could let AI come up with punishments), ... can and will be decided by AI, on the authority of the executive of the country alone (ie. a minister dan decide by himself) and that only violates this law if the (executive) government decides to convict itself.

It'll be the same as under GPDR: ever more government data sharing (e.g. if you don't pay your insurance bill, it's now added to your tax bill. The big difference being: you can never go to jail for not paying an insurance bill, but you can be imprisoned for not paying a tax bill). Any mental health care treatments, even from decades ago, are next-to-guaranteed to be brought up in divorce or criminal cases.

This is the problem I have with GPDR and other "data protection" legislation in the EU. Everything I'd consider critical information ... needs protecting, not from Google/FB/Amazon/Apple, but from the governments of the EU! And these laws achieve the exact opposite!

And no worries: this makes a team of "up to 25 FTE" responsible for policing an entire country ... obviously any complaints will be dealt with promptly!

In other words: even in the private sector, they intend to use this law as a weapon against Google, Facebook and Tiktok, and against whatever-the-current-minister-hates company, like huge international companies, but it is utterly impossible for these national AI authorities to go after any normal-sized company.

"This will make the EU more competitive worldwide". The authors of this law clearly have a bright future in other forms of comedy!

In short: glad to see all "high risk" AI usage is now dealt with.


You're committing a logical fallacy.

"The laws are aimed at protecting X but governments find ways to circumvent these protections therefore these laws make matters worse."

No. It's not therefore.


I'm inclined to agree, but then the whole 'artificial intelligence' part is just distraction, or they simply don't know what they're doing. Both options are disturbing.


Strange language in the press release (emphasis added):

> Unacceptable risk AI systems are systems considered a threat to people and will be banned. They include:

> - Cognitive behavioural manipulation of people or specific vulnerable groups [...]

I'm fairly certain that "specific vulnerable groups" are made up of people. But people have already been mentioned. So why this redundancy? Is this supposed to hint at an understanding that regular people (i.e., those who are not members of "specific vulnerable groups") will receive a lesser degree of protection either by law or in practice?


The full text reads:

> The prohibitions covers practices that have a significant potential to manipulate persons through subliminal techniques beyond their consciousness or exploit vulnerabilities of specific vulnerable groups such as children or persons with disabilities in order to materially distort their behaviour in a manner that is likely to cause them or another person psychological or physical harm.

You can read the full proposal if you have any questions. These documents are often written quite well, with legalese strictly limited to where it makes sense.


So in other words, "cognitive behavioral manipulation" won't actually be banned. It will only be banned if conditions A, B, C, D, E, and F are also met. Otherwise, it's full steam ahead, for example for the purposes of advertising.


How would you measure harm? Any normal software system will occasionally hurt people who don’t fit the world model written into the system, so zero hurt isn’t reasonable.

How then? Probably by using some statistical measure, like “less than X% of human users are negatively affected by the system”. Ok, sounds good. But if X is 1 and your system hews that number by negatively affecting 20% of a specific group your system is ostensibly not targeting, that’s still a problem. That’s why specific protected groups are called out.


People come in different shapes and forms. Some are more vulnerable than others, in a democracy, those have the right to be protected.

Pretty easy, when you think about it.


Your parent is pointing out that the phrasing is similar to a law that requires specific treatment for "citrus fruits and oranges", which looks initially like it should be equivalent to requiring specific treatment for "citrus fruits". The question is what additional legal requirements come from the inclusion of "specific vulnerable groups".


> People come in different shapes and forms.

And there's a common term for all of them: People.

According to the above quote, "people" in general have the right to receive protection.

So why mention those groups separately, unless the protection they are going to receive somehow differs from that received by the rest of the population?


For the same reason we have very specific ant-discrimination laws? Apparently, some people have problems applying basic human rights to other people without being explicitely told do so.


Kids and the mentally disabled are given as an example. Those are treated differently by society, for good reason.


With this law, you can forget about base models such as LLaMA being released, since models will be required not to say "mean things".


I can't find any mention of "mean things" in the proposal, what section are you referring to?


>AI systems shall be developed and used in a way that includes diverse actors and promotes equal access, gender equality and cultural diversity, while avoiding discriminatory impacts and unfair biases that are prohibited by Union or national law


LLaMA 2 is extremely curated.


Not the base model


I have deployed it in the chat mode and I can't count the number of times it was giving me lectures on what is appropriate to say.


"Platforms have to provide related summaries of the copyrighted data used during the training stage."

I'd expect most if not all models need to be re-trained to comply with that.


Many models are based on public and private datasets that are kept around for future reference and retraining. Maybe some data sets have been deleted, which would be a problem, but I don't think it applies to most models.


> I'd expect most if not all models need to be re-trained to comply with that.

Why do you expect that? Do you think people have deleted their training data?


I think it's possible that services like Google, which occasionally rotate their caches but uses collected data for training networks, no longer have the necessary information.

For companies that haven't thrown out their datasets yet, they should either save their datasets now or collect the necessary information before the law goes into effect.

Of course companies won't, and they'll pretend to be completely surprised by the law, but when this goes into effect in a few years they'll have no credible defence.


Given the effort that has gone into collecting and classifying the data I highly doubt anyone is deleting their training sets.


While the 'open box by default' ideal is admirable, I wonder how much this would practically aid understanding of the underlying system among users/regulators today, or when faced with even larger models down the line.

Don't get me wrong, I'm pro sharing training data, but I'm questioning our current ability to wade through and understand large model parameter complexities when there are some many bespoke features and use-case specific designs.

I'm hopeful we see more innovation in this space once we're able to access training data more easily, which this regulation might enable more broadly. But then, how do we prevent nefarious actors from doing the same? Who is to say a 'low risk model' today is not a high risk one tomorrow?


Is this applicable just to the recent generative ML models for text, images, audio where you give it a prompt, or is this also applicable to other ML models.

That is, does it apply to text-to-speech models that take phonetic/text and prosody information and convert that to audio -- like that reseached by the University of Edinburgh?

Does it apply to the various NLP models that predict parts of speech, morphology, lemmas, word senses, coreferences and other NLP topics? E.g. will it apply to things like word2vec?


> is this also applicable to other ML models

Yes. The law is not concerned with technological details, but what the technology is used for.

You ask a lot of “does it apply” questions, and the answer to each of them is “yes”. But that answer alone might be misleading. If the tech is used in a low risk application then there are different requirements than if it is used in a high risk application.


Ahhh 2024, what a shitshow of a year that's going to be, especially at the end.


NIH, ... ekhmm ..., NIS 2 in October 2024?


Before the usual rush of "EU is anti-innovation" that is typical on HN, I recommend this article, too: "The truth about the EU AI Act and foundation models, or why you should not rely on ChatGPT summaries for important texts" https://softwarecrisis.dev/letters/the-truth-about-the-eu-ac... (though the InfoQ article does give a very short neutral summary, too)


Thanks for the link, it just served to reinforce my view that EU is anti-innovation. These points can’t be read any other way. They will put EU at a disadvantage in AI development.


It means that you read neither the original article nor the one I linked.


> Developers (not deployers) of foundation models need to register their models, with documentation, prior to making it available on the market or as a service.

What would "on the market" mean here? Would you need to register a model that you upload on huggingface?


How many of the models on HuggingFace are foundational models trained from scratch?


My question was not rhetorical.


Neither was my reply, I honestly don’t know.

The sibling comment suggests that there are only a few foundational models on HF. They are presumably squarely within the target of this new EU law. HF as the hosting company will be in a similar position as YouTube is with local video licensing and content restrictions.


From the article I linked: "The developers of a foundation model are responsible for compliance, not the deployers."

At worst HuggingFace will be asked to remove those models until they are made compliant.

If anything, if you believe HuggingFace's marketing blurb from the front page, they should be thrilled.

HuggingFace: "The AI community building the future"

EU: foundational models should be documented, data sources should be explained and summarized.

What better way of ensuring community AI can be built and reasoned about?


Only a few, but they are very important.


Facebook notoriously allowed advertisers to check boxes to not show ads to certain demographics. This would result in all sorts of discriminatory practices, like not showing real estate advertisements to Black people. No AI involved. Just a simple rule based system that breaks old and well-understood laws against discrimination.

And you can find many other examples where computers are used to "damage to public rights, health, safety, or may trigger negative environmental effects.", and there is nothing in AI that looks like it deserves exceptional treatment here.

Is a human powered election disinformation farm worse than the AI equivalent? I don't see how. It's actions that cause harm that should be made illegal, not the algorithm that might produce such an action.

The proposal as it stands forbids "exploit[ing] physical biometric data, emotion, gender, race, ethnicity, citizenship status, religion, and political orientation during inference that may cause discriminative operations", but in the name of security the EU as well as the UK and the US already do all of these things. Are our governments going to cease and desist with their unlawful monitoring after this EU AI act comes into effect? We all know the answer to that.

How does this proposed ML law improve society? Why do we need it? How does it protect the public?


After the singularity is over...


For the record, especially after GDPR and data related innovation-crippling EU laws, this seems to be the first time EU doing something sensible with tech laws.

I was expecting something much stricter, this one makes sense and looks completely fair.


Harvesting and selling peoples personal information - the true face of innovation in silicon valley.


GDPR is not innovation-crippling, it is data collection and privacy invasion crippling.


It’s not always the law directly that is impactful but all the additional work required to comply that takes resources away from doing innovative things things.


This is similar to telling that paying taxes and filling the reports take away from the innovative things.


Hence, crypto.


Crypto was innovative for what, a year, before degenerating into a bunch of scams?

I can somewhat respect the original idea, but what it turned into has no value whatsoever.


If Douglas Adams wrote H2G2 today, what would be in place of digital watches in the quote:

> Far out in the uncharted backwaters of the unfashionable end of the western spiral arm of the Galaxy lies a small unregarded yellow sun. Orbiting this at a distance of roughly ninety-two million miles is an utterly insignificant little blue green planet whose ape-descended life forms are so amazingly primitive that they still think digital watches are a pretty neat idea.

Smart phones and social media would clearly be high on the list, but I think cryptocurrency and AI would be, too.


I'm not sure what you're getting at.

Smartphones are pretty darn useful. I use mine to watch videos, listen to music, browse the web, do network diagnostics, track my gym workouts, read my mail. I rarely actually use it as a phone, it's more of a very portable computer.

For social media, we're on sort of a social media right now. We're talking. That's social.

AI is pretty neat. I like the picture generation, it's fun to play with, and I don't see it going away.


When it was written, people were genuinely impressed by digital watches, too.


Digital watches are still cool, they just lost relevance because we have clocks everywhere, so having one is mostly redundant.

But otherwise, what's not to like? A simple, cheap, precise, extremely reliable tool that does one job, and does it excellently well. Even a cheap one has a much superior precision to the best mechanical watches ever made.

The modern digital watch is pretty much perfect. If you go for something that's not the cheapest model, you can get one that will run on solar power and sync by radio, making it essentially work perfectly forever without maintenance or adjustment.


It sounds like you're taking the wrong direction with this; did you ever experience any of the H2G2 media? It — the whole thing not just that quote — takes the pedestal out from underneath all the cultural icons and institutions which we put on pedestals in the first place as reasons to think ourselves amazing.

Watches were mocked precisely because they were a cultural symbol; I think the things I listed would also be.

Likewise a character naming themselves after a car due to minimal research when trying to blend in with the natives: https://en.wikipedia.org/wiki/Ford_Prefect


> It sounds like you're taking the wrong direction with this; did you ever experience any of the H2G2 media?

No. But just because a well regarded book said something doesn't mean I have to agree with it.

> Watches were mocked precisely because they were a cultural symbol; I think the things I listed would also be.

I genuinely find nothing to mock in the digital watch. I think it's a great example of human achievement, and basically the pinnacle of the function it performs. Timekeeping has been a problem for humanity through the ages, and we finally solved it for good decades ago and then made it dirt cheap.

If you showed one to the smartest people of the ancient past, their mind would be blown at the amount of polished tech from multiple branches we can put on a wrist and have it work perfectly almost forever.


And they are.


It's very very easy to not collect user data. I've been failing to collect it for years before GDPR happened.


Yeah, not innovation-crippling. That's why we now need to be paranoid on every row of data that we might be collecting which might be illegal due to the "visionary" EU regulations and focus on legality of everything, instead of well, coding a productive, useful service like the good old days.

If this isn't the literal very definition of innovation-crippling, I don't know what is.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: