> What is the roadblock preventing these models from being able to make the comm...

throwaway0123_5 · 2025-11-14T14:06:03 1763129163

> I'd really hate to see the world go down the path of gatekeeping tools behind something like ID or career verification.

This is already done for medicine, law enforcement, aviation, nuclear energy, mining, and I think some biological/chemical research stuff too.

> It's a tradeoff we need to be willing to make.

Why? I don't want random people being able to buy TNT or whatever they need to be able to make dangerous viruses*, nerve agents, whatever. If everyone in the world has access to a "tool" that requires little/no expertise to conduct cyberattacks (if we go by Anthropic's word, Claude is close to or at that point), that would be pretty crazy.

* On a side note, AI potentially enabling novices to make bioweapons is far scarier than it enabling novices to conduct cyberattacks.

thewebguyd · 2025-11-14T17:08:55 1763140135

> If everyone in the world has access to a "tool" that requires little/no expertise to conduct cyberattacks (if we go by Anthropic's word, Claude is close to or at that point), that would be pretty crazy.

That's already the case today without LLMs. Any random person can go to github and grab several free, open source professional security research and penetration testing tools and watch a few youtube videos on how to use them.

The people using Claude to conduct this attack weren't random amateurs, it was a nation state, which would have conducted its attack whether LLMs existed and helped or not.

Having tools be free/open-source, or at least freely available to anyone with a curiosity is important. We can't gatekeep tech work behind expensive tuition, degrees, and licenses out of fear that "some script kiddy might be able to fuzz at scale now."

Yeah, I'll concede, some physical tools like TNT or whatever should probably not be available to Joe Public. But digital tools? They absolutely should. I, for example, would have never gotten into tech were it not for the freely available learning resources and software graciously provided by the open source community. If I had to wait until I was 18 and graduated university to even begin to touch, say, something like burpsuite, I'd probably be in a different field entirely.

What's next? We are going to try to tell people they can't install Linux on their computers without government licensing and approval because the OS is too open and lets you do whatever you want? Because it provides "hacking tools"? Nah, that's not a society I want to live in. That's a society driven by fear, not freedom.

Imnimo · 2025-11-13T21:32:10 1763069530

I think one could certainly make the case that model capabilities should be open. My observation is just about how little it took to flip the model from refusal to cooperation. Like at least a human in this situation who is actually fooled into believing they're doing legitimate security work has a lot of concrete evidence that they're working for a real company (or a lot of moral persuasion that their work is actually justified). Not just a line of text in an email or whatever saying "actually we're legit don't worry about it".

pixl97 · 2025-11-13T22:34:09 1763073249

Stop thinking of models as a 'normal' human with a single identity. Think of it instead as thousands, maybe tens of thousands of human identities mashed up in a machine monster. Depending on how you talk to it you generally get the good models as they try to train the bad modes out, problem is there are a nearly uncountable means to talking to the model to find modes we consider negative. It's one of the biggest problems in AI safety.

ACCount37 · 2025-11-14T10:51:30 1763117490

To a model, the context is the world, and what's written in the system prompt is word of god.

LLMs are trained a lot to follow what the system prompt tells them exactly, and get very little training in questioning it. If a system prompt tells them something, they wouldn't try to double check.

Even if they don't believe the premise, and they may, they would usually opt to follow it rather than push against it. And an attacker has a lot of leeway in crafting a premise that wouldn't make a given model question it.