Hacker Newsnew | past | comments | ask | show | jobs | submit | staticassertion's commentslogin

I really don't see how any clone is going to manage to do what localstack couldn't - maintain compatibility with tons of AWS services while not getting paid for it. If this were viable, why would it not have worked before?

The only things I can think of are that perhaps LocalStack was just a mess of a codebase that couldn't maintain velocity or attract contributors, or it just failed to steward new contributors or some such thing.


Personally, I would get value out of really solid compatibility of the base features of a few core services (sqs, s3, kms, and maybe dynamo are the main ones that come to mind) with a light weight gui interface and persistence.

If I’m getting into esoteric features or some “big” features that don’t make sense locally, then I just spin up a real dev account of aws, so I know I’m getting the real experience.


Hello! We won't have the broad coverage that Localstack has... we're not aiming to be the "next Localstack"... just want to keep the core services that were available for free in the LS community up to date. If you’re looking for larger services like MWAA, sorry, but we won't be supporting them.... Most core AWS services don't receive many updates anyway (their APIs don’t change drastically or frequently)

"LocalStack was just a mess of a codebase" - very true.

I do think there's potential to semi-automatically create a compatible suite of services, but it'll require some very talented use of LLMs and some novel testing approaches. Not something I want to sign up for.

I evaluated Floci, but that has the typical issues you'd expected with freshly minted vibe code.


I suppose (among many other things) LLMs are changing this. We no longer need that many contributors when we can use AWS docs, intercept AWS API calls and give it to AI agent to mimic. Of course, contributors are still needed for maintaining tests and validations.

and then the maintainer goes on a rant about accurate but agentically coded pull requests and doesn’t merge it

No one was convinced to spend money to do the things you're saying. That's just disingenuous. People rent models because (a) it moves compute elsewhere (b) they provide higher quality models.

c) It's turnkey instead of requiring months/years of custom dev and on-going maintenance.

Yeah, NPM should be enforcing 2FA and likely phishing resistant 2FA for some packages/ this should be a real control, issuing public audit events for email address changes, and publish events should include information how it was published (trusted publishing, manual publish, etc).

Instead they took away TOTP as a factor.

Scaling security with the popularity of a repo does seem like a good idea.


Are there downsides to doing this? This was my first thought - though I also recognize that first thoughts are often naive.

You don't want "project had X users so it's less safe" to suddenly transition into "now this software has X*10 users so it has to change things", it's disruptive.

TOTP although venerable was better than no second factor at all.

TOTP isn't phishing resistant

No it's not but it's better than nothing. Don't let the perfect be the enemy of the good.

TOTP seems effectively useless for npm so that seems fine to me

They're not a failed experiment. No one has ever "experimented" by making a safe package manager for their new language. And it is not that insane to do so. Very basic things will get you very far:

1. Packages should carry a manifest that declares what they do at build time, just like Chrome extensions do. This manifest would then be used to configure its build environment.

2. Publishers to official registries should be forced to use 2FA. I proposed this a decade ago for crates.io and people lost their minds, like I was suggesting we drag developers to a shed to be shot.

3. Every package registry should produce a detailed audit log that contains a "who, what, when". Every build/ command should be producing audit logs that can be collected by endpoint agents too.

4. Every package registry should support TUF.

5. Typosquatting defenses should be standard.

etc etc etc. Some of this is hard, some of this is not hard. All of this is possible. No one has done it, so it's way too early to say "package managers can't be made safe" when no one has tried.


I don't understand commercial aspect of large OSS like package managers but i was wondering for years why this was missing from npm. I think typosquatting was handled by npm last year but only after some popular miss typed packages started stealing developer creds.

The people building package managers are unaware of these problems going into it and it becomes extremely disruptive to start adding these things later on since your entire ecosystem is built on the assumption that they can do these things.

It's also shockingly controversial to suggest typosquatting suggestions. I made this suggestion ages ago for cargo, demonstrated that basic distance checks would have impacted <1% of crates over all time, and people still didn't want it.


For those who didn't know what TUF means (like me), I think they're referring to The Update Framework (https://theupdateframework.io).

Sorry, I should have clarified that - you're correct. `cosign` is an example of a tool that makes this quite straightforward and proves that this sort of system can work today.

Love these ideas!

> Publishers to official registries should be forced to use 2FA. I proposed this a decade ago for crates.io and people lost their minds, like I was suggesting we drag developers to a shed to be shot.

How is this enforced when it's pushed via a pipeline?


Your account is separate from your publishing. That is, in order to go to my account to change configuration values, 2FA must be required.

Publishing should be handled via something like Trusted Publishing, which would leverage short lived tokens and can integrate with cryptographic logs for publish information (ie: "Published from the main branch of this repo at this time").


> What do you base that on?

The entire history of malware lol


Can you elaborate? Why do you believe that motivated threat hunters won’t continue to analyze and find threats in new versions of open source software in the first week after release?

Attackers going "low and slow" when they know they're being monitored is just standard practice.

> Why do you believe that motivated threat hunters won’t continue to analyze and find threats in new versions of open source software in the first week after release?

I'm sure they will, but attackers will adapt. And I'm really unconvinced that these delays are really going to help in the real world. Imagine you rely on `popular-dependency` and it gets compromised. You have a cooldown, but I, the attacker, issue "CVE-1234" for `popular-dependency`. If you're at a company you now likely have a compliance obligation to patch that CVE within a strict timeline. I can very, very easily pressure you into this sort of thing.

I'm just unconvinced by the whole idea. It's fine, more time is nice, but it's not a good solution imo.


What, in your view, is a better solution?

There are many options. Here's a post just briefly listing a few of the ones that would be handled by package managers and registries, but there are also many things that would be best done in CI pipelines as well.

https://news.ycombinator.com/item?id=47586241


I suspect we'll see combinations of symbolic execution + fuzzing as contextual inputs to LLMs, with LLMs delegating highly directed tasks to these external tools that are radically faster at exploring a space with the LLM guiding based on its own semantic understanding of the code.

I'm with you, I expected this to be happening already. Funny enough, I guess even a hardened codebase isn't at that level of "we need to optimize this" currently so you can just throw tokens at the problem.


Right, so that's exactly how I was thinking about it before I talked to Carlini. Then I talked to Carlini for the SCW podcast. Then I wrote this piece.

I don't know that I'm ready to say that the frontier of vulnerability research with agents is modeling, fuzzing, and analysis (orchestrated by an agent). It may very well be that the models themselves stay ahead of this for quite some time.

That would be a super interesting result, and it's the result I'm writing about here.


> Everything is up in the air. The industry is sold on memory-safe software, but the shift is slow going. We’ve bought time with sandboxing and attack surface restriction. How well will these countermeasures hold up? A 4 layer system of sandboxes, kernels, hypervisors, and IPC schemes are, to an agent, an iterated version of the same problem. Agents will generate full-chain exploits, and they will do so soon.

I think this is the interesting bit. We have some insanely powerful isolation technology and mitigations. I can put a webassembly program into a seccomp'd wrapper in an unprivileged user into a stripped down Linux environment inside of Firecracker. An attacker breaking out of that feels like science fiction to me. An LLM could do it but I think "one shots" for this sort of attack are extremely unlikely today. The LLM will need to find a wasm escape, then a Linux LPE that's reachable from an unprivileged user with a seccomp filter, then once they have kernel control they'll need to manipulate the VM state or attack KVM directly.

A human being doing those things is hard to imagine. Exploitation of Firecracker is, from my view, extremely difficult. The bug density is very low - code quality is high and mitigation adoption is a serious hurdle.

Obviously people aren't just going to deploy software the way I'm suggesting, but even just "I use AWS Fargate" is a crazy barrier that I'm skeptical an LLM will cross.

> Meanwhile, no defense looks flimsier now than closed source code.

Interesting, I've had sort of the opposite view. Giving an LLM direct access to the semantic information of your program, the comments, etc, feels like it's just handing massive amounts of context over. With decompilation I think there's a higher risk of it missing the intention of the code.

edit: I want to also note that with LLMs I have been able to do sort of insane things. A little side project I have uses iframe sandboxing insanely aggressively. Most of my 3rd party dependencies are injected into an iframe, and the content is rendered in that iframe. It can communicate to the parent over a restricted MessageChannel. For cases like "render markdown" I can even leverage a total-blocking CSP within the sandbox. Writing this by hand would be silly, I can't do it - it's like building an RPC for every library I use. "Resize the window" or "User clicked this link" etc all have to be written individually. But with an LLM I'm getting sort of silly levels of safety here - Chrome is free to move each iframe into its own process, I get isolated origins, I'm immune from supply chain vulnerabilities, I'm immune to mostly immune to XSS (within the frame, where most of the opportunity is) and CSRF is radically harder, etc. LLMs have made adoption of Trusted Types and other mitigations insanely easy for me and, IMO, these sorts of mitigations are more effective at preventing attacks than LLMs will be at finding bypasses (contentious and platform dependent though!). I suppose this doesn't have any bearing on the direct position of the blog post, which is scoped to the new role for vulnerability research, but I guess my interest is obviously going to be more defense oriented as that's where I live :)


> With decompilation I think there's a higher risk of it missing the intention of the code.

I'm not sure but suspect the lack of comments and documentation might be an advantage to LLMs for this use case. For security/reverse engineering work, the code's actual behavior matters a lot more than the developer's intention.


I think the other side of that is that mismatches between intention and implementation are exactly where you're going to find vulnerabilities. The LLM that looks at closed source code has to guess the intention to a greater degree.

This is true for a lot of things but for low-level code you can always fall back to "the intention is to not violate memory safety".

That's true, but certainly that's limiting. Still, even then, `# SAFETY:` comments seem extremely helpful. "For every `unsafe`, determine its implied or stated safety contract, then build a suite of adversarial tests to verify or break those contracts" feels like a great way to get going.

It's limiting from the PoV of a developer who wants to ensure that their own code is free of all security issues. It is not limiting from the point of view of an attacker who just needs one good memory safety vuln to win.

I wonder if your background just has you fooled. I worked on a data science team and code was always a commodity. Most data scientists know how to code in a fairly trivial way, just enough to get their models built and served. Even data engineers largely know how to just take that and deploy to Spark. They don't really do much software engineering beyond that.

I'm not being precious here or protective of my "art" or whatever. But I do find it sort of hilarious and obvious that someone on a data science team might not understand the aesthetic value of code, and I suspect anyone else who has worked on such a team/ with such a team can probably laugh about the same thing - we've uh... we've seen your code. We know you don't value aesthetic code lol. Single variable names, `df1`, `df2`, `df3`.

I'm not particularly uncomfortable at the moment because understanding computers, understanding how to solve problems, understanding how to map between problems and solutions, what will or won't meet a customer's expectations, etc, is still core to the job as it always has been. Code quality is still critical as well - anyone who's vibe-coded >15KLOC projects will know that models simply can not handle that scale unless you're diligent about how it shoul dbe structured.

My job has barely changed semantically, despite rapid adoption of AI.


I'm a software engineer and _I_ don't understand the aesthetic value of code. I'm interested in architecture and maintainability but I couldn't give a rats ass on how some section of code looks like, so long as it conforms to a style guide and is maintainable.

> so long as it conforms to a style guide and is maintainable.

Most people consider aesthetic values to align with these things.


> We know you don't value aesthetic code lol. Single variable names, `df1`, `df2`, `df3`.

https://degoes.net/articles/insufficiently-polymorphic

> My job has barely changed semantically, despite rapid adoption of AI.

it's coming... some places move slower than other but it's coming


> https://degoes.net/articles/insufficiently-polymorphic

lol this is not why people do "df1", "df2", etc, nor are those polymorphic names but okay.

> it's coming... some places move slower than other but it's coming

What is coming, exactly? Again, as said, I work at a company that has rapidly adopted AI, and I have been a long time user. My job was never about rapidly producing code so the ability to rapidly produce code is strictly just a boon.


I understand that you’re trying to apply your experience to what we do as a team and that makes sense; but, we’re many many stddev beyond the 15K LOC target you identified and have no issues because we do indeed take care to ensure we’re building these things the right way.

So you understand and you agree and confirm my experience?

I have worked at many places and have seen the work of DEs and DSs that is borderline psychotic; but it got the job done, sorta. I have suffered through QA of 10000 lines that I ended up rewriting in less than 100.

So, yes; I understand where you’re coming from. But; that’s not what we do.


Yes, but then you said that you do what I'm suggesting is still critical to do, which is maintain the codebase even if you heavily leverage models. " we do indeed take care to ensure we’re building these things the right way."

Well, (a) why would they? (b) "uptime" has shifted from a binary "site up/down" to "degraded performance", which itself indicates improvements to uptime since we're both pickier and more precise.

Are we really questioning why cloud providers would offer better uptime guarantees?

Yes, I'm asking why they'd lock themselves into a contract around 5 9s of uptime since the parent poster mentioned that they won't do so. Of course, AWS actually does do this in some cases and they guarantee 99.99% for most things, so it feels a bit arbitrary - 5 minutes vs an hour, roughly.

So then its clearly not as trivial to achieve as you made it sound.

Are you replying to the right person?

I could easily see this as a case where the team had a legacy area of code in a language that no one was familiar with anymore so no one felt great about actually contributing to it, so it languished, and now AI let them go "fuck it, let's just rewrite it".

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: