Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Pragmatic way to avoid supply chain attacks as a developer
27 points by RoboTeddy 3 months ago | hide | past | favorite | 21 comments
In the usual course of writing software, it's common to install huge dependency chains (npm, pypi), and any vulnerable package could spell doom. There's some nasty stuff out there, like https://pytorch.org/blog/compromised-nightly-dependency/ which uploaded people's SSH keys to the attacker.

It's easy to say just "use containers" or "use VMs" — but are there pragmatic workflows for doing these things that don't suffer from too many performance problems or general pain/inconvenience?

Are containers the way to go, or VMs? Which virtualization software? Is it best to use one isolated environment per project no matter how small, or for convenience's sake have a grab-bag VM that contains many projects all of which are low value?

Theorycrafting is welcome, but am particularly interested in hearing from anyone who has made this work well in practice.




A large portion of my role at $DayJob is around improving supply chain security.

Some examples of how we do it:

- Devs can only use hardened (by us) Docker images hosted inside our infrastructure. Policies enforce this during CI and runtime on clusters.

- All Maven/PIP/NodeJS/etc. dependencies are pulled through via proxy and scanned before first use. All future CI jobs pull from this internal cache.

- Only a handful of CI runners have outbound connectivity to the public internet (via firewalls). These runners have specific tags for jobs needing connectivity. All other runners pull dependencies / push artefacts from within our network.

- The CI Runners with Internet connectivity have domains whitelisted at the firewall level, and so far very few requests have been made to add new domains.

- External assets, e.g an OpenJDK artefact, have their checksums validated during the build stage of our base images. This checksum is included in Docker image metadata should we wish to download the asset again and compare against the public one.


> All Maven/PIP/NodeJS/etc. dependencies are pulled through via proxy and scanned before first use.

What does the scanning process check for / which tools are used?


Sounds like you’re running a tight ship – congrats! Have you received feedback from your dev teams on the ergonomics of the setup?


Thanks! I was given a blank canvas and asked to build a platform which aligns with the company’s “cybersecurity” vision. They want more teams to align on how they build/deploy/manage products in a PCI regulated environment.

It’s quite challenging given I literally have 50 different “internal customers” (teams) who do things in their own silos - and have done for the last 20 years.

Definitely a marathon not a sprint and will take years to complete.


That's great advice, thanks! What are you using to scan the packages/images if I may ask?


A defense-in-depth approach with a special eye to compartmentalization/separation/sandboxing coupled with principle-of-least privilege is a good stance to take, I think. Also keep in mind that "security is a process, not a product". There is no silver bullet no tool will save you from yourself...

With this in mind:

- https://qubes-os.org - Use separate VMs for separate domains. Use disposable VMs for temporary sessions.

- https://github.com/legobeat/l7-devenv - My project. Separate containers for IDE and (ephemeral) code-under-test. Transparent access to just the directories needed and nothing else, without compromising on performance and productivity. Separation of authentication token while transparent to your scripts and dev-tools. Editor add-ons are pinned via submodules and baked into the image at build-time (and easy to update on a rebuild). Feedback very welcome!

- In general, immutable distros like Fedora Silverblue and MicroOS (whatever happened to SUSE ALP?) also worth considering, to limit persistence. Couples well with a setup like the one I linked above.

- Since you seem to be in a Node.js context, I should also mention @lavamoat/allow-scripts (also affiliated via $work) as something you can consider to reel in your devDeps: https://github.com/LavaMoat/LavaMoat/tree/main/packages/allo...


The root cause of many of our woes is ambient authority. This is the metaphorical equivalent of building an electrical grid without fuses or circuit breakers.

You have to trust everything, and any breach of trust breaks it all. This approach is insane, and yet, widely accepted as the way things were always done, and will always be done.

If you ever get the chance to use capability based security, otherwise known as the principle of least privilege, or multilevel security, do so.

Know that permission flags in Android, or the UAC crap in Windows, or AppArmor are NOT capability based security.


There are really only a handful of approaches to preventing this kind of supply chain attack, and all come with tradeoffs (ranging from the infeasible to the merely impractical):

- Don't take any 3rd party dependencies. Build everything in house instead. Likely only possible in niche areas of government/defence where sky-high budgets intersect with intense scrutiny.

- Manually validate each new version of every dependency in your tree. Also very expensive, complex vulnerabilities will likely still slip through (i.e. things like SPECTRE aren't going to be caught in code review).

- Use firewalls/network security groups/VPC-equivalents to prevent any network traffic that isn't specifically related to the correct operation of your software. Increasingly hard to enforce, as our tech stacks rely on more and more SaaS offerings. Needs a properly staffed network admin to enforce and reduce the pain points on developers.

- Network isolated VMs/containers that can only talk to a dedicated container that handles all network traffic. Imposes odd constraints on software architecture, doesn't play well with SaaS dependencies.

In practice you run with whatever combination of the above you can afford, and hope for the best.


Be conservative in your software choices. If your software stack relies on a few off-the-shelf Debian packages and nothing more, you are pretty safe.


I see the people posting here suggesting isolation for development. While this is a good model, doesn't it suffer from the risk that you are not using the same isolation set up in production, for whatever reason?

In that sense, isolation for develop to solve supply chain security seems a symptom-treater not a cause-treater.

A more extreme approach is to:

minimize dependencies, built a lot in-house, don't update pre-vetted dependencies before another audit

In general, I think a big dependency chain is useful for getting to PoC quickly (and in some cases it's indeed unavoidable, eg. numpy etc), but in building many simplish web apps and client server applications it's feasible to have a very narrow dependency chain, especially back-end. You can even do this front-end if you eschew framework stuff.


What I started to do is to remove external packages and bringing just the parts I need to the codebase, usually using chatgpt to write a smaller version of the lib. no dependencies, no supply chain attack. Also stopped using npm.


This may be of interest: https://programming-journal.org/2023/7/1/ "Building a Secure Software Supply Chain with GNU Guix"


I've found an effective band-aid is 'become the supply chain'. Running your own cache, poorly, has advantages. You don't get the latest silliness by forgetting to update sometimes.


This could perhaps be done in a structured way - e.g. a pip cache/proxy with a configurable update availability delay.


Definitely! I suggest that where possible. Some implementations require creativity, where others it's a mundane configurable option.

It's not necessarily as easy as timely snapshots. For example, not knowing when the upstream may be syncing. Signatures, release interdependence, etc.

There is intent in my framing, however. Once the cache is made... do or don't mind the cadence - the benefit is there for the moment. Performance/availability :)

This is really just a central actor to other procedures. Scanning, eviction, and so on.


Implement SLSA on developer/org/infra level

https://slsa.dev/spec/v1.0/



CycloneDX tools offer packages for each and every programming language. [1]

The dependency track project accumulates all dependency vulnerabilities in a dashboard. [2]

Container SBOMs can be generated with syft and grype [3] [4]

[1] https://github.com/CycloneDX

[2] https://github.com/DependencyTrack

[3] https://github.com/anchore/syft

[4] https://github.com/anchore/grype


SBOMs can't flag vulnerable dependencies until after those are publicly known. Traceability is useful when mitigating a crisis, but it won't prevent one.


> Traceability is useful when mitigating a crisis, but it won't prevent one.

So how do you prevent a crisis then without knowing what your software stack has as dependencies?


For protecting your developer machine, devcontainers will provide at least some isolation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: