Hacker Newsnew | past | comments | ask | show | jobs | submit | perlgeek's commentslogin

The real lesson they should learn is to not rely on running images and then using "docker commit" to turn it into an image, but instead to use proper image building tools.

If you absolutely have to do it that way, be very deliberate about what you actually need. Don't run an SSH daemon, don't run cron, don't an SMTP daemon, don't run the suite of daemons that run on a typical Linux server. Only run precisely what you need to create the files that you need for a "docker commit".

Each service that you run can potentially generate log files, lock files, temp files, named pipes, unix sockets and other things you don't want in your image.

Taking a snapshot from a working, regular VM and using that as a docker image is one of the worst ways to built one.


My first reaction: 800GB who committed that?!? This size alone screams something is wrong. To be fair even with basic dockerfiles it’s easy to build up a lot of junk. But there should be a general size limit in any workflow that just alerts when something grows out of proportion. We had this in our shop just a few weeks ago. A docker image for some ai training etc grew too big and nobody got alerted about the image final size. It got committed and pushed to jfrog. From there the image synced to a lot of machines. Jfrog informed us that something is off with our amount of data we shuffle around. So on one end this should not happen but it seems to easily end up in production without warning.

Given that Jfrog bills on egress for these container images I’m sure you guys saw an eye watering bill for the privilege of distributing your bloated container

Yes. But fair enough that we got a warning the very next day.

What if I need cron in my docker container? And ssh? And a text editor? And a monitoring agent? :P

Thankfully LXD is here to serve this need: very lightweight containers for systems, where your app runs in a complete ecosystem, but very light on the ram usage.


>What if I need cron in my docker container? And ssh? And a text editor? And a monitoring agent? :P

How are you going to orchestrate all those daemons without systemd? :P

As you mentioned, a container running systemd and a suite of background services is the typical use case of LXD, not docker. But the difference seems to be cultural -- there's nothing preventing one from using systemd as the entry point of a docker container.


fwiw I recently bootstrapped a small Debian image for myself, originally intended to sandbox coding agents I was evaluating. Shortly after I got annoyed by baseline vim and added my tmux & nvim dotfiles, now I find myself working inside the container regularly. It definitely works and is actually not the worst experience if your workflow is cli-focused.

Even putting GUI apps in a container isnt too bad once one develops the right incantation for x11/wayland forwarding.

My experience is if the tooling is set up right it’s not painful, it’s the fiddling around with volume mounts folder permissions and debug points and “what’s inside the container and what isn’t” etc that is always the big pain point

Very accurate - that was one of the steps that caused me to fiddle quite a bit. Had to add an entrypoint to chown the mounts and also some Buildkit cache volumes for all the package managers.

You can skip the uid/chown stuff if you work with userns mappings, but this was my work machine so I didn't want to globally touch the docker daemon.


Ideally, you have a separate docker container for each process (i.e. a separate container for the ssh service, one for cron etc). The text editor can be installed if it's needed - that's not an issue apart from slightly increasing the container size. Most of the time, the monitoring agent would be running on the host machine and setup to monitor aspects of the container - containers should be thought of as running a single process and not as running a VM along with all its services.

Initially I didn't understand how they were getting the log files into the image. I had no idea that people abuse "docker commit" - do they know nothing about containers? If you want persistent logs, then have a separate volume for them so they can't pollute any image (plus they are readable when the image restarts etc).

When I saw the HN title, I thought this was going to be something subtle like deleting package files (e.g. apt) in a separate layer, so you end up with a layer containing the files and then a subsequent layer that hides them.


Probably not much larger at all, because the image doesn't need to contain Rust toolchain.

I don't know if the rust compiler produces bigger binaries, but for a single program, it'll not make a big difference.


> Shell (which probably means specifically bash)

Debian has ongoing efforts to make many shell scripts (like postinst Scripts in packages etc.) non-bash-specific.

A minimal Debian installation doesn't contain bash, but rather dash, which doesn't support bash extensions.


> A minimal Debian installation doesn't contain bash, but rather dash, which doesn't support bash extensions.

Please don't make up wrong facts that would be trivial to check first.

All minimal Debian installations include bash as it is an essential package. Where essential is used in the sense of https://www.debian.org/doc/debian-policy/ch-binary.html#esse...


Whether with a base install via the installer, or debootstrap, I've never seen bash missing.

For clarity, 'sh' is what is softlinked to dash. Not bash.


Probably familiarity bias; the author has more contact with EVs and/or expect the reader to have more contact with them.

I guess they want to do it first where the opportunity for monetization is best.

Is this how they're going to steward AGI as well? Take it to places where monetization is the best first? This is a nonprofit "for good of humanity" company btw

> Is this how they're going to steward AGI as well? Take it to places where monetization is the best first?

Yes! You bet it is. Sam said it so himself. He promised investors they’d make AGI and then ask it how to make money.

https://www.startupbell.net/post/sam-altman-told-investors-b...

They also define AGI as “generates $100 billion in profits”.

https://techcrunch.com/2024/12/26/microsoft-and-openai-have-...

> This is a nonprofit "for good of humanity" company btw

But it’s not. They explicitly tried to become a for-profit.

https://www.promarket.org/2025/05/06/openai-abandons-move-to...

Don’t be fooled, Sam and company give zero shits about “the good of humanity”. All they want is money and power.


I always believed so but it still feels bad to be proven right time and time again. There's a sliver of hope that's always there that maybe, just maybe, people actually care.

Is this an app that puts random stuff from the Internet through an LLM, making it vulnerable to command injection?

In theory, we know how to do wealth redistribution, AI or no AI: tax value creation and wealth transfer, such as inheritance. Then use the money to support the poor, or even everyone.

The problem really is political systems. In most developed countries, wealth inequality has been steadily increasing, even though if you ask people if they want larger or smaller inequality, most prefer smaller. So the political systems aren't achieving what the majority wants.

It also seems to me that most elections are won on current political topics (the latest war, the latest scandal, the current state of the economy), not on long-term values such as decreasing wealth inequality.


CI/CD actions for pull/merge requests are a nightmare. When a developer writes test/verification steps, they are mostly in the mindset "this is my code running in the context of my github/gitlab account", which is true for commits made by themselves and their team members.

But then in a pull request, the CI/CD pipeline actually runs untrusted code.

Getting this distinction correct 100% of the time in your mental model is pretty hard.

For the base case, where you maybe run a test suite and a linter, it's not too bad. But then you run into edge cases where you have to integrate with your own infrastructure (either for end2end tests, or for checking if contributors have CLAs submitted, or anything else that requires a bit more privs), and then it's very easy byte you.


I don't think the problem is CI/CD runs on pull requests, per se: it's that GitHub has two extremely similar triggers (`pull_request` and `pull_request_target`). One of these is almost entirely safe (you have to go out of your way to misuse it), while the other is almost entirely unsafe (it's almost impossible to use safely).

To make things worse, GitHub has made certain operations on PRs (like auto-labeling and leaving automatic comments) completely impossible unless the extremely dangerous version (`pull_request_target`) is used. So this is a case of incentive-driven insecurity: people want to perform reasonable operations on third-party PRs, but the only mechanism GitHub Actions offers is a foot-cannon.


> while the other is almost entirely unsafe (it's almost impossible to use safely).

I don't believe this is fair. "Don't run untrusted code" is what it comes down to. Don't trust test suites or scripts in the incoming branch, etc.

That pull_request_target workflows are (still) privileged by default is nuts and indeed a footgun but no need for "almost impossible" hysteria.


> I don't believe this is fair. "Don't run untrusted code" is what it comes down to. Don't trust test suites or scripts in the incoming branch, etc.

TFA is a great example of how this breaks down. The two examples in the post obtain code execution/credential exfiltration without running an attacker controlled test suite or script.


I never understood what it is about labeling/commenting that prevents in from working in the regular event. They could just add a permission that specifically allows those actions.


Companies should create value, and capture a fraction of that value.

What Synology did was trying to significantly increase the fraction of value they captured, at the cost of their customers, who would have to pay that, and without providing extra value for their customers.

This is not only a a bad deal for customers, it also triggers our sense of injustice.

The best companies create value, and capture only a part of it, and leave other parts of the value for both customers and partners/suppliers.


It kinda feels like you turn from a software engineer to an offshoring manager.

Offshoring software development means letting lower-payed software developers from somewhere far away do the actual programming, but they have a very different culture than you, and they typically don't share your work context, don't really have a feeling for how the software is used -- unless you provide that.

Now we're offshoring to non-sentient, mostly stateless instances of coding agents. You still have to learn how to deal with them, but you're not learning about a real human culture and mindset, you learn about something that could be totally changed with the next release of the underlying model.


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: