Using ARG in a Dockerfile – beware the gotcha

noelwelsh · 2024-05-14T11:38:52

This is a perfect example of the problem in the devops world. People want to do something that has slight variations from time to time. There is a well established concept for this, called a function. Programmers use functions every day and understand their semantics. Devops tool developers do not implement functions but some other abstraction with different semantics. Programmers use this abstraction and confusion ensures.

Why the people who implement these tools, who presumably are very accomplished programmers, cannot 1) see they are implementing something that is already well known and/or 2) implement it with standard semantics, is baffling to me.

Kids, just say NORWAY to devops.

pdimitar · 2024-05-14T12:35:53

That's why I plan on migrating all my shell scripts to Golang programs f.ex. using https://github.com/bitfield/script -- it already has a number of simulations of shell commands and I'd contribute others if I had the time.

sh / bash / zsh scripts are just fragile and that's the inconvenient truth. People who devised the shell interpreters had good intentions but ultimately their creations grew to a scope 1000x bigger than they intended, hence all the "do X but if Y flag is set then do Z... unless flag A is also set in which case do Y and part of Z".

It's horrendous and I seriously don't get what's so difficult in just coding these scripts in a programming language that provides single statically linked binaries (like Golang) and just distribute it with your images -- or run them in CI/CD and have init containers and never include them in the images in the first place.

Inertia, of course. But I'll be actively working against it until I retire.

Sebb767 · 2024-05-14T13:29:42

> It's horrendous and I seriously don't get what's so difficult in just coding these scripts in a programming language that provides single statically linked binaries (like Golang) and just distribute it with your images -- or run them in CI/CD and have init containers and never include them in the images in the first place.

It takes longer to develop (for sufficiently small scripts), it's harder to verify, it's harder to debug, it requires a build step and it's a lot more effort to modify it in-place. The process for calling other programs is also extremely streamlined, making it perfect for integration tasks like CI/CD pipelines.

Some of these issues don't exist when using an interpreted language like Python instead, but this comes with its own problems. Shell is pretty universal, well-known and is, for sufficiently small problems, the quickest solution.

docandrew · 2024-05-14T16:04:53

My team is also in the process of refactoring a number of shell script CI pipelines into Go. With `go run` it might as well be interpreted, from a UX standpoint it’s as easy to call as a shell script but with the benefit of static types and a robust ecosystem of supporting libraries and not having to sed/awk your way through command output.

pdimitar · 2024-05-14T17:56:53

> It takes longer to develop (for sufficiently small scripts)

Obviously, but at one point I sat down and tried to estimate how much time I wasted on using `set -euo pipefail` in my scripts and still having to chase after silent failures. I might be biased at this point but it still seemed quite a lot.

For a lot of one-off tasks shell scripting is 90% superior (unless you are super comfy with Golang) because it takes literal minutes to write and then iterate on it. Sure. But my threshold for when to reach for a proper program has been getting lower and lower lately because I very quickly arrive at a point where I need typing, better error-handling, retries and a few others.

Not everyone's case, surely. It seems my journey was more along the lines of "I was using shell scripting much more than I had to and I am making a partial comeback to proper programs".

throwaway482945 · 2024-05-14T13:39:11

Python's facilities for calling subprocesses are pretty inconvenient compared to bash IMO. It defaults to binary output instead of UTF-8 so I almost always have to set an option for that. I wind up having to define threads in order to run programs in the background and do anything with their output in real time, which has an awkward syntax. The APIs for checking the exit code vs raising an error are pretty non-obvious and I have to look them up every time. And I always wind up having to write some boilerplate code to strip whitespace from the end of each line and filter out empty lines, like p.stdout.rstrip().split('\n') which can be subtly incorrect depending on what program I'm invoking.

theamk · 2024-05-14T15:25:26

It looks like you used some old tutorials?

"subprocess.run" appeared in python 3.5, and it's pretty nice - for example you so "check=True" to raise on error exit code, and omit it if you want to check exit code yourself. And to get text output you put "text=True" (or encoding="utf-8" if you are unsure what the system encoding is)

As for your boilerplate, it seems "p.stdout.splitlines()" is what you want? it's what you normally want to use to parse process output line-by-line

The background process is the hardest part, but for the most common case, you don't need any thread:

     proc = subprocess.Popen(["slow-app", "arg"], stdout=subprocess.PIPE, text=True)
     for line in proc.stdout:
          print("slow-app said:", line.rstrip())
     print("slow-app finished, exit code", proc.wait())

sadly if you need to parse multiple streams, threads are often the easiest.

theamk · 2024-05-14T15:14:15

Eh, looking at your go library vs shell script, I'd choose shell script anytime. Here are major reasons:

- Debugging. I can prefix any shell script with "bash -x" and I'll get an step-by-step output of every command executed, including the arguments. I can copy-paste any part of script into interactive shell so I can run it over and over until I get things right.

- Error reporting: as long as you use shell's best practices, shell script automatically stops at a first error, and in most case there will be a clear error message printed. In golang, this is all manual, and it is pretty easy to ignore errors. (python really helps here)

(note your library seems even worse than that: not only it starts with go's error reporting handicap, you also designed it so "we just won't see any output from this program if the server returns an error response.", _and_ you actively mix stdout and stderr... this basically means any programs using your library will be nightmare to debug)

- Performance: "sort" is an external process, but it is designed to be able to sort even the data which does not fit into memory (it has spill-to-disk code). "grep" actually uses extreme optimization and things like Boyer-Moore algorithm - does your code do that?

(that said, shell scripts rapidly become unreadable after some complexity threshold.. my personal rule is "if you ever feel tempted to create common library of shell functions, it means your script become too complicated and it's time to rewrite it in real language ASAP". But until you hit that complexity limit, the shell scripts are very nice)

pdimitar · 2024-05-14T17:23:38

Valid criticisms but I should note I didn't write the library. Apologies if I left another impression.

I agree shell scripts are still more convenient for iteration but as you pointed out in your last paragraph, most of mine crossed the complexity threshold and it's now more worth it to me to invest in a proper program than to keep stumbling on poorly documented edge cases in bash/zsh.

And I again agree with your criticisms of the Golang library (worthy of filling a few issues in their GitHub repo btw, if you have the time), it's just that my stance there is: sure it will hurt replacing shell scripts but we should start getting to it. Being stuck in this local maxima is limiting.

throwaway290 · 2024-05-14T13:14:28

Let's all write configuration in imperative turing-complete languages, what could go wrong...

playingalong · 2024-05-14T20:09:51

False dichotomy.

You can also employ DSLs (domain specific languages). Plain ones or my favorite ones - as subset/SDK of generic purpose languages.

throwaway290 · 2024-05-15T02:48:07

Imperative vs declarative is false dichotomy now? Do you know what it means

noelwelsh · 2024-05-15T05:56:39

It's unclear to me if your original post was claiming that Docker is a Turing-complete imperative language, or that a Turing-complete imperative language is the only alternative to Docker.

It's a false dichotomy that the only points in the design space are 1) Docker and 2) a Turing-complete imperative language.

throwaway290 · 2024-05-15T08:18:09

There is not one but two true dichotomies, one between imperative and declarative, another between tiring complete and not.

If you were recommending everyone to use some general purpose language with functions that is not turing-complete then sure. Which one is it?

Otherwise I will repeat, declarative purposefully limited configuration languages that you are trying to mock in your comment by picking on YAML (which relates to Docker how?) have their place and were created to solve real problems. For every Norway there are dozens of bigger issues (including security) in any regular programming language

sherburt3 · 2024-05-14T11:46:16

What does NORWAY mean?

svieira · 2024-05-14T11:49:53

Norway's 2 character country code abbreviation is "NO". Which is one of the many ways to spell the boolean `false` in YAML 1.0. Which is yet-another-way-to-shoot-yourself-in-the-foot with DSLs written in YAML.

steve_rambo · 2024-05-14T11:49:20

https://www.bram.us/2022/01/11/yaml-the-norway-problem/

pdimitar · 2024-05-14T11:49:48

Likely a joke with the YAML parsers that parse `NO` as the boolean value `false` when the writer of the YAML file intended to put Norway's 2-character code as a key in a map.

Joker_vD · 2024-05-14T08:48:25

> Perhaps I should have read the entire reference document for all the Dockerfile instructions first.

I really, really hate to be "that one guy" (c) Rachel Kroll, but... if you're about to use a large, old, complicated piece of software, the chances are very good that its developers have mental models about what is good software, and what is good documentation, and what is self-evident and what is not, that are very, very different from yours, especially if you're a programmer and the piece of software is aimed at the sysadmins/ops/devops — so yes, read the whole docs. The things are not what you would hope they are.

ponector · 2024-05-14T10:19:40

And then go and read all issues logged to see how your mental model is different.

Recent example I hit was a Gitlab pipeline. "When: always" setting for after_script does not mean the command will be executed always. Only if main section finished with either success or failure. If it is terminated by timeout or manually - no, script is not executed.

anal_reactor · 2024-05-14T11:59:00

>so yes, read the whole docs

Thanks to reading the whole docs an hour a day I avoided a bug that happens once a year and takes whole 30 minutes of debugging.

"Correct practices" in a nutshell.

MereInterest · 2024-05-14T12:31:58

Exactly. Documentation is useful when a user knows that there are multiple reasonable options for an implementation to take. "Which byte order is used for this protocol?" is a question that can occur to a user and can be looked up in the documentation. Questions like "Does this language require explicit imports to access variables in the parent scope?" are not questions that occur to a user, because any violation of that would be so unexpected as to be a bug.

Docker documents its bugs, calls them the intended behavior, and then blames users for not reading about the bug in the documentation.

teekert · 2024-05-14T12:03:42

So many gotchas in today's world.

Recently it turned out that docker-compose.yaml was helpfully exposing our DB to our local LAN (which is an entire company that has nothing to do with our dev work), despite restricting all in and out traffic to our tailnet with UFW.

I know it's a known issue, it comes up here every now and then, we kept it in mind and even talked about it. But as soon as someone uses an "expose" in a docker-compose.yaml, you're f-ed.

Sure, you should keep the containers talking within the docker network, but we are kind of distributed in our org (across a series of NUCs).

I find it annoying that is has come to this. Now considering Podman, or VMs (to use a host firewall around the virtual stuff).

nonameiguess · 2024-05-14T14:55:51

The real issue here is Linux overloading netfilter to implement both overlay networking so you can address virtual interfaces in network namespaces, and traffic blocking rules, so host-based firewalls like ufw and containerized networking implementations like Docker and Kubernetes more or less have to assume they have full control of your filter chains and it's up to you as a server admin to deconflict. The best answer is usually not to use any kind of host-based firewall if you're doing container networking and use a network firewall instead.

jhardy54 · 2024-05-14T13:49:45

Are you binding to 127.0.0.1 or 0.0.0.0?

> Sure, you should keep the containers talking within the docker network, but we are kind of distributed in our org (across a series of NUCs).

I don’t think I understand this, are you saying that you want the ports exposed to some devices on your network but not others?

teekert · 2024-05-14T19:44:15

In docker-compose it's just "- ports", I guess it's always 0.0.0.0, unless bound to something else (like 127.0.0.1). You can restrict it to the Tailnet IP, but then you have to hardcode the Tailscale 100.x.x.x IP address (not flexible).

hooby · 2024-05-14T08:54:57

So, each stage/each FROM has it's own scope basically. If you define an ARG in the global scope, you have to import it to the local scope before using it.

Not terribly unusual, but I wasn't aware of it - so yeah, wouldn't hurt if the docs for ARG mentioned it.

Maybe submitting a PR for an addition to the docs would help more people than writing a blog post about it?

miguelaeh · 2024-05-14T08:33:10

I don't see the gotcha, that's how it is supposed to work. It's just their purpose

Joker_vD · 2024-05-14T08:52:07

The gotcha is that users only read the smallest possible amount of docs (which is usually zero), at the most "focused" placed (e.g. the docs for exactly one command they suspect is misbehaving, not for all of the commands used in the script, and definitely not the intro into the docs where the concepts are explained), and the doc writers don't bother to duplicate the information in all the relevant places.

MereInterest · 2024-05-14T12:38:36

There are two types of useful documentation: Documentation that builds a mental model for a user, and documentation that answers questions that cannot be answered from the mental model. The first is done through tutorials and quickstarts, and the second through API references.

There's a third type of documentation: Documentation that describes behavior that is contrary to the mental model. This documentation is useless to a user, as there is no reason to seek it out. The benefit to a developer is that they can document bugs instead of fixing them, and blame the users for it.

resonious · 2024-05-14T08:58:45

This is true, and then they will complain about lack of or poor documentation.

lucumo · 2024-05-14T09:46:07

To be fair, if the documentation isn't meeting the users' needs it is poor documentation.

One of my more tongue-in-cheek maxims is that too much documentation is worse than too little. With too much documentation the information might be there, but you aren't able to find it. The outcome is the same, but with too much documentation you've just wasted an hour failing.

It's slightly tongue-in-cheek because you can push the amount of documentation pretty far, but you have to think about how to organise it and how users will get the required information when and if they need it.

dspillett · 2024-05-14T11:18:25

> if the documentation isn't meeting the users' needs it is poor documentation

Unless they are not reading it, or properly paying attention. Or following an unofficial document and blaming the core project for failings in that. Sometimes it is on the user.

> With too much documentation the information might be there, but you aren't able to find it.

It can also became a huge burden to maintain, meaning it is in danger of becoming out of date or inconsistent.

lucumo · 2024-05-14T11:28:54

> Unless they are not reading it, or properly paying attention. [...] Sometimes it is on the user.

If they make no effort, sure. But looking for an answer to a question and finding something that seems to works is a perfectly reasonable way to use documentation. If there's a dangerous gotcha and it isn't documented right there, then the documentation is structured badly.

> Or following an unofficial document

That could be a very clear symptom of bad documentation.

Sometimes it's on the user, but if lots of users are failing to use your documentation, maybe consider that the documentation is bad.

resonious · 2024-05-14T14:09:20

I do agree with this in the end. I also think people often underestimate how challenging it is to create good docs.

I've seen this pattern often:

- Person doesn't know how to do something.

- They fumble around until they get it working.

- Once it's finally working, they write a doc about how they did it (because our "poor documentation" is a common pain point and therefore a popular problem to attack).

- If you actually search our internal docs, you find at least one other doc describing the same process.

I've seen this happen in many different contexts, even onboarding. I've seen multiple people join the company, and each one wrote down their own "onboarding painpoints" doc to hopefully help the next person, without even noticing the existence of the previous person's equivalent doc.

So again, I still agree with you. Even in these scenarios I described, I'm willing to believe that there is some way we could've structured our docs that couldn't prevented these issues. But I have no idea what it is, and seeing all the futile attempts at improving the situation makes me irk a little at armchair "poor docs"-style comments (not that that's even related. I've gone a little off topic here!)

codetrotter · 2024-05-14T10:55:49

LLMs augumented with RAG has great potential for docs as well.

Have a problem, ask the LL and it will reference the docs. So you don’t have to read through 40 pages just to find an answer.

Some products already make use of it for their docs. More will in the future.

The advantage then is that you can have up to date docs that the LLM can pull from and be able to hopefully accurately pinpoint relevant docs and summarize an answer for the user.

I also think some startups will come that focus on providing this kind of service. Probably several such startups exist already even. Similar to how there are some companies from before LLMs existed that focused purely on better access to docs of open source products.

sergiotapia · 2024-05-14T12:01:37

Documentation has to meet the user, not the other way around. Otherwise it's poor documentation. Docker should update their docs to show what 90%+ of people are interested in and leave the deep dives for later.

Compare these two similar functions:

https://hexdocs.pm/elixir/1.12/Enum.html#map/2

https://modules.vlang.io/maps.html#to_array

One has beautiful clear examples of what 99% of people are going to want to do. The other is some kind of secret language people must decipher, especially if they are new to the language. "What is 'm'? What is 'I'?"

Docker can improve!

BonusPlay · 2024-05-14T08:41:03

The gotcha is that "global" args don't propagate automatically to all stages (thin includes 1 stage builds).

I want this one arg in multiple stages, so I'll declare it above everything is the chain of thought.

koito17 · 2024-05-14T08:55:36

I interpret each directive in a Dockerfile as creating a new layer of an image. So this ARG-before-FROM gotcha doesn't feel like a gotcha to me, but rather, the consequence of literally interpreting "ARG" and not knowing the side-effects of a directive in a Dockerfile. (Yes, even WORKDIR, ENTRYPOINT, and related instructions create a layer, albeit a 0-byte one)

fireflash38 · 2024-05-14T10:43:10

It persists across every other new layer. It just doesn't persist across FROM.

johanbcn · 2024-05-14T08:49:31

It's explained on the official documentation: https://docs.docker.com/reference/dockerfile/#scope

hombre_fatal · 2024-05-14T09:03:45

True, though gotchas exist when user intuition doesn't match with actual behavior regardless of whether they are mentioned in docs.

chii · 2024-05-14T09:52:34

Principle of least surprise.

If you need to write in the docs about a surprise that a user otherwise wouldn't have expected, may be it's a sign that the surprise should be fixed up such that it's not surprising behaviour.

mikehollinger · 2024-05-14T12:13:39

> I don't see the gotcha, that's how it is supposed to work. It's just their purpose

The issue here is that docker evolved rather rapidly and in a “let a thousand flowers bloom” sort of manner. And because of that you have these subtle but confusing differences between behaviors that aren’t really all that consistent.

A good example of this is how the shell is handled from layer to layer(sorta this) or even how CMD and ENTRYPOINT behave (or don’t).

If the spec has allowances for behaviors like this generating warnings would be the best possible outcome (eg referencing a variable that theoretically isn’t set). Maybe certain runtime / runc / build envs complain but the author didn’t see the complaint.

nrvn · 2024-05-14T11:09:00

A worse idea would be to run bash scripts adhoc without at least `set -u`.

There is an unofficial bash “strict mode”, overlooked and undervalued:

set -euo pipefail

Just use it everywhere and use shellcheck for scripts.

evgpbfhnr · 2024-05-14T11:16:57

`set -e` is evil: https://web.archive.org/web/20240511180116/https://mywiki.wo... (wooledge wiki doesn't seem to load for me for some reason so archive.org it is)

hyperhopper · 2024-05-14T11:21:37

This perfectly illustrates the problem with bash.

Somebody did something the straightforward way which worked. A solution is given on a better way, but that also has a problem that somebody else didn't know about.

Bash let's you do anything and have it kind of work but you never know how many footguns you're setting up for yourself.

anal_reactor · 2024-05-14T11:54:53

I hate bash

hannibalhorn · 2024-05-14T12:15:00

That's an interesting link and an argument I hadn't seen before.

In my "real world" usage, I've found "bash strict mode" incredibly useful, especially when scripting things that others will be modifying. It just avoids all of the "somebody forgot an error check", "there's a typo in a variable name", etc., type errors. And pouring through logs to figure out what actually went wrong in CI/CD is brutal.

Looking at the examples:

- I don't generally do arithmetic in bash, that's a sign of "time to change language" - Also generally use `if ...; then ...; fi` instead of `test ... && ...`

If you do the advanced type of scripting that somebody who writes a Bash FAQ does, I'm sure `set -e` can be annoying. But once you learn a couple simple idioms y(covered on the original "bash strict mode" page) you don't have to know much more to have fairly robust scripts. And future maintainers will thank you for it!

orochimaaru · 2024-05-14T11:43:55

I think the author is expecting ARG during runtime but it is supposed to be a compile time command. The bigger gotcha I have seen is using ENV instead of ARG. Especially for proxies. You may be using a proxy during compile time but don’t need (or it’s not configured to your network) at runtime.

gorgoiler · 2024-05-14T09:38:51

POSIX sh has “set -u” to handle this category of errors — scripts that attempt to expand an unset parameter will exit with an error message.

Perl has “use warnings ‘uninitialized’” for the same reason.

It sounds like this feature would be worth considering for the Dockerfile spec.

steve_rambo · 2024-05-14T11:06:52

I wish we would rather get rid of Dockerfile in favor of what something like buildah does:

https://github.com/containers/buildah/blob/main/examples/lig...

Since Dockerfile is a rather limited and (IMHO) poorly executed re-implementation of a shell script, why not use shell directly? Not even bash with coreutils is necessary: even posix sh with busybox can do much more than Dockerfile, and you can use something else (like Python) and take it very far indeed.

mass_and_energy · 2024-05-14T12:01:46

That's like saying "why do we bother with makefiles when we can just make a shell script that invokes the toolchain as needed based on positional arguments?". Well, we certainly could do that but it's over complicated compared to the existing solution and would represent a shift away from what most Docker devs have grown to use efficiently. What's so bad about Dockerfile anyway?

MereInterest · 2024-05-14T13:13:22

> What's so bad about Dockerfile anyway?

Things I've run into:

* Cannot compose together. Suppose I have three packages, A/B/C. I would like to build each package in an image, and also build an image with all three packages installed. I cannot extract functionality into a subroutine. Instead, I need to make a separate build script, add it to the image, and run it in the build.

* Easy to have unintentional image bloat. The obvious way to install a package in a debian-based container is with `RUN apt-get update` followed by `RUN apt-get install FOO`. However, this causes the `/var/lib/apt/lists` directory to be included in the downloaded images, even if `RUN rm -rf /var/lib/apt/lists/` is included in the Dockerfile. In order to avoid bloating the image, the all three steps of update/install/rm must be in a single RUN command.

Cannot mark commands as order-independent. If I am installing N different packages

* Cannot do a dry run. There is no command that will tell you if an image is up-to-date with the current Dockerfile, and what stages must be rebuilt to bring it up to date.

* Must be sequestered away in a subdirectory. Anything that is in the directory of the dockerfile is treated as part of the build context, and is copied to the docker server. Having a Dockerfile in a top-level source directory will cause all docker commands to grind to a halt. (Gee, if only there were an explicit ADD command indicating which files are actually needed.)

* Must NOT be sequestered away in a subdirectory. The dockerfile may only add files to the image if they are contained in the dockerfile's directory.

* No support for symlinks. Symlinks are the obvious way to avoid the contradiction in the previous two bullet points, but are not allowed. Instead, you must re-structure your entire project based on whether docker requires a file. (The documented reason for this is that the target of a symlink can change. If applied consistently, I might accept this reasoning, but the ADD command can download from a URL. Pretending that symlinks are somehow less consistent than a remote resource is ridiculous.)

* Requires periodic cleanup. A failed build command results in a container left in an exited state. This occurs even if the build occurred in a command that explicitly tries to avoid leaving containers running. (e.g. "docker run --rm", where the image must be built before running.)

mdaniel · 2024-05-14T16:24:36

> Must be sequestered away in a subdirectory ... Must NOT be sequestered away in a subdirectory

In case you were curious/interested, docker has the ability to load context from a tar stream, which I find infinitely helpful for "Dockerfile-only" builds since there's no reason to copy the current directory when there is no ADD or COPY it is going to use. Or, if it's a simple file it can still be faster

  tar -cf - Dockerfile requirements.txt | docker build -t mything -
  # or, to address your other point
  tar -cf - docker/Dockerfile go.mod go.sum | docker build -t mything -f docker/Dockerfile -

MereInterest · 2024-05-14T18:22:53

Thank you, and that's probably a cleaner solution than what I've been doing. I've been making a temp directory, hard-linking each file to the appropriate location within that temp directory, then running docker from within that location.

Though, either approach does have the tremendous downside of needing to maintain two entirely separate lists of the same set of files. It needs to be maintained both in the Dockerfile, and in the arguments provided to tar.

Jasper_ · 2024-05-14T23:57:27

> The documented reason for this is that the target of a symlink can change

The actual reason is that build contexts are implemented as tarballing the current directory, and tarballs don't support symlinks.

MereInterest · 2024-05-15T03:05:40

> and tarballs don't support symlinks.

Err, don't they? If I make a tarball of a directory that contains a symlink, then the tarball can be unpacked to reproduce the same symlink. If I want the archive to contain the pointed-to file, I can use the -h (or --dereference) flag.

There are valid arguments that symlinks allow recursive structures, or that symlinks may point to a different location when resolved by the docker server, and that would make it difficult to reproduce the build context after transferring it. I tend to see that as a sign that the docker client really, really should be parsing the dockerfile in order to only provide the files that are actually used. (Or better yet, shouldn't be tarballing up the build context just to send it to a server process on the same machine.)

dvhh · 2024-05-14T11:39:16

That look quite the same as running a container in docker then commiting it into an image. But this does not seem to allow to set entrypoint or some image configuration values.

akdev1l · 2024-05-14T12:01:13

https://github.com/containers/buildah/blob/main/docs/buildah...

vbezhenar · 2024-05-14T10:18:51

For this use-case you don't need anything in Dockerfile spec. ARG is not magically expanded inside RUN blocks, it's just an env available during build stage.

In other words you can use this code to catch this error:

  ARG DEBVER="10"
  ARG CAPVER="7.8"
  FROM debian:${DEBVER}
  
  RUN <<EOF
  set -eu
  printf "DEB=${DEBVER}\nCAP=${CAPVER}\n"
  EOF

This code fails with expected error.

gorgoiler · 2024-05-14T10:44:56

That seems like a perfectly good idiom and it’s fantastic to hear that Dockerfiles support heredocs — I had no idea, thank you!

https://www.docker.com/blog/introduction-to-heredocs-in-dock...

It would still be possible to accidentally reference uninitialized environment variables in contexts other than RUN though. Having those be treated as errors would be useful.

kmarc · 2024-05-14T12:05:00

This is unnecessarily complicated. Just use `SHELL`

    SHELL ["/bin/sh", "-o", "nounset", "-c"]
    RUN printf "DEB=${DEBVER}\nCAP=${CAPVER}\n"

MereInterest · 2024-05-14T12:11:19

Follow-up question: Does the SHELL need to be set within each stage of a multistage build? Ideally, it could be set once to remove this footgun, but I’m guessing the footgun gets reloaded with each stage.

kmarc · 2024-05-14T12:31:00

Yes.

In general, I think of multi-stage builds as if multiple docker files were concatenated; if ARG or SHELL doesn't exist in the theoretical split Dockerfile, then it wouldn't exist in the multistage build.

Docker(file) is a thing which brought in some quite neat new ideas, and a lot of these "footguns". I did read the docs, and I am comfortable now building even complicated "Dockerfile-architectures" with proper linting, watching out for good caching, etc. I wish there were better alternatives; I saw images being built with Bazel, and thanks, but that is even more convoluted, with presumably a different set of footguns.

I am not saying that Dockerfiles are good. But I also think that knowing the tool you use differentiates you from being a rookie to an experienced professional.

MereInterest · 2024-05-14T13:18:56

> I am not saying that Dockerfiles are good. But I also think that knowing the tool you use differentiates you from being a rookie to an experienced professional.

I agree in general, though I'd also add that a tool requiring somebody to know and avoid a large number of footguns is what differentiates a tool from being a prototype and being production-ready. While Docker has a large number of features, the number of gotchas makes it feel like something that should only be used in hobby projects, never in production.

In the end, I ended up making several scripts to automate the generation and execution of dockerfiles for my common use cases. I trust myself to proceed cautiously around the beartraps once. I don't trust myself to sprint through a field of beartraps if there's a short deadline.

iainmerrick · 2024-05-14T10:24:53

So now you need to wrap all your RUN lines in here docs?

Better defaults and/or an easier way to change the defaults would be good!

The previous commenter is right -- it's all very well to say "just read all the docs" or "it's consistent with the way other tools work", but sometimes both the docs and the tools can be made less error-prone for users.

diarrhea · 2024-05-14T16:46:12

Empty/unset environment variables not, by default, being cause for immediate error and abort is the billion dollar mistake of the DevOps/infra world.

gorgoiler · 2024-05-14T18:37:28

I agree but let’s hear the counter arguments so as to not sound like we aren’t listening.

Environment variables can be useful. The first version of some code can be oblivious to environment variables and the second version can use them for feature flags. Because it is entirely optional to provide these, the second version of the code can be deployed to the first version’s infrastructure, and the first version will happily ignore environment variables set in the second version’s infrastructure.

As a tool for incrementally moving heterogeneous parts of a system forward without needing atomic leaps, the use of environment variables as optional flags is very useful.

I just wouldn’t even start from there. Instead I configure everything in code, in one repo, with all new deployments appearing as new clusters where the only thing they have in common is that they take production traffic. This kind of production infra is very useful but there are downsides for sure. It makes database schema changes, for example, something between very hard and anathema to execute.

kaliszad · 2024-05-14T09:53:13

That is what you get when people reinvent the wheel and lifetimes/ scopes are implicit. Docker could've used something like JSON5 [0] for their configuration format to make the lifetimes explicit. Another time when easy won over simple. [1]

[0] https://json5.org/ [1] https://www.youtube.com/watch?v=SxdOUGdseq4

ascar · 2024-05-14T10:05:05

JSON for something that essentially mirrors a shell installation process? Feels like you are trying to reinvent things with a golden hammer, not like actually making it easier.

dcormier · 2024-05-14T10:58:43

I remember tripping over this. It was frustrating.

mass_and_energy · 2024-05-14T12:05:54

I'm seeing a lot of people complaining about Dockerfile being too complicated, but if you can wrap your head around local scopes and Docker's layering it's really not rocket science. Is Dockerfile that bad or is it an overall lack of understanding? I've used Dockerfile for years and have found it to be intuitive and powerful.

pdimitar · 2024-05-14T12:19:57

Not sure why you are coming here to rant about "a lot of people complaining about Dockerfile being too complicated" when the topic here is a gotcha that's not at all obvious and Docker working in a way that is not intuitive. If I specify ARG I obviously expect it to be set everywhere in the Dockerfile. Your rant is out of place because it generalizes and attempts to shift topics.

crabbone · 2024-05-14T12:24:44

JavaScript is also not complicated. It's just poorly designed when it comes to matching programmer's intuition.

With this out of the way: Dockerfile as a concept, i.e. a minimalist (in simple cases) way to declaratively configure a container is great. The execution of this concept is, on the other hand, awful. The behavior of gotchas like the one OP described isn't the reason for the awfulness, rather, it comes from the inconsistent-by-default builds. That is, Dockerfile is not a full description of the contents of the built image. To know what will go into the image, you also need to consider the state of the entire world (because you can download things from the Internet during build, execute code with unpredictable side effects etc.)

brainzap · 2024-05-14T10:30:41

ARG not being availabe in ENTRYPOINT or CMD is so annoying for usability.

bheadmaster · 2024-05-14T10:36:41

You can use ENV to convert ARG into an environment variable which can be used in CMD through shell environment variable evaluation:

    FROM ubuntu:20.04

    ARG foo
    ENV foo $foo

    CMD ["/bin/sh", "-c", "echo $foo"]

noop_joe · 2024-05-14T20:22:11

To me this is just one of many infrastructureisms. Tools, platforms, etc are all leveraging metaphors that don't map well to the mental model of _application development_.

btreecat · 2024-05-14T11:36:12

The CSS is whack on Firefox Mobile

imp0cat · 2024-05-14T07:54:30

tldr; ARG values are only valid in the build image, not anywhere else.

In a multistage build, they are only valid in the image that declared them (ie. they do not magically pass over to any other image).

a_t48 · 2024-05-14T08:40:57

It's kind of funky when you use them in the names of stages, too. IIRC you have to declare them right above.

MereInterest · 2024-05-14T12:07:03

I take issue with your use of “magically”. Multistage builds look like a inner scope for each stage, and an outer scope that contains them. In most programming languages, variables in an outer scope are accessible within an inner scope.

I can see an argument for variables declared within one stage being inaccessible in another stage, as the FROM keyword starts a new sibling stage. However, they are still children of the outer scope, should have access to variables in the outer scope. That they do not is rather unexpected.

imp0cat · 2024-05-14T12:43:37

Understood, sorry about that.

But that's how it feels to me.

It's mostly because my mental image of multistage builds is different, I see them as a way to make the final image smaller by throwing away everything from the previous image.

So it does make sense to me that ARG values are thrown away after a new FROM instruction and if you need them, you need to declare them again.

These two guides have led me to this view, I blame the author! ;) They're excellent BTW, I've revisited them a lot.

https://vsupalov.com/docker-arg-env-variable-guide/#arg-and-...

https://vsupalov.com/build-docker-image-clone-private-repo-s...

MereInterest · 2024-05-14T22:09:32

> Understood, sorry about that.

And apologies on my side, as I worded it a bit strongly. I agree with pretty much everything in your post, except for the implication that it would be magic. To me, the expected behavior is to follow lexical scoping, just like the vast majority of languages from the past several decades.

> So it does make sense to me that ARG values are thrown away after a new FROM instruction and if you need them, you need to declare them again.

True, and I'd expect an ARG from a previous stage to be undefined when in the next stage. Each ARG belongs to the scope in which it is declared, and a new scope is entered with each FROM statement. However, ARG statements prior to the first FROM statement don't fit into that model. They are before any FROM statement, so they can't belong to any FROM statement.

There's a design decision on whether statements before the first FROM statement are a parent scope, or whether they are unrelated scopes, to the scope used for each stage. They feel like they should be a parent scope, because these ARG definitions can be used in a FROM statement, and so it keeps surprising me when they are instead treated as unrelated scopes.