, so that custom configurations in that directory win over the distribution-provided defaults set up after this Include directive inside /etc/ssh/sshd_config.
> In addition to /etc/systemd/system, the drop-in ".d/" directories for system services can be placed in /usr/lib/systemd/system or /run/systemd/system directories. Drop-in files in /etc/ take precedence over those in /run/ which in turn take precedence over those in /usr/lib/. Drop-in files under any of these directories take precedence over unit files wherever located. Multiple drop-in files with different names are applied in lexicographic order, regardless of which of the directories they reside in.
> ...
And this is just an excerpt which doesn't cover the details. Read the manpage for all the hilarity.
>Over-elaborated hierarchies where configuration files may reside, and their precedence schemes.
>Read the manpage for all the hilarity.
If you find it hilarious it's because you don't understand the reason. /usr belongs to the distribution and /etc belongs to the user. The distribution doesn't touch /etc and the user doesn't touch /usr.
Upstream ships a unit with its own opinion of the defaults. That unit goes in /usr.
Distribution would like to override a default. It can patch the file but that is somewhat harder to discover (it would require reading the package source). Alternatively it can add a drop-in. That drop-in goes into /usr too since it's a distribution file.
User would like to override a default. They should not edit /usr because that belongs to the distribution. They drop their override in /etc.
Some units are created at runtime on demand and thus ephemeral, like mount units generated by generators. These go in /run. These are functionally "system-provided units" because users must be able to override them, so they're below /etc in the hierarchy.
The "defaults" and "overrides" we're talking about are both at the level of the individual setting as well as at the level of the whole file. If a user wants one of the settings to be different, they can add a new 99-foo.conf file that unsets / changes it. If they want a whole file to be different, they can add a new file with the same name that is empty / has different values for its settings.
BTW `systemctl cat whatever.unit` will print all the files that systemd considered in the order that it considered them.
>And this is just an excerpt which doesn't cover the details.
The only detail your excerpt doesn't include is the consideration of `service.d/`, `socket.d/` etc directories which provide overrides for all units of that type. It's just one point, not commonly used, and not that big of a deal either.
> If you find it hilarious it's because you don't understand the reason.
I think it's hilarious that you think users are supposed to understand "the reason" at this level of detail.
When we make a thing, we make choices about who it's for. Sometimes those choices are explicit. But this is a good example of an implicit choice. Clearly it grew into this shape for reasons, but those reasons are about the people doing the making, not the people using it.
Even if we grant that all of this hierarchy is necessary, it still imposes a large burden on the person trying to figure out "why is X doing Y?" And that might be fine if the makers said, "Well this is going to be hard, so what can we do to make it easy?"
But denying that users will reasonably find this hilarious (or frustrating or impossible) is just digging the hole deeper. Telling people that their understandable reactions are wrong doesn't change their reactions, it just gets them to stop mentioning the problems around you.
> it still imposes a large burden on the person trying to figure out "why is X doing Y?"
Systemd has tools for that, such as systemctl cat which shows all applicable configuration and what files they come from.
Personally I have found the systemd hierarchy to be very useful. It allows me to create overrides for certain properties in a Unit without having to worry about keeping the rest in sync with upstream, or my package manager overwriting my configuration. And it allows me to have different independent packages (or chef recipes or ansible playbooks) that create different override files for the same unit without stepping on each others toes.
Yes it is a little complicated, but for most end users, you don't really need to worry about most of the complexity, you just need to know where you put your overrides. And for distributors and power users, that complexity serves a useful purpose.
> Systemd has tools for that, such as systemctl cat which shows all applicable configuration and what files they come from.
Sure, it could be that systemd does have amazing tooling for helping users with the burdens they have created. Or, given the very mixed feelings about systemd, maybe they've tried to be user focused but not hit the mark. I don't know enough to say.
But my point is that either way responses of the form "it's so easy, you just have to remember [9 paragraphs of detail users don't care about]" are part of the problem, not the solution.
Systemd didn't create this problem. But it solves the issue of distributions shipping config files as part of packages and then on every package upgrade having to reconcile between the distribution's config and your modifications. Now with the drop in system you don't need to do that.
I am not an expert but I think in general systemd has a lot of complexity but it's to handle existing issues in a better way. Some of the older init systems might be simpler to describe or get started but lead to more confusing situations in the long run.
"Make it as easy as possible, but not simpler". I think this applies perfectly here. Systemd has very complicated requirements due to its very unique position in a Linux distribution. And as a (power) end-user of it, you basically need you basically needs to touch only /etc
> Systemd has very complicated requirements due to its very unique position in a Linux distribution.
But are all those requirements actually necessary for the job it's doing? Or have its developers imposed artificial constraints through their choices in designing systemd that make it more complex than it could otherwise be?
The combination of distributions existing and believing they “own” parts of the system, and users needing to override settings means it is as simple as possible, and no simpler.
A distribution 'owning' a directory means that the package manager is allowed to change files in there, and allowed to assume nothing major was changed from what it installed.
A distribution that doesn't own any directories like this effectively can't make any changes. It's like writing a class without private fields.
That's because some features are targeted at different audiences. End users want simplicity and ease if use. Admin also want those, but they are secondary to determinism, consistency, and ease of tracking changes.
I find the best way to think about systemd is as the result of some people that probably knew the most about init systems remaking init to be consistent, documented, and handle a bunch of new features that different newer init systems had used to good effect.
Then, because regardless of how much those people knew the vast mofority of information about needs was sequestered in 1000 different teams and projects, they had to add a bunch of new features. Repeat that 100 times, and you get what we have today, which is a system that mostly works (and works well if you actually adopt their preferred patterns), but one that also has the craziest and hardest to remember directive names I have ever seen, where slight nuance about how they function means entirely different params.
In a perfect world it would be rewritten with the same features but with all the knowledge of what features are actually needed known and consistently implemented with sane naming from the beginning, but nobody wants to go through that again.
That last bit is especially what interests me about systemd. From the way you describe it, it sounds like they made it mainly to please themselves, and were then forced to accommodate the actual users and actual users. But that they did so grudgingly and partially, again favoring their interests.
Assuming that's how the history went, fine. Everything has a history, and we have to deal with the now.
But my point is that we are not going to deal well with the now, let alone the future, unless we recognize the impact the history has on people. So if people defensively shout "that's not funny" and imply users are wrong for being bothered by the mess, then we're on track to build another mess. We can't do better until we are honest about why it didn't turn out as well as we would have liked.
> it sounds like they made it mainly to please themselves, and were then forced to accommodate the actual users and actual users
I think it's less that than that there were real advances they were after, which they delivered (Poetterings original essays on what it can and should do and where the ideas came from, such as OS X's LaunchD and some things from Windows, etc) referred to numerous things that every sysadmin I knew at the time that read it was excited if could be delivered (but none of use really appreciated what it meant to deliver them).
The issue is not that they initially built it to their own desired, but that the way SysV init worked with just launching bash scripts to start and control the process that eventually should be running meant that a lot of crazy behavior was put into scripts along the years, because when you can write a program to control your program, sometimes it's easier to just put special behavior into your launch script than to patch the real code in question in some way.
When this crashed against systemd's desire to know what's actually running and control the environment perfectly (which is beneficial, it was always a pain to have a process that started fine from your root user shell but not on startup), and strongly disincentivize programs between systemd and the process in question (which a bash script is) for reasons of process and resource tracking (Linux kernel cgroups were just starting to widely be used beneficially around this time), there started to be a mass influx of feature requests for behavior that used to be handled by dedicated bash scripts previously, but now systemd needed to account for.
Combine that with systemd adding real dependency tracking (as well as start order tracking, which is related but not quite the same, leading to the difference in Wants and Requires), and what we have is a system that's just trying to do something vastly more complicated than what SysV init did previously (and that's before we get to the scope creep which caused them to want to take over some other early boot processes to they could be deterministic or whatever, which is also a PITA for many people).
If you're actually interested in the history if this, I recommend reading "systemd, 10 years later: a historical and technical retrospective" at https://blog.darknedgy.net/technology/2020/05/02/0/. That's where most my info on the details come from. I was around in the beginning and was excited to see systemd start, but I didn't actually use it until far later since most the systems I administer don't get changes like that immediately (and it was in an extremely slow release cycle at that time).
If you read the above you may come away with a different perspective than me, but I will say that systemd is very consistent in its operation and very robust in how it tracks processes and whether there's been a problem and restarts are needed, etc, as long as you actually adopt their preferences (which is to say run everything foreground and let systemd deal with STDIN/STDOUT and environment, and have a good pid to start with). If you do that, writing a new service file is extremely easy. Far more easy than it ever was with SysV init. It's a different paradigm, and sometimes it can be a pain when some process doesn't want to play nice, but mostly it just works once you stop fighting it, which is nice, and cuts down on admin time figuring out problems (especially since it just grabs all STDOUT and STDERR for you so random error messages that don't go to logs aren't lost either).
Thanks for the details. I will check out the link.
I should say that "real advances they were after" is not at all incompatible with "made it mainly to please themselves". The things that pleased them may well have been "real advances"; they often are. The question is the extent to which they were prioritizing the needs, capabilities, and interests of the whole audience being served, not just the things they personally thought important or preferable.
Users are only supposed to know about `/etc`. The rest of the directories are useful for distribution maintainers, people who package software for a distribution (but may not be maintainers), and some developers.
> Distribution would like to override a default. It can patch the file but that is somewhat harder to discover (it would require reading the package source). Alternatively it can add a drop-in. That drop-in goes into /usr too since it's a distribution file.
> User would like to override a default. They should not edit /usr because that belongs to the distribution. They drop their override in /etc.
Larry Wall elegantly solved these problems 38 years ago with the "patch" utility. Debian is a good example of how its use has been standardized to solve these problems.
The patch approach to managing distribution-provided configuration files is superior to the systemd unit approach for at least three reasons:
- Non-experts can read the configuration files from top to bottom in linear order, so the documentation can be inlined by the distribution provider, which means they can add maintainers notes as necessary.
- It also works better than the systemd unit style of configuration for sophisticated users. In particular, if you add a now-incompatible option to the file in the suggested place (next to the documentation and commented out default version of the setting), then the package management system can detect that you've made a potentially breaking change before installing the new software, and ask you what to do. This works in the absence of sophisticated data modeling, schema mapping tools, etc.
- The implementation of all this is factored out and standardized across 1000's of packages, so it only needs to be implemented once and understood once.
Rebasing patches is pretty hard I'd say. Much harder than just explaining someone the logic behind /lib, /etc and /run. I mean, this directories are seperate for a reason. Knowing the idea behind the directory split (a.k.a. Linux Filesystem Hierarchy Standard) is more essential than knowing how to use the patch command line tool. There are more and more people who don't want to use the command line. And that's completely fine.
> - It also works better than the systemd unit style of configuration for sophisticated users. In particular, if you add a now-incompatible option to the file in the suggested place (next to the documentation and commented out default version of the setting), then the package management system can detect that you've made a potentially breaking change before installing the new software, and ask you what to do. This works in the absence of sophisticated data modeling, schema mapping tools, etc.
Anecdata, but still: Every single time the package manager asked me what to do, it was not because of a breaking change, but simply because the base file changed in a way that the patch didn't apply cleanly anymore. That is, I, the user, had to work around limitations of the patch format. If the distro configuration and the user configuration had been separate files, things would have just worked fine.
You don't have to be an expert to read multiple files. If you do need expert assistance with something like that please contact me and we can work something out.
So it's not hilarious statements like this make it hard to understand the reason? I almost cannot tell whether your whole post is sarcasm, or just showcasing irony.
You mean that the top-level Linux directory are hilariously badly named?
If so, I don't think anybody will disagree.
But if you meant that's a large problem, then well, no it isn't. There are 7 of those badly named ones. It's a matter of reading half a page of text to see what they do, and after 2 or 3 times you read it as reference, you have them memorized.
If would be nice to solve the issue, but it isn't anybody's priority.
Yes, "The user (syadmin) doesn't touch /usr, except /usr/local", where compiled in applications reside with full path so /usr/local/{bin,sbin,share,...}.
If you're deploying an especially big application, you may also prefer /opt/$APPNAME/{bin,etc,usr,...} (A full filesystem tree, basically).
We manage a fleet of systems at work, and I don't remember editing or touching a file under /usr, unless the period I was developing .deb or .rpm packages to deploy on said fleet.
There are many written and unwritten rules in Linux world, and sometimes people disregard or ignore these rules so blatantly, so I can't understand whether the post I'm replying to is a sarcasm or irony.
I think there is a gap here between those who routinely (exclusively?) used UNIX machines along with 50, 100, 1000 other people all logged in at once (generally pre-Linux era) and those who only ever got to see and enjoy a time when everyone could have a UNIX workstation on their desktop (Linux, Mac, or other x86 flavour, take your pick).
Yeah, I really feel like we're in a "horseless carriage" or "radio with pictures" era for computing. We have incredibly powerful machines available. But the dominant paradigm for managing that power is to slice it up into things that simulate much less powerful processors in software environments designed for a shared 1970s university department minicomputer.
In many ways I love it, of course. It's what I grew up on, and I'm sure I'll have a small Linux cluster in the retirement home, because that's how computers are "supposed to be". But I think the old paradigm has only lasted so long because we have so much computing power to burn.
I'm not sure I understand. When the "old paradigm" was new, computing power was much more limited. They couldn't afford to "burn" any computing power. But it's only stuck around, because now we have more computing power? That doesn't make sense
You might read some Kuhn. Paradigms change when the old one becomes unsupportable. But the vast increase in computing power per dollar means we can waste a lot of it without it being economically uncomfortable.
He works for a financial trading firm, where the competitive nature of it means that every processor cycle counts. For him, older paradigms are perplexing and irritating, because they are major barriers to getting the most out of the hardware.
Back in the shared era, the machines were better-suited to individual customization, and were more easily understood by system administrators. Now, they're optimized too... waste more computational resources when idle?
If you want to use /usr for yourself, be prepared to throw the FHS and compability with almost everything away and rebuild your whole userland with something like NixOS/Guix.
In particular, in our own version of the system, there is a directory "/usr" which contains all user's directories, and which is stored on a relatively large, but slow moving head disk, while the othe files are on the fast but small fixed-head disk. — https://www.bell-labs.com/usr/dmr/www/notes.html
Isn’t that just a backronym? I read recently (here, I think) that on some original specific UNIX system, the user directories were there originally, and then other stuff that belonged elsewhere went in there because the other physical volume (where things like /bin were) ran out of space. Later another dish was added and profiles were moved there, to make /home.
It’s still hilarious. The fact that there’s a need for this hilarity doesn’t negate that. What a mess humanity have created… (Let’s write a suckless OS!)
systemd configuration lookups are lightweights and straightforward compared to ansible variables lookups rules :
--
Understanding variable precedence
Ansible does apply variable precedence, and you might have a use for it. Here is the order of precedence from least to greatest (the last listed variables override all other variables):
1 command line values (for example, -u my_user, these are not variables)
2 role defaults (defined in role/defaults/main.yml) 1
3 inventory file or script group vars 2
4 inventory group_vars/all 3
5 playbook group_vars/all 3
6 inventory group_vars/* 3
7 playbook group_vars/* 3
8 inventory file or script host vars 2
9 inventory host_vars/* 3
10 playbook host_vars/* 3
11 host facts / cached set_facts 4
12 play vars
13 play vars_prompt
14 play vars_files
15 role vars (defined in role/vars/main.yml)
16 block vars (only for tasks in block)
17 task vars (only for the task)
18 include_vars
19 set_facts / registered vars
20 role (and include_role) params
21 include params
22 extra vars (for example, -e "user=my_user")(always win precedence)
IIRC, Ansible had this exact variable precedence long before they were acquired by RH.
I get that hating RH is in vogue currently, but let's keep it grounded in reality please.
EDIT: And for the record, these Ansible precedence rules are not a bad thing. Once you are somewhat familiar with them, it allows you to write incredibly flexible and extensible playbooks and roles.
As an Ansible person, I couldn’t agree more. Being able to manage variables at basically any level of precedence is extremely useful.
I am also ever more deeply concerned about IBM’s hell-bent purge of all the good left at Redhat. I had thought it would take them less time to destroy Redhat, but I’m sure they wanted to make sure they spent time “listening” before making all these ecosystem-eroding changes.
Chef is worse. It's discouraged to use `override` in Chef cookbooks but nothing says that you can't, and introducing spooky action at a distance is just part of your life now.
I love how there are 21 places to look if you're trying to figure out why a CLI parameter is being ignored. Does it at least print a warning or something?
According to the comment above, `cmd -u my_special_user` will only use the value "my_special_user" if none of the other 21 places have another configuration for the user, right?
".d" directories are a well-established pattern to split a complicated config file into logical units. In the case of something like systemd this is absolutely essential, as you wouldn't want the config for your web server in the same file as the mail server.
/usr/lib is the location where the package manager will drop config files. This is the default, distro-provided configuration. /etc/ is where the administrator can place config files, providing system-specific overrides. /run/ is for files "created at runtime", I am not quite sure what the point of those is.
/run is useful for configuration generation. Sure, you could maybe poke an API and have it do the right thing, but if "everything is a file" then why not put your configuration in a file? That way it's easier to see what's actually happening.
Systemd is also a useful demonstration of how to show people where configuration comes from -- `systemctl status <unit>` will include a list of all the files found during lookup, and `systemctl show <unit>` will give you the resolved unit configuration including all the defaults.
Drop in files (config in .d folders) are a core extension point of systemd and not a symptom of an overly engineered config system. These are how services are meant to be custom tailored to specific machines and even user's needs.
Let's say you need to pass a special flag to sshd on your system, a bad approach would be to go edit the distro shipped sshd.service file. Your change will work until the distro ships an updated file and you forget to hack in your change again.
Instead place a sshd.service drop-in config file in the appropriate place to customize the service as necessary. Your config tweak takes precedence over the distro shipped config, but doesn't live inside it or conflict with upstream changes.
It's a little bit away from the domain of config files, but I think CSS is the most graceful example of handling this as a generalised problem. Style configurations cascade, and the last one read is the one that gets applied. There are overrides to that flow, but you have to very overtly add a piece of syntax to the overriding line like `!important`. Two overlapping overrides revert back to the "last read" approach. The approach is so fundamental to CSS that the entire language is named after it, so there's no ambiguity or forwards compatibility concerns in this regard whatsoever.
I’m sure you know this, but for readers who don’t, it’s much more complex than “last defined wins with a few override options”. Based on “specificity”, rules based on the selector used.
Some examples:
1. #mydiv overrides .mydiv - IDs are more specific than classes
2. .outerdiv .innerdiv overrides .innerdiv - nested selectors are more specific
3. .mydiv overrides div - classes are more specific than elements
And a lot of other rules.
I can’t imagine I’d want this level of complexity in config files.
Rather specificity is a triplet of (ids sub-selectors, class sub-selectors, element sub-selectors). When attributes conflict, the one whose rule has the more specific selector wins.
The specificities of the rules you wrote out are (1,0,0), (0,1,0), (0,2,0), (0,1,0), (0,1,0), and (0,0,1).
In-line attributes (@style) beat out out-out-of line ones except if they’re !important. Specificity also disambiguates between !important attributes.
> I can’t imagine I’d want this level of complexity in config files.
Lots of people don’t want it in their CSS either. It’s generally considered good practice to make everything the same specificity and avoid !important.
Sure, haven't seen that in a while in well structured code bases.
> make everything the same specificity
How would you achieve same specificity on everything? Only use either ids or classes? No type selectors / pseudo-classes / pseudo-elements / multiple selectors? I can't imagine how that would be feasible and am not aware of anyone recommending this.
With :is and :where it is trivial to trick selectors
:is(selector_1, :where(selector_2))
Has always the specificity of selector_2, and if you choose an impossible selector for selector_1 (the easiest approach is to give multiple ids or use non existent classes/elements) it only select by selector_2
There are a large number of companies that require ids as the basis for every selector, by policy. This may not be interesting to you, but I think this is what they meant. The fact they may or may not be familiar to you is incidental.
I'm familiar with policies that generally strive for specificity of (0, 1, 0) like BEM or (1, 0, 0), but I'd be very interested in any policy that rules out pseudo-classes like `:hover`, as prohibiting them would leave you to handle any interactivity you'd get for free from CSS via JavaScript.
> Specificity is the algorithm used by browsers to determine the CSS declaration that is the most relevant to an element, which in turn, determines the property value to apply to the element. The specificity algorithm calculates the weight of a CSS selector to determine which rule from competing CSS declarations gets applied to an element.
The rules of CSS specificity are a bit more involved than '!important' or not, and I frequently see beginners trip there, but apart from that I agree that the CSS rule set might be worth incorporating.
Fun trick to use if you use a css post processing (less, sass, etc.) is "&&&", it will resolve to a very specific rule which overwrites the other rules
> Two overlapping overrides revert back to the "last read" approach.
IIRC, there's an important (heh) exception to that: user stylesheets. Without the "!important" override on the user stylesheet, the site stylesheet wins; with the "!important" override on the user stylesheet, the user stylesheet wins, even if the site stylesheet also had an "!important" for that rule.
The domain of style and configuration are very, very different. I'm not sure what works okay for displaying a website in different layouts on vastly different unknown devices share a lot of the same concerns of configuring programs on a machine
While at it I’ve made it a practice to expose an internal endpoint for every service to output its config. It has saved me on numerous occasions to catch config related bugs.
Could you explain a little more? I am not sure what exactly you mean by "exposing an internal endpoint" - do you mean that as a secured API endpoint which you can call? Isn't that hard to secure?
Depending on your environment, endpoints might not default to exposed. As an example where I work, we have to explicitly enumerate endpoints to expose (potentially including wildcards, like admin/*), and anything else is only accessible with direct access to the pod.
I have been leaning towards a different approach. Rather than reading multiple config sources at runtime, read them at build time, and emit a single config file with everything fully resolved. Put that on the filesystem, and have the app read it. You have an easy canonical reference for the configuration, with no need to expose anything. The lifecycle of the config doesn't have to match that off the app binary, so it could (and probably should) have a separate build and release process.
Amazon had a hierarchical configuration system for loading config that could override other config (or add to it, depending on the implementation). The Java framework would provide an endpoint to show you the final resolved config along with which file and line that value came from. Very helpful when trying to figure out why config was different than expected.
At least the inspector in Safari (and maybe other WebKit browsers) does something similar for CSS.
I’m all for it. It’s a pain when writing config systems (no longer just key/value, but key+value+meta), but very helpful. It can be a pain for things like JSON where libraries don’t give you that type of diagnostic information easily, however.
I had the distinct pleasure (?) of owning the C++, Python, and Java implementations of that config for a while and ... yeah, it had some positives, but the fact that it had two orthogonal hierarchies of configuration in addition to implementation-specific feature gaps and nuances made it something of a nightmare in practice.
Agreed, logging in config retrieval code should log all this stuff, requests, where they are filled from, that sort of thing.
The problems with this approach when it's enabled is that log aggregators (e.g splunk) can end up including stuff like passwords which shouldn't be visible to the group able to view those logs, so some care is needed.
I'd rather have some config reserialization capability with provenance (perhaps as an opt-in). A good config reserialization can make itself useful in so many situations, if it has the right features. Like pointing out key-value pairs that don't have any effect, or simply writing out some documentation.
Logging has its merits, like discoverability when you stare at it without suspecting configuration to be the culprit, but I believe that there can and should be more.
Ant also had first writer wins. I tried to explain this so, so many times to so many people, and only rarely did it stick. On that project I spent way too much time fixing build errors, and ended up rearranging things to make the behavior more obvious rather than less, because without it people would fool themselves into thinking they had it right, because sometimes it would appear to work.
And the thing with developers is that they make lousy scientists. If they want something to work, they’ll take any information that supports that theory and run straight to the repository with it.
The trick is not to fool yourself, and you’re the easiest person to fool.
I would prefer a convention… but first wins is backwards IMO. It really sucks when a bad convention gets adopted, like this first wins, or spaces instead of tabs. ;)
First wins makes some sense from a security pov. You want to load the more "trusted" configuration files first because they must be able to further restrict the less trusted files, but don't want to let the less trusted files override the settings set by the trusted files.
Rejecting multiple definitions is better, unless you care about DoS (I would be pissed if I got locked out of ssh access due to an option being defined twice, for instance).
If there are multiple files in a wildcard pattern, what order are they resolved? I know for classpath loading, for example, it’s in directory order, which is file system dependent, and often based on the order the files were written to the directory and/or the size of the directory inode. Lots of fun figuring out why the wrong version of a class is consistently loaded on one or two servers out of twenty. (Always check your classpaths for duplicates kids.)
> If there are multiple files in a wildcard pattern, what order are they resolved
If you don't care (eg [0]) then do nothing and let the user bang it's head and reinvent the wheel.
If you care - just sort it and mark it as 'platform dependent' or somehow. Nobody really cares for the exact rules and a simple \d\d\-\w+\.conf pattern for the filenames is more than enough in 99% of cases.
Oh yes! Good thing if it only happens in the build server and not on deployments, but that can be "fun" enough. Bonus points if you have a culture of repeat builds being "special".
I've always seen this solved by sorting (lexicographical order of bytes, originally). First two characters were digits, for all files within a wildcard directory.
Because "first" and "last" don't convey priority. You have to know how the configuration parser works with defaults and overrides. Does the parser stop reading after the first match? Do later configuration directives override previous ones or are they appended to them, and if they are appended, in which order are they evaluated?
First and last are only ambiguous if we make the order ambiguous though. There's no reason for the order to be ambiguous, and if it is then obviously we can't use first/last. I don't think that first/last are either ambiguous or implementation dependent concepts. The other questions are simply other questions, and aren't directly related to whether the ordering is ambiguous or not.
There are plenty of applications that read a set list of configuration files and use either the first or last setting found. I agree with the responder that the last setting usually makes more sense.
Also: processes / running programs should provide a straightforward way to show their configuration.
While "sshd -T" shows the effective configuration from all configuration file parsing, there is no straightforward way to see the configuration of the/a currently running sshd process.
This has footgun potential, as someone could think their sshd is secured and hardened by checking with "sshd -T", while the actually running sshd process still uses old configurations.
Apache also makes it needlessly hard to find out its effective (and active / currently running) configuration. I don't know of any better way then awkwardly using mod_info ( https://httpd.apache.org/docs/2.4/mod/mod_info.html ) to get such insights.
Not sure I get the reasoning here. That requires you to actually go and proofread the configuration of each and every running process, at which point it's easier to just verify the effective configuration on disk and restart the service.
SSH is a special case because it leaves active sessions alive when it restarts, but that doesn't generalize to other daemons. Just restart if in doubt.
However, I do not think there is any standard API to follow, I used "info cfg" but many alternatives were possible. So sysadmins have to explore to find out which is the introspection command to give. Not ideal, but I don't think there is a solution.
Another good place for that is the --version output, if your --version is one of those verbose multi-line ones (instead of a basic single-line output).
Not relevant to tarsnap (and in general something I would be very careful about adding since people often don't realize what's in their environment) but I would say that command line options take priority over the environment and the environment takes priority over all configuration files.
Nice example, which opened a small rabbit hole for me: what does "first occurrence wins" mean for options that are meant to be used multiple times, such as "Port" in sshd_config. The man page only says "Multiple options of this type are permitted.", but now I wonder what happens when I put my own Port into /etc/ssh/sshd_config.d/whatever.conf but leave "Port 22" in sshd_config? Or is the (new since bookworm) default sshd_config commenting it out for this reason?
More generally, for multi-options do we maybe want "first file wins"?
Reading in general to more specific configuration directories in order and using the last value read makes the most sense to me. This is usually how I write my own applications.
Anyway, I thoroughly document the order and precedence, but the suggestion from Chris' blog to report more detail about the process is a good one.
>>>Some programs use a "last occurrence wins" approach, while other programs (e.g. sshd) use a "first occurrence wins" approach.
A bunch of applications at my job use a config-loading library that will fail on startup if the same option is set to two different value (on different lines). It's certainly frustrating at times...but it's also prevented a huge amount of ambiguity.
That would actually be my preference TBH. Failing quickly at startup is usually far better than unpredictable behaviour later on, particularly if it's a long-running service.
Though I've always thought that the behaviour of tools when launched interactively by the user, vs launched via an automated background process should probably be different - it may well be that if a user/administrator isn't aware it failed on machine startup the consequences of it not being running at all are worse than an ambiguous configuration setting.
This is a big ridiculous, but the fact is that the entire order of processing there is a result of practical need evolved over likely thousands of organizations.
There are unit testing concerns, dev/test/stage/prod concerns, CLI args, OS vars, config files, IDE vs non-ide concerns, and more. And even some references to databases (if you squint and see JNDI as a database).
So yeah, finding out where property value X was set can be a bit maddening oftentimes.
"Configuration" is an afterthought in most languages and frameworks in initial iterations. So bespoke / one-off solutions will explode early on. As the original complaint details, "configuration" isn't just a simple property file. It gets complex pretty quickly, and a good configuration subsystem would be nice:
- supports most of the concerns above
- provides a means for logging the final config value set and WHERE IT SOURCED FROM
But... oh wait, config values aren't just key-value. Wait until you get to trees of values or other more complicated config. Then value overlaying gets pretty hard, and resolution code gets complicated too.
Think of infrastructure as code frameworks. The configuration is insanely deep and complicated. Many complicated systems have "configuration languages", and they are probably Turing Complete.
And once your configuration system is complicated enough ... your configuration system kind of implicitly needs ITS OWN configuration system.
So maybe the first config layer is actually a bootstrap for loading all the configuration info.
And.... then you have config values you need to secure. Hooo boy, now you have all the complexity of high security systems interfaces: authentication and authorization, decryption and hashing.
And invariably there are binary blobs encoded in text friendliness.
It gets crazy. Fast.
BUT THE AMOUNT OF SIMILARITY IS NUTS. NUTS. This is begging for a common approach or standard or library.
It's crazy that a year from now, no one is ever going to be thinking about this garbage ever again. By then, enough programmers will have discovered how insanely good GPT4 is at doing software configuration, and in the absence of the right answer, reading these verbose docs to finally just tell you what to do.
There should be an option to report where any currently active non-default configuration settings actually came from, individually. (including if that's shell environment vars, etc).
I forget what tool I use regularly that actually does this. Is it git?
It would also help if an active process said if it was running using the current config.
For example, if a change has been introduced to a config file, but the process has not been restarted to incorporate those changes, this should be conveniently mentioned.
This is the central problem with all programming today.
You can think of everything as configuration. Parameters of a function are just configuration for that function.
What we need is traceability of every value in a system, not just configuration. You click on any value and you see a graph/chain of all the transformations this data goes through right back to the source of truth. This is how we should code. Think about how difficult this is to do today with your current IDE/editor.
Every task in software is wanting to change what is displayed in a user interface of some sort. Usually you have to make your way through a labyrinth to figure out what to change where, and also what side-effects will occur.
If this is not smack bang in front of your face when you are coding, then you are just adding more and more tangled data flows and complexity to your system.
Microsoft IntelliTest (formerly Pex) [1] is internally using Z3 constraint solver that traces program data and control flow graph well enough to be able to generate desired values to reach given statement of code. It can even run the program to figure out runtime values. Hence the technique is called Dynamic Symbolic Execution [3]. We have technology for this, just not yet applied correctly.
I would also like to be able to point at any function in my IDE and ask it:
- "Can you show me the usual runtime input/output pairs for this function?"
- "Can you show me the preconditions and postconditions this function obeys?"
There is plenty of research prototypes doing it (Whyline [4], Daikon [5], ...) but sadly, not a tool usable for the daily grind.
I suspect this is an area that is part of the essential complexity [1] of programming. Reachability, variable liveness, pointer aliasing, etc. are undecidable problems for arbitrary programs. Also, the control flow graph of most non-trivial programs is probably too complex to visualize in a usable way.
However, there is still a lot of room for improvement in compilers and debuggers. If I could tell the compiler to instrument my code to track the values that I care about, and then have the debugger visualize the flow of information through the system, that would be very useful. It wouldn't guarantee that the data flow analysis applies to subsequent runs of the program, but it could reduce the amount of time spent reading and debugging code.
Something that gives me hope is that I haven't seen anyone prioritize debuggability in a modern programming language or even a framework. Vale is maybe the first getting close: https://verdagon.dev/blog/perfect-replayability-prototyped.
I think there is a new era yet to come.
> probably too complex to visualize in a usable way
You can always abstract the complexity. A simplification of the control flow graph at the function level.
As an example, I want to understand the Postgres codebase better...there are a lot of very visualizable things going on there. If you look at lecture slides on relational DBMS, there are lots of diagrams...so why can't be view those animating diagrams as the code executes.
I was thinking of instrumenting the code to get a trace of every function call and its arguments, and then filter that and visualize it.
The key to this approach working though is a think ultimately we need to write our code and organize our code such that it can be easily visualizable. The purpose of code is for humans to understand it, otherwise we are writing assembly.
> This is the central problem with all programming today.
Sweeping generalizations rarely age well.
> You can think of everything as configuration. Parameters of a function are just configuration for that function.
No, parameters of a function are contextual collaborators which participate in a function implementation. Configuration canonically influences a program aside from its individual run-time interactions.
Is a value given to the cosine function a configuration? Words matter.
> Every task in software is wanting to change what is displayed in a user interface of some sort.
This is a myopic perspective. Software processes data into information which provides value in context. This may never be "displayed in a user interface."
To wit, the network gear involved in my posting this reply most certainly qualifies as a "task in software." So, too, is the software running in the power companies involved. It would be quite a stretch to categorize them as "wanting to change what is displayed in a user interface."
This would be sooooo nice, especially in DI heavy languages like Java. It might not be turtles all the way down, but finding the final turtle can be so difficult.
It drives me nuts when systems are unnecessarily built with "magic DI" frameworks rather than initializer/constructor DI. Why is it so hard to pass a dep into the initializer? Make the initializer parameter a protocol/interface and you can pass a mock in at testing. And you can also grep the whole codebase for callers and find the exact concrete type passed in at runtime.
IMO, automatic DI frameworks were a big mistake. An idea that looks useful at the time, but actually leads to spaghetti mess code that is harder to maintain than it would be without any automatic DI.
I agree, though I'd say in the more general case that overly flexible DI frameworks are the broader problem.
Designers will put much more thought into how best to interact with a framework, so start with a small, opinionated interface then consider extension feature requests on a case by case basis
Eh, it's not like it's much easier to find the last turtle in python. Or C++ when people are extensively using patterns like PointerToImpl*, which can really obfuscate things.
Object orientation kills traceability. Classes are instantiated in one trace, then methods are called in another trace, and if something goes wrong in the latter, it is not trivial to find the wrong settings passed in the former.
I think some functional programming languages solve that by running a good chunk (if not all) of the application in a single trace.
Yes, this is why we love functional programming! "What happened along the way" equals to the call stack, as long as there is no field mutation involved.
And, of course, async/non-blocking calls, as tracing a call along different threads or promises may not be available all the time.
"""The venerable master Qc Na was walking with his student, Anton. Hoping to prompt the master into a discussion, Anton said "Master, I have heard that objects are a very good thing - is this true?" Qc Na looked pityingly at his student and replied, "Foolish pupil - objects are merely a poor man's closures."
Chastised, Anton took his leave from his master and returned to his cell, intent on studying closures. He carefully read the entire "Lambda: The Ultimate..." series of papers and its cousins, and implemented a small Scheme interpreter with a closure-based object system. He learned much, and looked forward to informing his master of his progress.
On his next walk with Qc Na, Anton attempted to impress his master by saying "Master, I have diligently studied the matter, and now understand that objects are truly a poor man's closures." Qc Na responded by hitting Anton with his stick, saying "When will you learn? Closures are a poor man's object." At that moment, Anton became enlightened."""
Not sure what are you trying to say. My point was that a invoking a function that will act on arguments that have been defined away from the callsite is not a OO specific feature.
> And, of course, async/non-blocking calls, as tracing a call along different threads or promises may not be available all the time.
I think there is still hope on that front. Following structured concurrency patterns like the supervisor model should handle it. See the discussion about "Notes on structured concurrency, or: Go statement considered harmful" [1].
I disagree. Pretty much any program of enough size, including those written in functional languages, are going to have state that outlives the lifetime of a single call stack.
Simple example: the web server that takes in the HTTP requests? That one definitely got initialized outside the call trace that's serving the HTTP request.
No different from an object that got constructed and then a method called on it.
So it's an interesting thought experiment, but... I suspect the Rust compiler could statically figure this out for a large # of cases .... based on following the borrow/copy/clone tree for any particular variable. It could in theory be possible to write a tool which traces that graph back to the snippets of code where it originates, though that would usually end up being a command-line args parser, or a config file reader...
It's mostly feasible in other languages to, but probably not with the same level of rigor.
First thing I thought of after reading your comment was web inspectors which do help a little bit in figuring out stuff like "Why is this line in italics?" by showing you computed styles. But they could go a lot farther. I'd like to see a temporal graph of everything (e.g. HTML, JS, CSS, Virtual DOM, whatever) that had any visual or behavioral impact on an object, and the reasons why things changed in that order, why some changes overrode others, etc.
It seemed very promising until it showed DevOps engineers dragging blocks in a shared (synchronous) canvas like Figma.
Am I the only one that thinks pull requests (asynchronous) are a more appropriate medium of collaboration for this kind of work that requires strong consistency between elements?
This is a minor thing which I found useful- The "help" functions of all my bash-defined functions report which file they're defined in (such as: `echo "This function is defined in ${BASH_SOURCE[0]}"`). If I could do something similar for my environment-defined values (hmmm, now I'm thinking...), I would. (I could create a function that tracks at least the changes _I'VE_ made to these values, which could help...)
A feature of actuarial software is generating dependency graphs of functions. Actuaries don't write the financial models that power reporting processes in standard languages (Python, Julia, C++) and instead have proprietary systems with features like this.
i wonder whether our current strong compile-time/runtime distinction can support this. it may be slightly easier for interpreted and vm-based languages but i’m not sure we can have it more broadly.
lately i’ve been playing with prolog and revisiting computer science/computer programming before imperative languages drew a curtain on innovation and i think we have lost a lot. our modern programming isn’t the programming the founders had in mind, it seems.
Sounds quite a lot like the Common Lisp experience when using Sly and Emacs. You can (trace) every function and see lexical scopes of variables, even bind them if you execute some s-expr that does not currently have some variable under that symbol.
The distributed aspect is a bit more difficult though. I don't know if there is a system that truly got this right. Maybe Erlang's BEAM VM together with a good debugger I don't yet know about.
FWIW, we do something like this in Spack[1] with `spack config blame`. There are different YAML configs, they can be merged and override each other based on precedence, but we tell you where all the merged values are coming from. Here's a simplified example -- you can see that some settings are coming from the user's "default" environment, some from Spack's defaults, and some from specialized defaults for Darwin:
The filenames are shown in color in the UI; you can see what it really looks like here: https://imgur.com/uqeiBb5
For the curious, there is more on scopes and merging [2]. This isn't exactly easy to do -- we use ruamel.yaml [3], but had to add our own finer-grained source attribution to get this level of detail. There are also tricky things to deal with to maintain provenance on plain old Python dict/str/list/etc. objects [4,5].
This is mostly due to the procedural "recipe" style programming paradigm that most developers seem to default to. It doesn't scale well past small programs with the control flow being embedded in the code so the only good way to follow the flow is to track things through the callers. This is one of the reasons I prefer dataflow programming styles that makes the control flow explicit.
Yes but wouldn't the logging take up quite a bit of memory? Something like O(compute cycles* (log(lines of code)+log(memory allocated))) before compression.
In complex prod systems, logging and traceability are generally more important than memory consumption and disk space. Memory and Disk are cheaper than a lawsuit if something goes awry
This is what's really needed. In one code base it's relatively easy for the dev team. Across your microservices it's practically impossible unless all teams are involved. This is a nadir in dev productivity currently
$ redis-cli INFO
...
executable:/Users/antirez/hack/redis/src/./redis-server
config_file:/etc/redis/redis.conf
...
In the case of Redis, it is possible to do much more. With the CONFIG command you can check specific configuration parameters at runtime, and even change most of them.
There are commands to get reports about latency and memory usage, performing a self-analysis of the instance condition. You can see counters for commands called, and even obtian a live stream of the commands the instance is executing.
This should all be obvious features to make system software approachable, observable, and to make people able to act easily when shit happens, without having to guess where the configuration file is from the system startup scripts or alike. UX is not only about GUIs.
On Windows, Process Monitor does the same thing --- with the additional feature that, since configuration is often stored in the registry there, it will also show which registry keys it touches.
I use quick and dirty greps all the time, even when there is a "better" option available. It just works and is very intuitive in interactive contexts. Probably GP works in a similar way.
In lieu of strace, IDK how fast `ls -al /proc/${pid}/fd/*` can be polled and logged in search of which config file it is; almost like `lsof -n`:
pid=$$ # this $SHELL's pid
help set; # help help
(set -B; cat /proc/"${pid}"/{cmdline,environ}) \
| tr '\0' $'\n'
(while 1; do echo \#$pid; ls -al /proc/${pid}/fd/*; done) | tee "file_handle_to_file_path.${pid}.log"
# Ctrl-C
It's good Configuration Management practice to wrap managed config with a preamble/postamble indicating at least that some code somewhere wrote that and may overwrite whatever a user might manually change it to (prior to the configuration management system copying over or re-templating the config file content, thus overwriting any manual modifications)
## < managed by config_mgmt_system_xyz (in path.to.module.callable) >
## </ managed>
It still works if you turn off SIP. This is the same on Apple Silicon and Intel. However, for these purpose of tracking file accesses, I recommend using `eslogger` instead, as it doesn’t require disabling SIP and is faster, among other advantages.
It’ll show all file events. No need to disable SIP. SIP is doing a lot of good work for users and unless you’re doing kernel work or low level coding I’d keep it enabled. Obv. There are other cases but for the general public keep it on.
It’s personal preference. If the security people had their way we’d all be developing on iPads. If SIP interferes with your work: turn it off. Linux doesn’t have SIP and it’s just fine to develop on Linux as well.
100% agree. We would put everyone on chromebooks if we had our way. I don’t think it’s good for productivity and generating new ideas for a company so I never advocate for it.
There was a man who never existed named Thomas Covenant, created by Stephen R Donaldson. Decades ago this character said something that has stayed and will stay with me my entire life:
"The best way to hurt somebody who's lost everything is to give him back something that is broken."
For us, this thing is MacOS. I miss dtrace every damn day.
DTrace allowed you to ask the damn os what it was doing, since the man pages are random and do not match the command line help, new daemons constantly appear with docs like this:
NAME
rapportd – Rapport Daemon.
SYNOPSIS
Daemon that enables Phone Call Handoff and other communication features between Apple devices.
Use '/usr/libexec/rapportd -V' to get the version.
I do this much more frequently than I would like. But occasionally you have issues where the program lists a directory then only acts on specific names. So the strace output won't tell you what the expected name is.
Users should be aware that `sudo apt install sysdig` may require reconfiguration of UEFI Secure Boot, and that there is no apparent clean abort from this possible. The raw-mode terminal screen contains the text:
┌────────────────────────┤ Configuring Secure Boot ├────────────────────────┐
│ │
│ Your system has UEFI Secure Boot enabled. │
│ │
│ UEFI Secure Boot requires additional configuration to work with │
│ third-party drivers. │
│ │
│ The system will assist you in configuring UEFI Secure Boot. To permit │
│ the use of third-party drivers, a new Machine-Owner Key (MOK) has been │
│ generated. This key now needs to be enrolled in your system's firmware. │
│ │
│ To ensure that this change is being made by you as an authorized user, │
│ and not by an attacker, you must choose a password now and then confirm │
│ the change after reboot using the same password, in both the "Enroll │
│ MOK" and "Change Secure Boot state" menus that will be presented to you │
│ when this system reboots. │
│ │
│ If you proceed but do not confirm the password upon reboot, Ubuntu will │
│ still be able to boot on your system but any hardware that requires │
│ third-party drivers to work correctly may not be usable. │
│ │
│ <Ok> │
│ │
└───────────────────────────────────────────────────────────────────────────┘
This leads to a second screen asking for a new password. There is a <Cancel> option, but selecting it merely loops back to the first screen.
Hitting C-c (Control-c) has no effect.
There are validity rules for the password, but they are presented only after you've entered an invalid password, twice, the second time to confirm.
After this installation proceeds, then terminates with a warning:
DKMS: install completed.
Processing triggers for man-db (2.9.1-1) ...
Processing triggers for libc-bin (2.31-0ubuntu9.9) ...
W: Operation was interrupted before it could finish
I suspect that's just because you're trying to install an old version (as I wrote sysdig used to have a kernel extension but now should use eBPF functionality provided in stock kernels). I can't easily verify (no ubuntu at hand), but presumably if you install the vendor supplied, up-to-date version (first google hit I found: https://www.linuxcapable.com/how-to-install-sysdig-on-ubuntu...) it will work without UEFI changes.
I wonder if there is any mainstream program with a more convoluted configuration scheme than bash? Just when think you understand the various invocations of .profile, .bash_profile, .bashrc in login/non-login/interactive/non-interactive modes in home and/or /etc after digesting an article with flowcharts[1], you realize that any invocation can execute arbitrary code from any location, and different distributions (not to mention, macOS, WSL, git-for-Windows, etc) all source different stuff in their default configs (like executing .bashrc in .profile).
This is a why Murex (my shell) has a ‘runtime’ builtin which lets you query which script updates what. Every function, variable etc can be queried to find where they were set and even when (time stamped) too.
I love this post. Such a crazy diagram for something that intuitively we believe should be simple. You can easily imagine most of the the compromises and feature requests over the years that brought it to this state. All software that survives decades of care and upkeep tends to end up in a similar state.
Not only that, but past a certain complexity of the config, programs should have a way to dump their runtime config to the command line. Yes, I am talking to you, server daemons that take configs from 2-3 different rc-file locations, hardcoded defaults, envvars, the moon phase and the current market value of dried parsnips.
If you don't have a way to simply tell me what your current config is, then you have no earthly right to run it.
Wasn't Windows Registry a (failed) attempt at unifying this stuff?
Now, hear me out. The problem is not the location of configuration. The problem is the crippled and anemic way UNIX (and the likes) programs take configuration. These can only predictably take environment variables and command-line arguments. Both ways are crippled due to size limit (especially the command-line arguments) and structure (you cannot even have lists / arrays in environment variables). This is the reason why every application author believes they must write their own configuration, with their own (crippled) configuration format and logic for locating their configuration.
On the bright side, I think, systemd is moving towards making this mess more uniform. On the not so bright side, systemd doesn't appreciate the difficulties associated with the problem and doesn't seem to have any sort of plan moving forward. So, unit files are kind of configuration, but because the format is so crippled and anemic, they immediately added a way to source random external files for process environment etc.
Finally, systemd, even if they understood the problem well and put a good effort towards solving it are still powerless against the format in which system supplies input to processes. What really needs to change is all the family of execve, execlp and so on. This was always a stop-gap / bad idea, but it was OK for a while, before programs started to require more complex input. It should've died in the 80s... but today it is so deeply entrenched that removing it will take a very long time.
Another problem here is the approach that "everything is a file". So, the only way to provide configuration beyond command-line arguments and environment variables is a file... But, imagine an alternative reality in which OS is capable of providing basic data-structures, which can also be persisted and queried? Kind of like you do it with a relational database? I think this idea is so much not novel that it's older than most people commenting in this thread... And, even though this would make it so much easier for so many application authors to provide consistent and easy to inspect, better tested interface to their programs... nobody's doing that :) Because files are good enough.
Having spent many years in the windows land I definitely get where you're coming from. However, I find the Unix way where everything is a file, much simpler to work with.
Just take a simple task of copying a subset of some system's configuration to another one. On Unix it's just a matter of copying some files. On Windows (and our imaginary is that provides object storage) we would need to find the relevant objects, then write a script to export/import it (or mess with export/import tools for a while).
Personally I see "everything is a file" as strength not weakness. In addition to my personal preference having a centralised configuration system not only restricts in many ways, presents a potential performance bottleneck, but also is an extra point of failure.
Additionally, I think most Unix software uses a fairly simple text format for config files. There really is little reason for using anything else. If you need complex data structures use YAML. I've found many more proprietary binary config file formats on Windows (despite registry) than on Unix. So if the os would make something like this available there would still be no guarantee all apps use it. It would just be an extra thing to deal with for little benefit IMO.
JSON requires too much escaping to be reasonable to edit by hand in many cases. TOML is only marginally better.
With any significant level of nesting TOML starts to obscure the structure more than it helps anything.
I tend to start with a basic K=V (“dotenv”) format. Past that though I’m going straight to YAML.
I agree that the “everything and the kitchen sink” specification is ridiculous and rarely (if ever) use much of the functionality, but as far as representing arbitrary structured data in human readable and editable text… it sucks less than the alternatives if you stay away from the weird stuff.
For who? Yaml gives you all the standard containers, tools
to avoid repeating yourself, lord of ways to format strings so they stay readable and arbitrary object serialization.
Yaml only gets weird if you want to use complex objects
as dict keys but that complexity is on you then.
So do JSON and TOML. If you think you need more than what they provide, something has gone very wrong.
>tools to avoid repeating yourself,
Bad. Config is not code. Settings should be explicit. DRY is not helpful.
Because of the lack of anchors/aliasing/variables etc, JSON has the nice property that you can parse it with exactly one upfront memory allocation: N bytes of JSON requires at most N words of memory.
>lord of ways to format strings so they stay readable
Bad. Too many ways: https://stackoverflow.com/a/21699210/6107981 No normal human being can remember all the rules for how a string gets parsed. Therefore it isn't actually human-readable or human-writeable, despite masquerading as such.
>arbitrary object serialization
You can write out the fields as a JSON object. If this isn't enough, then what you're trying to serialize doesn't belong in a human-readable text config file.
>Yaml only gets weird if you want to use complex objects as dict keys but that complexity is on you then.
This assumes the guy who wrote the config schema is the same as the guy who has to later use and edit it, which is seldom the case. When people do needlessly complicated config like this, they're taking an enormous stinking shit on my lawn for me to clean up.
> Bad. Config is not code. Settings should be explicit. DRY is not helpful.
I disagree. From experience, there are enough times that configuration requires repetition, loops, interpolation, etc, that you often end up in this gross space of either repeating chunks manually (error prone and brittle), or using code/templating to generate config (requires everyone on the codebase to use a specific toolchain).
HCL2 (used by Terraform) has been my favorite configuration system. It's just the right amount of power that I don't need copypasta or feel the need to resort to configgen. TOML is great for a single human-defined configuration for a program, but HCL does just as well at that, while also working for systems.
JSON definitely fails as human-editable. Again, you need some tooling to ensure syntax correctness.
> JSON has the nice property that you can parse it with exactly one upfront memory allocation: N bytes of JSON requires at most N words of memory.
I really don't care how much memory is allocated during parse, within reason, as long as execution is bounded, can't blow up, or otherwise misbehave.
>I disagree. From experience, there are enough times that configuration requires repetition, loops, interpolation, etc, that you often end up in this gross space of either repeating chunks manually (error prone and brittle), or using code/templating to generate config (requires everyone on the codebase to use a specific toolchain).
In such a case, use a Python script to generate the configuration, and then commit that generated config into source control. Keep the script and the generated config together in the same repo. You can set up a commit hook to keep them in sync. That way you can see explicitly what the configuration is, while reducing the repetition in the editing process.
>JSON definitely fails as human-editable. Again, you need some tooling to ensure syntax correctness.
What kind of tooling are you talking about here, other than syntax highlighting? I'm really at a loss to imagine how editing JSON by hand could be hard, other than maybe the lack of trailing comma. There's just not that much to it.
> In such a case, use a Python script to generate the configuration, and then commit that generated config into source control. Keep the script and the generated config together in the same repo. You can set up a commit hook to keep them in sync.
I can't really understand how all of that is simpler than having your config file have a little power in itself. Especially when that power is as easy as "use ruamel." You can't enforce commit hooks so you end up making it a build failure and that's honestly really annoying to have to push again because "right I forgot to run the config generating thing." And you're one ci.skip away from messing your system up.
Like it's adding failure modes
for such little gain. If you really must have the final generated output then save it as a build artifact for inspection later.
YAML is a poorly engineered format. It doesn't think about storage at all, for example (i.e. in its definition, it's just a single text blob. What if you need to split it?
The format is awful for random access: you need to read the whole thing to extract a single data item. If you need to query such a configuration multiple times you'll do a lot of unnecessary work.
YAML poorly defines numerical types (how many digits can a float have? what about integers? Are floats and integers the same thing?) YAML lists... allow mixed types. Not sure it's such a good thing from performance and correctness perspective.
YAML has unnecessary "hash-table" type. Hash-tables aren't a data format, they require function in the reader that makes asymptotic properties of hash-tables work. In fact, YAML has just a weird sub-type of list, where elements come in pairs. Since it inherited all this from JSON, there's also the ambiguity related to repeated keys in such hash-tables -- what should an application do with those nobody knows.
YAML sucks for transmission due to the ambiguity caused by repeated keys in "hash-tables". You cannot stream such a format if you have the policy that the last key wins or if you have the policy that repetition is not allowed. Also, since YAML allows references but doesn't require that the reference lexically point to a previously defined element, you may have unresolved references which, again, will prevent you from streaming.
YAML defines mappings to "native" elements of different other languages... but what should you do if you are parsing it in Ruby, but get a mapping to Python?
YAML schema sucks. It would require a separate "expose" to describe why.
YAML comes with no concept of users or namespaces which would be necessary for ownership / secure sharing of information. It also comes without triggers which would allow an application to respond to changes in data.
YAML is very hard to parse for no reason. It's very easy to make a typo in YAML that will not make it invalid, but will be interpreted contrary to what was intended.
YAML doesn't have a canoncial form which makes comparing two elements a non-trivial and in some cases undefined task.
YAML doesn't have variables, nor does it have forall / exists functionality. This results in a lot of tools that work with YAML overlaying it with yet another layer of configuration, which includes variables (eg. Helm charts, Ansible playbooks).
----
I mean, honestly, people who created YAML probably saw it as an inconsequential, sketchy project that should take maybe a weekend or two. They didn't plan for this format to be the best... they just made something... that sort of did something... but not really. I have no idea why something like this received the acclaim that it did.
I use ini files for my Windows application [1]. They are enough if you don't need multiple levels of nesting. The simpler the format is, the better.
When storing settings in config files, you can just copy the program folder to another computer and your settings will be transferred. This is called "a portable application" in the Windows world. When using the registry, you need to export the registry keys and import them, which makes it hard to backup all your apps with their settings or to set up multiple computers to use the same apps with the same settings.
Indeed, even the meaning of the name YAML admits failure: "Yet another markup language".
The year is 2023. Can we really not come up with better things for configuration?
I mean, we did, 25 years ago With XML. The problem is that it was too much of a dumpster fire when it came to tooling, and folks just abused it until everyone was sick of it in the early naughts. But the fact remains, XML meets all the needs for configuration files, is flexible enough to handle large and small use-cases, and with the right tooling, can be a breeze to deal with.
But no. Instead we have half-solutions like json, and worse, yaml.
I have never seen any programming project back-peddling to its previous design in recognition that it was better than the current one. Even though, unlike in many professions, we have the means (via source-control).
The hope is that as time passes, the generation that had first-hand experience with the old thing will die off and the new generation will re-discover things unjustly forgotten.
> I mean, honestly, people who created YAML probably saw it as an inconsequential, sketchy project that should take maybe a weekend or two. They didn't plan for this format to be the best... they just made something... that sort of did something... but not really. I have no idea why something like this received the acclaim that it did.
One of the co-creators responded [1] to an article [2] about how Yaml is “not that great” on this site.
It comes off to me as, well we did it for free and people use it so I’m happy abuot it.
Well, if something is free, it does make it good, just not in engineering sense...
It's valid to make things w/o thorough design or deep knowledge of the domain you work in. It's a good way to learn how to do things. The blame isn't with the authors. The blame is with the community which mistook YAML for a good configuration format.
Try configuring systemd (that's including all the services it has to run) in INI... I mean, it's sort of what it does with unit files... but this means you now have hundreds of INI files. They are all related, but you don't really know how. You could try versioning them, but you don't even know where all of them are, so every configuration change is a minefield.
Or, another, perhaps a simpler mental experiment: try writing a CloudFormation template in an INI format. After all CloudFormation templates are configuration for the service that creates virtual appliances. If you look at the current JSON / YAML way to define this kind of template, you'll see that YAML isn't expressive enough to do it, and they had to invent a lot of meta-language conventions to be able to express what they need. With INI your struggle will be monumental to accomplish that.
Though I use .ini, myself. I often think about changing to sqlite. It is insanely stable and provides all the goodness of a database. Since I speak SQL like a native, it would be easy for me to manage.
But, I stick with .ini still because, 1) I have a great access module that I would have to rewrite and, 2) there's always a text editor and sometimes I'd have to add a sqlite UI. Sounds like work to me.
I once thought the same, but there is not a universal answer on how to store data that delivers the combination of performance, reliability for every use case. Files are the lcd and leave room for optimization - and end up being a nice abstraction to hide the inner workings of decades of different storage devices and apis.
It's simple when the task is simple, and it becomes a lot more complex than necessary when problems become more complex.
Consider the joy of sysfs / procfs and similar. These are filesystems that, through files, expose information that is numbers, strings, lists, even trees. There's no uniformity there, no way to predict where and what sort of files with what sort of data will appear... Sometimes you get extra files with "metadata" that describes the format found in other files, sometimes it's in the same file, and sometimes it just doesn't exist. Sometimes numbers are encoded as strings, and sometimes as bytes. Booleans can be encoded as numbers, (which are either strings or bytes).
There's plenty of weird behavior that isn't described by files / their properties. For example, suppose you want to delete a RAID device, but if fails because it turns out that it's used by LVM volume, but you didn't even know you were supposed to look for that...
It gets even worse with proprietary stuff that doesn't follow conventions. Take for instance NVidia GPUs. They do appear in sysfs as PCIe devices... but most interesting properties just aren't listed there. Instead, NVidia's driver creates "devices" in udevfs (i.e. /dev) with arbitrary properties that you cannot predict because the format is whatever the vendor wanted it to be at any version of their product.
In other words, files are a very poor mechanism for describing data. In order to work with them you need conventions that aren't part of any file mechanism. Take network interface names, or block device names, stuff like eth0 or nvme0n1p1 -- you need to know the convention and how to interpret the name (i.e. the first suggests that the device is using Ethernet driver and is the first one connected, the second means that the device is using NVMe driver, it's the first device of its kind attached to PCIe bus, first namespace, first partition.) Specifically, in the case of block devices, you could discover the same information given in the file name by traversing the relevant filesystem subtree, but in some cases the information is just encoded as a file name or as it contents, but it doesn't utilize the filesystem primitives. It just relies on convention.
----
> I think most Unix software uses a fairly simple text format for config files
How did you count?
Do you mean software that is part of UNIX standard or software that runs on UNIX?
What makes the majority an interesting metric? (I.e. suppose there are 10M of different programs that run on UNIX, and 6M of them are simple enough that they don't require complicated configuration, there are still 4M of those that do, and some of them are probably used daily by millions of users.
I mean, who cares if simple programs don't need complicated configuration? The problem is the configuration of complex programs, and they are indispensible in our daily lives.
>Wasn't Windows Registry a (failed) attempt at unifying this stuff?
The way I see it, Windows Registry "failed" for no reason other than a handful of CLI neckbeards refusing to let go of individual .ini and other such plain text config files written however and scattered wherever they damn well please.
The Registry is really well designed, it's a central repository of settings for everything that exists in that Windows installation including Windows itself. It can hold data besides plain text. It can be manipulated both by the CLI and GUI and is easily accessed, either individual keys or in batches. It can accomodate multi-user environments independently of the file system.
The Registry still exists and is still used by most Windows programs, though unfortunately even Windows is starting to disregard the Registry with its newer developments. The Registry for me is a reminder that the programming industry is, in many ways, one of the most regressive industries ever.
In my day job I do a lot of troubleshooting. That's most of my job... helping other IT folks figure out what the heck is happening with something.
I'd say that in 70% of things I work on, part of that time is figuring out what config files are used because the app/service/process/whatever is doing something the owner/tech/whatever doesn't expect, yet is rational, and they "didn't tell it to do that".
It'd be great if everything reported as the article mentions -- amazing, in fact -- but I feel like it's a pie in the sky wish. Like asking that all programs always exit cleanly or something. I feel like there'll always be inevitable edge cases -- libraries that themselves pick up config files or registry values or environment variables that aren't expected, who knows -- that'll need to be discovered. If things are already not going right with a program, I'm less likely to trust what it says it's doing and more likely to just watch it and see what it's actually reading and doing.
Yesterday I was wondering why the self signed root certificate and the certificates I added to my debian install didn't work with httpie (ssl warning) but did work with curl (and the browsers I added the RootCA to).
I found the explanation: https://github.com/httpie/httpie/issues/480 httpie doesn't look into /usr/local/share/ca-certificates but apparently it's not httpie's fault, it's the python requests library that doesn't check that folder.
I don't know what to think of it and who is supposed to fix it. But I am back to curl with my tail between the legs (because I oversold httpie to my coworkers) for now. There's only one place where the bucket stops with curl.
I read the thread after seeing your comment, and it strikes me that besides the first comment being a little brusque, sigmavirus24 comes across as both calm and helpful - doubly so given that the original issue appears to have been PEBKAC related
> I think I forgot to run "pip3 uninstall" and ran only "pip uninstall" as I had to use python3 to get working ssl in the first place.
The issue is still open, so it has nothing to do with PEBKAC. "A little brusque" is definitely underselling it. That was a fine display of internet assholery.
For once I would say it isn't Python's fault either. There are at least 10 different locations that certificates can be stored at on various Unix variants and Linux distros. Go checks them all, which is probably the only sane solution, but seriously come on Unix devs... Is it that hard to pick a standard location?
My favourite is a very bad pattern common in Docker images. Just configure it with env vars, and then a script plucks them into a config file somewhere.
And then document it misleadingly badly. These options are mandatory! (lists 5 of the 10 actual mandatory options).
Then a script branches on a env var value, which isn't one that's documented, and then fails opaquely if it took the wrong branch because you didn't know you needed to set that env var.
Best ones are the ones that consume your env vars and set _other_ env vars, it's great!
1) Document all the config options, and ensure you highlight any interactions between settings, and explain them. I'm biased because I used to work on this project at RH, but I really did like the documentation for Strimzi, here's how their env vars are documented [0].
To highlight a few examples of how I think they're good:
> STRIMZI_ZOOKEEPER_ADMIN_SESSION_TIMEOUT_MS
Optional, default 10000 ms. The session timeout for the Cluster Operator’s ZooKeeper admin client, in milliseconds. Increase the value if ZooKeeper requests from the Cluster Operator are regularly failing due to timeout issues.
We know if it's required, what it defaults, and what I love to see, why we might want to use it.
> STRIMZI_KAFKA_MIRROR_MAKER_IMAGES
...
This [provided prop] is used when a KafkaMirrorMaker.spec.version property is specified but not the KafkaMirrorMaker.spec.image
I like this, explaining when this env var will or will not be used.
> STRIMZI_IMAGE_PULL_POLICY
Optional. The ImagePullPolicy that is applied to containers in all pods managed by the Cluster Operator. The valid values are Always, IfNotPresent, and Never. If not specified, the Kubernetes defaults are used. Changing the policy will result in a rolling update of all your Kafka, Kafka Connect, and Kafka MirrorMaker clusters.
Firstly, always great to enumerate all accepted values for enum-like props. But what I really like here is that the consequences of altering this value are explained.
> STRIMZI_LEADER_ELECTION_IDENTITY
Required when leader election is enabled. Configures the identity of a given Cluster Operator instance used during the leader election. The identity must be unique for each operator instance. You can use the downward API to configure it to the name of the pod where the Cluster Operator is deployed. (Code snippet omitted)
Interactions between config options highlighted - if you set STRIMZI_LEADER_ELECTION_ENABLED to true, this option is now required.
We're told that this must be unique. And a suggestion on one straightforward way to do this, with example code.
One more thing to call out as good:
> The environment variables are specified for the container image of the Cluster Operator in its Deployment configuration file. (install/cluster-operator/060-Deployment-strimzi-cluster-operator.yaml)
Being told _where_ in the source code the env var is being consumed is great.
Now compare the Confluent docs for their Kafka Connect image. [1] A colleague of mine was using this image, and was connecting to Confluent Cloud, so needs to use an API key and secret. There's no mention of doing that at all. But you can. Just merely set the CONNECT_SASL_JAAS_CONFIG option, and make sure you also set
CONNECT_SSL_ENDPOINT_IDENTIFICATION_ALGORITHM and CONNECT_SASL_MECHANISM.
And very importantly, don't forget CONNECT_SECURITY_PROTOCOL, as not only will you have odd connection failures (usually manifests as the client dying at the ApiVersions negotiation stage of the Kafka protocol's handshake), but because a script run in the image uses the value of that undocumented setting to execute a prerequisite in different ways [2], and you'll get weird behaviour that obscures the real issue.
2) Support more than one way of configuring things - maybe I want to mount in a config file, maybe I want to provide a ConfigMap, maybe I want to do it via env vars. Well... [3]
If any of the frequent offenders are open source, I wonder if ot would be worth your while to submit (or get someone else to) a change to add this functionality.
I don't think I've ever encountered a widely used program that doesn't have that information in its manpage. The manpage is the canonical source the article calls for.
With the exception of New Age software like Go, that for some reason doesn't use manpages, I'm not convinced that the problem implied here actually exists.
Doesn't the man page often list several potential locations where it looks for the file and the ordering behavior, leaving you to investigate and find which is actually loaded? It's still annoying.
But that is exactly the information you need in order to determine which config file is being read in all situations. A single location doesn't cut it if the reality is more complicated.
If the goal is to find out which particular file is being used at this particular time, on this particular system, the maximum verbosity log might have that information, and otherwise strace can help, as described in the other comments.
I really do not care about all the possible paths a config file might be read from. I want a flag that I can pass to a binary to see which paths it will actually read, and ideally, I'd like those paths to be logged to STDOUT on startup.
Yes, I want to know which files in which order.cI don't want a static list of all possible locations, I want it to deduce what files would take effect in a given environment and with the given config files - a program must already know this when it is ran - so let's just be open and print said paths.
Absolutely. For example, if you pass "-v" to ssh, it prints paths to configs that it tried to open, paths to identity files that it tried to open, to known_hosts files, etc. This info is not needed so very often, but when it is, it's incredibly useful.
Yes. Exactly. Which is why the program should tell you which one takes precedence.
The program already knows what config files it's loading and in what order, so why is it left up to the human to do all that work again? There should be a flag like --manifest or something that would spit out what actually got loaded from where.
Out of interest, I checked out man man on Linux and found the search logic documented in man 5 manpath. In addition to system directories specified in /etc/manpath.config, the man command checks directories in the user's PATH and maps them to related man directories: pathdir, pathdir/../man, pathdir/man, pathdir/../share/man, and pathdir/share/man are all considered. The MANPATH environment variable overrides all other configuration and should not be necessary.
So, following current practices, installing binaries in ~/.local/bin and man pages in ~/.local/share/man should work without any additional configuration.
> Ideally I'd like to avoid scanning all the way through the manual page or other documentation for the program to find out, because that's slow and annoying.
I maintained a CLI tool for a long time. I came to the conclusion that good config is a bit harder and more subtle than most people think. I agree with the premise of the article, but there are more layers to it, literally.
Depending on how much config you have, sooner rather than later you’ll have multiple sources: system defaults, user config files, sometimes profiles, env vars, and flags. Configuration may be sourced from a combination of these, so overrides need to be handled and naming needs to be consistent. Another source of mess is deeply nested config, such as for a user-defined project. Those are very awkward to add flags for, in which case it’s better to enforce a file (like kubernetes config).
To me, tools like git strikes a good balance. And by that I mean that project-local config is in a project root directory that’s not necessarily checked in to VCS, but picks up the config even if you’re in a subdir. It rarely causes any unwanted surprises, but at the cost of being somewhat flag heavy, which is then alleviated with user-defined aliases.
I needed to update /etc/resolv.conf under RedHat the other day and I was surprised by how much this has changed. You don't edit /etc/resolv.conf directly anymore because it's maintained by Network Manager[1]. Instead you edit either one of the interface configuration files and tell NM (via nmcli) to reload it, or you use nmcli to update the interface config file.
Fine, I can deal with that.
What really surprised me were a couple programs that needed to be restarted to see the updated /etc/resolv.conf as they apparently don't use glibc for DNS resolution. I used lsof to figure out what was sending DNS requests to the old name server IPs.
It turned out to be osqueryd and Apache Traffic Server.
1. There are various ways to put /etc/resolv.conf back under manual control if you don't want NM to maintain it.
> Instead you edit either one of the interface configuration files and tell NM (via nmcli) to reload it
Yeah, and this would be fine if both /etc/resolv.conf told you what is creating that file and NM didn't rewrite its own configuration files all the time.
Anyway, it would be not that bad if NM was kept restricted to RedHat, because the entire philosophy of that distro is that configuration files are just suggestions and the system is free to change anything at any time. But instead, that behavior infected everything.
On Debian it only says it's auto-generated and you shouldn't edit. It's not necessarily created by NM either¹, so it would be really helpful to have it.
1 - The policy of whatever configures your network last rewrites this file is spreading. Apparently systemd creates it first, and Debian has more than one network configurator for you to choose.
As a system administrator I want 2 things from config files:
* loading a directory of config files, so various Configuration Management tools have easier time deploying config (especially for multi-use apps like say syslog daemon, where multiple disparate apps might want to put a part of the config in)
* a function to dry-run config validation, so CM can alert about that instead of restarting an app to a wrong config
I would go one further and say that the default should be that programs cannot read, open, or see files at all unless they're either explicitly allow-listed to open those files, or the user has given permission for them to open those files (i.e. by using the file name as a command-line argument, or selecting the file in an os-provided file selection dialog box).
I like this about Deno (not to suggest that the runtime is without its flaws). Whenever I try to run a script and it tells me I need to allow e.g. read access to the filesystem, I’m more grateful than annoyed. And it lets you choose the granularity of the permissions that you grant. Hopefully this approach will become more common.
Programs should just store all their files together. The traditional directory structure of Linux et al is wrong. I do not need a /bin folder full of unrelated programs. I need a /program_xyz folder for each program, with all the stuff in there.
Granted when disk space was scarce, this reduces the ability to share dynamically linked dependencies. There are other ways around that. And in the modern world, there are some retrofitted solutions where package managers indeed do it this way. And a lot of other tools out there like Docker overlap this idea that part of what they do is disentangle the file system of unrelated programs.
But the real giga-brained answer, which most programmer-type people probably aren't emotionally ready to accept, is that not only are the ways we organize our directories wrong, but tree-based directories are just dumb in general. Everything should just go in one big directory and you can use names and tags to sort them out. Programmers are culturally obsessed with trees.
I would posit most things in a file system are better represented by graphs. A file type would have a program to run with, a program to edit with, maybe also referencing files in relatively placed directories/locations, and potentially pointed to by a crown job or other automation routines…
Trees and hierarchical layouts do make sense for some things, but as soon as you start symlinking, a proper graph works better - though I imagine computationally harder (a sacrifice I’m willing to make on most desktop machines tbh).
Your notion of sticking everything in a single space and using tags, searches, and filters is something I’ve wanted to explore for a while. Afaik it’s unexplored space, unless others could point me to some awesome tools or processes?
The UNIX filesystem has traditionally been a graph for ever. I haven't looked at details for a couple of decades, but definitely all UNIX/POSIX/Linux filesystems operate on a graph model.
A distinction I used to make when I was teaching this stuff: on your filesystem tree, on Unix names (labels) are on the links (arrows), while on DOS/Windows names are on nodes (boxes).
>But the real giga-brained answer, which most programmer-type people probably aren't emotionally ready to accept, is that not only are the ways we organize our directories wrong, but tree-based directories are just dumb in general. Everything should just go in one big directory and you can use names and tags to sort them out. Programmers are culturally obsessed with trees.
Java will die before it gives up its practice of "let's organized code in a series of nested directories where each directory is the next qualifier in the package name"
I wish more programs had an obvious way of showing what configuration values are currently set. Something like `program --config` with no arguments should print all configuration keys and their current values.
Seconding this, for sure. "/usr/sbin/sshd -T" (T for "test") has been a life-saver more than once.
My strongest version of this is that any application that manages any state should be able to report on that state in a human- and machine-readable way.
This doesn't tell you the current running configuration, it tells you the current configuration on disk. These should be the same, but aren't always. I personally have a nasty habit of forgetting to reload configs. I make sure to leave them in a syntactically valid state, but I really don't care if someone can't access my webserver for a while; if I have to go, I add a quote and a curly bracket, save, and close my ssh session. If my 5 minutes away runs into tomorrow, then I no longer remember what's been changed (I manage this these days by making /etc a git repo with an ignore list and stashing incomplete changes...then forgetting to commit them when I'm done)
The main issue is to use configuration files residing somewhere in the filesystem. This looks like a global variable in a codebase (something we generally try to avoid).
Instead, the configuration file should be explicitly provided as a command line argument. Systemd sandboxing can also be useful to ensure the program only uses the expected set of files.
For instance, on my NixOS machine, the Nginx configuration is not in `/etc/nginx` but explicitly provided and can then be known with ps:
$ ps aux | grep nginx
nginx: master process /nix/store/9iiqv6c3q8zlsqfp75vd2r7f1grwbxh7-nginx-1.24.0/bin/nginx -c /nix/store/9ffljwl3nlw4rkzbyhhirycb9hjv89lr-nginx.conf
> This looks like a global variable in a codebase (something we generally try to avoid).
Aren't they more like global constants than variables? Loaded at startup, and never change during that run of the program. (With the exception of only explicitly being re-read on SIGUSR1 for daemon-like programs.)
And global consts, or #defines, or whatever, are things we generally don't try to avoid?
It's not a bad idea but it's not applicable to every piece of software. I don't think that passing a config file for every git command would be convenient.
You can change the commandline string at runtime. You could inject a fake "-c correct/path" even if it's not there. (That's useful for other things too, like injecting the git revision of the app into the commandline)
In Conduit (https://conduit.rs) there is no default config path. In order to start Conduit, you need to specify the CONDUIT_CONFIG environment variable, which is the path to your config. This will typically be done in a systemd service.
This has multiple benefits:
- You can't accidentally start Conduit with a wrong config (e.g. the config in the current working directory)
- You can have multiple Conduits running at the same time with different config
- It's easier to understand for the system administrator because there is no hidden information
I've mentioned this before in a comment long ago that I thought there should be a library for configurations, like libconfig. This would expose an API that a program could get it's config from, then the user could decide in which format they would like to keep their configs, like ini, json, yaml etc and could translate between them. Then you could get like a unified config for your system.
Strong agreement. I'm building a tool for dynamic configuration and that makes this doubly fun, because now the configuration can come from a remote, live updating source as well.
In Prefab, the source can be a default file, env specific defaults, live values, or machine specific overrides when doing local development.
I think the best way to achieve this is by providing an OS API that results in the files always being created in the same place. Applications/libraries could still choose their own filenames and syntax, just the location would be OS controlled. I think there is room for a new desktop/laptop OS to emerge and one good idea from mobile OS design I would like to see is having everything be an API call that allows the OS to impose standardisation, permissions and user preferences rather than the free-for-all desktop OSes have (though I propose letting the user run non-compliant applications, and not porting the iOS app store only model into the OS).
The problem with the Windows registry, at least back in the day, was that it was a single file that could be corrupted and it would wreck your whole system.
I think having a standard utility API for *nix configs makes a lot of sense. I'm surprised it doesn't exist.
I tried to find one, and there are some libraries for reading and parsing in every language, but nothing that seems to cover everything.
> The problem with the Windows registry, at least back in the day, was that it was a single file that could be corrupted and it would wreck your whole system.
Exactly. But Registry has been using NTFS and it's capabilities for rollbacks and recovery. Therefore, the problem is mostly solved. There are occasions [0] of bugs causing corruptions though but they are very rare.
Lately I've written all my tools to not only announce where they're reading their configs from (which isn't precisely what OP is asking for but is close), but also to have an option to generate the default config that will be presumed if a config file is not found. "mytool --gen-config /tmp/newconfig.txt" should produce a fully-formed config with all defaults explicitly set which can be moved into /etc/mytool.conf or wherever it's supposed to go, all ready for me to tweak without having to look up settings names.
A system-wide key-value store, that's distinct from file storage, seems like a pretty great idea. The Windows Registry is icky, we all know that, it somehow appeared out of nowhere in Windows 95 already crusty as hell. But if it were designed by people with taste and discretion, would it be good?
When you give people the freedom to use the KV store as they wish, you inevitably run into a meta-problem whereby the way they write their data structures in to the KV store is varied across applications. I don't know if you could standardise this structure beyond enforcing applications to use an OS-assigned private store, but I don't see how that would work with portable executables etc.
I suppose you'd need to be very opinionated for it to work, but people will still bodge it to make it work with their needs that your system doesn't natively satisfy (people putting JSON strings in the values, for example).
The filesystem is manageable by users (with few exceptions). You can't easily discover which part of a registry hive is associated with an application.
It's precisely as discoverable as the filesystem. It's a tree like the filesystem. With names given to nodes by the application, just like the filesystem.
It's indeed "configured using symlinks" but not in the sense that you're free to use symlinks the way they're supposed to work but in the sense that if you don't want to use systemctl to place them, you can place them "by hand" if you know all the magic ways in which systemd interprets them and will remove them as it sees fit when you invoke a management command.
I have no idea what you mean by that. A systemd target is a folder in /etc/systemd/system/ which contains a bunch of symlinks to unit config files. When the target is requested, systemd makes sure that all these units are running. This is how symlinks have been used for configurations since forever (for example by apache on Debian).
ls: cannot access '/etc/systemd/system/mydaemon.service': No such file or directory
It has no business removing that symlink. It was placed there by me, and it should be treated no differently than if it was a regular file.
I have no problem with systemctl enable/disable creating and removing links in the appropriate target.d directories, but this symlink had nothing to do with that mechanism. Leave my stuff alone.
This doesn't work because unit files have to be in one of the load paths. Have you tried adding /path/to/ to $SYSTEMD_UNIT_PATH? This is documented in the Unit File Load Path section of the manpage [1].
> This is how symlinks have been used for configurations since forever (for example by apache on Debian).
This is not how symlinks have been used for configuration "since forever." You didn't have to add the link target to some "load path" variable when the link itself was in the correct directory.
This article (And most notably its title) uses a narrow view of programs, how they're interacted with, and storage models. Should your car's ECUs do this? Your washing machine? A GUI program? What should they output if storage is the last page of flash memory? What if it uses 2 pages? If it's offboard, should it report the device ID? If it's a GUI, should it have a button that displays this? Would an import/export feature meet the intent?
I find the author's intent and meaning to be rather straightforward to understand: he's a Unix sysadmin, so I think it's understandable that he's speaking about programs executed on a Unix-like system that a sysadmin is likely to administrate.
So no, nor a washing machine nor a car ECU is expected to do this (not until sysadmins are paid to administrate them at least).
As for technicians that might need to configure one someday, those people are usually expected to already know where to look, or at least to be able to read their documentation correctly. And anyway, car ECUs and washing machine usually provide a UI for configuration and don't rely on config files as a primary mode of configuration.
As for GUI programs, if they use a human readable config file under the hood, or if they accept a config file as input (some let the user change some simple settings through their UI but uses a config file to change more advanced settings), yes they should have a way to show where it is (either through a command, or through the settings UI for example).
I don't really know what you are talking about with the pages of a flash memory... Since in Unix, everything is a file, it will be read through a file anyway, not directly through pages. Same thing for offboard: just display the path to the file, wherever it might physically be. And if you really are in the most rare and hypothetical scenario where your program reads its config directly from the pages of a flash memory, just have it display "The config is read from the last [two] page[s] of <this flash memory> when invoqued with --help.
The article makes sense in this context, and after a reread, the hint is identification of background in the first sentence. I misunderstood due to the title of "Everything that uses configuration files should report where they're located",and lines "if a program uses a configuration file (or several), it should have an obvious command line way to find out where it expects to find that configuration file.", and use of "program" in general.
Specifically, "program" in this case refers to "CLI program that runs on Linux, used by the IT field".
I've found strace | grep "open" particularly useful for locating config files. They're usually opened right at the start of the program as well.
(For reference, strace prints out every system call used by a program with the arguments that were called; occasionally, a program may use something other than `open` or `fopen` to read in a file, so that should be kept in mind when using this method)
strace has its own filtering mechanism, so you can do `strace -e open`. Though you might have to add `,openat,openat2` so maybe your way is easier in practice.
I've had rare cases of needing nested data structures in the environment. My solution was to specify them as JSON strings and load the value in code. It works.
- If your configuration has more than 5-10 options then env vars become a mess while a configuration file is more maintainable
- Nested configuration / scoping is a mature advantage of configuration files
- You can reload configuration files whereas you can't reload environment variables during runtime
- A configuration file is a transparent record of your configuration and easier to backup and restore than env variables. Env variables can be loaded from many different locations depending on your os.
- In configuration files you can have typed values that are parsed automatically with env variables you need to parse them yourself. This is just a difference not that bad for env variables per se.
Programs reconfigurable at runtime are by far rarity, as that's generally pretty hard thing to do. But yes, just have a config file, ENVs are nice as a backup for small apps, but never as only way
It's definitely nice when it does but it's also a whole lot of complexity to add to the code. You have to re-create all structures relying on configs, reconnect if needed, clear all previous ones, but only after any ongoing processing of requests received before the reload signal finishes.
Some go half-way, like HAProxy where it does spawn a new process but that process just binds with SO_REUSEPORT and signal old one to close the socket and finish all remaining connections. So you effectively get hitless reload, it's just that old process is the one finishing in-flight requests
Yeah, it def adds complexity but if you want it used in real world applications, it’s a must-have. With only env vars, it’s an impossibility to ever have (without supporting infrastructure and processes), which was the point I was trying to make.
Hardly. "Real world applications" need to tolerate single node failure anyway so reload functionality doesn't add all that much. Some system daemons, sure, them not working for a second can be disruptive so it is worthy endeavor there, but the amount of added complexity is nontrivial and the added complexity scales with apps' own complexity.
It's a bit different for desktop apps, as restarting app every time you change some config options would be far bigger PITA than some server app.
Unless you’re working for a tech company, it’s running on a single server (and it’s worth pointing out that most companies aren’t tech companies). That’s what I meant by “real world”, AKA, outside the tech bubble.
Yes. Env vars beat constants but have stunted our thinking because we know deep down it's not a great idea to have too too many.
Config files are better, but can lead to the mess described here and typically only give us crude axes for overriding.
It's time for us to embrace & tame the complexity. If you use a feature flag tool already, you know what it's like to target different values based on rules.
If I want to experiment with changing the http timeout for specific calls from a single service in a single region I should be able to roll that out.
It's time to expand our brains and bring this feature flag style thinking to the rest of our configuration.
> If I want to experiment with changing the http timeout for specific calls from a single service in a single region I should be able to roll that out.
...why ? how often you do it ? What actual app feature it fills ? Why would you throw permanent code at something that you need to do once or twice?
That kind of thing is FAR better served as customizable hooks (loadable .so/.dll, maybe even attached Lua interpreter) rather than bazillion random flags for everything and checks cluttering code.
For anything more than trivial configurations it can be nice to commit them to source control, giving some reversion capabilities, and easy replication on other projects.
I also vastly prefer it when error messages give a hint about what reconfiguration you can do to fix it, including correct syntax and googleable terms.
So don't do: farblewarble too low.
Instead:
FarbleWarble is 1234 but need at least 5678. Edit /etc/whatever.conf and add: FarbleWarble="5678"
There is a special hell for programs that talk abut how the Burble is configured incorrectly , without any mention of FarbleWarble
It would also be nice to have an “explain” option where it shows how or why a value was updated. I am specifically thinking of Ansible which has a multitude of locations from which it sources variables. It would be nice to be able to spy on a variable and report every value change and where that change came from.
I'm of the opinion that most programs should get their config files explicitly from the command line. None of that ~/.apprc or $APP_PORT=8080. That's inconvenient for tools you use often, like ls or grep. If they read config files, an option like --ls-config would be very welcome.
Every new account/os installation, I have to hunt around to turn that off. I absolutely dislike colored output in almost all cases. Might be because I'm color-blind. But they keep moving those settings around. Perhaps that part is done in their spare time by the MS Office menu bar team.
This is a great idea. I feel that the best total situation is 2 things:
1. A way to print the paths to all files visible that will be read for configuration. This means that if the user config file means the system config file is ignored, print only the user config file path. Alternatively print that the system file will not be read but show its path anyways. There's many ways to solve this and it depends on the way the app loads configs.
2. A way to print the final configuration after it's been loaded and validated. This should show what settings override others. For example if the user misunderstands the config merging or erroneously quotes a numeric value causing validation to fail, this would be a great way to triangulate that.
I've seen quite a few apps implement #2 but rarely do they have #1.
Maybe tangential to this, I’ve been thinking about the idea of “conceptually linked items” in a system and how to ensure things don’t become desynced. Static type systems help keep linked objects synced across code. But what about documentation? What about configuration code? It quickly falls out of sync.
I was thinking a possible naive solution to this in a file system would be some sort of pragma you could put on a file: other files it should be synced with and a date of last change. If you update a file, any file it’s supposed to be synced with must be updated, if only at the very least, the date of last change. It’s sort of a forcing function to make sure everything’s up to date that needs to be.
Could be onerous in a large system but I’m not sure.
This is a high quality blog post. It raises awareness of a usecase that's always been there but it might never surfaced, it goes straight to the point, and provides clear examples.
It's a blog post where everyone ends up being slightly better after reading it.
Programs that use configuration files should provide ways to:
show which sources of config data were used: files, CLI, env vars, etc.
with --verbose, also show which potential sources were not used
show any conflicts or overrides from config sources
show config settings differing from program defaults
in the same format as their source, or
in an appropriate format for a config file
For the last item, it should be possible to capture the output showing
non-default settings and append it to a config file, without any editing
or processing, to reproduce the program configuration state exactly.
This also applies to code generation. At my first job out of college, it was difficult to piece together where a particular bit auto-created code came from. At my second job, I pushed everyone to ensure that code generated files pointed to their sources (configuration files and generating code). Ultimately, one of our engineers designed it into our codegen framework so well, it basically came for free for our developers. And boy was it so nice to not wonder where the knobs were that changed your output.
On a side note, we also checked in the codegened files and that was great for other reasons like reviewability and searchability.
I feel no standards for config files is probably the biggest reason app containers got traction with advanced home users. Save your docker containers nuke your server install move to a new distro and reimport back the app container.
There are standards. The article even mentions the xdg base dir standard. The problem is that not everyone follows them but a large number still do and then it is easy to find them. I still agree that app containers are still a good idea.
The main culprits are programs that were written before the standard existed in addition to many people that just don't know about it.
Who should enforce this standard? I would say community pressure does it pretty well at the moment. A standard does not need to be enforced and the XDG base directory one is still useful. The more awareness exists and the more people want this and think it is useful the better. It also has the advantage that no other competing standard exists as far as I know but it still has some downsides.
I'm sick of programs dumping config files in random places in my home directory instead of .local, and then trying to read other system configuration files from .home even though I've gone through the proper steps to move them to .local and labeled the fact that I've done that with environment variables
more and more im feeling software deserves to be dumped into a nix style sandbox if it refuses to behave properly. I've had to recompile shit from source just to get it to respect my environment variables regarding where to look for local configs.
As I read this I wonder if SBOM (Software Bill of Materials)[1] could be used? I'm not sure if it already has configuration files as part of it's definition but doesn't seem like a stretch to add? Btw, I've never used SBOM in practice but seems like something worthwhile implementing.
Further: programs should use well-defined libraries to manage config and data file location. As easy as it is to roll your own, a library can and should handle cases you don’t need to worry about.
Python has platformdirs for this. It handles XDG and a number of corner cases.
"just make everything a protobuf" (or whichever system you prefer). The config file is essentially an API and APIs really benefit when they're typed.
You can make a actual API to expose application configuration knobs too which is nice but you should tie this into your rollout/CICD system so you're just as careful pushing new configuration values as you are when pushing new binaries (multiple envs, slow rollouts, blue/green etc).
This is how we wind up with three implementations of a “Windows-style Registry” while also having all the other management modes; local dotfiles,system dotfiles, all of /etc and whatever is allowed under /opt.
Be smarter about package management and this is a non-issue; checksums, install-time dumps of file modifications, and the like.
It’s why tools like CFEngine and the rest were invented, and I’m upset this is still an issue for people like OP twenty years later.
Configuration files are like parents’ advice - carefully thought out, well-intentioned, but often overwritten by the ‘environment variables’ of real life experiences!
I have this tool, which is used in a few different environments, so I gave it a “xxx config” command which tells which config files were read, which environment variables were used, which environment variables from the tool’s XXX_ namespace are unknown, what configuration values are effective etc.
I disagree it's responsibility of the tool. Program should just reach out for files and some higher order or meta runner should trace what the process does. One of the commenters mentioned `strace`, this is the correct approach, but should be done in well engineered way.
Other way is to run program in symbomlic interpreter and trace what it wants to read in any of the branches.
Why do you suggest this path over documentating the behavior explicitly?
What makes one approach better the other? Is it faster to run a trace? Should a developer trace their own code and auto-generate documentation post-build?
And... one of my pet peeves with configurations is that people delete everything that is commented.
Like helm chart definitions. Now there is an upgrade of the chart, and I need to go and trace down which value changed.
Well, if we would've kept the commented stuff I could overlay it with the new defaults showing me where the defaults have changed. At least that gives me some clue.
Consider writing a FUSE filesystem driver that generates configuration files on the fly in response to some secure source of truth. It skips one of the middlemen in terms of configuration management and makes programmatic reconfiguration more accessible.
I think good programming languages enforce that. I really like that in Golang you import a library, and them have to qualify any calls with that library name. For example, id you imported library foo, calls will look like foo.Bar and foo.Gu. I think that’s one thing that Rust didn’t steal from Golang when they should have
Anaconda being by far the worst offender I have encountered. It places a ton of configuration spread out over your home, .condarc, and your shell profile dotfiles, as well as further configuration in the install folder and the of course the project environment.yml file. Good luck finding out the channel precedence.
this would also really help the CLI in terms of auto completion, I see people go above and beyond writing completion scripts, which could be simplified instead by running --autocomplete or checking some file somewhere.
It should also report the syntax and so on, and actually you want a way to directly modify configuration values. So if you think this through, you will end up with something like https://www.libelektra.org
I'd add: everything should use configuration files that are stored on the machine. I know services that use configuration files and you are expected to upload them into the service using an IDE and a plugin to then store the config in it's database.
not supporting configuration files on disk allows a service to work without needing read or worse write access to the disk. storing configuration in the database is a benefit. (potentially even for security)
Agree! Also any other locations where any state that impacts the program behavior is stored: cache, temporary files, etc. Ultimately I want to be able to debug things or do a sort of a "factory reset". Or maybe I need to move to NixOS.
Perhaps we could implement some sort of standard switches, such as --locate-conf for reporting which configuration file(s) are being read (and in which order) or --show-conf to produce a comprehensive list of set configurations.
YEs. YEs. YEs. I always do and condemn everyone who doesn't. Also, it's nice to have a -showConfig flag that exposes the results of config file processing, minus secrets of course.
I am curious, as UNIX user, why there is no global configuration directory in home path where all the configuration files are stored. I hate that my home directory is cluttered with lots of .file and .dir.
How many of us just upvoted on title alone? once you see it, it's just self-evident. Which doesn't mean we remember to do it when we are implementing things that use config files.
I just implemented a "--config" command in my latest code that reports exactly what was used for configuration, files and environment variables, and where the files are actually located.
I find "strace programname 2>&1 | grep open" is a good way to find out which files are being opened for config. It's quicker than digging out the man page.
i worked at a big company ($10B+) that was heavy on scala and akka. we had tons of bugs across almost every team trying to figure out which application.conf was setting something or getting overwritten from somewhere else. we had many unnecessary configs to force something we couldn’t figure out. we had maybe 2 people who actually understood it and could debug it. what a horrible system. no one ever walked away from akka and said i enjoy the decisions they made to configure it
Haha I wonder if we worked at the same company. Anyways, I also remember that a lot of those same teams never read the documentation that explained how any of that stuff worked. In general, most of Scala stuff was made out to be super hard and complicated by other people, when really they just needed someone to explain to them the basics in simple terms what was happening.
I don't get to write Scala anymore and now I really miss it for certain things.
i wrote scala at work for like 12 years straight. it’s a great language and i really do miss it at times. scala code always achieved its goal of being less code. but there were some things like this where scala made it more difficult. and i would like to chalk this up to “someone just needed to learn the basics”, but why was everyone having this issue? it’s a bad interface
Perhaps there should be an semi-standard flag similar to --help and --version for showing help specifically about configuration and environment variables. --cfg-help maybe?
This has been the default (pun intended) on Apple OSes for a long time. At the end of the day, the developer has the ability to create/destroy files at will.
It should be easier to track file opening history of any running program. There is strace, but grepping/parsing the output for this purpose gets very messy.
Agreed! Few things anger me more than when a program simply reports, "File not found"... OK, which file? Where were you looking for said file?
I know how we get here, people simply report the error from the OS w/out adding context. This is why I love anyhow reporting in Rust, you can attach context (like the file name and path) to the error.
In log-store.com I report the default name of the config file (you can specify it via cmdline if you want), and the 3 locations and order searched, if the config file isn't found. I also have it report the file being used, so if you expected the file in /etc, but you accidentally had one in your home directory, you'll know on startup.
php.ini is my experience of the worst of this, finding which of the fifty five billion config files is the actual one controlling the web server’s php and not the console’s is always a fun journey.
Especially if there are multiple places to read them from: current dir, PATH, classpath, $PROJECT_HOME, database, env vars... With env vars it would be handy to also know where they come from, is there a way? like having a timestamp when it was set, by what user and process.
The title read without context sounds like it's suggesting they should report which IP address/country/jurisdiction they're in. To any random user on the internet. Which would be insane. But it's really only saying "which system path they're in, to e.g. a sys admin".
I am familiar with how software is configured, and the implied context is not clear from only the headline, hence the headline needs fixing. "Everything" is massive overstatement: bank websites use configuration files, so if hackers try to log in to the bank, the website should report where the config files are located? Nope. Headline way too broad.
Some programs use a "last occurrence wins" approach, while other programs (e.g. sshd) use a "first occurrence wins" approach.
Buried in https://man.freebsd.org/cgi/man.cgi?sshd_config(5) and awkwardly expressed:
"Unless noted otherwise, for each keyword, the first obtained value will be used."
That's why e.g. Debian's /etc/ssh/sshd_config ( https://salsa.debian.org/ssh-team/openssh/-/blob/debian/1%25... ) starts with
, so that custom configurations in that directory win over the distribution-provided defaults set up after this Include directive inside /etc/ssh/sshd_config.