- I need to configure 50 virtual hosts, each containing a different set of web content - should I make 50 separate packages? or make one meta package?
- I need to install a java web application, and configure a database.properties file with a JDBC URL that might vary based on environment
- I have a set of cron jobs that I need to be configured based on the application sets that I have installed, but a different set that I need to be consistent across all systems. Now I need to build logic that figures out whether any of the cron jobs configured match jobs that already exist, or run the risk of them running more than once.
- I need to install tomcat 15 times with slightly varied configurations.
Now, you can say that some of these things are not relevant in development environments, or that you can do some of them with packages, and so on. But there's real advantages to use a config management tool to build your dev environment, and then when you're ready to move to production, use teh same config management model to build that environment.
It's not that you can't make packages do most or many of the things you want. It's about using the right tool for the right job.
2. Make one package for the web application. You can put the database.properties as a post-install configure section if you already know what it should be, or have it run a script on the target that loads the correct variables. Or make separate packages just for the database.properties. Again, it comes down to where you want to do maintenance.
3. This one isn't too difficult. Use /etc/cron.d/ and name your cron jobs uniquely based on what they do. Then make packages however you want. Even if multiple packages deliver the same cron job (more than once), you're just overwriting a file that already exists, so no duplicates. If that causes a conflict you can deliver to a temp location and have a %post section test if the target cron file already exists, and delete the temp file.
4. This is where slotted services (admittedly a term I made up) comes in handy. No package managers really deal with this properly, which is where a completely virtualized service becomes a much easier way to handle it. But you can still install a chroot directory and run the service from it using just package manager, no deploy tool required. Optionally, you can also build a set of packages and dependencies and install them to version-specific directories, and set up directories of symlinks to the versions you want, and target your application at the symlink-directory-tree that matches the versions you want to run. It can be a hassle if you don't have a tool to do it for you. I think there are some existing open source tools designed to do this, but I haven't used them.
(That was for running 15 different versions of tomcat, by the way. If you just want to run tomcat with different configurations, just make your configs and run tomcat for each instance! The scripts that ship with tomcat already support instance-specific configurations; iirc, HOME_DIR was the base path, and INSTANCE_DIR was the instance-specific configuration path, or something similar)
Everything you mentioned is relevant to development. And i'm not trying to discount real configuration management. If anything, it's critical to use a real configuration management tool to manage large, complex sets of configurations across large orgs. Using a package manager is just an easy deploy tool for the configuration, but how you manage it is left as an exercise for the engineer.
(That being said: why don't deploy tools incorporate change management, persistent locking, user authentication, pre/post install hooks, and audit trails? We need more open source solutions that fulfill enterprise requirements)
2. Once you're "running a script on the target that loads the correct variables", you're doing configuration management. There needs to be a mechanism for retrieving the metadata from somewhere, usually centralized. You'd be better off delivering this through a CM tool to build and maintain the file.
3. This doesn't work. Not only do you have the conflicts to deal with, and a post-configure script a hacky way to deal with it, but now if you need to change a single cron job, you have two unsatisfying options - make a new meta-package for that one cron job and overwrite the versions delivered by the five other packages, which will break any config validation you're doing (as one would hope you're doing), or I can update all of the packages that provide that cron job and update them all (which works, but now you have to come up with a way to notify your package that just in this case you shouldn't restart services adn the like just because the package was deployed. On top of that, there's an even worse issue, which is - how do you know when the last cron job is removed? If I have five packages that all create that file, either removing the first one removes it, which breaks the other four, or I have to come up with post-remove script logic that tries to programmatically determine if this is the last reference to that file and remove it only then. If I did the latter, my meta-cron job package update would break this model as well, and I'd have to remember to remove that specifically as part of uninstalling the their packages.
4. I guess this works, but now you're dealing with chroot'ed environments, which means deploying not just the specific stack you want, but all of the necessary libraries, and as you say, your original package manager idea doesn't really deal with this.
And tomcat gets a lot more complicated too, when you're trying to manage shared XML resource files. In fact, the whole package manager notion really requires the "X.d" style approaches to loading files.
But all of these challenges are why package managers are the worst solution for deploying configuration files. Package managers are great for deploying static software, shared libraries, and the like. I'll even concede that dropping code on a machine is fine with a package manager. But they're not designed to deal with dynamic objects like config files and system configurations.
In fairness, there's a whole other third class of objects that currently both package managers and configuration management tools do terribly, and that's things that represent internal data structures - database schema, kernel structures, object stores, etc.
I've built a couple of configuration management tools and work for a company that has a few more, so this is something I've spent a lot of years working with. Package maangement as configuration distribution is attractive for simplicity, but falls apart beyond the simple use cases. Model-driven consistency is vastly superior.
But you have a good point! Dupe files are hard to manage. Some package managers refuse to deal with it, others have complicated heuristics. The best solution would be to just deliver the files and let a configuration management tool sort out what needs to be done based on rulesets. This can still be accomplished with packages as a deploy tool, and a configuration management tool to do post-install work, instead of the %post section.
4. You already deal with chroot environments using lxc/docker/etc. They're just slightly more fancy. But even with docker's union fs you still have to install your deps if they don't match the host OS's. Unless, of course, you package all the deps custom so they can be installed along with the OS ones. Nothing is going to handle that for you, there is no magic pill. Both solutions suck.
Most configuration management eventually becomes a clusterfuck as it grows and gets more dynamic and complex. In this sense, delivering a static config to a host in a package is simpler and more dependable. I can't tell you how much more annoying it is to have to test and re-deploy CM changes to a thousand hosts, all with different configurations, only to find out 15 of them all have individual unique failures and having to go to each one to debug it. On the other hand, you could break out those configs into specific changes and manage them in VCS. Or even pre-generate all the configs from a central host, verify they are as expected, and package and deliver them. I have done both and both have their own special (read: retarded) problems.
For reference, the sites that I worked at that delivered configuration via package management spanned several thousand hosts on different platforms, owned by different business units and with vastly different application requirements. But you have to adjust how you manage it all to deal with the specific issues. edit Much of it involves doing more manual configuration on the backend so you can 'just ship' a working config on the frontend. Sounds backwards but (along with a configuration management tool!) it works out.