What if Ansible used XML for configuration management?

gecko · on June 2, 2014

This post is ridiculous.

A very valid complaint would be that Ansible isn't idempotent, which I think is what the author was trying to get at. Nixos would be a great example of an idempotent system, and you might be able to convince me that Docker would qualify, too. As much as I like Ansible, its lack of true idempotence is a weakness.

But that has nothing to do with yaml v. XML v. anything else. You could trivially write an idempotent description language in XML, and you can trivially write a horrible state-based system in a Haskell DSL or Prolog. As far as I can tell, the author is complaining that people think that Ansible is idempotent because it has yaml. That is incorrect, but I don't think that it'd be better or worse if they'd pick something else; merely more verbose. (Look at Maven and Gradle for an example.)

xpe · on June 2, 2014

Yes, the post is somewhat ridiculous. I can take the blame for that. Still, I think the choice of language for these tools matters a lot. With so many experiments in configuration tools, my goal was to raise the point in a light-hearted way.

I'm really interested to see if or how a community with diverse preferences (developers and system administrators) might agree on a small number of configuration management tools. Or are we destined to have these tools splinter across languages?

gecko · on June 2, 2014

I doubt it, unfortunately. I think a Nixos-like approach is the closest you can get to having a single tool, simply because it's a fairly logical extension of apt/rpm/etc.-style systems, but its very operation makes it too alien for me to get it adopted anywhere. I currently suspect that a movement to more PaaS-like designs is actually going to be our saving grace, simply because it ought to take so much off the table. Then we can look at tools like Clojure, Haskell, Prolog, and other similar language concepts to try to tackle the other half of things.

peteretep · on June 2, 2014

    > I think the choice of language for these tools matters
    > a lot

I read the post, and still don't know _why_ you think it matters, when comparing two data-interchange languages.

xpe · on June 2, 2014

I'm saying that using a data interchange language (whether it be XML or YAML) is no substitute for a programming language.

xpe · on June 2, 2014

What I've watched and read, mostly from ansible.com, but also http://en.wikipedia.org/wiki/Ansible, suggests that Ansible is idempotent. What should I read that makes the opposite claim?

gecko · on June 2, 2014

That's unfortunate. The presence of the command module alone ought to make it clear that's not true.

I think Ansible aims to be more idempotenty than other environments, and I think it succeeds there. The lack of something like Chef databags helps, as does its insistence that $SCM is the source of truth. But it doesn't take a holistic enough stance to really get all the way there. It's very difficult to do things like handle the transition from one version of a play book to another sometimes; while running a given playbook on a given blank system may be idempotent, it can't actually reliably model state transitions. In other words, it's only got half of the puzzle.

crdoconnor · on June 2, 2014

It aims to be idempotent, but the presence of things like the cmd module allows you to hack it in such a way that it isn't.

This is why I don't use cmd in general, and if I do I try to make sure that it's an idempotent command.

qntmfred · on June 2, 2014

you said idempotent 5 times fyi

prutschman · on June 2, 2014

That's only a problem if "idempotent" is heterological.

rdtsc · on June 2, 2014

Good points.

Ansible playbooks are not always declarative. There is often stuff that looks like this:

---

   ---

   # Launch Job to count occurance of a word.

   - hosts: $server
     user: root
     tasks:
      - name: copy the file 
        copy: src=inputfile dest=/tmp/inputfile 
 
      - name: upload the file
        shell: ...
      
      <more names and shell pairs ...>

---

That is just specifying a set of steps. So then it calls for for loops, if conditions, and so on.

I noticed that pattern in examples when looking at Ansible and kind of shook my head.

But oh well, I guess, it depends on how you arrived at Ansible. I think if you view it as "better than SSH-ing and running shell scripts". Ok it looks nice. But if you want a more mature declarative based configuration system, then maybe you'll start to see various flaws or design issues with it.

Chef seems more powerful. Also played with saltstack. I like saltstack better so far but aside from playing with it still use shell scripts and OS packaging pre-/post- install scripts to handle tasks like these.

btgeekboy · on June 2, 2014

You say that they're not always declarative like it's a bad thing. Personally, I find it nice that we have the playbooks to build out our infrastructure and the playbooks that perform a rolling upgrade of our application using the same syntax, variables, and inventories. If you keep your declarative playbooks separate from your procedural ones, you get more power than a declarative system alone can provide. (FWIW, we chose it not just for this, but because of its lack of an agent - we can use our configuration management to deploy software to customers' infrastructure without colliding with their own CM.)

Further, this (using a data language as a programming language) isn't without precedent. Both XSL and Ant have standard conditionals, loops, etc available for use.

jimbokun · on June 2, 2014

XSL and Ant are the poster children for why this is a bad idea.

xpe · on June 2, 2014

> Both XSL and Ant have standard conditionals, loops, etc available for use.

Yes, XSL and Ant got there first, but what fraction of us would like to see the same decision made again? :)

vertex-four · on June 2, 2014

Salt implements its declarative language as a separate thing (States) from its more imperative parts (Modules, Runners, and the like). That's probably a better abstraction.

krakensden · on June 2, 2014

Things that take serious logic and state should be re-written as modules (or ActionModules, which are awesome & underdocumented). I feel like people are far too scared of writing their own modules- it's not very hard.

yen223 · on June 2, 2014

People don't necessarily hate XML, but man do we hate tags. One huge feature of YAML is its clever use of whitespace to eliminate a lot of cruft from the syntax.

That said, I'm starting to warm to the fact that Django uses straight Python for its settings. If you're going to use a Turing-complete language to write your config file, might as well use a familiar one.

crdoconnor · on June 2, 2014

I hate XML. It's unnecessarily complicated, it breaks easily and it's verbose.

YAML is kind of sucky as a data transfer format (chopped up YAML is still valid which leads to all kinds of screwed up behavior), but its mirror equivalent JSON is good for that. Equally, JSON sucks for declarative configuration but YAML really shines.

Django's settings file is also one of the thornier bits of django. I'm not sure if should necessarily have become declarative, but making it a monolithic python file causes a lot of headaches with integrating modules, state changes, etc. The fact that you can often move a bit of config from one end of the file to the other and change the behavior completely is a bit screwed up. I think it would work better if it were a bit more ansible-y.

yen223 · on June 2, 2014

I never did like using JSON as a configuration format (I'm looking at you, Sublime Text). But it's still miles ahead of using XML as a config format.

I can see where you're coming from regarding Django's settings file. Lord knows how many hours I've wasted hunting down stray configurations thanks to frameworks which abuse inheritance for its settings config (I'm looking at you, Mezzanine). Still, the fact that it is Python, and can therefore be debugged like any other Python script instead of obeying its own special syntax, is a plus in my books.

crdoconnor · on June 2, 2014

I was actually thinking of mezzanine when I wrote that (and django-cms is worse!). Both packages gave me a real headache while setting up settings.py. There was no clean separation between code and declarative configuration.

sanderjd · on June 2, 2014

Yes! One of my little frustrations with Rails is that most configuration is done in Ruby, but then every now and again you're supposed to configure something with YAML, and then even weirder, some YAML files are actually implicitly interpreted as erb. It's weird and I'm not sure what the point is.

yen223 · on June 2, 2014

The idea is like what crdoconnor mentioned - it's good design to strictly separate 'data' from 'code', and most configurations are 'data'. Unfortunately, in practice there are plenty of times when a small loop or a conditional would greatly enhance DRY in config files.

colechristensen · on June 2, 2014

I hate XML. It is a basket of good intentions poorly implemented.

yen223 · on June 2, 2014

Amen to that.

I actually like the idea behind XSD - I have encountered situations where a "strongly-typed" configuration file would have made a lot of sense. Only problem is that the kinda-like-XML-but-not-really format of XSD is so clunky it becomes a second source of frustration.

vezzy-fnord · on June 2, 2014

The author's complaint isn't really related to YAML or XML, it's his disdain that Ansible uses a data serialization language, instead of a full-blown programming language.

rdtsc · on June 2, 2014

Another way to look at it though is the author wouldn't mind using either YAML or XML if Ansible scripts were fully declarative. Once the need for if conditions and for loops comes about, any configuration file language (or data transport) language is going to be a little awkward.

yen223 · on June 2, 2014

"Once the need for if conditions and for loops comes about, any configuration file language (or data transport) language is going to be a little awkward."

Occasionally, it would spawn a whole new language: http://www.lua.org/

lost-theory · on June 2, 2014

This wraps up my feelings about CM tools very well. They all seem to follow this "language agnostic" / "it's not really programming!" model, whether it's chef's ruby DSL or ansible's YAML. I don't understand what's so special about this field that it requires a totally different paradigm.

In chef's case, I would rather write simple classes & functions in plain ruby than use the DSL and write "Lightweight Resource Providers".

In ansible's case, I even did a little experiment to show what it would look like to use & call ansible modules as normal python code:

https://github.com/lost-theory/ansible/blob/module_experimen...

https://groups.google.com/d/topic/ansible-project/Zv3Veo77gP...

I would love to see a CM tool that embraced plain python or plain ruby and not try to go the pseudo-"declarative" route, which ends up being IMO too constraining for not much benefit and needlessly reinventing a bunch of wheels (e.g. looping & conditionals, as the article mentions).

stonith · on June 2, 2014

I think the intention is to appeal to traditional sysadmins with limited development experience, who don't know ruby, and more importantly don't want to know ruby.

I haven't worked with Chef so I can't comment, but I work extensively on the puppet-openstack project and my experience is that this aversion to actual languages is part of a pipe dream. The only people who are able to work with the modules in any meaningful way are the ones that understand the Puppet DSL and runtime in its entirety, and it's not much simpler than the Ruby language it's based on.

Honestly I don't know what the solution is: part of the DevOps movement is introducing development practices to the management of systems, but that brings with it an inherent requirement that the operators actually embrace the development practices. A quick example of where this can easily fall down is if you maintain a puppet environment of any complexity, you'll end up keeping both the data inputs and the modules in git repos, but if your sysadmins aren't comfortable with git, the adoption simply won't happen.

stormbrew · on June 2, 2014

> A quick example of where this can easily fall down is if you maintain a puppet environment of any complexity, you'll end up keeping both the data inputs and the modules in git repos, but if your sysadmins aren't comfortable with git, the adoption simply won't happen.

As someone who is traditionally more dev side, but has done a lot of casual system administration, this part of things like chef and puppet drives me nuts. I don't want to version my infrastructure in git and then throw it, with no versioning context, up to a server and have it version it again in its own unique, quirky, and honestly haphazard way.

krakensden · on June 2, 2014

Chef's DSL is particularly egregious, since if you step out of their slightly bizarre subset your code runs on the machine running knife. Newbies tend to get really confused.

sanderjd · on June 2, 2014

I used chef for years without getting past the really confused stage.

bryanlarsen · on June 2, 2014

Of course, Ansible doesn't really use YAML. It uses a YAML parser, but each individual command goes through quite a bit of parsing inside ansible as well. An ansible command like this:

   - command: chdir=/foo echo {{ bar }}
     when: bar is defined

Really parses down to something like:

    - action:
      - module: command
      - arguments:
         - chdir: /foo
         - command: echo {{ bar }}
      - when: bar is defined

More parsing is done by ansible than is done by YAML. I'm not necessarily saying it's bad, but it definitely leads to some gotchas and weirdnesses.

How it mixes in Jinja2 is also strange. In the above example, there are two expressions: {{ bar }} and 'bar is defined'. Normally if you paired a templating language with a text format, you would expect the templating language to apply to the whole file. Ansible applies it afterwards, and selectively. That's the right thing to do, but it does make it confusing.

Many people choose ansible because they don't want to learn Ruby as well as a CM tool; ansible isn't much simpler. I'm not a huge fan of ansible, it's just better than everything else I've encountered so far.

tdicola · on June 2, 2014

I've been using Ansible a bit recently (and _really_ enjoying it) and agree that YAML can be a little wonky at times, like having to escape any template line with :{ in quotes because YAML interprets it as a dictionary. However I don't think Ansible wants you to code all your logic in YAML. You can easily write a plugin with Python and have all the proper syntax and capabilities of a programming language.

Where Ansible really shines is giving you a huge library of quality plugins/components and a simple YAML-based DSL for composing those plugins. Want to spin up an EC2 instance, copy over your code, and ensure a service is started? No problem, it's a 4 or 5 line Ansible script. When you want to do something more advanced, look at the extension points Ansible provides to plugin your own code: http://docs.ansible.com/developing_plugins.html

doxcf434 · on June 2, 2014

Right, if you need more complex logic, then you write a module in a normal programming language. The YAML is intended to keep the intention and orchestration logic simple and readable.

xpe · on June 2, 2014

Does this work well in practice? I'm new to Ansible.

walrus · on June 2, 2014

It works okay. One shortcoming, as mentioned at https://news.ycombinator.com/item?id=7831680, is that there is no clear model for state transitions. Let's say that on day 1, you want to use Apache:

  - apt: name=apache2 state=present

Then, on day 2, you realize that Apache isn't hip, so you switch to nginx instead:

  - apt: name=apache2 state=absent
  - apt: name=nginx state=present

It works, but you're stuck with entries in your playbook that are there for purely historical reasons.

doxcf434 · on June 2, 2014

It depends on your deployment model to a degree. In the case where you build a new image for your web server on every deployment, then this isn't an issue. For longer running systems, like databases, it would still be an annoyance.

I think that Ansible is still a transitional glue tech though, and projects like flynn.io may move things forward by making the CM part of the infrastructure a developer task vs. a devops one.

crdoconnor · on June 2, 2014

Works very well for me. I have some pretty complex configurations and they're all done declaratively with a bit of template logic. I think it's pretty rare to need a custom module (not that it's hard to write one).

Ansible is 100% intended for configuration management. Configuration is by its very nature declarative, so in any case if you were doing a lot of custom procedural code for configuration you're probably doing something a bit wrong.

akoumjian · on June 2, 2014

One of the things I like about Saltstack is that you can build your formulas using any renderers you like. You can try to remain strictly declarative in YAML, or if you like build your states completely dynamically in Python or a Python-DSL. The default uses an inbetween that takes YAML buts lets you template that YAML with Jinja.

[http://docs.saltstack.com/en/latest/ref/renderers/]

mavelikara · on June 2, 2014

Build tools also had a similar evolution. Ant's XML syntax first started out as all declarative, but then later on had to add more control structures (ant-contrib, IIRC). The currently popular build tools - Gradle, Rake, Grunt etc - all use a dynamically typed full-blown programing language for specifying tasks and their dependencies.

xpe · on June 2, 2014

As another example, Leiningen is largely declarative, but since Clojure code is also data, it is possible to insert code alongside. See line 211 in the sample project.clj.

https://github.com/technomancy/leiningen/blob/stable/sample....

crdoconnor · on June 2, 2014

Ant made the mistake of becoming turing complete which meant all sorts of screwed up code got written in horrible XML.

To be fair, ansible allows the same sort of behavior with cmd, letting you turn your playbook into one long overly verbose bash script.

I have yet to see any code abortions on a par with what I've seen with ant or maven with that, yet, though.

benatkin · on June 2, 2014

I find YAML to be code shoehorned into YAML and think it's ambiguous whether under the GPL (the license itself, not AnsibleWorks' interpretation of it) the YAML is subject to the GPL. It seems to be class declaration, iteration, and function calls (to GPLed functions) to me.

crdoconnor · on June 2, 2014

The part of ansible that really kind of bugs me isn't the YAML, it's the default JSON output of commands.

I ended up integrating this into all my projects:

http://blog.cliffano.com/2014/04/06/human-readable-ansible-p...

Which fixes the problem (sort of), but I really don't understand why the default isn't looking something like that.

It's a real pain parsing through some JSON with a ton of \n...\n...\n...\n's in order to find the source of the errors.

rmrfrmrf · on June 2, 2014

I'd say leave the programming languages to modules and compose them together with a data language. Forced simplicity for configuration is a good thing.

xpe · on June 2, 2014

You'd prefer to allow composability in the middle and bottom layers (the modules) but have a different language on top?

I have to admit that I'm a fan of turtles -- the same kind of turtles -- all the way down. Many of the abstractions I see in configuration management tools seem to be fancy names for modules. I don't see why a role, a playbook, a play, and a task really need to be different things at all. Why is the hierarchy necessary?

rmrfrmrf · on June 2, 2014

> You'd prefer to allow composability in the middle and bottom layers (the modules) but have a different language on top?

Absolutely. IMO, the top layer should be a summary of what the configuration manager will be doing and should have the project-specific data in it. The rest should be reusable modules. When you let a full program language into the top layer, you inevitably get people trying to put smaller routines in the top level itself mixed in with reusable modules, which results in bloated spaghetti.

crdoconnor · on June 2, 2014

>Why is the hierarchy necessary?

To make it easy to extract out reusable code in a consistent way.

They aren't all necessary, though. If you have a very simple playbook it can go all in one file.

Sort of like how python has classes (which correspond to roles), but you're not forced to use them.

jimbokun · on June 2, 2014

Any Java programmers here?

Thank God IntelliJ treats Spring XML as Java code, because so much logic and conditionals and variables live in those files. The actual program can be very different than what you would expect only looking at the Java files.

It pains me to see other tools make the same foolish mistake, even if I don't happen to use it.

mwsherman · on June 2, 2014

I suspect that Web Components will evolve this way as well, declarative language used imperatively. http://clipperhouse.com/2014/03/31/web-components-have-we-no...

_3u10 · on June 2, 2014

Yeah, why not use a language like bash that we use everyday for config management instead of reinventing the wheel in YAML.

This is pretty much why I love fucking shell scripts. http://fuckingshellscripts.org/

Sae5waip · on June 2, 2014

Because modern configuration management tools like Ansible (or Puppet, or Salt, or whatever) allow you to do more things more easily.

Also, because Bash is a particularly horrible language. I have a lot of experience writing bash scripts, and I hate bash.

shiven · on June 2, 2014

Please! Not that horrible, horrid poop called XML. Keep it away from my beloved deployment/management tool. The world has plenty of other bloated packages that will gladly not care if they are bogged down with XML, just leave ansible alone!

I have a visceral hatred, a la Erik Naggum, towards XML. His arguments against XML capture a lot of my feelings, so I won't rehash them here.