Hacker News new | past | comments | ask | show | jobs | submit login
Ansible Simply Kicks Ass (devo.ps)
185 points by hunvreus on July 3, 2013 | hide | past | favorite | 153 comments

> Doing this with Chef would probably mean chasing down a knife plugin for adding Linode support, and would simply require a full Chef stack (say hello to RabbitMQ, Solr, CouchDB and a gazillion smaller dependencies)

It is throwaway lines line that where you really need to be careful since, no, you don't need to RabbitMQ, solr, couchdb etc. You can just use chef-solo which can also be installed with a one liner (albeit running a remote bash script)

When comparing two products (especially in an obviously biased manner) you need to make sure you are 100% correct. Otherwise you weaken your case and comments like this one turn up.

The thing with one liners is that it is cool when they work, hell when they don't. I'm looking at replacing Puppet (because it's orders of magnitude too slow). Chef was a viable candidate, up until I tried installing it on two pristine installs (Scientific Linux 6 and, when it failed, Ubuntu Server). The instructions seem simple enough:


but when you run:

  sudo chef-server-ctl reconfigure
Chef goes into a frenzy of bootstrapping itself , installing a huge stack of dependencies, takes ages, only to fail with an obscure message in the end about one of its dependencies failing to start (RabbitMQ, if I recall correctly).

Can I work around the failure? Sure! But what does it tell you about a piece of software that its installer fails on two pristine different and common linux distributions? To me, it tells me that it is over-complex and poorly engineered. So, from a different perspective, I really identify with the "gazillion smaller dependencies" comment.

I helps me little that chef-solo is simple. I don't want chef-solo. I want a full config management distributed solution that is dependable (which probably requires lean).

My next candidate is Salt, and so far it looks good.

Be aware that Salt, for reasons passing understanding, implemented their own encrypted channel (instead of TLS) as an alternative to using SSH, and suffered a grievous vulnerability as a result. I'm not saying don't use Salt (I don't know much about it), but I would recommend using SSH with it, and not its custom transport.

TLS is not possible over zeromq by design. There is work to build encryption directly into zeromq, but it isn't there

(Disclaimer: I used to contribute a lot of code to salt and still do when I have the time)

Why wouldn't it be possible to run zeromq over a TLS-encrypted TCP stream? It looks like people have already built that.

I had a look at how Ansible does 0mq encryption the other day, they do key exchange over SSH and then use keyczar to encrypt data over the socket, so that doesn't look too bad. I didn't look at the key exchange or review things in depth, but, assuming keyczar.Encrypt() does the right thing, it shouldn't be too bad.

Does Salt offer all the benefits of Chef server? To my knowledge, the former has no search engine nor any way to return feedback from the managed hosts. Shared storage is absolutely essential for building things like ssh_known_hosts files and Nagios configurations.

You can do that with Salt via the Salt Mine: http://docs.saltstack.com/topics/mine/index.html

I haven't used it myself, but I've been reading up on Salt quite a bit in hopes that I can replace Chef with it soon.

Ansible can trade data between machines using 'hostvars' as a variable, so you don't even need the analog of Puppet storeconfigs to assemble facts. (Sorry, don't now the Chef term).

Stop by the list if you'd like more info, but it's easy to generate nagios configs from templates this way.

Who stores hostvars? Does Ansible provide a way to extract a subset of them based on other hostvars? (e.g. "give me all the ssh host keys of my hosts in the intersection of datacenter X and hostgroup Y")

Ansible doesn't store them, but exchanges them in memory. (The assumption is you'd want them to be current anyway). We are going to add some optional caching (likely just requiring sqlite as it's all we need) in a future release.

But yes, it's easy to do that. Assuming you meant a template, you could do:

{% for host in groups["datacenterX"] %} {% if host in groups["Y"] %} {{ ssh_host_key_rsa_public }} {% end if %} {% endif %}

Ansible is also good at carving up groups based on these things, like if you just wanted to talk to those hosts:

hosts: datacenterX:&Y

note that Hacker News ate my newlines.


note that Hacker News ate my newlines.

If you want to edit the post to include them, indent by at least four spaces the text for which newlines are important. Maybe like this:

    {% for host in groups["datacenterX"] %}
        {% if host in groups["Y"] %}
            {{ ssh_host_key_rsa_public }}
        {% end if %}
    {% endif %} [sic]

> Ansible doesn't store them, but exchanges them in memory (The assumption is you'd want them to be current anyway).

What happens if the server crashes? Can a subsequent run before all the data is repopulated cause the derived files to have invalid or missing data?

It's saving them in memory now so there are no files to corrupt.

When you run again it gathers all the facts from the hosts.

I'm in no way proficient in Salt. It does have shared storage and the ability to go through other nodes config data, though.

This video is a nice introduction: http://m.youtube.com/watch?v=jJJ8cfDjcTc

I haven't had trouble with installing Chef 11 server on Ubuntu.

As for installing a huge stack of dependencies: if we're talking about Chef 11, there's nothing to install. All of it is embedded into the omnibus package. It is configuring them.

It's too bad about the RabbitMQ. Have you submitted a ticket?

I'm curious - for what do you use SL? I had tried it at v4 and it did not seem superior to a 'standard' Fedora install at that time, even for scientific analyses (ROOT and GEANT were no different to install IIRC and there wasn't the same healthy package system that comes with Fedora). Has it improved in some way since then?

It's an historic left-over. My company uses CentOS, and back in the days of the transition to v6, CentOS was delayed. SL6 came out at the right time and we switched.

SL is basically the same as CentOS: a community version of RHEL.

That's exactly why I'm on SL6. The delay, and better maintenance prospects with SL than CentOS as Fermilab is committed to the project.

(I don't think we work at the same place, there aren't any likely Sergio's in the corporate directory)

From the docs:

chef-solo is a limited-functionality version of Chef and does not support the following:

* Node data storage

* Search indexes

* Centralized distribution of cookbooks

* Environments, including policy settings and cookbook versions

* A centralized API that interacts with and integrates infrastructure components

* Authentication or authorization

* Persistent attributes

I love how the first result when you google "chef-solo" is a list of all the things that it won't do for you, all of which are stated in jargon that a beginner cannot understand.

It might be more direct to replace that page with a single sentence: "chef-solo: We're Not Interested And You Shouldn't Be Either."

Or, might we consider compiling a friendly list of the great things chef-solo could do for you? I was going to work on that, but now I'm tinkering with Ansible instead.

I don't know what's the opinion of other people, but for me chef-solo is not even a product on its own. If you use chef, but need to either bootstrap the server or prepare some system image - then going the way of stripped down / hacked chef is what you're looking for. And that's what chef-solo does.

But maybe it's more of a standalone tool for others?

I use chef-solo with Vagrant, I rarely use chef-server/client with Vagrant. My dev box rarely needs to separate out environments, for example.

You do not need to bootstrap Chef 11. You download a single omnibus package and run a single line to install it. All dependencies are already embedded into the package. Under the covers, it bootstraps itself, but from the end user's perspective, It Just Works.

Came here to say exactly this. chef-solo _is_ good, particularly combined with things like Vagrant and Berkshelf

That's mostly just sales talk. The reality is that chef-solo works just fine for a lot of use cases, and that set of use cases has significant overlap with the use cases for ansible.

If a tree falls in the forest, but that tree is not mentioned in the documentation – indeed, if the documentation says something like "there's this forest, we used to live there, it's dirty and primitive and stuff, now all the cool kids live in Manhattan"...

The point is, the beginners never find the tree. They get lost, kind of like I've gotten lost in this metaphor, and they go try something else.

I would agree (if I understand the metaphor within a metaphor, that is) that documentation and best practices are probably the biggest weakness in the Chef community. Once you add in tools like berkshelf or librarian-chef where the tool creators recommend totally different workflows from those recommended by Opscode, not to mention the wide variety of testing stories... it's a mess. I've heard from someone at Opscode that fixing that mess is one of their priorities, but until then, it's definitely a problem.

Where we really need to be is a place where configuration management is just another part of application repositories (there's a parallel here to db migrations), so that web applications are completely self-contained. That includes versioning, unit testing, etc. I haven't yet seen any of the configuration management tools come forward with a solid unified best practice solution for treating applications this way, but there are a lot of people working in that direction, which makes me very hopeful for the near future.

There is a project called "littlechef" which allow you to use part of the chef-server's features without chef-server.


It doesn't seem to include node storage, so for example you'll need to include a mysql password because if it generates a random one, you won't be able to retrieve it.

(This may have changed since last time I tried it.)

I should point out that Ansible can store passwords on the filesystem here, using the 'password' lookup plugin. So while there is no database you can still easily access things exactly like this.

To be fair, I don't think Ansible has any of those, either.

We're also adding some neat things around authentication and access control in a forthcoming product release, if you do decide you want those kind of "serverey" things. See here: http://www.ansibleworks.com/ansibleworks-awx/

But yes, you don't need these, Ansible can exchange variables between hosts in memory without needing a database, etc.

Can I ask why you chose GPL for Ansible? This makes it more challenging to integrate or embed Ansible in many types of commercial or non-GPL projects.

Chef [1] and Puppet [2] both chose the Apache license. Salt did as well.

If you were worried about competitors offering hosted versions of Ansible, I could understand going AGPL, but not GPL3.


[1]: http://www.opscode.com/blog/2009/08/11/why-we-chose-the-apac... [2]: https://puppetlabs.com/apache/

I come from Red Hat and I like the GPL, because it ensures the software remains free. I view GPLv3 as largely a language improvement on GPLv2, which makes it easier to understand and define how it applies.

Ansible being GPLv3 doesn't affect folks much. Your library modules can be licensed in any way you like, and it's fine to shell out and call ansible-playbook. AWX also exists as a nice API layer if you need a web services interface that abstracts you from the license question, so folks looking to do commercial integrations can tap into the REST layer.

AGPL is more restrictive than GPL, in some ways, as you can't use things in a hosted service. We didn't want to be that restrictive.

While AGPL certainly is more restrictive (that is the whole point of the AGPL after all) -- it does of course not prevent the use in a hosted service, it just has a requirement that the end user still has access to the source code (while with GPL you could change large parts of the Free code, and never contribute that back to the users, if you so chose).

As long as there is a REST bridge/layer, any service shouldn't be considered a derived work -- only changes to "core" Ansible would be "covered" by the AGPL.

I'm sure you know this, but it is an important point (just like the GPL never dictates that you have to contribute changes upstream, just to "your" users (which of course might include upstream, and most people feel it is more constructive to give back in the more general sense...)).

I've seen some strange (imnho incorrect) interpretations of the AGPL, hence this comment.

Right! It would prevent people from being able to use it in hosted services that didn't share the source, and we wanted to make sure people could easily use it in those things without any problems.

Ansible is licensed as GPLv3. But it sounds like AnsibleWorks-awx is a commercial-only product? Are future 'pro' features going to only be released and maintained in commercial variants of AnsibleWorks or separate related projects? Would love more transparency about the relationship between open and closed-source products at AnsibleWorks.

I made a similar clarifying request on the Ansible user group forum a few weeks ago but my message was not approved. I believe that these are fair requests and deserve clarification.

We've posted quite a lot about this to the list in the past, I'm not sure what happened to your post.

Ansible is GPL, and we're totally going to keep it just the way it is, and continuing to make it more awesome. (The application is just invoking ansible-playbook).

Why is it commercial? A lot of the stuff we are building is mainly interesting to enterprises -- RBAC, reporting, things like that.

The core things everybody needs to use Ansible today, all the modules, are free software and we will continue to add loads more there. Ansible received over 50 new modules in the last 2 releases.

K cool. Thanks for clarifying. Keep up the good work. :-)

How is that a fair request? Ansible is a GPL3 Python codebase. If the community doesn't like where AnsibleWorks takes it, they'll fork it.

for that list of features, you either don't need it because of how Ansible works, or there's an inventory plugin for it.

FWIW, environments should be in the next release.


Chef 11 doesn't use Couchdb, though it uses those other dependencies. However, you do not go and install those dependencies. You install a single package that includes all of that.

Chef client does not require RabbitMQ, Solr, or CouchDB. It too has an omnibus package that embeds all the dependencies and isolates it from the system.

Both server and client are now one-line installs as well.

True, though chef-solo is a local-only solution. Ansible manages to manage remote machines without that kind of setup, and can start managing the remotes without installing anything on them.

Install chef-solo on server.

scp locally dev'd (with vagrant ideally) cookbook to server

Run cookbook

3 steps that can easily be wrapped in a little script (which I know a large company does because I saw their presentation about it on confreaks. Sorry I cannot remember the name of the company or presentation but it was a chef related one).

Still not exactly killing it in terms of complexity. I would avoid comparing ansible to chef-solo in that respect and focus on bits where ansible has a clear (IMHO) win.

Having said that I should say that I have not used ansible and am basing this on what I have read about it.

You don't even need to do that. You can type `knife solo prepare root@my-server` and it will install chef-solo on that machine. Then type `knife solo cook root@my-server` and you're good to go.

Well damn, I have been doing it wrong. That is awesome to know though.

I wonder if anyone has done a http://todomvc.com/ equivalent for cfengine, puppet, chef, salt, ansible etc.

Something like a simple webserver running Apache with mod_xsendfile, Passenger, Ruby 2.0.0, postgreSQL and a few firewall tweaks (why yes, I AM mainly a ruby developer, why do you ask?)

Is there a way to have multiple "roles" while using those commands?

I mean all my servers are 70% identical and have the same base, and the DB have a few different packages than the app-servers for an example.

I would hate to have two different chef-solo installations that basically does the same 70% of the time.

I think -r allows you to select roles IIRC. Not sure if it allows for multiple roles, but you can include roles within roles so you can make a role called 'web' that includes 'db','webserver' etc

`knife solo bootstrap root@my-server` will run both prepare and cook commands.

Awesome! Thanks for that

3 steps that can easily be wrapped in a little script (which I know a large company does because I saw their presentation about it on confreaks. Sorry I cannot remember the name of the company or presentation but it was a chef related one).

I think you're thinking of Ben Rockwood from Joyent: http://www.confreaks.com/videos/1684-chefconf2012-chef-behin...

If I recall, he made a comment to the effect of "Chef solo is great for really small deployments and really big deployments."

(Replying to this comment as I can't to the other for some reason) It is indeed! That'd be a brilliant idea actually. Sadly I only have a very minor amount of experience using chef.

If you're interested though, I recently released our stack as a chef solo project at: https://github.com/FineIO/fine-io-chef

chef-solo does require ruby > 1.8.7, rubygems and many gems though. Their argument for ansible is that it only needs basic python which is installed by default on most systems.

Python and Ruby are located at basically the exact same point in programming language space, so to me they're pretty interchangeable. Since these are Ops tools, the argument that anything is 'installed by default' scares me. Nothing should be installed by default; choosing what is installed and how it gets installed is one of the major responsibilities of the Ops team.

This is more of a function of so many utilities written by Red Hat and Canonical folks use (and require) Python. And that's part of the reason Python is so popular among systems applications. One of the earlier starts to this was the system-config-* toolchain, and of course Anaconda (which isn't so much installed, but a great example).

Another Ansible convert here. Our entire team uses it, we love it. We have a semi-unique situation where we have to do a lot of finagling to access some machines, don't always have control over what sits in front of them, and frequently have to comply with outside regulations. I won't go into too much detail, but Ansible is flexible enough and expandable enough to fit our needs. We were able to have Ansible reading from our custom inventory system in under a day, have expanded it to tie in with tons of internals, and all with very minimal effort. What would've taken weeks setting up with Puppet or Chef just falls into place. It's a beautiful thing. Highly recommend giving it a try, even if just to have a quick way of setting up your own dev machines.

> I won't go into too much detail

Actually, can you? I don't mean can you give me an exact lay of the land, but what kinds of problems in this arena are you facing? I presently work in gov't contracting and so far you've defined my exact deployment scenario. That is, infrastructure architecture requirements (hardware/network config) are ever changing, and won't stabilize for some time.

If you can't be more specific as to your problems, maybe you could at least let me know what about Ansible has made this easier to deal with?

The only weak point Ansible has is that its documentation doesn't show enough real-life scenarios. Fortunately, it's really easy to grok and extend when you see a script, so I wrote this to help:


I agree, we need to fold more playbook examples into the docs for sure.

Folks might be interested in checking out http://github.com/ansible/ansible-examples for some full-stack use cases showing playbooks that we do have.

I've just picked up ansible, and one thing I found lacking was an example ansible repository with a standard directory layout and config documented on the site or on github. For example, it's not obvious from the docs that a nice way to set things up is:


Have you seen http://www.ansibleworks.com/docs/bestpractices.html ?

Just noticed these don't mention "vars" in the roles directory, do need to fix that :)

I hadn't, doh! Thanks. Might be worth adding a link to that at start of the docs, because I found I was hesitant to start writing yamls without knowing the "right" structure for them.

Oh, thanks, that wasn't there when I started using Ansible. I'll take a look at those, they look very well structured, and very reusable.

These sound awesome. Any pointers on how to modify these for Ubuntu machines?

This is definitely cool but the real hard problem isn't in these simple easily scripted cases. The real hard to solve problem is managing all the complexity of similar-but-different hosts.

This article could just as easily have taken the complete opposite view of Ansible by saying things like "parallel ssh sessions don't scale, strong encryption costs too much CPU time; push can never work reliably therefore pull is the only viable model; etc. etc."

I feel one of Ansible's strongest points to champion is the low barrier to entry. It takes minimal understanding to get going, compare that with at least 1 month hands on with cfengine or perhaps 2 weeks puppet before you would consider yourself proficient enough. With Ansible it's 20 mins or so.

We've got quite a few users managing hundreds and thousands of hosts, so I'm not seeing these kinds of compliants. If I would, I'd feel it, but we don't :)

One of the things many people want to do is rolling updates too, and Ansible is remarkably good at them, having a language that is really good for talking to load balancers and monitoring and saying, "of this 500 servers, update 50 at a time, and keep my uptime". Folks like AppDynamics are using this to update all of their infrastructure every 15 minutes, continuously, and it's pretty cool stuff.

For those folks that do want to do the 'facebook scale' stuff, ansible-pull is a really good option. One of the features in our upcoming product is a nice callback that enables this while still preserving centralized reporting.

Happy to have the conversation, but definitely I've never heard the CPU time compliant. I think the one thing we see is a lot of users are happy that Ansible is not running when it is not managing anything, rather than having daemons sucking CPU/RAM/etc, and folks are actually getting a little better performance from avoiding the thundering herd agent problems.

I just did some consulting on helping another team improve their hadoop cluster performance and the first thing I noticed is that all 40 boxes in the cluster were burning a CPU core with a puppet agent process that was running at 100% CPU for months.

That's one of the nicer things about the no agent setup, when Ansible is not managing something, there is nothing eating CPU or RAM, and you don't have a problem with your agents falling over (SSHd is rock solid), so you get out of the 'managing the management' problem as well as the 'management is effecting my workload performance' problem.

In particular with virtualization, running a heavy agent on every instance can add up. (reports of the ruby virtual machine eating 400MB occasionally occur).

How does Ansible effectively scale to thousands of hosts using ssh? My experience is that you can only run a few hundred ssh sessions at a time with reasonable performance, and that's on beefy hardware to begin with.

Several different options.

Many folks are actually not doing repeated config management every 30 minutes, though I realize that may be heresy to some Chef/Puppet shops, there's also a school of thought that changes should be intentional. So there is often a difference in workflow.

LOTS of folks are doing rolling updates, because rolling updates are useful.

Many folks are also using ansible in pull mode.

You could also set up multiple 'workers' to push out change, using something like "--limit" to target explicit groups from different machines.

What happens if you feed Ansible --forks 50 it's going to talk to 50 at a time and then talk to the next (it uses multiprocessing.py). If you also set "serial: 50" that's a rolling update of 50 nodes at a time, to ensure uptime on your cluster as you don't take the 1000 nodes down at once.

This is really more of a push-versus-pull architecture thing, while it presents some tradeoffs, it's also the exact mechanism that allows the rolling update support and ability to base the result of one task on another to work so well.

Ansible also has a 'fireball' mode which uses SSH for the initial connection for key exchange and then encrypts the rest of the traffic. It's a short-lived daemon that doesn't stay running, so when it is done, it just expires.

> Many folks are actually not doing repeated config management every 30 minutes, though I realize that may be heresy to some Chef/Puppet shops, there's also a school of thought that changes should be intentional. So there is often a difference in workflow.

I think this is a false dichotomy. Those who believe runs should be performed frequently often implement this to revert manual changes performed by people operating contrary to site policy.

Not so sure, I've heard that quite a few times. The use case of rack-and-do-not-need-to-reconfigure-until-I-want-to-change-something seems quite common, but I suspect it's in often better organized ops teams where you don't have dozens of different people logging and not following the process. There is of course --check mode in ansible for testing if changes need to be made, as is common in these types of systems. Thankfully, both work, and you can definitely still set things up on cron/jenkins/etc as you like, if you want.

You can actually manage a deployment methodology this way (update x number of hosts kinds of things) pretty handily using Mcollective with puppet. You can even script your own plugins for it to do basically whatever orchestration flow you want. Pretty cool toolkit and I use it myself.

I keep on hearing these arguments of scale. I haven't ever experienced such issues. Now yes, if you're running Facebook infrastructure you may run into issues. Though honestly, who here manages more than a few hundreds hosts? I can't find the post that was making the point of encryption cost and parallel SSH sessions, but it'd be helpful to get actual hard numbers backing these claims, and how they compare to Puppet's or Chef's approach.

I agree with the complexity of similar-but-different hosts, and that's actually something we're set to solve with devo.ps (we'll see if we pull if off).

To some extent i agree; most will never scale so why worry about it? X times out of Y you'll come out ahead by just ploughing on and not considering the future. Even if we are thinking up front, we can rationalise that it's just a gamble, often one worth taking!

It definitely doesn't take Facebook sized infra to outgrow a technology though. What if the gamble doesn't pay off? What if you planned to scale the central NFS server dependency by just adding an extra NFS server but have now found there's no rack space left / no budget / a purchasing delay / insufficient network capacity / cooling capacity / power capacity / a.n. other unplanned problem.

For the SSH question, i couldn't reliably get more than 250 concurrent connections outbound from one circa 2008 blade server. From memory that would have had a spec of dual core CPU, maybe around 2.4Ghz with 8Gb ram using PAE as it would have been a 32bit kernel (our spec, the cores will have been 64bit). They were multiply-connected at chassis level on myrinet fabric in one DC and infiniband in another and the resource being exhausted was CPU.

These days all the blade servers are gone but we see an absolute explosion of virtual machines so it's a similar and still relevant problem in many ways.

It's not a matter of taking the gamble you'll stay small. It's that there are different kind of big, only 1 times out of 1 billion there is a single player (Google, Facebook, Amazon) who reach an insane scale. By then, you'll have the resources to solve whatever issue you may hit.

As Michael (creator of the project) said here; it scales just fine even with serious players (thousands of instances). He has actual hard data and concrete use cases to back it up.

So far, all I've heard from detractors is the "OMG Chef is so hardcore Facebook uses it". Well, they use some. They'd probably do just fine with Ansible instead, provided they were putting half the brain power they put into Chef to make it work for their infrastructure.

You may get very big, just likely not Facebook big.

It is worth pointing out that Ansible does not need concurrent connections to each server to manage each machine, you can address as large of groups as you want and control parallelism with the --forks parameter.

I really don't intend this as a brag, but I just wanted to point out that I learned Puppet yesterday. No, I mean, I really learned it yesterday. I started with my company's vagrant-managed VM and took their existing puppet architecture, and armed with that I learned (from no previous experience) how to write Puppet manifests, modules, 'define's, etc.

My only point is that it doesn't take two weeks to learn Puppet. I'm not saying that Ansible is worse or anything like that, rather I just wanted to contribute another data point.

I think the issue is not so much the question of how long it takes someone to learn a tool, but the repeated cost you get get from using it on a day to day basis. (I'd still be super impressed if you had storeconfigs and the spaceship operator nailed in a day!)

For instance, I worked for a major computer vendor doing an OpenStack deployment, and watched a simple deployment there suck up 20 developers for six months, where all of that time was in writing automation content.

Repeated hammering out of dependency ordering issues, coupled with the non-fail-fast behavior, and having to trace down where variables came from turned us into automation tool jockeys, so we couldn't focus on architecture and development. The project barely had deployments extending beyond 5 nodes in the end from all of the complexity.

Ansible already existed at this point, but it provided major fuel for me doubling down efforts into expanding it. The goal here is not just the basic language primatives, but making it really easy to find things as you have a large deployment, and making it really easy to skim/audit even if you aren't a really smart programmer.

That all being said, Puppet deserves major credit for pioneering a lot of concepts and revising CFEngine.

While Ansible aims to be a cleaner config tool, but also focuses on application deployment and higher level orchestration on top, so you get some capabilities not found in those other chains (like really slick rolling update support).

Thanks for the helpful comment and also for Ansible in general! I'll take a good, close look at it - you might find me in your IRC channel sometime soon as I poke around Ansible, trying to see if it might be a good idea to port things over.

I know we were looking at Boxen as a way to roll out environments to our dev machines, and we are hoping that our existing Puppet configs will work well with that effort (since Boxen uses Puppet). Do you think it's at all possible that there will be some sort of adapter to allow Boxen to use Ansible? I have no idea if that would a good idea or not (I haven't looked in to either Boxen or Ansible enough yet) but that's the sort of thing that would likely help steer our decision process.

Sure thing -- Jesse Keating has been working on a side project called 'ansibox', which is effectively about taking boxen like ideas and applying them to Ansible. He's 'jlk' in #ansible and you should stop by and say hey. It's new, but it has the same kind of 'choose what you want and we'll make it for you' kind of workflow.

You're tinkering with an existing, working architecture and a company-maintained Vagrant VM?

Of course it's going well. But you should make sure to buy all the people who built all these well-engineered things a tasty drink at the next company get-together, because rest assured: Not every Puppet setup is easy to tinker with.

I absolutely do, and having a working environment is a huge help! However I found the puppet docs to be more than sufficient, and in fact I spent the day learning puppet not through our existing code (which actually needs a fair bit of work still) but rather using the learning VM that Puppet provides on their website.

i, on the other hand, tried to learn puppet over the course of a weekend with a brand new laptop and Boxen just after it was released. There is a nightmare of broken dependencies that still aren't resolved months later (librarian-puppet won't install apache because apache's version has a -rc in it which isn't a ruby version blah blah blah...)

now I know that using VMs / vagrant is critical to sane server orchestration development workflows though, so there's that.

Interesting! We do use librarian-puppet but we do not use apache. It's possible that we just managed to step around that land mind essentially by accident. Did you run in to any other issues?

Used Chef before. In current gig, it's just me managing stuff. I picked Ansible over Chef and Puppet cause it's low(no) barrier to entry and simplicity. Also ability to ad-hoc commands means I didn't have to use another tool for that.

In the 1year since, Ansible has exploded in capability. It isn't quite as low barrier anymore (more to config language) but still gobs lower than others. While adding much sophistication.

I wouldn't even consider Puppet or Chef anymore. Salt or Ansible. Although, Python guy, so biased more than a little.

Unfortunately, we were forced to toss out Ansible immediately in our tool selection process because it did not support Windows OS clients without Cygwin.

Also, none of the "VM lifecycle management" tools we needed like Foreman or its commercial equivalents had any integrations with Ansible.

These concerns left Ansible dead on arrival for us. It sounds like a decent tool if it matches your needs.

(note: Ansible creator here)

Yeah, there's no Windows now. If we do it, I'd want it to push powershell over native means. I'm still not 100% convinced it's something Windows admins want to do in a large enough number to focus on it, but input is always good and would be interested in hearing more about use cases folks want.

As for lifecycle, Ansible has significantly less provisioning integration required than some of the other systems because all you need to do is get your SSH keys in there -- less bootstrapping. It also has a pluggable inventory system so you can get your list of hosts/variables from things like EC2 or OpenStack, so with things like cloud-init, that is usually what folks need. And there are some modules for spinning up new guests in various systems. (I'd like to see a vcenter inventory plugin too).

I'm also the original author of Cobbler and there's integration with that on the provisioning side, and for something else similar, it's largely the case of writing an inventory plugin and making sure the key is installed.

I'm glad to hear from the creator. That's one reason I love HN.

Our Windows folks are big Powershell fans here. You wouldn't find too many complaints with it. The biggest problem going with Powershell would be bootstapping it into the environment if it's not already installed or if you depend upon a more recent version.

Since we're actively evaluating lifecycle management tools right now, this is our main concern. Many of the integrations to provisioning tools flow downhill from frontends which manage the state of many systems. We find ourselves choosing the best-integrated stack rather than evaluating provisioning software. The plumbing is very important, but a well-written lifecycle tool should make the copper pipes less visible.

I'm still not 100% convinced it's something Windows admins want to do in a large enough number to focus on it...

I think this is much more common now that windows in the cloud is quite easy. (My company seems to have as many windows instances in EC2 as linux.)

I do agree with your preference for powershell over cygwin.

I think there is significant demand for a good Windows solutions in this space. Chef and Puppet have Windows support, but I don't see much community momentum around it.

Powershell is the obvious choice, and, as of v3, I think it would be a viable transport for Ansible. If you or anyone on this thread wants to put some ideas on paper for a PoC, I'd be glad to help. Email in profile.

PS. Thanks for your work on Cobbler. We used it recently for automating VM template builds.

There is already SCOM+SCCM for this but you better have some bars of gold lying around that you can afford to toss out of the window.

Please tell us what you've decided to use. I have an occasional customer who is trying to deploy .NET servers on AWS using Puppet and he reports that there's not a lot of step-by-step guidance on how to do it. Just bootstrapping the thing is a pain, requiring some kind of Powershell incantation to impersonate "curl".

I don't have a simple answer yet. We're still eliminating options. Right now our long list still stands at:

Foreman, Redhat's ManageIQ EMC, UrbanCode's Terraform, CloudBolt C2, vCloud Automation Center, BMC's Cloud Lifecycle Management, and IBM's SmartCloud Orchestrator.

Most of these use libvirt and one of Puppet, Chef, or Salt under the hood.

Seeing as Ansible is out of the question, what would be the best and/or simplest of these tools to learn and use on a Windows client (with Vagrant, remote Linux server etc).

I want to choose one of them but can't decide which one to pick.

The simplest one I've found other than Ansible that supports Windows is Saltstack. But it does require a master and isn't quite as intuitive as Ansible imo but close.

Having used Puppet and Chef in the past, Salt is a breath of fresh air. I've found it much simpler to use.

I can't speak to best or simplest yet. Puppet, Chef, and Salt all support Windows clients.

I would try writing the same recipe in each of them and see which works best for you. All of them can now support easy, domain-specific languages in JSON or YAML.

Puppet is the oldest and has the largest community. Chef is almost as old and large and seems most popular with the Rails community. Salt is the youngest and has the most active contributors.

I think you can think of these tools on a spectrum from pull-based to push-based. I say spectrum, because many of them are in reality hybrid. Tools running agents are naturally pull-based, where as with push you don't necessarily need an agent (consider sand blasting ssh commands).

CFEngine is by far the most pull-based tool as it is underpinned by a theory mandating this behavior. Puppet is pull-based but with more push, Chef is even more push based, and Ansible and Salt are (I guess) mostly push.

In the end it depends on what is more practical for your situation. If you have a few machines, then more push will be practical, but the more machines you have, pull-based solutions scale better.

Slight correction, ansible has a pull based mode, called ansible-pull, which simply scales with git. It's there if you want it! ansible-pull can also be easily deployed with regular ansible as well, if you want to do that, there's an example playbook.

If you want solid reporting with pull mode, we're going to have a product release in the next month that has a cool callback based feature, where you get centralized reporting and nodes can still phone in any time and request configuration of themselves. Easy to bootstrap, it will just require a one-line wget to invoke a configuration request.

I would not characterize SS as being push based in the same way, one of the things Ansible is really good at doing is orchestrating things "like an orchestra", so if you want to talk to the woodwinds, brass, percussion, and then go BACK to the woodwinds, you can. It's not simply saying this-than-that, etc. This enables things like the rolling update feature, and doing some pretty intricate multi-tier work.

One of the major inspirations for Ansible was when I was building Func at Red Hat and wanting to build a multi-tier architecture management system. I still strongly believe you need a push based system for that, because you don't want to wait 30 minutes + 30 minutes + 30 minutes for all of those changes to stack up.

Some other tools try to solve this by glueing layers on top, with Ansible, at least the intent is to build that in.

So yeah, I think there are definitely times when you need both.

Yes, I think you need both pull and push, the ratio depending on the situation at hand.

A senior sysadmin once told me that he asked all incoming recruits about what this proportion generally should be. The answer: 1/6 (16.6%) push and 5/6 pull (83.3%). I found that answer slightly amusing as it seemed overly general. On the other hand, he felt very strongly about that point and was quite annoyed with juniors' natural inclination to use push by default when pull could practically be used.

The main thing that irks me (last I tried it) is that the pkg repo functionality uses add-apt-repo. This makes it not work on Debian (6 was stable at the time) or with any repo that doesn't have a matching src repo.

edit: docs indicate it's still a problem http://www.ansibleworks.com/docs/modules.html/#apt-repositor...

We've got a pull request to fix that one up, somewhat in progress - https://github.com/ansible/ansible/pull/3117.

Awesome! Might be switching from Salt soon. I _really_ like that I can just pip install it.

I'm sorry, but even as fairly smart tech guy this page was not helpful to me. It didn't tell me anything useful about Ansible. I was left wondering ".... how the hell does it work? how do I use it?" and basically had to investigate their homepage to figure out why I should care. If it was a sales pitch to get me interested, it worked, but what the hell are sales pitches doing on HN?

"but what the hell are sales pitches doing on HN?"

I assume this is a rhetorical question :-) A number of folks on HN are either in a full-time or part-time dev/ops role and so tools that are developed in this space are interesting.

Unlike other sites there aren't specific subcategories of interest on HN so you get the devops stuff with the politics stuff with the new language stuff with the economics stuff with the funky physics stuff with the the i-did-a-thing stuff.

Ansible is an example of what is part of the Web 3.0 stack, dynamic but durable provisioning of resources out of a third party infrastructure.

If you want to find out more, perhaps you want to read http://ansibleworks.com/docs instead?

Well, that's my point... why have a page which basically just says "Ansible is cool! Go check it out!" I mean, that's cool that they like Ansible, but there wasn't any real content... it was just link bait, basically. I just find blatant advertisement or marketing to not be useful. I would rather they have posted a cool thing you can do with ansible, or really anything of any value other than a brief outline and a link.

I love ansible. Very straightforward to use--only problem is it's moving so fast I can't keep up. :) Good problem to have though.

It's also the first open source project I've committed code to. Huge fan. I use it to run a small 8 node openstack system, as well as deploy Apache VCL.

I agree ... Ansible is awesome.

I'm using a single set of playbooks for my infrastructure with Jenkins to check out changes and push them to production machines. The same infrastructure definitions are used with Vagrant to give each developer an exact copy of the production system.

How do you let jenkins install packages and make changes to the machine? Passwordless sudo?

I turn off password access to production machines, so nobody accesses them without a known key anyway. Ansible has its own key installed on the servers and its account is allowed to sudo without a password.

So if I can break into your ci machine (or just get jenkins to run random commands on your prod server, which is probably easier), I then have sudo access to your prod server?

Using ansible from a local machine is fine, because you can make your devs type in passwords and etc, but I can't think of a secure way to do it with continuous integration.

Don't assume that breaking into his CI instance is easier than breaking into the prod server. It's probably on a private subnet, in the first place.

That's one way, you could also allow SSH key access as root too.

Ansible also has a --ask-pass and --ask-sudo-pass if you want to provide a password.

So you store a password that will give you root/sudo on your production machine in your deployment code, or just on your ci server?

I would suggest just using keys in this case.

anyone with both ansible and saltstack experience to compare ?

I've used and continue to use, both. I like both tools, it's hard to choose one.

Ansible advantages:

+1 no client required. Python libraries may be required on the client machine to take advantage of certain modules (e.g. Python's postgresql library to use some parts of the postgresql module)

+1 roles. A nice, best practice way to package up logical configurations (e.g. files, templates, tasks).

+1 separate host files for prod/uat/dev/whatever you like. Keep all generic configuration tasks/roles host-agnostic.

+1 very easy to write (and distribute) additional modules

+1 uses SSH for all transport. I love that it uses something tried and true, secure, and robust like SSH. Based on comments on Salt's Github, Salt may be receiving something similar soon. A ZeroMQ mode (aka Fireball mode) is available in Ansible for speed.

+1 sticks to Linux (Unix?) only. I like that Ansible is not targeting Windows (for now?); The Windows ecosystem is a very different community; with its own mindset and tools, and they'd prefer to use their own vs. something from Linux-land. This also allows Ansible to concentrate on getting really good on Linux/Unix.

Salt stack advantages:

+1 (+1 +1 +1) allows you to use Python (with Jinja2 templates) to express logic in configurations. Some people are against this (and tools like Puppet and Ansible try to hide programming from the user), but I feel this is a terrible idea. All that you end up doing is writing a configuration language that's Turing complete [joking!]). Some people will say: logic doesn't belong in configuration! But when you need to express conditions or loops, let me use a (standard, i.e. Python) programming language. I don't want to have to learn your DSL (which only your specific community knows about).

So in summary, I'd love a tool, that has:

- Ansible's "no client" setup

- Ansible's logical way to package up configurations (roles)

- Ansible's separate host file setup

- Ansible's simplicity of writing (and distributing) modules

- uses SSH (currently only Ansible, but might be available in Salt shortly)

- Salt's templating being available in Salt states (Ansible's "playbooks"). This point alone puts Salt on even ground with Ansible. I cannot stress this point enough. This is the reason one of the main reasons I don't use Puppet and Chef.

Can you explain why templating in Salt is better than using conditionals[1], loops[2], and variables[3] in Ansible playbooks? You can even use templates[4] in Ansible that are Jinja2 as well.

[1] http://www.ansibleworks.com/docs/playbooks2.html#conditional...

[2] http://www.ansibleworks.com/docs/playbooks2.html#loops

[3] http://www.ansibleworks.com/docs/playbooks2.html#variable-fi...

[4] http://www.ansibleworks.com/docs/modules.html#template

As for logic, to me, it's important that the language remain a data format and be parseable and auditable, i.e. "human readable for non programmers".

We had one manager describe it as "automation even a manager can understand". I'm a developer, but I'm jaded about software development, and I definitely agree with the approach Ansible (and Puppet) take about describing infrastructure with a model, rather than making it a program. I'm not really even trying to make it turing complete, I want to create a language that allows modelling how you want to describe the datacenter, but it's not trying to be a programming language. In fact, I'm rebelling against it trying.

There's a difference to a developer vs sysadmin mindset, but as something in between, I want to write applications code when I'm writing code, rather than writing infrastructure automation code. Code should be minimized, and that makes things more reliable in the long run, and enforces consistency IMHO.

I'm a little unclear about what you say about putting host specific information in Ansible roles. Pretty sure you can't do that, but I may be misunderstanding the question :)

Hi Michael, thanks for all your work.

About the second point (host specific information in Ansible roles): I was referring to this: http://www.ansibleworks.com/docs/playbooks.html#roles -- I was (incorrectly) under the assumption that you could put host file variable overrides (as you would in host_vars) under the vars/ section, but it looks like that's more of a generic, non-host, non-group specific variable directory? I have edited my comment and removed the incorrect section.

Sure thing -- basically the 'vars/' directory on a role loads in variables that are always set, basically constants.

Thanks for the write up... I am also considering both of these. I've hated every minute of working with Chef - particularly the Ruby DSL which as you note above is a terrible idea.

Haven't used it personally and hope my colleague can weight in on that one, but it was on the list of tools we evaluated end of last year while working with Chef.

It use agents ("minions") and is architected very differently. Ultimately, we settled for Ansible because it uses lean and out-of-the-box tech, doesn't push dependencies on remote hosts and uses YAML extensively (close enough to JSON, actually I believe JSON is a subset of YAML).

I started with salt a while ago (not serious production use) and recently picked up ansible.

So far I'm coming down on the side of ansible. The config is a marginal improvement on salt (which is already pretty good) but I think the clinchers is the explicit push nature rather than the declarative state declarations in salt.

I'm far from expert mind you and not using either in a serious production setting, so don't take my word on anything!

Very curious about the same thing.

My run-in with Ansible was that it was incredibly slow. What took puppet 1 minute to do took Ansible 7 minutes. Not great for pushing changes. Sure, it could probably use some tuning, but then we're out of the realm of 'just works' and into 'chasing down -foo-'. It certainly looks like it has its use-cases though.

Interesting. I'd like to hear more if you stop by the list and I would have hoped you would have brought it up so we could have explored more. Still want to do that.

Not a lot of data to go on in the above, but if you are specifying -c ssh, ControlPersist is good! Perhaps you were also trying to manage an AWS cloud from outside AWS. Better to install a control node inside the cloud in that case. Otherwise, I don't know, maybe a DNS issue? I'd need to know where your issues were to dig further.

One thing ansible is pretty sharp at is grouping yum transactions, so if you had every package in a seperate line, with_items is a good thing to use too, that can make deployments extra zippy.

Another thing you might be interested in is 'fireball mode' which uses SSH for key exchange and uses a more direct transport for making subsequent changes.

Unfortunately I wasn't the one that set it up, so i don't know the details. The coder who set it up left, and the other coders who picked up his project were complaining about the time it took to push changes to staging. After trying it in puppet, it turned out to be much faster, and we moved onto other things. I don't know any further details, sorry.

We haven't experienced such kind of latencies. Quite the opposite actually.

One unanswered question for me is how to manager packages on different systems. Apache for example has a different name on RPM vs apt based systems, or you might have a different command to go the install (arch vs ubuntu for example).

Can ansible do this? Can I easily migrate a playbook from using say apt-get to pacman?

Typically your playbooks are more specific then 'do something vague' for example, install apache. You would say specifically, using 'apt' install the package 'httpd'.

To see the package managers Ansible supports OOTB look at:


For examples on how this can be abstracted to handle multiple platforms see: http://www.ansibleworks.com/docs/bestpractices.html/#operati...

I found Ansible quite tricky to use.

I was trying to set up https://github.com/pas256/ansible-ec2 to manage some EC2 instances and it seems like I needed to replace a static config file with a script? Wouldn't that prevent me doing anything that wasn't EC2 with that instance of Ansible?

Debian also moved the inventory files around in their package so that made things confusing.

I've been using Ansible in an EC2 environment and it's been fantastic. It's powerful enough, but its requirements are slight and you don't end up fussing with bootstrapping it into being. A server with Python (and a few little things that can be installed via Ansible/Python) will get you to a place where your instances are being populated.

I'm very curious about the breakdown of developers versus operations people who are in the middle of this love affair with Ansible.

I'm ops, but the cool thing about ansible is that it is easy to get devs to help. They know their software packages, I don't, but I can point them at a playbook and they can fairly easily fill out the basics, I can work on the orchestration after they have done the initial playbook design and testing.

Chef (and formerly Puppet) had (for us) a much higher barrier to entry when it comes to getting an unfamiliar dev on board. We basically had to get reqs from the dev, and then we would build the recipes.

From surveys we've done, I think the breakdown is about 60% on the ops side, 40% on the developer side. Obviously in a lot of startups, especially folks using cloud stuff, lines blur a lot.

I wanted to make a language that was good for both, and easy to do ordered tasks, and a lot of app deployment requires ordered tasks and being able to do things on the result of other things. OTOH, sysadmins want a language that is easy to read and has the declarative state features, so it's kind of both at the same time.

> means minimal dependency on install (Python is shipped by default with Linux).

I love how the Ansible "Getting Started" guide [0] first says you need to install Puppet 2.6, as 2.4 on CentOS / RHEL is not supported :)

[0] http://ansibleworks.com/docs/gettingstarted.html


There's a very big difference given the nature of this article. I thought that Ansible was dependent on Puppet and was quite confused.

Yeah the web page does not say that unless you have a really sneaky IRC proxy in the way :)

Python 2.4 is all you need on remote nodes, but on the control machine you do need Python 2.6.

I have also needed a JSON lib installed on the target for python < 2.6. Is there some way around that?

Easiest is probably to use ansible's "raw" module to install it with yum, by feeding the http:// URL to yum, unless you have EPEL already configured.

We really wanted to use Ansible but couldn't get it to work with virtualenvwrapper. Has anyone done this (easily)?

Wierd, I haven't had any issues in my virtualenv. All I did was `pip install ansible`. The only hiccup was I had to `apt-get install python-dev` on Debian.

For those new to python/Debian this is an important point -- with python-dev and python-virtualenv installed (and also look out for "gcc" and/or "build-essential") -- generally a virtualenv made with "--no-site-packages" should have a fully working "pip" for installing most things.

Gcc is sometimes needed to build (often optional) c extensions.

Not sure about virtualenvwrapper, as the setup process does want to install some modules for you, but Ansible runs pretty easily from a straight checkout.

Source ./hacking/env-setup from a checkout and you are running live from there.

So you may wish to just install the python package and then set up your library directory.

Is there any automated or semi-automated way to create an /etc/ansible/hosts file from a ssh config file? Not fully supporting the ssh config file is a real drag for me, I currently have several dozen hosts configured and I just don't want to retype everything for ansible to work.

We use this over at useartisan.com and though I'm not one of the Dev ops oriented people here everyone who is using it seems to be really psyched on it. Glad people are contributing and it's gaining steam!

Not ruby therefore instant awesome. Man! I really don't like ruby.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact