2016-12-08 First contact with Ansible security team
2016-12-09 First contact with Redhat security team (email@example.com)
2016-12-09 Submitted PoC and description to firstname.lastname@example.org
2016-12-13 Ansible confirms issue and severity
2016-12-15 Ansible informs us of intent to disclose after holidays
2017-01-05 Ansible informs us of disclosure date and fix versions
2017-01-09 Ansible issues fixed version
(Well, that plus the fact that this seems like the sort of scary bug which will get exploited very quickly.)
I always preferred using plain SSH and Shell for automatically configuring servers.
That's why I had a look at Ansible, but despite this simple concept, it seemed too bloated to me: too many layers around this simple idea. But I had never thought to be proven right, let alone that this complexity actually translates into 6 scary security holes in a row.
To propose an alternative, this is how I'm currently doing it. I'd love to hear what others think about it, or whether others have tried similar things:
Note: This works best of a relatively small and/or heterogenous set of servers/VMs. So it's not really the full-blown Ansible use case. But then, how many people really do have billions of users and a large set of homogenous servers at their hands? I bet the "long tail" of web applications doesn't need more than a few servers. Or, if your hardware is strong enough, just two servers, where the first one takes all the load, and the second one is a failover system on stand-by.
Use a plain shell and plain shell commands, which has the advantage that you can quickly try these interactively in any VM. Use "set -eu" so it halts on error, instead of blindly executing all following commands. Use appropriate commands, e.g. "install" instead of "cat+chmod+chown". Use "--backup=numbered" to keep backups of previous versions of files, if you want. Use only idempotent commands, i.e. commands that you can run multiple times without changing anything of significance. Prefer portable commands, but use distro-specific code as needed. The time you switch to another distro you'll have a new script, anyway, because you'll have a different system with different requirements. Write everything down in the style of a "copy&paste" documentation (Does this count as "literate programming"?), but put all your remarks in shell commands. Then automate the "paste" part by making it executable.
Example file (executable file "myserver_config"):
tail -n +3 "$0" | ssh -p 1234 root@myserver_ip ; exit
# Install packages
# Configure Foo
# Configure Bar
# Configure Nginx
# Configure Nginx so that the foo fits into the bar
# and boggles the foobar.
install --backup=numbered -o root -g root -m 600 /dev/stdin /etc/nginx/nginx.conf <<'EOF'
# Do more stuff
One is idemopotency - The fact that it is not easy to run the above script on a bunch of host just to 'fix up' a config that was somehow broken or to add or amend a directive in httpd.conf type of thing. With Ansible, you could re-reun the playbook and it would only change what needed to be changed to bring everything in line with the playbook, i.e. just the directives in httpd.conf and then do the reconfigure on httpd to bring in the changed config.
The other is inventory, so such a change as I outlined could be, with a one liner playbook command, be run against all Dev hosts only, or Dev and UAT hosts only, or Red Hat hosts only etc, when you're managing a lot of hosts this is invaluable. Even for just checking some config or other easily by running a shell command against some set of hosts.
In my case, I simply overwrite all config files and restart all services. This is not "efficient" (but still sub-second), but I believe this is idempotent by all means.
tail -n +3 <current-filename> | ssh -p 1234 root@myserver_ip ; exit
When you write `install`, is this another script that you have installed on the server beforehand or is it a function/alias or something? [EDIT: Duh, didn't realize this was actually a standalone program (standard but non-POSIX AFAICT) I haven't used.]
I came up with this trick to avoid nested here documents, which would make syntax highlighting unusable.
> When you write `install`, is this another script that you have installed on the server beforehand or is it a function/alias or something?
The "install" tool is very common on almost all Unix systems. It is usually called by a Makefile on "make install", but of course can be used by anyone.
Today I wouldn't use any of those anymore and, as you suggested, use shell scripts in a immutable infrastructure fashion (aka, reinstall system when changes are needed).
More seriously, ansible is terrible and I have a long list of complaints, but it is still better than everything else at the moment. Being able to run playbooks from anywhere without a special controller host is very important, and seems to have been missed by most competitors.
Different tools for different scales. Ansible exists as an improvement over ad-hoc shell scripts, simple/ad-hoc inventories, copy-pasting etc.
I think it's very unfair to all the contributors to highlight some security issues and say "I told you so", and propose a shell script as an alternative.
EDIT: I see your edits, acknowledging more complex environments...
I'm all for disclosure, but seriously - if RH want Ansible to be used in enterprise they can't expect patches at this rate. The researchers releasing the exact exploits so quickly is just irresponsible IMO.
While I sympathize far more with the full disclosure people than with the patch choreography people, I'm really only pointing this out to demonstrate that you're not going to resolve this debate in the HN comments about an Ansible vulnerability.
Ansible has released new versions that fix the vulnerabilities described in
this advisory: version 2.1.4 for the 2.1 branch and 2.2.1 for the 2.2 branch.
Latest can be found here: https://github.com/ansible/ansible/releases or https://releases.ansible.com/
I couldn't understand from article or discussion.
If one of your servers is compromised, this is a vulnerability in the ansible client that lets that bad server take over your local computer and your other servers when you connect to it by sending you bad facts.
So it's pretty serious.
runs on a host, a JSON object with Facts is returned to the Controller. The Controller uses these facts for various housekeeping purposes. Some facts have special meaning, like the fact "ansible_python_interpreter" and "ansible_connection".
So while you may normally define it in the inventory, it sounds like, from the advisory, it can also be defined by the host.
I'll try to answer several questions so this might get long.
First, the procedure for disclosing the CVE is something we discussed internally (including security professionals), as all over the internet there are 2 or more views on this. The decision arrived at doesn't please everyone (I doubt one that would exists), but it is a recognized way of dealing with security issues, so it is what we followed.
The CVE can be dense explaining how the exploits work. The simple version: it is a rehash of an old problem which we had thought solved, the researchers proved us wrong by finding ways around our filtering. The vulnerability is pretty hard to trigger and requires both a compromised system to intercept the Ansible calls and return faulty data and intimate knowledge of the existing playbooks and systems used to force the arbitrary execution.
All software has flaws, this is not an excuse, it is a fact. Not all software or flaws have the same scope though. As a automation tool that is often used to manage things with high levels of privilege, we take these things very seriously and we do our best to prevent it in the first place or remediate it as soon as possible. As an OSS project we appreciate the eyes and efforts many put into finding these flaws which end up making the software better (and me a better programmer).
Ansible is not idempotent, it is declarative, which does help the user create idempotent plays. True idempotency depends on many things, the modules used, the problem addressed and how the playbooks are written, etc. Both of the following are valid ansible tasks, but the implications are very different even if the result ends up being the same.
- shell: usermod -G user bcoca
- user: state=present name=bcoca groups=user
I hope this helps,
There was this "safe_eval" function which filtered input in a way quite inconsistent with its name. The Ansible team was responsive and pleasant to work with!
But I suspect lots of remote control and monitoring software products might have security bugs like this where they assume that the returned information from systems under management are trustworthy.
Edit to add: Here's the patch made to safe_eval in 2014. I had suggested using literal_eval instead but I guess a Python 2.6+ requirement wouldn't work.
Edit again: Ansible is a pretty great product, and IMO one of the first of such tools to seriously improve the UX for sysadmins. Thanks for maintaining it!
Edit: LWN link was changed to this one
Some system have Python installed in a rather uncommon location. For example, Python is not part of FreeBSD base system, so Python is installed at /usr/local/bin/python instead of the expected /usr/bin/python, or Arch has Python 2 installed at /usr/bin/python2 rather than /usr/bin/python.
Note that Ansible doesn't require itself to be installed on the remote host (and IMHO is one of its biggest selling point) and execute tasks by sending a packed version of a task to the remote host and execute it using `ansible_python_interpreter` (e.g. `/usr/bin/python /tmp/ansible-tmp-a43bf412.py`)
No, the client specifies the interpreter to use inside itself.
I'm saying 'interpreter' instead of Python because you can create modules in any language, Ansible only ships with Python ones, but Perl, Ruby, etc modules exist also and usable by Ansible.