
Migrating away from Puppet to cdist (Python3) - mahmoudimus
http://www.nico.schottelius.org/blog/migrating-away-from-puppet-to-cdist/
======
mahmoudimus
We wanted a push based configuration management, that actually had reasonable
error messages. We were looking at puppet, but the undecipherable error
messages were such a pain!

We searched and found cdist. We're very much enjoying it. The reasons he lists
for switching against Puppet resonated with me very much.

Plus, it's written in Python3, which might be a turn off for some, but it's a
pretty cool demonstration that python3 is making its way in the wild.

~~~
viraptor
I looked through the docs, but couldn't find anything interesting regarding
exchanging data between nodes. Everything seems to be isolated and the only
real exchange seems to be possible in local script generators. How do you do
stuff like "find all running frontend nodes and put them into this config
file", or "find a database node and create a config with credentials valid for
that node" using cdist?

~~~
eolamey
I haven't looked at puppet in a while, but I think this feature is only
possible in chef, where each node can execute search queries on the chef
server. It is actually one of the reasons I prefer chef to puppet. I guess
since the developers of the tools are coming from the puppet world, they did
not think about implementing it.

Wether this kind of feature should be implemented in the CM tool is another
debate :)

~~~
subway
Puppet provides a mechanism called Exported Resources allowing nodes to
provide bits of config for use by other nodes. For example I presently export
Nagios check resources for webservices on my web nodes, which are collected by
my Nagios server for building it's config.

------
jes5199
I used to work on Puppet's code professionally, and it's true that all of the
listed problems affect the Puppet software to some degree. It's always a
challenge to fix problems in software that people are actively using - I can't
help but notice that one of the complaints is too many significant changes
between versions!

Puppet evolved to solve a lot of different problems for a lot of different
people, and I'm continually impressed by the ingenuity of the community, but
having to solve all those problems at once means that it's hard to change
behavior at all. My intuition is that a collection of more discrete components
- one that uses Unix commands instead of ruby libraries, for example - will
probably eventually replace Puppet (and Chef, and the others).

In the meantime, though, there's an impressive amount of sysadmin knowledge
baked into the Puppet codebase - it can accomplish very diverse tasks on very
diverse platforms - and there's no easy way to extract the results of its
evolution into other software.

~~~
viraptor
> one that uses Unix commands instead of ruby libraries, for example - will
> probably eventually replace Puppet (and Chef, and the others).

I hope not... There's a huge value in being able to operate on proper data
types easily. I can get a hash and output a corresponding .ini-like file in a
couple of lines in chef. This would be a nightmare to do in shell.

~~~
jes5199
that's not the part I meant! Clearly we need good programming languages to
manipulate data. I just think that for the part that touches the OS, using
shell commands and piping their output to a parser is more reliable than the
equivalent ruby libs.

------
zobzu
There's one thing i dislike: releases are git based by default.

oh its great and all but it's pretty much like chrome and others: no testing.

this stuff is _critical_ to the infrastructure, and yet again: no testing.

just git pull updates away! yeah, right then push configuration on this
untested stuff over thousand of machines and watch it potentially break...

Oh that, and that it's YA configuration management tool, and of course, you
don't switch those overnight "just to check out the newest kid on the block".

I'd much rather get puppet enhanced. (but yeah - some of the puppet issues are
design issues which aren't going to change)

What I do like:

\- push. ffs! pull uses more resources and means you rely a lot on the host to
be working properly. with push you get errors back (can't connect, can't do X,
etc)

\- no ruby. woot. seriously, ruby's cool but its just not default.

\- simple and straightforward. nuff said. kiss has been applied.

\- since its simple its less likely to break in odd ways like puppet does, or
die off because the master just doesnt handle the load. push model also help a
lot here.

\- easy to run off a laptop compared to puppet.

------
olegp
Funnily enough I just released Bootstrap
(<https://github.com/olegp/bootstrap>) yesterday which addresses these very
same issues I had with Puppet.

I'm initially focusing on Ubuntu Server LTS on AWS EC2. As a result, it's all
Bash and very simple. I'm still working on the documentation, but would really
appreciate any feedback or improvement suggestions you may have.

------
mixmastamyk
In the same vein, has anyone tried the py-based salt (saltstack.org) for
config management?

~~~
cmkrnl
Have not tried yet, but thanks for the link! Salt seems to be comparable to
Mcollective - except, where Mcollective uses middleware (ActiveMQ or RabbitMQ
or Apollo) which comes with its own server (broker), Salt uses ZeroMQ, with a
built-in broker.

~~~
SEJeff
Actually, ZeroMQ is brokerless by design[1] which prevents a hop in all
communications. The design is slightly more complex, but the speed increase is
very real.

[1] <http://www.zeromq.org/whitepapers:brokerless>

------
peterwwillis
I have to admit, it's very attractive to make it all essentially glorified
shell scripts. Not only is it simple enough for most people to understand by
default, it's easy to hack on and it's relatively portable.

The only thing I don't like about that is that you're essentially writing
code, which - since this is a sysadmin tool - means you're depending on
sysadmins to write code. This is usually a bad idea. With restrictive,
structured language you can still provide for all the features they've built
into their types/manifests/gencode/etc without relying on someone being able
to program.

I was maintaining one part of a configuration management tool at a previous
gig. The idea was to have a simple key=value file which essentially set
identical environment variables. There was support for a few magic words like
"REQUIRED" and "OPTIONAL", some very rudimentary logic statements ("if KEY=VAL
then KEY2=VAL2") and an optional executable program for each config file to do
extra magic that might be needed before the real configuration management
kicked in (Cfengine2 at the time). In this way one or two people could write
some generic config management modules/scripts and the environment variables
would tell them what to do, so anybody could build and deliver configuration
with only a key=value file and no programming. Everybody could pick up the
format or copy an example because it was dead simple. Best of all: almost no
code to maintain.

~~~
moe
_means you're depending on sysadmins to write code_

You're not seriously suggesting to give the management of a complex deployment
into the hands of a person who can't program?

 _With restrictive, structured language you can still provide for all the
features they've built into their types/manifests/gencode/etc without relying
on someone being able to program._

Sorry, it's delusional to think you could operate puppet without being able to
program. In theory the puppet language is said to be "not quite turing
complete" or "mostly declarative".

In reality there is no such difference. It's just a (poorly designed)
programming language with domain-specific constraints. It _does_ have all the
concepts that give non-programmers headache: variables, classes,
parametrizable classes, defines, inheritance, etc. And on top of that you
better know your ruby, too, because you won't get away without writing custom
types, resources and functions. So... your admin is not just using the
language - he's also extending it!

~~~
cagenut
I'm a sysadmin, who can't program, who has setup and used puppet very
successfully at two different comscore top-100 sites to provide faster, more
reproducible builds with better revision tracking and change control. There's
nothing delusional about it, its working quite well and I'm being payed quite
well.

------
nodata
cdist: For anyone looking for a tutorial this is (unfortunately all) that
there is:
[http://www.nico.schottelius.org/software/cdist/man/2.0.4/man...](http://www.nico.schottelius.org/software/cdist/man/2.0.4/man7/cdist-
tutorial.html) More docs under:
[http://www.nico.schottelius.org/software/cdist/man/2.0.4/man...](http://www.nico.schottelius.org/software/cdist/man/2.0.4/man7/)

puppet: Making the error messages a bit better and switching to ssh as a
transport mechanism would make a lot of people happy.

~~~
obtu
The docs at
[http://www.nico.schottelius.org/software/cdist/man/2.0.4/man...](http://www.nico.schottelius.org/software/cdist/man/2.0.4/man7/cdist.html)
are outdated, they mention a quickstart and a cdist-env that don't seem to
exist: <https://github.com/telmich/cdist/issues/6>

------
gizzlon
Seem to rely on using password-less ssh as root! So one machine can ssh into
all your servers as root? [0]

Sounds dangerous.. There's a reason you would normally protect your ssh keys
with a passphrase. Do you really want leaked file(s) to permit root login to
others servers?

Think it should at least: \- Use authorized_keys to limit what can be done
with the key \- use ssh_agent instead of keys to somewhat limit the exposure
if/when the laptop with cdist is stolen/cloned/hacked.

[0]
[http://www.nico.schottelius.org/software/cdist/man/2.0.4/man...](http://www.nico.schottelius.org/software/cdist/man/2.0.4/man7/cdist-
tutorial.html)

~~~
nodata
It's not as crazy as you imply: it's a configuration management system - it
has _complete control_ over your machines. Giving it root or not is
irrelevant.

Password-less login: if your ssh key uses a passphrase, combine with keychain
or ssh-agent.

~~~
gizzlon
My point was that storing unprotected keys for all your servers on one or more
"unprotected" laptops/PCs seems like a bad idea.

Yes, to some extend it can be mitigated with ssh-agent, but that's not what
they do in their tutorial. And even with ssh-agent, can't someone steal all
your keys from memory?.

 _"It's not as crazy as you imply: it's a configuration management system - it
has complete control over your machines. Giving it root or not is
irrelevant."_

This is crazy in its own regard: Giving one piece of untested, unhardened code
control over all your machines. At least the code-base is small..

~~~
nodata
Okay. Out of interest, do you have any idea on how you would design a
configuration management that works without root access to all of the machines
you want to configure?

~~~
obtu
The problem isn't exactly with root (configuration management has root-like
privileges), but push vs pull.

Both rely on the confidentiality of a private key on the master (the server
identity if it is a pull, client identity if it is a push), but to fool a pull
you also have to subvert the dns or the routing. Using ssh certificates that
expire and are tied to the master's address would be a good first step making
push safer against key interception.

~~~
peterwwillis
If someone can find a way to write to your master change management repo,
you're boned. (Is it sitting on NFS? If so, you may have a problem!) If
someone roots the master box that you're pushing from or that your clients are
pulling from, you're boned.

Technically you don't _have_ to use root anywhere. You can push from a master
box as an unpriviledged user and specify any user on the target machine to
connect as, and ssh keys (along with an agent) will take care of the rest. The
target machine doesn't need to be running your config management as root, you
just need to set up a config management account with rights to the
users/groups that your services are running. Your service may get compromised
but root wouldn't - until they run a local priv escalation 0day.

I think you're confused. Since the tools are using either SSL or SSH these
protocols (when properly implemented) prevent mitm or spoofing attacks on
things like dns or routing protocols. Your key does not get intercepted, ever,
because you're never sending a private key. But some machine, somewhere, has
to have keys or you can't log in.

The only thing you have to worry about is someone getting both the keys AND
access to a server that's running an agent that you've authed with. If you're
super duper paranoid ssh-agent has an option to never cache a given key's
authorization so it can ask you for the password for the key every time you
use it. I'm pretty sure that would be super annoying, though, so better to
just run one agent once for an entire batch push/deployment and kill it once
your job is done.

------
dguaraglia
Good Lord, my eyes! The website needs some serious work.

~~~
cmkrnl
Web browsers usually allow the user to change a site's look quite a lot. Try
applying a custom stylesheet, or perhaps Readability.

~~~
nodata
His complaint is that the website is horrible.

Telling him to redesign it or use Readability doesn't change that.

~~~
Gigablah
It looks exactly like a PuTTY terminal, not surprising for a systems admin.

------
fierarul
I might just give this a try.

I postponed using Puppet/Chef because it didn't seem right that they were Ruby
and thus had their own dependency chain.

Seems to me automatic configuration should be as close to whatever is
available on a bare install, meaning C/bash.

~~~
mattyb
At least for (recent versions of) Ubuntu Server, a C compiler isn't available
on a bare install, so it'd need to just be Bash.

And what's with the aversion to dependencies? Installing Ruby is just an apt-
get invocation away from a bare server.

------
kuviaq
Anyone have a comparison between cdist and fabric?

~~~
obtu
Fabric sends a bunch of commands over ssh. It's very simple, and your task is
to write the right commands.

> Puppet in contrast to other systems emphasised on "define what I want"
> versus "define what todo", which is a great approach and we've [cdist]
> shameless cloned this approach.

With cdist and puppet you describe resources by instanciating resource types
like Package, CronJob, Service, File, etc. It's very concise and high-level.
Types are composable and easy to reuse, so if you see a common pattern in your
configuration, you can factor it into a type.

------
chrisatlee
It's very frustrating that none of these tools work well with windows.

~~~
astrodust
It's very frustrating to work with Windows. The lack of a POSIX compliant
shell in the standard distribution is what's holding it back the most.

Mac OS 9 and prior was also super frustrating to work with for similar
reasons.

------
bryanwb
besides the bootstrap problem, I don't see how chef doesn't solve the issues
they had w/ puppet.

Perhaps there were things that disqualified chef in their eyes?

------
eolamey
I would suggest that anyone interested in Configuration Management would read
the article, event if they are pretty sure that their tool of choice is
exactly right.

I have used puppet in the past an am now a happy chef user. I find that node
attributes, data bags and search queries (from the nodes themselves) are very
compelling reasons to use chef.

I agree that installing either chef or puppet on even a somewhat recent distro
can be difficult, and this might be one of the reasons that CM is not as
widespread as it should (I maintain PRM packages for chef, and it is no so
easy...). So I agree with one of the points made by the author. But I have to
admit I pretty much discarded his other ideas after reading the article at
first, even though I poked around in the documentation.

But then, I thought about the points made and I must say that a CM tool made
of small, autonomous, well tested "parts" makes a lot of sense. If you think
about it, a CM system's job is to configure UNIX servers (mostly), so why not
implement it the UNIX way? It is a bit like the Hubot implementation by Ted
Dziuba: back to basics!

You CM tools, wether puppet of chef, must: deliver files to nodes and then
execute code to ensure that they are in a proper state (it's a bit more
complex than that, but not that much). I'm not sure if it's better to use push
of pull for file delivery, but this should be an independent task, that could
be implemented either way. Using SSH for security and access control seems
like a no brainer. Then comes the compliancy part, where code is executed to
ensure that the node is configured the way you want it. Why not have a bunch a
small programs, which could be implemented in different programming languages,
do it?

Let me give an example. Both puppet and chef work, at the lowest level, with
resources. You define them in containers (manifests or recipes) and then make
sure they are defined on the nodes. You could have "file", "package, "user" or
"template" resources (for example), like:

    
    
      # this is pseudo chef code for a fictional redis recipe
      package "redis" do version "1.0" end
      user "redis" do uid 230 end
      template "/etc/redis.conf" do
        user "root"; group "root"
        mode "0644"
        notifies :restart, "service[redis]"
      end
    

Why not have each resource definition as a UNIX programs, that can be executed
independently? My understanding is that cdist uses shell functions, but I
would prefer regular programs, which could be called from anywhere:

    
    
      # in a shell script, distributed by the CM tool
      package redis --version 1.0
      user redis --uid 230
      template /etc/redis.conf --user root --group root --mode 0644 --restart redis
    

(I know this looks pretty much like what cdist does, but I'm trying to get to
the point made by the author without being distracted by the puppet critique)

I am convinced you still need attributes, bits of data attached to roles and
nodes, for your CM to be efficient, but I don't see any reason this could not
be implemented in cdist.

Anyway, this comment is getting really long and I am not sure I made my point.
I guess I'm more thinking out loud rather than commenting on the article :)

Do yourself a favor, read the article, put aside the implementation details
and website design, and think about the underlying points.

