Hacker News new | past | comments | ask | show | jobs | submit login
Puppet or Chef? (philsturgeon.co.uk)
94 points by ichilton on Oct 30, 2012 | hide | past | web | favorite | 59 comments

We started off writing the bulk of our server setup/deployment automation in chef, and have since completely abandoned it

The core problems we had with chef were:

• worse than ruby: the chef ruby DSL is like some bastardized, crippled ruby - e.g. ruby_block{}, just uggh

• way too slow & resource intensive: chef itself uses a lot of memory and CPU, has a slow boot time, and does stuff like execute apt for each desired package on each node on each run. this might work fine if you're running on beefy physical/virtual hardware, but not for managing hundreds/thousands of tiny LXC containers that need to be scaled on-demand in seconds

• not self-hosted: chef seemed to have real difficulty closing the loop and being the thing that deployed and configured itself. there's guides online for scripting yourself up to a basic chef setup, but what if you want your chef client to bootstrap with some custom rubygems? back to bash scripting - and then how does that script get on each of your nodes? chef isn't intended to build/deploy itself the way it does the rest of your stuff

We've now transitioned everything to heroku buildpacks + a build server, which create self-contained "slugs" and therefore are self-hosting (i.e. the build server can build itself), and allows us to have a single build/deploy process for everything


While I'm not a Chef fanboy, far from it, some of these comments are just not true.

"and does stuff like execute apt for each desired package on each node on each run". No you just need to set it up correctly so that it keeps the timestamp of the latest apt-get update, and does not refresh it within the next x hours.

"not self-hosted". Somewhat agreed, although you can easily get around this with the excellent https://github.com/tobami/littlechef

Would you care to explain/write about your experience and setup a little bit more in depth, we explore this path and found it really interesting.

  >  what if you want your chef client to bootstrap with some custom rubygems?
knife bootstrap[1]?

[1]: http://wiki.opscode.com/display/chef/Knife+Bootstrap

Using Heroku buildback means you have shifted to using Heroku instead of self hosting?

No, we've combined bits & bobs that Heroku has open sourced (check ddollar's repos on github) with our own Ruby/CoffeeScript/Lua/Bash glue on dedicated hosting. There's a single OS image and compiled binaries for software stored in Ceph's S3-compatible distributed storage, and everything else is applications and buildpacks stored in git

The build server takes URLs for an app and a buildpack, runs the buildpack to do any dependency fetching/compilation/etc, bundles up the output into a .tar.gz file that contains everything necessary to boot up on the standard LXC image, and uploads it into the distributed storage. Then when we want to boot 1 or 2 or 10 of that app, we just grab the gzipped "slug" and boot it up on an ephemeral LXC containers (i.e. a single base image + temporary overlay file system)

This system can build a rails web app, a node.js app, other services like mysql/redis/nginx, and even the build server itself

So the bootstrapping is a little tricky, but e.g. the process to run a build may go git -> api -> mysql/redis -> worker -> build server -> ceph/s3, and each of those pieces themselves is built/deployed/managed the same way, which we've found to be a huge win for maintainability

"ruby_block" is a Chef resource that allows you to execute your favorite ruby code as a separate resource.

You can always drop to regular, plain Ruby, anywhere you like.

you might want to look into pallet do achieve what you want if you're not afraid of clojure.

I never got on well with either. Puppet was mind-bogglingly slow (even locally without a master), both Chef and Puppet felt overwrought, and I prefer keeping dependencies to a minimum.

Just a pointer for anyone else that doesn't have much love for either: I found myself happy and productive with Salt. Fast, simple, and lets me do what I needed with less Byzantine setup: http://docs.saltstack.org/en/latest/

There's also Salty Vagrant and Salt Cloud for local and remote provisioning.

"I prefer keeping dependencies to a minimum."

The job of an OS not a human admin. "apt-get install puppet" on the clients and "apt-get install puppetmaster" on the puppetmaster. That's about it.

"Puppet was mind-bogglingly slow"

Was it a pause exactly equal to one DNS lookup timeout? The SSL inside puppet used to get all wound up about reverse DNS matching the forward or whatever exactly. You need working DNS to puppet. If DNS is dead you may as well forget debugging puppet until your local DNS is healthy.

Also its possible to do unusual SSL configurations that can make it a bit slow. Vanilla out of the box should be reasonably fast. Starving a virtual image of CPU can make the SSL slow... a virtual 40 MHZ 386 equivalent is not going to do SSL any faster than a physical 40 MHZ 386 used to.

"The job of an OS not a human admin. "apt-get install puppet" on the clients and "apt-get install puppetmaster" on the puppetmaster. That's about it."

It might be a simple apt-get command, but consider setting up Chef Server. You're suddenly adding the following to your system: Ruby, CouchDB, RabbitMQ, Java, merb-assets, merb-core, merb-helpers, merb-param-protection, merb-slices, thin, solr-jetty. And then maybe libxml-ruby, merb-haml, haml, coderay. (From http://wiki.opscode.com/display/chef/Installing+Chef+Server)

That's a whole heap of stuff and moving parts I wasn't looking for. Compare that to the above mentioned Ansible or cdist on the lightweight end of the spectrum.

What distro are you running that's got current Puppet/Chef releases in its archives?

The story may be better for puppet, but with chef, it's pretty much "off to the racetrack" to get the latest and greatest Ruby, Chef, and other deps installed.

What was slow? The compilation time for the Puppet catalogs, or the run time on the clients?

Compilation is way quicker in 3.0, averages a couple of seconds per node for me. Client side is parallel, not sure how it could be quicker.

Sorry, I don't recall in particular. It was about a year ago.

It was only a basic setup of a user account, a directory tree, and ufw. It took over a minute for either a first or subsequent run on a clean install of Natty. Considering how much more config I had to add and too little time to dedicate to investigating and speeding it up, I had to put Puppet aside.

Good to know 3.0 is a lot faster. I'll give it another look in future.

So from what you write it sounds like the time it took for Puppet to apply the configuration changes was the problem: to install all the packages you wanted, and then to configure and check them. This always takes time though, this isn't Puppet specific.

Came here to comments to post about saltstack. Couldn't agree more. So blindingly fast. I think people who comment that puppet isn't slow just don't know how fast salt is. Tens of thousands of clients polled in just seconds is something puppet just can't approach. I also found not having to learn yet _another_ DSL refreshing.

I recently had occasion to look at Cfengine, Chef and Puppet.

I chose Puppet.

http://chester.id.au/2012/06/27/a-not-sobrief-aside-on-reign... (scroll down if you don't enjoy my windbaggery).

The short version is: Puppet best fits the way I think about how such systems should work. Despite ostensibly belonging to the same genre of system, the three of them have subtle but very important differences.

Could you elaborate about these subtle but very important differences, please?

I found the answer in the above referenced blog post.


I document my puppet configurations in Leo (http://webpages.charter.net/edreamleo/front.html) taking full advantage of Leo's ability to represent directed acyclic graphs natively.

There are lots of things I'd like to improve in Leo that involve chucking its current code base (e.g.: back end storage is XML and thus version control hostile; front end is not a web browser; acyclic should get dropped from "rooted, ordered, acyclic graph"), but until that bit of brilliance dawns over the world, I'll continue to use Leo.

I'd love to see more details about this. You may not return to this comment, but nigel@puppetlabs.com would love to chat about what you're doing here.

Been puppet'ing for years across maybe a hundred machines. Looks like almost everything Phil initially wrote about puppet got edited afterwards.

There is a third and fourth solution to the "Encrypted data bags for puppet" problem. The third, my solution, is to never, ever, store AAA in configuration system. EVER! I do store calls to programs and such or even just data files as a program. I admit sometimes the "program" to get certain passwords is something like "backtick cat somefile backtick" but usually I do better. Those AAA programs/repos are handled much more delicately and securely than a "everything goes" config system that everyone can mess with.

The fourth solution is the implied idea that you'd never rotate AAA credentials on a regular basis and never change infrastructure passwords when someone quits, which sounds pretty funny to me. Hey HN, my mysql root password for a month back in 1998 was: (insert something like line noise here)

Another old time puppeteer observation is everyone has a SSL nightmare eventually and even the mighty GOOG can't help you sometimes. Especially on restoral of backups, or replication of live systems, it can get pretty hairy. Also DNS malfunctions can horribly confuse puppet's SSL occasionally. This is something you'll only hear from an old puppeteer not a short experiment like the article.

I tried both and decided the answer was neither - I'd recommend giving Ansible serious consideration. (https://github.com/ansible/ansible)

Thanks for the mention!

FWIW, front page is http://ansible.cc and there's a FAQ there that explains a bit more.

Docs are http://ansible.cc/docs

Another one is Babushka, for serious simplicity and speed http://babushka.me

We use babushka at 99designs for setting up servers and development VMs. I highly recommend it.

Thanks for the Ansible reference, looks interesting.

No problem ... there's also a Google Group that's quite active and it's great resource. The project's founder is very active there and it will give you a good feel for how the project operates. (http://groups.google.com/group/ansible-project/topics)

Last week I interviewed the product manager of Puppet [0] and asked him to differentiate Puppet from Chef and cfengine. He didn't fall for it, and just said something like "the most important part is that you do something."

I've used both (and cfengine) to varying degrees and would have to agree. Simply using a configuration management tool takes an incredible amount of work off your shoulders, you can't go wrong with either.

0 - http://linuxadminshow.com/2012/10/28/episode-4-puppet/

(edit - I said I interviewed the Chef guy, I meant Puppet)

I have the feeling that choosing a configuration management system is very similar to choosing a build system. There are a few big players where it's easy to do easy things and possible to do complex things, but hacking with them is very arcane (like GNU make). There seem to be quite a lot of documentation and contributed packages for both Puppet and chef, for example.

But, instead of trying to fix what is broken, it seems that a lot of people end up reinventing the wheel and writing their own "lightweight" configuration management system.

As someone that wants to switch to such a system, I believe that it will be easier to learn puppet since the existing code base is just enormous.

Why not add: or Fabric or Ansible or something else? I haven't found much that Fabric won't do for me and I haven't found much reason to install Ruby on my system [Python is a system package].

For a non-Puppet and non-Chef user, would someone explain why you'd limit your choices to puppet and chef?

One huge advantage that puppet and chef offer is abstraction from the underlying system. So if you write a configuration that says "Apache should be running, have PHP installed, and have the following 3 virtual hosts installed", then you don't care if you're running on Solaris, Debian, or CentOS. The manifest/recipe takes care of figuring out what the package is called and where the config files go.

I'm not familiar with Ansible, but Fabric seems more like SSH with a for loop. Puppet and Chef are not about automating the typing as much as they are about ensuring consistency. In my Apache example above, if someone goes and changes a vhost file by hand, that file will be replaced with a good one and Apache will be restarted, even if that file was made by a complicated template.

    then you don't care if you're running on Solaris, 
    Debian, or CentOS.
That does sound like an advantage, but I generally run the same OS across all machines (for a client), so that's not a huge advantage for me.

    Fabric seems more like SSH with a for loop.
Definitely. And that suffices fairly well for managing 10-20 servers.

    if someone goes and changes a vhost file by hand, that file 
    will be replaced with a good one and Apache will be restarted, 
    even if that file was made by a complicated template.
Ah, so Puppet is running on a machine, it will maintain the configuration of the machine even if some well-intentioned user fiddles?

> Ah, so Puppet is running on a machine, it will maintain the configuration of the machine even if some well-intentioned user fiddles?

Yes, at least for all ressources under its control. So in theory all changes to ressources managed via puppet should be done via puppet. I guess that's better this way when you're running a large farm of machines, the use-case puppet/chef/etc. were created for.

Ansible is definitely model based (and idempotent, like Puppet or Chef), unlike Fabric.

The resource model was reasonably inspired by Puppets, even if other aspects were not.

I look at Fabric as a deployment tool, fine if you like it, but configuration management requires more things on top.

Ansible works in pretty much the same manner.

I use Fabric to run Puppet. Setting up a puppet server is overkill for my small setup, but I found Puppet to be better suited to managing server configuration and more scalable should my number of servers continue to grow.

Fabric + Cuisine is happy joy.

Thanks for the tip!

I have started using chef to automate build environments on top of vagrant. One thing that made me choose chef is windows support: chef uses winrm on windows, + things like powershell, and that seems to work better than puppet (e.g. I could not for the life of god make a unattended install of visual studio with puppet).

On the - sides, I don't find it particularly well documented (it feels like reading the MSDN, where things are self-referentials), and it is quite slow.

For those looking for a dead simple alternative to Chef or Puppet, I wrote Shoestrap. It's a pure Bash, no-BS set of scripts. Doesn't have all the bells and whistles of Chef, but it might be enough for your needs.



I too wrote my own system - which uses Perl. Mine is called Slaughter and the upcoming 2.x release allows you to fetch policies/instructions from remote servers via one of:

* rsync * http * git clone * hg clone

etc. It is pretty extensible and covers my needs. Beyond that it is a little hard to know.

http://steve.org.uk/Software/slaughter/new.html for the upcoming 2.x release.

I've tried both Puppet and Chef. Puppet fits my mental model more, but I had for more success with actually getting Chef up and running.

Puppet, on the surface, has less 'messing about' to do in order to get things running - but my experience was that I encountered a few issues which basically caused me to spend way too much time mucking about rather than working.

Ideally I would like to use Puppet, I feel like it's cleaner and it seems more 'logical' to me at least - but I only have actual experience of Chef working, so make of that what you will!

Once Chef is up and running, it absolutely flies, which is always good!

There is also cdist (http://www.nico.schottelius.org/software/cdist/).

Some highlights:

• Python based

• Uses SSH key-based auth

• No central server required

• Only requires SSH and a Posix shell on the client

We tried both and finally end using puppet.

In our experience puppet is easier to setup and start, chef server stack is HUGE compared to just installing a package for puppet master and certs/knife setup is a PITA.

The DSL is pretty similar except for how they track dependencies and general workflow, where chef is `procedural`, puppet internally builds a call graph based on a declarative syntax which can be hard to track down and understand in some situations.

Regarding cookbooks vs manifests, both have tons of modules around, but like with any plugins/modules software the basic stuff in general is cover, but you will need to get your's hand dirty to get things to work your way and not all manifest/cookbooks are good to use at all..

We don't found compilation and install time to be an issue for us, a complete lamp stack install fully customized takes less than 2 minutes per node.

We have a linux background and didn't find ruby hard at all to hack and mess around to solve some of our issues like custom facters and install some rare stuff like nsis :)

I have used both extensively. I found Puppet much easier to get started with but the custom Puppet DSL becomes very quickly constraining. I know that a new pure Ruby DSL is in the works. I found that Chef has a steeper learning curve but many times greater productivity. I do feel that their is much better tooling support for Chef than Puppet, in large part because Chef's pure Ruby approach is much friendlier to Ruby developers. It remains to be seen whether Puppet's coming Ruby DSL will have the same level of functionality.

If you are managing a small application, Chef may feel heavyweight. However, imho, if you can't manage Service-Oriented Architecture w/out a serious configuration management system, be it Chef or Puppet. There are just too many moving parts.

The Ruby DSL for puppet has been around for a couple years, and recently got some improvements as part of GSoC.

http://puppetlabs.com/blog/ruby-dsl/ http://puppetlabs.com/blog/gsoc-project-ruby-dsl-for-puppet/

I pretty much hate puppet after 4 years of dealing with it. Scaling problems, having to run another bunch of crap to store configs, in an app that was designed to store configs.

I'm getting all riled up just remembering the hell.

Next time I get to pick, I'll use salt.

I migrated all my machines to salt 6 months ago after using Puppet for a few years. It's been pretty painless but I would check out Ansible also before commmiting to a Salt transition.

The answer to this question is always "yes".

Over the years, I've also used cfengine and a host of other systems and have decided that simplicity and the ability to customize a system to fit your needs are key. Turns out it's quite easy to roll your own, given the right tools. I now use:

gsl https://github.com/imatix/gsl and synctool https://github.com/walterdejong/synctool

The former is used to generate configurations for various classes of systems and the latter pushes them out over ssh. Works very well.

We are currently building out Chef to handle change management for 3000+ servers :), loving it so far but we are also a RoR house too.

Know of another big hosting company using Puppet and they love it, for about 10k+ servers.

The reason I chose Puppet was that it allows my servers to pull their configuration instead of my pushing it to them. This is important because my servers are all remote and behind firewalls that I don't control. Using Puppet, my servers can make an outbound connection to the Puppet Master without having to mess with firewall ingress rules.

Things like Salt or Fabric require direct access to the target servers and won't work in my situation. I'm not sure about how Chef works as it wasn't available (popular) when I was setting this up several years ago.

Salt can be used both in a pull or push fashion.

I like the topic, but have to say the article wasn't that great - the author didn't seem to have experience with puppet and had to be corrected in the comments section heavily.

While on the topic of devops, worth checking out:



I am a fan of chef, we use it at cloudpokerdb.com for all of our configuration management. The Ruby Scripts vs Config Files rings true (at least it did when I first compared the two a while back). Im sure you can do it with puppet, but chef allows us to auto-deploy and setup any type of server environment (using openstack) one could need, via a couple recipes, and databags.

You might want to also take a look at CFEngine, which inspired Puppet and Chef.

See "Relative Origins of Puppet, Chef and CFEngine" http://verticalsysadmin.com/blog/uncategorized/relative-orig...

It's not really something I know much about, but I recently went to a talk where the guy said the ultimate difference between Puppet and Chef is Chef is ruby scripts, but Puppet is a config file - which means that Puppet can guarantee that its actions are idempotent.

My first reaction to this was "Puppet must either have horrible documentation or be difficult to pick up."

I mainly gleaned this from him correcting himself about puppet all the time and took this to mean he had a harder time picking up puppet.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact