I managed the process, and consider it a success. However, here are some points I would make:
1. It took a long time. Let's be generous and say it took 2-3 man-months of time to set up 4-5 different projects and roles. This was probably 10-20x what it would've taken to set up the servers directly. Why? Learning curve with chef for both our programmer and sysadmin. Figuring out how to make config changes automatable and idempotent.
2. The scale you get from chef is bigger than managing production infrastructure. We now use chef for not only production deployment, but also dev. Once paired with Vagrant, we are able to get new devs up with a complete stack in about 10m of keyboard time. If we need to upgrade to some new version of something, only one person has to deal with the sysadmin; everyone else can just update their box.
3. I think it will save money in the long-term. A good sysadmin is $100/hr+. Unfortunately you have to pay that rate whether they're doing architecture, security review, or just editing text files. With chef, a non-sysadmin resource can generate recipes with just architectural advice and review from a sysadmin. This is much more efficient, especially for small shops where a sysadmin is an expensive and not immediately available resource.
(We recently hired a sysadmin "+").
I interviewed several sysadmins over the years and no one that I deemed competent charged less than $100/hour.
I did find people charging as low as $50/hour but they didn't seem that great, or that reliable, or that available.
For me as well, you pretty much need to trust your sysadmin more than almost anyone else in your company (except people that can sign checks).
Trust comes at a premium.
I will yield that I don't like working with substandard people. I hate having to manage people and am willing to pay a premium for people I trust to work on the right things with the right skills in a timely manner. Besides, it doesn't scale. One of my business goals is to never have middle management.
I'm not looking to argue so much as to inject some more data into the price point that was casually dropped on this thread earlier. I do not think the other guy overpaid for sysadmin; if he's got an amazing admin, great! I can see paying a premium for that.
Another point I'd like to raise is, if you're paying $100k/hr for sysadmin, and using them frequently, contracting instead of fulltiming sysadmin might be penny-wise-pound-foolish. But maybe not, if you're only paying $10,000/yr in sysadmin. We've used over 100 hours of admin in just the last couple weeks. A great hire; one we made "37signals-style", after realizing that doing all the sysadmin chores ourselves was subtly making us all miserable and ineffective.
Believe me, I know the difference between contracting rates and salary. ;)
I am not an experienced sysadmin, but when I feel like I could do a better job than the person I'm interviewing, I consider that substandard.
You could hire someone full time for about $120K a year but then with benefits and employer taxes it will end up being closer to $150K a year anyway.
Are you sure we're talking about "system administrator" and not a more specialized role like "devops"? I know to us nerds those are basically the same thing, but they really aren't. If you're doing sysadmin for the primary purpose of deploying and maintaining boxes designed to run one proprietary application, you might be a devops person, and not a sysadmin.
Again, this might be a valley thing. I've got a bunch of friends who have complained how hard it is to get sysadmin in SFBA. Just know, if that's the case, the dropoff in salary outside the valley is waaaaaaaaaay sharper than it is for dev, which is pretty much just COLA adjusted from place to place.
(We staff offices in Manhattan, Chicago, and Mountain View, for what it's worth).
It's probably wrong from a larger perspective to lump the two together, but for me sysadmin as a role has been supplanted with devops. If you are a sysadmin that doesn't want to automate away their work via devops, you are not doing it right.
More importantly though, bill rates and capability do not track each other. Bill rates track risk. There is a lot of amazing- but- unproven systems talent out there, and there are some very expensive pikers on the market too. Generally I would agree that the more you pay, the less likely you are to have to fire 3 months later.
It helps a lot to be able to do the job yourself, soup-to-nuts, so that you'll have a better shot at screening candidates.
I am absolutely confident that there's some role/vertical definition you can come up with where sysadmins are making $100k in the door. But I think you're discounting most of the market for sysadmins, just like the people on the other current HN sysadmin thread who all seem to believe that sysadmins also optimize SQL queries, fix C code, and teach systems programming to stupid developers.
Although it's very new, I think I'll try out juju.ubuntu.com for my next large deployment instead of sticking with chef or puppet, if only for the flexibility of the hooks-in-any-language thing and a chance to feed my bash fetish.
2-3 months is a long time.
We consistently hear 2-6 weeks to get to full production, with 30-120 minutes for some proof of concept work.
And one of the great things about Puppet is you can start very small - automate the parts that are hardest, most critical, etc., and do bits manually as it makes sense for your problem set.
And yeah, managing dev and prod the same way is absolutely critical for clean deployments.
I don't know what you mean by "execute receipts" however in the video they kick off a puppet run, which makes me think you could similarly kick off a chef run instead.
btw, I am copying your method of managing dotfiles with a Rakefile as we speak.
I counter that while it takes a while to learn chef, it also takes many times more effort to maintain a custom set of shell scripts over the long term
FWIW, Puppet's dependencies are much simpler. I don't know much about Chef vs Puppet, but I can say that from an installation and dependency maintenance POV, Puppet wins.
To be clear, those dependencies are required for running your own open source chef server.
You can use chef without the server, as chef-solo, or let Opscode run it for you in the form of Opscode Hosted Chef. We do make it easy to install Chef Server through Chef Solo itself, or with our Ubuntu/Debian apt repository.
The chef databags/cookbooks tend to contain rather sensitive information (ssh-keys, passwords). Handing all that stuff over to a third-party borders on criminal negligence to me.
You can choose to encrypt the contents of a data bag using a locally generated (on your hardware, nothing we control) key.
There's also a pretty harsh difference between the security practices at Amazon and the practices that Opscode displays in their OSS-code.
I'm an opscode customer, but before that, I was playing around with their free plan.
Just to be clear, this has always been a core goal with Puppet - very low dependencies to make it easy to adopt.
There are some real downsides - we have to do a lot more coding, and it can be tough to get all of the hot newness - but we think the users benefit from a much simpler solution that's much easier to support.
The memory footprint is about 10 MB, install size maybe 30 MB.
It compiles into small binaries and is usable anywhere - in the cloud, on supercompute clusters, on the desktop or laptop, on a smartphone, in embedded devices.
1. The node object ends up being two large, which leads to memory issues when a search returns more that 200 nodes (800MB of memory).
2. Chef discourages declarative configuration.
3. Chef lacks a remote trigger mechanism.
Issue one starts to kill you once you have a large number of nodes. Dedicating a quater of the available memory to configuration management seem like a poor financial choice. We've come up with work arounds at my company(generate files centrally, and distribute with chef remote_file syntax), but I still feel they are hacky.
Issue two is more serious. Reindexing on chef only occurs after a node has submitted its node object back to the chef server. This results in incomplete searches until a node successfully completes a run. If you wish to remove a node from a particular role or attribute from a host, you may have a hard time doing so until the next chef run completes.
Problem three is really a result of the expense of running chef. If the memory and CPU costs were lower, there wouldn't be any real issues running Chef more frequently. Some changes I need to go out immediately, some don't matter. I end up back in the world of the SSH loop too often with Chef.
2. Chef definitely does not discourage declarative configuration. Chef recipes include declarative resource for configuring your infrastructure. Since recipes are an internal ruby dsl, there may be nondeclarative code in them.
3. Chef itself doesn't have a remote trigger mechanism because the. Her run is all about configuring the local node. Nothing prevents you from using the ruby language in a recipe to hook up some kind of remote trigger though. People in the chef community are doing this with projects like Noah and Pylon.
2. I should be more specific. Generally chef relies on the information that ohai provides, not with information enumerated by the administrator. There is a general assumption that the systems are properly configured, and chef is only furthering that, since the hosts provide most of the configuration details. (Yes, you could do stuff with data bags to address this issue.)
3. Thanks for the links. I've not seen those projects previously. I've solved the issue for myself using a much lighter weight solution.
2. There's no assumption that systems are correctly configured other than they start from a baseline configuration in the most common use case. We have worked with several customers managing existing infrastructures of running systems that had an unknown baseline and Chef was able to automated the pieces they cared about.
Again with "most common use case."
* Why not modularize chef, right now it's a nightmare to get working, emerge makes a graph about 200 nodes long...
I plan on sometime making a puppet / chef alternative because they both seem lacking
Not to discourage you from innovating; just so you know what's out there today.
1) the declarative vs imperative aspect
2) Chef's heavy dependencies
As a Ruby developer, I like Chef's Ruby DSL; but I somehow feel that its imperative DSL will lead to something similar to Bash Hell. I'd like to read more about the declarative properties of Puppet vs the imperative way of Chef, and why one should prefer one over the other.
Secondly, like some commenters before mention, Chef's dependencies seem quite excessive.
The main difference with Chef is that it has imperative structure (i.e. ruby-based scripting) to fall back on, whereas Puppet forces you to go out to a script if you want to be imperative.
In the long run, it's arguable that a strict declarative structure has more interesting properties when dealing with query of status & handling changes or drifts rather than "stamping out a server".
Puppet has been great for me to detect and correct drift, for example, all the while ensuring pre-requisites are executed in the proper order. The tradeoff is that you have to think through the various dependencies at a detailed level, which can be difficult. An extreme analogy would be programming with a logic language vs. an imperative language.
My experience has been Puppet is growing in popularity with enterprises, not just web companies, and it seems Puppet Labs is targeting this audience more than Opscode is, comparing customer lists. I'm not quite sure why this is - could be attitude - Chef is more about "get it done now", Puppet more about "get it right for the long haul", but even that is a caricature.
We at Puppet Labs have always focused on building tools that anyone can use, not just the best hackers in the world. We've got some of the best sysadmins working with us, but we punish them by making them write software that even people who aren't the best sysadmins can use.
So yeah, we get a ton of enterprise adoption as a result, but we've also got a ton of web companies using Puppet. It's true that in the Rails web startup, we're not always the number 1 choice, though. :)
This way I'll be able to invest in the server part of chef later on, when I need it.