Hacker News new | comments | ask | show | jobs | submit login
How Github uses Github to build Github (zachholman.com)
389 points by brown9-2 on Sept 22, 2011 | hide | past | web | favorite | 66 comments

Every time I read one of these (Github, Etsy, ...), I feel a bit guilty that our tiny startup doesn't do continuous integration/deployment. I mean, we git push to Heroku to deploy (maybe once a week), but there is no automated push once the tests pass. It seems like this is a big psychological leap (deploy every week vs deploy every hour).

When does setting up something like Jenkins CI become worth it? When you have 2 people? 10 people? 100 people?

Is there tons of custom code to set it all up with Github or whatever deployment scripts exist? What CI systems are dominant? (I'm mostly curious about Ruby-centric ones, but don't really want to bias responses.)

Assuming your main product is on the web, you really only need a few things to do this right:

- Automated tests that run on push to any branch.

- Rolling deploys (no downtime) from any branch.

- Exception reporting system (Exceptional, Hoptoad, homegrown, whatever).

- Twitter search feed.

Push to a topic branch, wait for the tests to come in green, deploy from topic branch, watch exception notification and twitter. If everything looks okay after 10 minutes, push to master. If exceptions or twitter blows up, redeploy master.

You should be able to setup the whole system in about the same time as it takes to develop a majorish product feature.

I've seen it mentioned a few times, but I've never gotten anyone to explain: how do you go about setting up a CI server (like Jenkins) to automatically track branches, new and old?

In Jekins builds can be triggered by simply calling a URL. In Github, it is really easy to add a post receive hook which calls a URL. Just call the build URL, and everytime a branch is pushed, Jenkins will build the project and to other things associated with the task (like deploying to staging, running tests on staging, and deploying to production if the staging tests don't return an error).

There is also a Plugin (https://wiki.jenkins-ci.org/display/JENKINS/Github+Plugin) which should handle this automatically.

At least with the latest jenkins and its git plugin, it builds/tests all branches by default.

Its not overly hard to setup, I just added git repos to the list. I'm still "testing" it locally on my workstation before we start using it in all of our environments.

Ahh, I probably should have updated jenkins and the plugin before asking that. Thanks!

Sorry, what do you mean by Twitter search feed?

A search for the product's name or handle on Twitter to detect a surge of activity post-deploy.

(Most surges are negative, eg. "Man, <YourService> is down AGAIN, this sucks.")

I assume watch a feed on Twitter for a search about your company or product. If people on Twitter start complaining in mass about your latest push it's probably a good sign that you need to roll back.

Ok, this is what I also assumed - although I would never rely upon this as part of my continuous deployment process, I guess it does little harm to still monitor and be aware of such things...

At Flickr, I always have our system graphs in one tab and our help forum in another tab.

What exactly is a rolling deploy and how would you set it up?

So, as others have pointed out, what you're really asking for is continuous deployment for which CI is prerequisite.

I think doing CD as early as possible in the project is very worthwhile: It forces you develop high confidence in your app from very early on. When committing your work means that users will be using your code in a few minutes time, you get in the habit of making sure that you have appropriate test coverage for the code you're committing. This also makes it more difficult to take shortcuts or "I'll add tests around this tomorrow".

If you don't get in the habit of CD early on, you're risking developing bad habits such as manual testing or hard-to-automate release steps which makes it progressively harder to start doing CD.

It's worth setting up Jenkins even with 1 person if it's a serious project. It is ridiculously easy to set up, and it's a "set and forget" procedure. This is orthogonal to continuous deployment, by the way.

While I agree that it is orthogonal I would highly recommend setting up continuous integration if you do continuous deployment.

Agreed; I can't imagine CD without CI. However, CI without CD is pretty normal.

Don't wait. Jenkins is literally one of the easiest pieces of web service software you could possibly set up. All you need to do to try it is download a jar file and run java -jar jenkins.jar. That's it!

I run it on my personal computer. I also set it up at our company, where I'd say the active user base is around 10 people. It's useful in both cases.

Jenkins has plugins for everything, and soon you'll be able to develop plugins in Ruby, and without maven, so if you don't have what you need you'll be able to add it.

Don't feel bad about it. Always do what makes sense for your business.

If you can handle doing it manual, do it. There will be a point where you are growing too much and too fast, and you loose too much time deploying, then look for implementing CI.

A lot of people will tell you to use CI from the start. And our startup has been using CI since the beginning, but there was no apparent benefit in the start. It was only 1 full time developer, and a second part time.

Right now we couldn't live without it, but this is because our applications has grown considerably, and the speed at what we are implementing our ideas is fast. Loosing time doing Deployment and running tests manually would be a great loss of time and of flow.

CI is only worth it, if there is a benefit for your project.

I recently set up Jenkins for a team of 6 with backend/frontend and it was easy and fast to set up. It is nice to automatically have reports added to JIRA when stuff builds correctly or fails, and XMPP notifications allow developers to be notified if something they pushed has broken the build.

What do you use the XMPP over? google chat? Or other XMPP chat programs?

We have a local XMPP server set up on the network, and we all use various different clients, I personally use Adium, we have some developers using Empathy on Linux, and others using Pidgin on Windows.

Jenkins has an account on said XMPP server (our AD logins actually) and is thus able to send messages to the developers that are signed on. We also have a "conference room" set up on XMPP that every developer is logged in to most of the time, and Jenkins is available there to run commands, so you can for instance say !build android-client and it will start building the android client, test it in the emulator and report results to the chat room.

If you don't mind me asking what XMPP server are you using?

I'm guessing that this is a pretty powerful computer to run the emulator on?

I don't know what XMPP server we are using. I was not the one that set that up. I only have access to all of the FreeBSD/Linux machines in the company, someone else gets the pleasure (or should I say displeasure) of managing all of the Windows server infrastructure (AD, File servers, stuff like that), including where our XMPP service is being run because it requires AD access. Upsides and downsides to being the only developer in the company that knows Unix...

We have a gitorious server that also has a jenkins account (jenkins connects over SSH). It is running within a VMWare Virtual Machine with access to 4 cores running at 2.4 Ghz, and 14 GB's of ram. Whenever jenkins notices a change in git (using polling, every 5 minutes, set up for now, eventually I'll get around to doing push notifications of some sort ...) it pulls down the latest source code, and starts up the Google Android emulator (basically qemu for the ARM platform with the devices as an Android phone would have them), once that is started up it compiles the source code (java) and using adb installs it on the phone, runs the test suite, and reports any errors back, Jenkins then shuts down the emulator.

It isn't extremely fast, starting up the emulator takes about 50 seconds or so, and I am trying to get a faster server to use as a jenkins build slave, but for now it works wonderfully. Everything I mentioned is completely managed by jenkins.

I compete with myself on a daily basis to add some points on my coverage %. Having a tool like Jenkins to graph my progress helps a lot.

Zach, what software did you use to make those slides? They are beautiful. Thanks.

Keynote. I use SpeakerDeck to share them with people (love, love, love SpeakerDeck... they really nailed the project. Still in private beta for now, I think.)

> love, love, love SpeakerDeck...

Just two cents from a user's point of view (someone viewing not uploading the slides). I think the interface is very pretty, but it lacks a few features that seem essential to me. First, you can only advance or go back one slide at a time. There's no easy way to jump to a specific slide if you want to go back or forward and review something. Second, there's no download option. (From the point of view of some uploaders and SpeakerDeck, this may be a feature not a lack of one.)

I understand that it's still in beta, so this isn't a major complaint, but I would love to see these features added.

It's not apparent, but you can mouse over the progress bar for a real-time preview of any slide. You can then click to jump to that location.

There is a link in top left corner: http://speakerdeck.com/u/holman/p/how-github-uses-github-to-...

Seems like they are in some kind of closed beta.

I don't think speakersdeck makes slides. It's just a tool to share them.

He used speakerdeck.com

The slides were very well done. Here's another set not-that-much-related that I found: http://speakerdeck.com/u/pengwynn/p/accelerating-titanium-de....

I like the idea of teaching a chat bot to run various commands for you. Not only is it faster (one place to go), it probably makes it easier to teach new people the ropes. ("How do we do x?" "Tell Hubot to do it.")

This is exactly why Hubot was created in the first place. An excerpt from the original project README:

This is Hubot, a Campfire chat bot in node.js.

The goal is to make common systems stuff seem less like black magic by turning frequently performed manual processes into simple commands and making those interactions visible to the team via Campfire. Benefits:

• Fast access to server logs and machine information when something goes wrong.

• Everyone sees how common tasks are performed.

• Interactions are logged and available in campfire transcripts.

• Let's us do stuff from the phone with Ember when AFK.

• Can paste squirrel pics and stuff like that.

For new people starting at GitHub, they can kind of just lurk in the chat room for a couple days to get a feel for how most types of things are done because people are issuing Hubot commands all the time.

It's really powerful. Way more powerful than we suspected.

EDIT: Formatting.

How do you do the "who's here?" in hubbot? of all the things when I'm managing teams, knowing physical locations of people was always the pain.

I believe it reads MAC addresses of connected machines to the wireless router.

Ever going to release the source for hubot? I just digging through github for campfire bots and would love another repo to pull idea from!

I saw the hubot presentation at ruby kaigi in Japan.. Looks fantastic and fun. Hubot open the door!

So I'm guessing that you are giving your bot a pub/priv key to get stats?

I wrote a Campfire bot in NodeJS that can accept arbitrary commands. Have been using it here at gdgt for a while now. It's pretty nifty.


Some great nuggets of wisdom from what is obviously a very high-throughput and productive team. Zach always has something very interesting to say. The beautiful slides are a nice plus as well. What are you using to make them Zach ?

Keynote (and a lot of time!)

I love it when companies eat their dogfood and make it work while having the employees not hate it at the same time. That says a lot about your product.

That's supposed to be the intention of eating your own dogfood. If a company's too big, maybe it turns into an exercise in "company loyalty" with no concrete benefit because there's too much red tape between devs in one group and changing requirements and design choices in a completely separate group, but for Github I think it might actually be working as intended.

This is an awesome talk.

Any chance of video becoming available? The slides were awesome and colorful and I'd love to hear the words that went along with them.

Great stuff! =)

Frozen Rails didn't record talks. I'm giving this a couple more places, though, and it's a high possibility one of them will.

Yes the slides are very nice. Anyone know what he used to create the deck?

Keynote. Some of these slides are a little messed up because the elaborate animations had to be scaled way down for PDF. For instance, the status.github.com pull request slide was actually a tall browser screenshot. Transitions slid to a further point in the pull request to highlight specific events.

It'd be pretty awesome if he used GitHub to build 'How GitHub uses GitHub to build GitHub'.

GitHub is a great Strange Loop. Probably why it's so good.

Loving that typeface

Yanone Kaffeesatz, it's available from Google Web Fonts:


I really love GitHub's way of working. I think this is the future of working in a team.

This presentation just inspired me to write a really emotional email to our teammates about speed and process using Github... we are already loving Github...but this presentation definitely hit the spot...

"internal twitter"? What's that?

was "don't reinvent the wheel your authentication can be free" intentional?

Also, CMake uses CMake to create its project files. Any others?

Lots of compilers are bootstrapped; GCC builds itself with GCC, GHC (the Haskell compiler) is written in Haskell and builds itself.

These are just two popular examples, but my experience is that most serious compilers for general purpose languages are bootstrapped.

(I'm working on a Perl 6 compiler that is mostly written in Perl 6. It's real fun!)

I'm curious what back end you're using for that project. Offhand I'd assume LLVM but I know there's others that might make a good choice too.

The current backend is the parrot VM (LLVM would be a bit too "low level"), though we plan CLR (.NET/mono) and JVM backends.

Coffeescript compiler written in coffeescript?

dotCloud runs on dotCloud, using a special bootstrap mode.

This is amazing.


This is one of the most commonly misspelled company names in hackerland, right up there with the various mutations of "37signals". Details like this are worth getting right.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact