Hacker News new | past | comments | ask | show | jobs | submit login
GitHub Flow (scottchacon.com)
327 points by schacon on Aug 31, 2011 | hide | past | favorite | 63 comments



"Every branch we push has tests run on it and reported into the chat room, so if you haven’t run them locally, you can simply push to a topic branch (even a branch with a single commit) on the server and wait for Jenkins to tell you if it passes everything."

From this, it sounds like Jenkins is automatically picking up new topic branches, running the tests, and reporting on the results. Any suggestions on how to set something like this up? In my (very limited) experience with Hudson/Jenkins, this sounds like it wouldn't be possible without manually setting up a project for each branch.


Exactly! I'd love to see a post with this much detail talking about the technicals of their deploy process. Using this free-wheeling branching model, requires some very flexible deploy tools, which I don't think(?) exist in the wild. Please school me...


You use the github plugin with Jenkins, and tell Jenkins to build all branches. That's about it.

edit: more details here -- this is what Relevance uses for our CI: https://wiki.jenkins-ci.org/display/JENKINS/Github+Plugin


I don't use jenkins, I use buildbot, but I'm sure it is quite similar. If you use github's post-receive hooks it reports all branches not just master when something changes. So if jenkins can build from a posted JSON object, it is quite simple to do. On our top level project if anyone checks into a topic branch it is built and tested automatically.


I'm curious about this too. I know Jenkins has a rest API; maybe it's attached to Github web hooks.


I think atmos spills the beans on how their builds with Jenkins work...

https://github.com/atmos/jinkies/wiki/Jenkins-Project-Setup


I believe Jenkins can be set to pick up the latest commit, regardless of branch. Either they've got a plugin to handle it or they just look over the builds manually.


Why would somebody downvote this?


I don't think it's possible with the typical git/github plugins out of the box. However, Jenkins lets you run any manner of scripts at different parts of the build.

One simple (perhaps too simple) way to implement this would be to run a script which enumerates branches and exports the name of the most recently touched branch to an environment variable. Then, parameterize the build on that environment variable.


Your approach is valid. We're doing something similar (e.g. auto-building Debian packages on each commit/push) and use git's post-receive and svn's post-commit hook to detect the modified branch and then trigger a parameterized Jenkins build which takes the branch as argument. You can even trigger different kinds of builds/tests/... depending on the branch name. Due to the nature of branching inside Git it's much more fun with Git, but it basically works for subversion as well.


I wasn't aware that you can open pull requests from within the same project (i.e. not from a fork). The idea of using this for quick code reviews before merging code into the production branch is really interesting to me...


Most people using github for their everyday work do it the way Scott mentioned. Github's pull request system is fundamental to its success and everyone I know's workflow.


I'm familiar with the Pull Request system. I've just only ever used it to pull changes back into the main repo from other forks; I didn't know you could use them from within the same repo.


Well written and presented. Important not to miss out his closing comment:

"For teams that have to do formal releases on a longer term interval (a few weeks to a few months between releases), and be able to do hot-fixes and maintenance branches and other things that arise from shipping so infrequently, git-flow makes sense and I would highly advocate it’s use.

For teams that have set up a culture of shipping, who push to production every day, who are constantly testing and deploying, I would advocate picking something simpler like GitHub Flow."

So if you fall in the second category, this is a read for you.


Question about the chat deploy bot: there are a few lines in there that look like this:

    hubot deploy github/ghost-down to production
Is that deploying a branch directly to production, or does that cause a branch to be merged with master and then master deployed to production? If the former, why deploy a branch directly rather than sticking to the "master is production" idea?


It's deploying a branch to production.

If you think about it, this is completely the example of "master is production-ready" idea. We care so much about keeping master deployable, that if we have code that may have issues and need to be rolled back, we deploy just the branch. That way if things break, you can "roll back" to master and everything is good again.


In any case, you can always do a `git co hash`.


Very good comparison between workflows of deploying several times per day versus much less often. While it might not be obvious to some, the exact same git "flow" won't work for both. Your tools should complement your corporate culture, not the other way around.

I think the most important thing to note from either method, though, is not to develop on master/trunk. Have a separate branch, or further branches off an entire "develop" branch. The tip of master should always be a stable build.


Yes, however the only factor here that implies never working off master is the peer review. If one dev can take responsibility for the stability of a patch, then it's perfectly okay to work off of master locally and push when it's stable. If you need to put this work on hold or it grows into something requiring a topic branch, at any time you can simply do:

git branch new_topic_branch && git reset --hard origin/master


I typically use master for developing, and another branch called "production" to store whats live.


This sounds like a feature branch strategy, which I've only used in 1 or 2 person teams, never on projects that big.

There have been some articles recently on the downsides of feature branching that my experience agrees with (http://continuousdelivery.com/2011/07/on-dvcs-continuous-int...). I'm curious if the GitHub people have hit the same issues.

So if 2 people are working on the same feature, they're probably working off the same named branch.

Are there any race conditions with merging to master? I'm assuming that only one head is allowed in master, correct? So that before a pull request is accepted and merged into master, the latest master must first be merged into the feature branch and have CI run all tests successfully on it before the pull request can go. Does GitHub stop you from merging into master if someone else just merged into master and you're about to create a new head?

Then you have to merge the latest master into your feature branch, run CI on it again and then merge to master after CI is successful (assuming someone else didn't beat you to merging to master again).

(I've got a lot more experience with Mercurial than Git so my mental model could be a little off)


> Are there any race conditions with merging to master? I'm assuming that only one head is allowed in master, correct? So that before a pull request is accepted and merged into master, the latest master must first be merged into the feature branch and have CI run all tests successfully on it before the pull request can go. Does GitHub stop you from merging into master if someone else just merged into master and you're about to create a new head?

One useful thing to keep in mind for this explanation is that GitHub doesn't do anything really, it's just one more git repo with a bunch of sugar.

Yes, there is only one HEAD for a branch in any given repo. When you push it expects that your local HEAD is a direct descendant of the remote HEAD. If this is not the case (due to someone pushing since you last pulled) then it won't do anything (you can force it, but that's almost always a bad idea). In practice though this is not a big issue. You don't have to remerge into the topic branch. Instead you can just reset your master HEAD to origin/master and then remerge the topic branch into master and then push.

If you are just working on changes locally directly on master, it's even easier, you just do git pull --rebase and all your local changes are rebased to the latest HEAD.


wrt that article, we don't run into these issues i think largely because we do something that seems straightforward to me that is for some reason not addressed in that article (unless i'm reading it wrong) which is to reintegrate (merge) from master into your feature branch rather often.

the article you mention seems to think that you can either have short lived feature branches or integrate them into master all the time, where we (and most people, i think) do the opposite - integrate master into the feature branches so they're never that far behind and then do the opposite to fast forward the feature branch into master when it's ready to deploy.

and yes, your mental model is off a bit - a branch in git is more like a bookmark in hg - multiple heads for a branch doesn't really make sense - every head is a branch, there is no such thing as an unnamed head.


I actually just wrote up a blog post as well about why I don't like feature branches: http://www.pgrs.net/2011/08/29/why-i-dont-like-feature-branc...

My main reasons are that git history gets messy, builds don't run on feature branches (although github seems to have a work around here), and refactoring is harder.

My opinions are largely based on working on larger projects (more than 10 devs working in the same codebase).


Learn to use rebase and squash commits! It solves a lot of the issues you talked about.


Ironically, it seems like more feature branches is the solution to the "feature branch problem". :) With more, smaller branches, their changes will be laser-focused and easier to squash.


Squashing commits can fix up git history, but it does not address the bigger issues: refactoring, lack of builds, and having to deploy multiple branches to test features. The first two make it very hard to do any significant code reworking on a feature branch.


I intend no ill will with this response, but it is worth pointing out that "the first two" "bigger issues" have nothing to do with your revision control system.

Lack of builds is something about which you and your CI server need to have a chat, and refactoring is something about which you and your people need to have numerous chats before, during, and afterwards.


GitHub seem to have solved both the lack of builds issue and the challenge of deploying multiple branches (their bot can deploy a branch to staging with a single command) - which just leaves refactoring. I imagine they deal with refactoring by broadcasting a message to the team saying "I'm working on refactoring area X, try not to touch that code until I'm done if you don't want to deal with a painful merge".


Yes, there's always a race condition merging to master. Occasionally I'll want to deploy something, but jenkins builds pile up, or someone is busy test-deploying a feature branch. Since we deploy from our chat room, we naturally talk about what we're doing in the same spot.

Also, if you have too many developers on the same project, break the project up into smaller pieces. We have various systems in separate repos that can be rolled out independently.


Is it really zero-downtime deployment? I've read about Passenger 3's zero-downtime deployment strategy, but on my Passenger 3 setup, the server is still always a little unresponsive for a few seconds after a restart.


As far as I know, they're using Unicorn: https://github.com/blog/517-unicorn

They spin up new instances of the app and let the old instances finish serving requests before killing them.


I don't know how it works in the Rails deployment world, but for us Python/Django folks all we have to do is touch a wsgi file and/or do a soft reload of the web server for changes to be picked up, and they do so instantly. No downtime. Until then, even if files are replaced, the old site gets served (although I would recommend symlinking instead of flat-out replacing files and directories).


github uses unicorn behind nginx.


Its interesting that they abandoned CI Joe. I wouldn't say, I saw this coming. But, unless they wanted to maintain/write a full blown CI server themselves, it would have got harder for multiple projects.


Indeed. I'd love to hear the details on this. I was considering CI Joe for a project.


This isn't surprising to me. While CI Joe was decent and simple, its really extremely limiting once you want to do anything moderately complicated.

Jenkins has a crummy UI, but it is very powerful and has a lot of useful plugins. If you are moving to any sort of continuous deployment setup, as Github is close to, you really need something as powerful as Jenkins.


What do they use for a CI tool?


Jenkins.


For those who enjoyed this talk, Corey Donohoe gave an awesome presentation at Cascadia RubyConf that goes into more detail about what they use Hubot for, and also mentions deploying branches to a subset of their boxes. It was one of the best talks of the conference: http://confreaks.net/videos/608-cascadiaruby2011-shipping-at...


In a small web agency, mainly creating sites for clients, we find a mix of "git-flow"-style and continious deployment works best.

In the weeks before a new site is launched, we work to our own feature branches and merge into master when a feature is complete. In the run up to the site launch, when there's just CSS tweaks and the odd bug fix, people start working on directly master and deploying straight to staging servers.

When a site has been launched we normally keep working just on master, though occasionally creating feature branches for bigger changes.

This seems to work well for us as our DVCS needs change over time. I'd be interested to hear how other web agencies manage the different stages of developing clients' websites.


One question i've always had: How often do "regular" people commit? Should I be committing every time I hit save... or should I wait? (I don't work in a dev team, so I'm looking for the wisdom of developers who have to work in teams.)


I like to think about my commits as units of work that I can pull back or cherry pick if I want to. It doesn't always work out that way.

Make sure your commits are cohesive to the change you are making. I think that is a good rule of thumb.


Yes, definitely nice if you can make your commits atomic changes. The more easily able to right a nice summary line (50 chars please!) the better.

On the other hand, some changes are big and messy. In this case I sometimes do intermediate commits, especially if it's at the end of a day just so I can keep yesterdays changes conceptually separate from todays. In the end I may rebase -i the whole thing and clean things up before pushing, but only if there are some obvious and quick ways to split it up.


I tend to do a lot of work upfront (without committing) and then go back and split the work up into smaller commits with "git add -p". I commit often, so it's never a huge list of changes I have to split.

I do this so I can cherry pick commits into other branches (e.g., fixing a bug in my current branch and merging it back to master).


It's really a judgement call, although I'd definitely recommend against committing every single save. The only hard-fast rule I follow is to always commit whatever I have left-over at the end of the day, so that I never lose work overnight or over a weekend.

Other than that, if you felt like you've taken a decently-sized chunk out of whatever problem or feature you're currently working on, commit.


I think you are confusing committing with pushing? But even then, I do not see any harm in pushing often, you can always amend your commits before final merge or pull request.


I try to commit when I have made enough changes so that builds and functionality don't break but not everything is 100% done either. My commit comment will describe what I was doing and what I changed. If unit tests pass then I think it's fine to commit (if you aren't writing unit tests than I advise that you should). There are projects I have to share with other developers and projects that I am the only developer on and I keep my rule of committing the same for both.


I tend to commit the first set of written tests, the first code & test fixes that passes them, etc. It's very organic though; if you do a "write one test pass one test" workflow that'd be a lot, but since I tend to work on multiple tests at once, it works fine.


Here's what I'm curious about that is not mentioned at all:

How do they manage deployment to staging? At my company we typically deploy topic branches directly to staging, but we have fewer developers and slower pace. If multiple people need to deploy topic branches we set up an ephemeral staging branch that merges the multiple topic branches together, but I can imagine that getting super hairy on a team the size of GitHub's.

Do they just mostly deploy directly to production, thus severely minimizing staging contention?


it depends. when staging is in a good state we'll simply ask if anyone is using it. you can see a few deployments to staging in that screenshot i posted i believe. however, if the developer judges that a ci pass is good enough, a deployment directly to prod after they get the ci green light is also common.

this is also one of the benefits of deploying via a chat room - you can ask if you're going to be stepping on anyone before you do it.


How do you handle branches which require DB migrations?


I'm assuming they build a new db per branch on the ci server and auto run migrations on them. pretty typical


For reasons decided long ago, the company I'm at uses Mercurial, and I don't think we're in a position to retrain everyone and move to a private GitHub repo.

Anyone know of ideas for doing code reviews for the whole pull request, commit, or a single line like GitHub? This is probably the most beneficial part for us.


If you're using Mercurial, Kiln offers code reviews and repository management, hosted by us or on your server.

http://www.fogcreek.com/kiln/

(disclosure: I work on Kiln at Fog Creek)


Could you use a private BitBucket repo for this? BitBucket has the same kind of pull request functionality that I'd think you could use to emulate this continuous delivery style.


To be completely honest, I've never been satisfied with using BitBucket, doing anything is comparitively harder, and honestly it just feels like I'm using an incomplete GitHub clone. (and I really don't like support teams that are "creatively bankrupt")

I'll look into if the issues feature works for us, but me might have to roll something of our own.

Doesn't mean I won't make the pitch for Git yet again.


I'm curious how CI is done on branches. It's mentioned but not elaborated on in the article.


Every branch is scheduled to build in Jenkins as soon as it gets pushed. We get success/failure notifications in Campfire.


So is there a separate Jenkins job per feature branch? How are these created?


I'm not sure it's how GitHub does it, but we use the jenkins gem to do pretty much the same thing: https://github.com/cowboyd/jenkins.rb


I wish they'd post a guide on how to do a separate CI job per feature branch. That'd make this approach really scalable.


Using something like the Jenkins gem (https://github.com/cowboyd/jenkins.rb), it's pretty easy to script. The hardest part is setting up the job configuration the branches will use.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: