
Let's deploy via Git - mnazim
https://coderwall.com/p/xczkaq?&p=1&q=
======
pvnick
This is a pretty neat hack, but not really good for a true production
deployment system. Rsync is a far superior alternative. That being said, git
should definitely be incorporated into the workflow such that, for example,
you have a "live" branch which always reflects what is to be on production
frontend nodes. From there you do 1) git pull origin live 2) rsync to live
servers 3) build/configure/restart/etc. Set -e on that script obviously...

Edit: I should also mention, if you are stuck on something like restricted
hosting with CPanel which severely limits your deployment options (some of my
clients are in this boat), then [http://ftploy.com/](http://ftploy.com/) is a
really cool solution. But you should really get your ass off cpanel asap.

Double edit: Some of the replies below have made some good points that I had
not considered which weaken my argument. So while I'm now more ambivalent than
dismissive towards the idea of using git to deploy, there are several
modifications that should be made to this particular system to make it
production-ready. See avar's and mark_l_watson's comments below and
mikegirouard's comment elsewhere for some ideas.

~~~
avar
We've been working on moving away from rsync for our code syncing to using Git
where I work.

I'm not saying there aren't uses for rsync, but your dismissal of git as not
being suitable for a "true production deployment system" isn't supported in
any way. And stating that rsync was "specifically made for this kind of thing"
without comparing any of the trade-offs involved is just appealing to
authority.

Some things you may have not considered:

    
    
      * rsync is meant to sync up *arbitrary filesystem trees*, whereas
       with Git you're snapshotting trees over time.
    
       When you transfer content between two Git repositories the two ends
       can pretty much go "my tree is at X, you have Y, give me X..Y
       please". You get that as a pack, then just unpack it in the
       receiving repository.
    
       Whereas with rsync even if you don't checksum the files you still
       have to recursively walk the full depth of the tree at both ends
       (if you're doing updates), send that over the wire etc. before you
       even get to transferring files.
    
     * Since syncing commits and actually checking them out are two
       different steps you can push out commits (without checking them
       out!) to your production machines as they're pushed to your
       development branches.
    
       Then deploying is just sending a message saying "please check out
       such-and-such SHA1" and the content will already be there!
    
     * You mentioned in another post here that rsync has --delay-updates,
       this is just like "git reset --hard" (but I'll bet Git's is more
       efficient). With Git you can do the transfer of the objects and the
       checking out of the objects as separate steps.
    
     * It's way easier for compliance/validation reasons to not get the
       data out of Git, since you can validate with absolute certainty
       that what you have at a given commit is what you have deployed
       (just run "git show"). If you check the files out and then sync
       them with some out-of-bound mechanism you're back to comparing
       files.
    

Edit: One thing I forgot, it's distributed. Which gives you a lot of benefits.
Consider this problem, you have 1000 servers running your code and you've
decided that you want to deploy _now_ from a staging server.

Having trying to rsync to 1000 servers at once from one box (the naïve
implementation with rsync) would take forever and overload that one box,
especially if you wanted to take advantage of pre-syncing things on every
commit so the commit will already be there if you want to roll out (constant
polling and/or pushing).

You can mitigate this by having intermediate servers you push to, but then
you've just partitioned the problem, what if you need to swap out those boxes,
they go down etc.

With Git you can just configure each of the 1000 boxes to have 3 other boxes
in the pool as a remote. Then you seed one of them with the commit you want to
rollout. The content will trickle through the graph of machines, any one
machine going down will be handled gracefully, and if you want to rollout you
can just block on something that asks "do you have this SHA1 yet" returning
true for all live machines before you "git reset --hard" to that SHA1
everywhere.

~~~
tux1968
You've described some admirable utility that can be achieved by using Git.
However, it can all be accomplished with other tools and without needing the
entire deployment history stored on each production machine.

As for your comment about being "back to comparing files", that's all Git is
doing internally anyway. You can do the same with other deployment tools and
sha1 hashes etc.

~~~
avar

       > it can all be accomplished with other tools
    

Sure it can be accomplished with other tools, but if Git is sufficient
introducing other tools just increases the complexity of your stack, and the
complexity of e.g. validating that a Git tag corresponds to what claims to be
rolled out as that tag.

    
    
       > and without needing the entire deployment history stored on each
       > production machine.
    

This is a constraint a lot of people seem to think they need but they don't
actually need. If someone gets your current checkout they'll have current code
/ passwords (if you accidentally checked in a password but removed it you
should _change that password_ ). Getting the code history will just satisfy
historical curiosity. Hardly a pressing concern for an attacker.

    
    
       > As for your comment about being "back to comparing files", that's
       > all Git is doing internally anyway. You can do the same with
       > other deployment tools and sha1 hashes etc.
    

Yes, but the point is that it just gives you that for free without you having
to hack anything extra on top of your syncing mechanism.

You'd be pleasantly surprised how much checking/validation/syncing logic that
you have to write around e.g. rsync when syncing a Git repo just disappears
entirely if you just use Git to sync the files.

------
tux1968
I'm a huge Git fan and use it every day, but Git was never designed as a
deployment tool. There may be situations where you want the entire history of
your development to be included on your live server, but often this just isn't
appropriate. Also, when deploying to multiple servers you have to invent adhoc
methods to handle configuration differences. Even native ssh seems like a more
prudent deployment method than Git. This smacks of using the closest hammer at
hand, rather than choosing the best tool.

~~~
snprbob86
> There may be situations where you want the entire history of your
> development to be included on your live server, but often this just isn't
> appropriate.

Are you concerned about being wasteful with disk space? Or is there some other
concern here? Some security issue perhaps?

~~~
ohwp
I once committed my DB settings (Mercurial) and noticed my mistake only later.
It's very hard to get it out of the history. Ofcourse it could be fixed but
this is one example.

Imho version control could be used for deployment but only when you use the
release-branch of your project.

And ofcourse NEVER put your config in version-control ;)

~~~
mnutt
_And of course NEVER put your config in version-control ;)_

I'm not sure I'd make that blanket statement. Version control seems like a
great place for configuration. It allows you to centrally manage configuration
details and provides an audit trail for debugging. You just want to make sure
it is in a separate, secure repository and not mixed in with your app
development.

------
lifeisstillgood
Aaargh !

Build on a build server

Scp to live server along with generated config

Install with native package tool and hook into native service manager

Use salt / puppet / chef to do everything after initial build on your target
servers.

Be nice.

~~~
robinson-wall
> Install with native package tool

I'm sad that so few people seem to build native OS packages for deployments.
My build system creates a release package and sticks it in an apt repo, then
puppet installs latest version of package when it runs.

~~~
AgentIcarus
I've just started looking at this sort of thing - can I ask what technology
you decided on for the apt repo?

(I'm especially interested in whether the restriction most of them impose on
having multiple versions of a package is something you are dealing with).

~~~
Tobu
I use mini-dinstall for repos. Multiple versions would complicate things
however; I'd probably do it by combining multiple suites.

------
mikegirouard
I use git to deploy about 30 sites and have found it to be a really useful
workflow. It's particularly useful over SSH when using key-based
authentication.

For my post-receive hook, I always add a tag to mark a deployment:

    
    
        git tag deployment-`date +'%Y%m%d%H%M%S'`
    

You can see all past deployments with a git log:

    
    
        git log prod/master --oneline --decorate
    

On all my developer machines, I have them add a `git-deploy` script to their
$PATH, which looks a little something like:

    
    
        #!/bin/bash
        git push $1 +HEAD:master
        git fetch $1
    

You can just run `git deploy prod` (assuming your deployment repository is
named 'prod').

The extra `git fetch` will pull down the auto-generated tags so you can see
them locally w/a simple `git tag`

Edit: Forgot to mention, that since git ships w/a bash shell for Windows, most
of this should work for Windows-based dev setups as well.

~~~
avar
You might be interested in checking out git-deploy. It's a tool we wrote to
manage tag creation and completely pluggable rollouts/rollbacks with sync
hooks you write: [https://github.com/git-deploy/git-
deploy](https://github.com/git-deploy/git-deploy)

It's basically a more advanced version of what you're doing.

------
huhtenberg
Pedantic nitpick -

    
    
      git remote add origin
      git push origin master
    

It shouldn't be called "origin", because it isn't really an _origin_. A more
fitting name would be "live".

------
klj613--
git is a SCM and should not be on production systems.

instead you should have a build server which builds up a package (rpm, deb,
tarball?) which is then used to deploy across the production environments.

you should also not compile JS/CSS etc on production system that is what the
build server is for.

anything installed on a production system should be 'required' for the app to
actually run.

-

that said, you can use capistrano (and other tools like this) to update 'demo'
environments and dev environments (with git) however the actual TEST and
STAGING environments should mirror the PROD environment (packaging).

~~~
j-kidd
On one end, you have stone age developers who need Visual Studio on production
systems because that's the only tool they know.

On the other end, you have hipster developers who need Git on production
systems because that's the only tool they know.

------
wise_young_man
I'm planning on writing about this in more depth later, but this is
essentially the route every cloud hosting company is taking right now and I
think it's a bad to only allow that kind of deployment.

Don't get me wrong, I love Capistrano, git deploy hooks, ruby gems that do
deploys (heroku), but most cloud hosts are only offering this mechanism to
deploy apps. FTP became popular because of the ease of use for designers and
webmasters. You don't always need to deploy your entire application for simple
changes. Another big one is the ajax file editor in the browser.

For trivial changes a simple file change would suffice. When you do an entire
deploy for app like this, depending on your dependencies and payload, it could
take a long time. What if you had the wrong price and need to make a change
immediately? Of course maybe now there are multiple environments which play a
factor too.

I do realize that was before we had multiple web servers running the app and
that is part of the reason, but there are still ways to make it work (file
mounts).

I'm hoping more deployment options in the future and that cloud hosts realize
the need is still there from traditional hosting.

~~~
sjtgraham
> For trivial changes a simple file change would suffice.

Doesn't scale past one developer or one box.

~~~
wise_young_man
My thought is that could work together with other tools. Say you do your next
deploy with heroku, it says there are unsaved changes and asks you to first do
a 'heroku pull' to pull the changes and you can then commit them or you could
blow them away with 'heroku push -f'.

I'd say 80% of the sites online are managed by one or two people. They may
need the scalability of the cloud for traffic bursts, but we can't say cloud
is the future if all of our existing tools and workflows are completely
broken.

For the past year I was building a cloud competitor to Heroku. We had a
traditional host (like HostGator) and we talked to those customers about
moving to the new cloud infrastructure and all the benefits to why. Most
people said it was too complicated and were stuck in their work flows (FTP and
file managers). Which is why I wanted to chime in with FTP is not dead.

------
jebblue
I'm trying git to get my feet wet with it but man this looks way complicated
to me. I didn't even know about "git config" or even that there was a checkout
command for git. I usually cd into the directory I want to turn into a repo
and use "git init" then after changing files, run gitk or if in Eclipse I use
EGit. I'm not even sure what happens in gitk when I do a commit, is it that
long "push master origin" stuff? Does that mean master is my local repo and
master origin is like the overall master? I guess with SVN it's clear even at
the command line but with git there are just so _many_ options. Then there's
custom scripts to make all this work? I'll stick with scp or rsync for
distribution for now. The author might want to look into Hudson or Jenkins,
they work wonders.

~~~
ihodes
Check out the git book here: [http://git-scm.com/book](http://git-
scm.com/book)

Worth the read if you're interested.

------
Wintamute
git push/pull is easy, but there's no getting around the fact that Git is a
distributed version control tool, not deployment software. Using Git for
deployment is probably fine for simple deployments where you're just getting a
bunch of static files onto a single box, but as soon as you stray into the
realm of non-trivial web application deployments then things change. Factors
like database migration, dev/prod environment parity, dynamically spinning up
new server instances and continuous integration etc. mean that the act of
simply copying your files become the least of your worries. Sure Git will play
an important part in getting a snapshot of the codebase from a dev's
workstation into the deployment flow, but that's where it ends and tools like
Chef and Puppet take over.

~~~
EnderMB
My thoughts exactly. This workflow is nothing more than hacking Git using
post-commit hooks to do something it was never built for.

If you want to deploy using Git then the smart thing to do is to use one of
the many continuous integration tools out there that were built specifically
for this kind of workflow. I use TeamCity to run my tests and to build/deploy
my website whenever I push to my default branch. This works really well for
some of my sites, and although I'm looking for a way to refine this so I can
also deploy database changes between local/staging/web servers I can't think
of a better way of doing this.

------
darkstalker
Don't forget to hide your .git directory. You could accidentally expose all
your source code to the web.

------
Xymak1y
The issue here is that there is still numerous web hosts who don't grant you
SSH access, so you're not able to set up any git repository there anyway and
are still stuck with FTP. Hopefully this will go away soon, or at least more
in the direction of Heroku and the likes.

~~~
chris_j
Which web hosts are they and what advantages do they have over web hosts that
_do_ grant ssh access?

~~~
Xymak1y
Take Hetzner (large German provider) as an example - while they do offer
managed servers and root servers, those are much more expensive. I'm not
advocating the use of such products, but merely pointing out that they're
still around a lot.

~~~
ahoge
They also offer VPS ("vServer") for about 8€/month. You can ssh into those,
can't you?

------
macnix
I came up with deliver
[https://github.com/gerhard/deliver](https://github.com/gerhard/deliver) to
address this very problem. It's bash utility that automates git-based deploys
and comes with pre-built strategies for the most common deployment scenarios:
generated sites (think Jekyll), shared (WordPress, PHP etc.), ruby, node-js,
S3 etc. I did a talk on it at my London Ruby User Group in March:
[https://speakerdeck.com/gerhardlazu/deliver](https://speakerdeck.com/gerhardlazu/deliver)

~~~
zrail
That's very cool. I really dig the minimalism of it. I'm currently using a
system that involves about the same amount of config and leans on Capistrano
for the heavy lifting, but I think I might investigate using deliver for my
next project.

------
jakobe
I'm not sure I like the approach of serving files from your repository. I'm
not sure about how git works in detail, but are repository updates even
atomic?

When I deploy my website, I use a different approach: My webroot is just a
symlink. My deployment script exports the repository to a directory with a
unique name for every commit. When the export succeeds, the symlink is updated
to point to the new directory.

The advantage: Changing to the new version is instantaneous. If something
should go wrong, I can immediately revert by changing the symlink back to the
old dir.

~~~
bluetooth
> but are repository updates even atomic?

No, they are not. During the push, there will be a short amount of time in
which some parts of the website will be operating on new code, while others
will be operating on old code. If many components are in play (ie using
libraries) you may end up breaking things if a new request comes in at the
right time.

------
wubbfindel
Here's one reason not to use git for deploy - if you don't want your source
code on production servers where clients can access it, or where it could be
found by hackers.

I work on a closed source system, so we will never deploy our code via git and
then build on the server. So, in this case build locally (or on a build
server), and rsync from there using deploy scripts.

~~~
inthewind
That's all well and true, but you could export and build, then use Git for
deployment instead of Rsync (if you wanted to.)

------
jpb0104
I'm surprised Fabric [[http://fabfile.org](http://fabfile.org)] has not been
mentioned in this thread. I'm not a Python developer but I love Fabric
specifically for a tool to handle deploying code. If you feel like a Git
deployment is lacking, be sure to check out Fabric, especially for multi-
server deploys.

~~~
Ixiaus
Fabric is great for running _commands_. But what a lot of programmers do not
realize is that _deployment is a process_. That process involves running
automated tests, packaging all of the assets, using package managers to
upgrade/downgrade based on versioned release schemes, database migration,
post-upgrade scripts, and quick/painless rollback (downgrade) if there's a
major blocking bug for users.

------
krapp
FTP still works. FTP isn't 'broken.' This 'replacement' adds huge unnecessary
complexity and doesn't work on nearly as many servers as FTP does (which is
_all of the servers_ )

And yes, I have deployed with git, so i'm not speaking out of complete
backwards ignorance. I can still see a use for both.

~~~
viraptor
Actually it is broken by today's standards. No encryption, two connections,
nat issues, no standard (everyone does best-effort output parsing), etc.

It doesn't work on "all of the servers" either. All of my servers have ssh/scp
available and will never have ftp.

~~~
krapp
_No encryption_ What about ftps?

 _two connections_ Not a problem if you're only occasionally updating one site
at a time, though I'd agree it doesn't scale up the way git would.

 _nat issues, no standard_ valid points. I've never had issues with either but
I don't work on the kind of projects a lot of HN users do, so I really tend to
only care if it stripped the line breaks or not.

 _it doesn 't work on "all of the servers" either. All of my servers have
ssh/scp available and will never have ftp._ I stand corrected.

It is good to actually see the arguments against FTP at least.

~~~
viraptor
Not sure what's the current state of ftps deployment, but when I checked
mid-200X, it was still a very rare occurrence. Maybe it's a bit better now.

It's hard to find out from google, because they claim I want to look for "ftp"
rather than "ftps" and they're the same thing :/

------
crististm
Yeah, Python and other technologies are also 90's or even older - what is your
point? Proven technologies that work are belittled to promote today's agenda?

I would tend to dismiss this kind of articles and suggestions even if they are
OK - only because they promote by appeal to a fashion.

------
jfdi
I had written a very similar article a while back which includes some django
specific points and is also intended to be really simple to read thru:

[http://thomaswilley.com/?p=9](http://thomaswilley.com/?p=9)

------
cpa
And add some directives to your http server to not serve your .git directory,
too!

~~~
Tobu
I had good hopes when I saw `core.worktree`, but that was just cargo-culting.
The right way to do it is to set the worktree to an outside location.

------
chadfowler
A beautiful thing that this demonstrates is that no matter how old a concept
is, there's always room to explain it clearly so that those who didn't already
understand it have the benefit of finally being enlightened. I learned this
many years ago as an author, thinking that my most basic ideas weren't worth
writing down. It turns out that what's obvious to one person isn't obvious to
everyone else.

Thanks for the reminder and for the clear explanation of how git deploy might
work!

------
gboudrias
In the Drupal community, we're all stumbling over each other to find the best
Git deployment strategy. I'm surprised that Git deployment would still be news
for anyone. I don't know who this article can reach that's not already
competent enough to be using Git (at least for dev).

On the other hand, if your site is just static (HTML/JS) files, I think it
makes great sense to use Git to deploy, as there is no configuration to worry
about.

------
inthewind
Was looking for an email address for the author! And failed. Anyway just
wanted to comment - that there's no publish date attached to the article - or
one that is at least obvious. I have no idea when it was authored.

------
krallja
> (Remember, ^D is Control+D, or whatever your shell's EOT character is.)

Use a heredoc [http://tldp.org/LDP/abs/html/here-
docs.html](http://tldp.org/LDP/abs/html/here-docs.html)

------
hmind
And here I was thinking that this was obvious and most people used something
like Capistrano... It's shocking to me the amount of people not using some
sort of SCM-based deployment method. =P

~~~
jasonlotito
I'm more surprised that people aren't using something more reliable and
proven, like the package manager their system already uses. Why introduce
another piece?

------
bliker
I personally prefer git-ftp. It is common that ssh is still only luxury.

~~~
LoganCale
Seconding git-ftp. I use it almost daily and find it excellent for deploying
on shared hosting where ssh either isn't provided or is a hassle to set up.
git-ftp is an awesome, awesome tool.

[https://github.com/resmo/git-ftp](https://github.com/resmo/git-ftp)

------
clubhi
I'm from the 80s. I bet you want to replace me also.

------
nodesocket
Shouldn't it be:

    
    
       git --bare init
    

On the server?

~~~
tgasson
No, because it needs to be checked out (that's the deployment). As a usual git
server, yes --bare is the way to go. In this case you could alternatively use
--bare in a ~/repos/example.com dir and set the $GIT_WORKING_DIR environment
variable to ~/www/example.com for checkout.

------
miog
Balls of steel deployment

------
jackbauer
git push. _no no_ rather tag and checkout/sync etc

------
zobzu
make a tag, push the tag, at least.. at the very least :P

------
dutchbrit
Terrible idea..

------
stevewilhelm
Or use Heroku.

~~~
andyhmltn
That's no way to fix a problem

