
Use rsync to deploy your website - breck
http://breckyunits.com/code/use_rsync_to_deploy_your_website
======
kylec
If you're already using Git for version control, why not also deploy from it?
I personally use it to simultaneously push to GitHub and production and it
works very well:

[http://stackoverflow.com/questions/279169/deploy-php-
using-g...](http://stackoverflow.com/questions/279169/deploy-php-using-git)

~~~
dschobel
ditto (but with mercurial). then you have the bonus of having your entire
deployment configuration versioned for easy rollback.

------
gouki
I usually have a few files that I don't want on the server, or vice versa. The
option --exclude-from=ignore-files is useful to maintain a list of files not
be synced.

The file ignore-files would contain the _keywords_ or files to ignore, one per
line.

------
qjz
-a is already recursive, so -r is unnecessary. For most sites, it would also be wise to include --delete so renamed or locally deleted files would be removed on the destination.

------
peterwwillis
Here are some more useful options:

* "-C" to exclude all common revision-control working directory files. Lowest filter priority, consider the rest of your args.

* Look up backup rsync guides and add snapshot or "--backup-dir" support to your deploys for a cheap alternative to system snapshots and other deployment reversion techniques.

* "-E" and "-X" for when you just want to apply execute settings and not blow away whatever custom permissions you may have on your deployed files ("--chmod" is also helpful here)

* "--delete-after" and "--delete-excluded" to clean up files you've deleted from your source files.

* "--timeout=120 --contimeout=60" Because nobody needs to wait 5 minutes to find out their deploy failed.

* "--compress-level=9"

* If you have a huge deploy tree, it may be faster (though less reliable) to run
    
    
        find SRC -mtime -8 -print0 > files.txt && rsync OPTIONS --from0 --files-from files.txt DEST
    

This will make a list of only the past 7 days' worth of modified files to
attempt to rsync, which may cut down on the total time to deploy considerably.

* "--log-file=/big/partition/deploy.log" To get a better idea of how the deploy is going or how it went.

* Consider if you need --delay-updates to make the deploy more atomic. If files get deployed a few at a time, will this cause the user experience to suffer? Load as well as long file lists can cause long periods between updates.

* If you have more than one developer that can deploy to the site at the same time, consider that rsync should only be your file-transfer tool. You need a whole other layer to account for transaction locking, who is deploying or reverting files, and basic logging is very handy to track down problems. But that's a bit out of scope for this link :)

------
Vitaly
I cringe every time I hear something like this called a "deployment". This is
a lazy hack, not a deployment. You can't deploy with
ftp/rsync/put_your_own_tool_here sync. Well, you can, kind of, but you better
not.

A "proper" deployment must be (at least):

* Completely automated. Should be just a simple command line.

* Atomic. As in all the files are changed at once.

* Easily revertible.

If your are deploying a db-backed application, deployment process also must
manage db versioning (including possibility of rollbacks).

All of the above is trivially provided by Vlad or Capistrano. And since they
are super easy to use even for non Ruby projects, I fail to see the reason to
keep using ftp or rsync other then laziness.

I've seen some people using git for deployment, which is better then
ftp/rsync, but still lacks in atomicity and rollbacks can be quite tricky.

------
pan69
We just do subversion checkouts. Of course, we have the proper .htaccess rules
set up to prevent access to the .svn folders. Subversion also allows us to
commit database exports straight back into the version control system from the
server.

~~~
potatolicious
I personally do not feel secure at all with production boxen having the
ability to push back into repo.

I use svn export for deployments - all sorted by revision into static cached
directories, then symlinked to HTTP root.

This has the added advantage of being able to roll back to any previous
version instantly.

~~~
Legion
That's clever and I feel stupid for not having come up with it myself (the
keeping separate revisions and symlinking part). Thanks for sharing.

~~~
sjs
Don't feel stupid. Not everyone thinks of everything.

------
prodigal_erik
We've been putting our code in rpms and it works well. Every box knows exactly
what's deployed right now (along with dependencies and whether anything is
tweaked), local edits to config files can be preserved, and "ssh yum install"
gets a box from clean to production ready in under a minute.

~~~
peterwwillis
Wait - _preserve_ local edits? That sounds bad.

It is a handy tool for dealing with versioning and rollback. But i'd hate to
be the guy who deploys a hot fix and suddenly rpm is out of locker entries.

~~~
prodigal_erik
Renaming local edits out of the way or preserving them is up to you, depending
on whether you put %config or %config(noreplace) in the spec. It does need to
be handled carefully, but it can be handy if you want one machine giving
special treatment to a representative sample of your requests or something.

------
Xichekolas
I use this little bash script (as a daily cron job) to selectively mirror
directories on two hard drives in case I lose one (very poor man's backup
solution). I imagine it could be used for deploying code (modified for remote
machine of course):

    
    
      #!/bin/sh
    
      RSYNC="/usr/bin/rsync"        # Verify with 'which rsync'
      DIRS="/home/xich /etc"   # Directories to be backed up.
      TARGET="/mnt/secondhd/backup" # Directory into which all backups are placed.
    
      # For instance, if TARGET is /mnt/secondhd/backup and DIRS is "/home/user1 /home/user2"
      # then after running, there will exist a /mnt/secondhd/backup/user1 and
      # /mnt/secondhd/backup/user2. DO NOT use trailing slashes for any of these paths,
      # as that will change the behavior of rsync.
    
      LOGFILE="/var/log/mirror_hds.log" # For errors only.
    
      for dir in $DIRS; do
      	INEX=""
      	if [[ -e "$dir/.mirror_include" ]]; then
      		INEX="--include-from $dir/.mirror_include"
      	fi
    
      	if [[ -e "$dir/.mirror_exclude" ]]; then
      		INEX="$INEX --exclude-from $dir/.mirror_exclude"
      	fi
      	$RSYNC -av --delete $INEX $dir $TARGET &> $LOGFILE
      done
    

The .mirror_include and .mirror_exclude files are just newline-delimited lists
of file masks (they do what you would expect). I did it like this so each user
can modify his own exclusion/inclusion lists (the file belongs to the user),
and doesn't have to mess with the cron script (which belongs to root).
Inclusion takes precedence. As an example, my exclude file:

    
    
      .* # any file that starts with a period
      tv # my tv shows directory
      movies
    

And my include file:

    
    
      .mirror_include # since this would be filtered out above
      .mirror_exclude
      .conkyrc

------
noonespecial
Recommend _-e ssh_ with keys as well. Safer, no rsync port hanging out.

~~~
pingswept
A good point, but I believe that most recent versions (since around 2004, I
think) of rsync now use ssh by default, so the -e flag is not needed to use
ssh.

~~~
noonespecial
Ouch, my age is showing. Again.

------
javert
I do this too, and it's very convenient. Especially if you have multiple
websites.

I actually put the rsnyc command in a Makefile so I can update by typing
`make`.

I have a folder in my ~ directory which contains a folder for each of the
remote machines I work with. I can update any remote machine by going into its
folder and typing `make`.

------
steveklabnik
I dunno, this is fine for small projects, but pretty soon, something like
Capistrano works much, much better.

~~~
mcantor
Could you perhaps explain why?

~~~
mey
Automating rollbacks, re-huping associated services/servers, rolling out
database changes, or coordinating across a cluster of servers.

------
webology
We recently changed our deployment model from an all Capistrano model to a
Capistrano plus rsync model. Our old model would take 20 to 45 minutes to
update every server in our farm. Our new model updates one staging /
deployment server via Capistrano then rsyncs to each of our production nodes.
This process normally takes less then a minute and spikes to a few minutes for
really large changes. We still have the ability to rollback code and this
model is actually quicker at fixing bad deployments then our old model was.

A majority of our time before was spent updating both our codebase then
checking external dependencies on each server. Rsync made this process much
quicker since these checks are completed once then pushed to each server.

------
Nycto
rsync works really well. It's what we use at our company. One trick we have
considered implementing is pointing the HTML root to a symlink that points to
the current release. When you rsync, create a new directory named after the
version. After you have verified the transfer, flip over the symlink to point
to the new code base. Doing it like this will make quick roll backs easy and
protect you from interrupted transfers or users hitting your site mid upload
(it has happened... we spotted an anomaly in the error logs and our "wtf"s per
minute shot through the roof).

------
DrJokepu
I love rsync, use it for deployment all the time. However, it is quite a pain
to set it up for Windows. All the Windows rsync ports I know of run on top of
Cygwin and Cygwin changes the permissions of the files to the Windows
equivalent of 000 all the time. It is possible to get around this but not
entirely straightforward. I'm seriously considering writing a step-by-step
quick guide for setting up an rsync daemon on Windows as it is not trivial at
all and it might save some time for other people.

------
riobard
My solution using git:

    
    
        #!/bin/bash
        # run in the project root 
        git add -A   # track added files
        git commit   # local commit
        git push     # push to remote bare repository
        ssh REMOTE_HOST 'cd PROJECT_FOLDER; git pull'  # expand a working folder from the remote bare repository; usually public www folder is contained there
    
    

extra benefits: two copies of complete history for backup local and remote

~~~
cloudkj
How do you ensure that the .git directory isn't accessible?

~~~
Pistos2
In my case, it's never part of the publicly-served tree. public/ is a subdir
within the repo.

~~~
riobard
I have a public www folder too if I only want to open up part of the whole
repo. But since there is usually no sensitive data in a public-facing repo
anyway, I don't care if people can access .git or not. They may clone it if
they want! :D

------
njharman
I use this for simple, single server, small codebase sites.

But for various reasons (mainly rollback, history and consistency(rsync takes
time, time in which your site has files from different versions) I much prefer
the "schlep entire codebase into new directory and then switch symlinks when
your ready to go live". Such as Fabric, Capistrano, or your custom deploy
scripts do.

"schlep" is the technical term for scp/rsync/checkout/etc.

------
cmelbye
Wikipedia (and its sister projects) uses rsync to deploy new code updates and
configuration changes to the production cluster.

------
dangrossman
Springloops (<http://www.springloops.com>) hosts my Subversion repositories
and can also be set up to deploy each repository to a set of servers, either
manually or automatically upon each commit.

------
garnet7
In the article, the author seems to be switching their use of trailing
slashes. At the top, (paraphrasing) it's

    
    
        rsync -arvuz /src/foo /dest/foo/
    

but at the bottom it's

    
    
        rsync -arvuz /src/foo/ /dest/foo
    

Which is correct?

------
mark_h
I've been doing this using fabric, which provides the rsync_project built-in:
<http://docs.fabfile.org/0.9.0/api/contrib/project.html>

~~~
idebug
is that a bit like puppet?

------
anamax
How robust is rsync? How does it report failures? (Is that one line inside
something that tells a human that something went wrong?)

What about partial successes?

~~~
joubert
We use rsync to deploy all our front-end code/files.

It is pretty robust, but, syncing multiple files is not an atomic operation
across the batch as a whole. So if one file fails you do get an err report.
But the other files would have been synced. Once you fix whatever the problem
is and you re-run your script, that one file will then be synced.

In practice, rsync is really great for deploying. It is also neat for backups
- in fact, before I got my Mac and started using Time Machine, I used rsync
for backups (and in fact used rsync.net for offsite backups).

------
terrellm
I use a Rakefile with a deploy task that uses rsync to send the HTML to our
server and s3sync to send our images to Amazon Cloudfront.

------
nick007
nice tip... love it

