
Rclone: rsync for cloud storage - dcu
http://rclone.org
======
planetjones
Looks really good. I am pleased more projects are adding Google Cloud Drive
support now. What I really want to do is:

\- create documents on my Mac which autosync to Cloud Drive in encrypted
format (this should tick that box)

\- be able to access said documents on any device including iOS, which
transparently handles the encryption

The use case is I now scan all my documents into PDF format, but keeping them
secure and accessing them on iOS seem to be almost mutually exclusive.

I looked at some other solutions for this which had their own iOS app and
security mechanism (Boxcryptor mainly) and I didn't like it - I just didn't
feel in control. And I got nervous about what happens if Boxcryptor goes
under; I don't want to rely on them keeping their app up-to-date to read my
documents.

I know Apple will never allow it but wouldn't it be nice to be able to mount
your own network drive which all apps could access.

~~~
keville
Apple's iCloud Drive has a decent web interface for the non-Apple subset of
"any device", and otherwise seems to offer what you want. Curious if you had
other reasons it doesn't meet your needs.

~~~
sneak
Apple either has or has realtime access to your encryption keys for iCloud
Drive.

~~~
mwfunk
I thought Google Drive did as well, is that not the case?

~~~
RachelF
No. Google Drive data is encrypted when being transferred, but not at rest.

My guess is that Google does not like encryption as it prevents de-duplication
between users.

For end-to-end Google Drive encryption you'll an app like BoxCrypter or
SyncDocs.

~~~
sneak
Based on what I have read about Google's internal services, I don't believe
your assertion to be true.

~~~
gcr
The fact that Google can offer fulltext search of documents implies they
cannot be encrypted, because Google can read them.

~~~
skybrian
That depends on what you mean by encryption. The data can still be stored in
encrypted files where Google has the key(s).

[https://cloud.google.com/security/encryption-at-
rest/#encryp...](https://cloud.google.com/security/encryption-at-
rest/#encryption_of_data_at_rest)

~~~
gcr
By some definition, accessing gmail over TLS counts as "in-flight encryption"
too.

Encrypting when the organization has the keys is only effective when accessing
the ciphertext is easier than accessing the key material.

I'd prefer a solution where Google could only store encrypted blobs with no
access to my keys and all decryption happens on the client device.

------
kgtm
Unfortunately, it appears that binary diffs are not supported.

This is a really important aspect for many workflows dealing with large files
(like TrueCrypt containers). Contrary to what is stated by the rclone
developer [1], at least Dropbox supports binary diffs [2].

This should be looked into, at least for Dropbox.

[1] [http://rclone.org/faq/#why-doesn-t-rclone-support-partial-
tr...](http://rclone.org/faq/#why-doesn-t-rclone-support-partial-transfers-
binary-diffs-like-rsync)

[2] [https://www.dropbox.com/en/help/8](https://www.dropbox.com/en/help/8)

~~~
lloeki
For the curious, the rsync algorithm is, at its core, ridiculously small[0]
and involves rolling checksums. I just keep wondering why nobody (but Dropbox)
bothers to implement it.

[0]:
[https://github.com/lloeki/rsync/blob/master/rsync.py](https://github.com/lloeki/rsync/blob/master/rsync.py)

~~~
Chris2048
I believe gdrive allows you to store file properties:

[https://developers.google.com/drive/v3/web/properties](https://developers.google.com/drive/v3/web/properties)

I wonder if you could add hashes and checksums to files in this way? The trick
would be adding the server-side app service manage these.

------
SEJeff
FWIW: tarsnap is also rsync for cloud storage and Colin (guy who founded and
runs tarsnap) also has won a putnam award for his work in mathematics and
crypto.

[http://www.tarsnap.com/](http://www.tarsnap.com/)

~~~
dsacco
_> > Colin (guy who founded and runs tarsnap) also has won a putnam award for
his work in mathematics and crypto._

Colin won the Putnam as an undergraduate student. The Putnam award is not a
mathematics research award like the Fields Medal or the Abel Prize. It's a
mathematics competition. As such, Colin didn't win the award for any
particular work.

It's still quite impressive though. I would say Colin's work developing scrypt
has more applicability to cryptography than his Putnam award.

~~~
SEJeff
Sorry you're entirely right. I just can't get over how unbelievably amazing
this is:
[https://news.ycombinator.com/item?id=35079](https://news.ycombinator.com/item?id=35079)

One of the best comebacks in HN history.

~~~
jdeibele
Thanks for the reference. Interesting to see Drew Houston popping in because
he was just starting getdropbox.com

------
niftich
This fills a real need for me. It does nearly everything I want.

Aside from the program itself, your documentation is really good, and special
+1 for documenting the crypto thoroughly (and another +1 for using NaCl's
building blocks in a safe way).

As a related point, I recently bought a Chromebook (still unopened), which
pushes you heavily towards storing your files in Google Drive. It makes me
uneasy to store certain things unencrypted, so I'll investigate writing a
compatible implementation for ChromeOS.

------
DanielDent
Rclone is great. I wrote an integration to use it with git-annex
([https://github.com/DanielDent/git-annex-remote-
rclone](https://github.com/DanielDent/git-annex-remote-rclone)).

Some of the supported providers (e.g. Amazon Cloud Drive) have a reputation
for days-long service outages. Some users of Amazon Cloud Drive have even
reported files going missing on occasion.

But the great thing with git-annex is you can have your data on multiple
clouds (in addition to being on your own equipment), so partial or complete
loss of a cloud provider does not need to result in availability or durability
issues.

~~~
rsync
Does the git-annex integration run over SSH ?

I'm trying to find a recipe for Rclone to connect to rsync.net since it won't
work over the usual path (rsync over SSH) ... but we do support git-annex so
...

Rclone+gitannex, over SSH, to rsync.net ? Want a free account to try it ?

~~~
DanielDent
I'd be happy to give it a try. I suspect rclone is the wrong tool for the job,
but I haven't looked closely at the rsync.net service.

------
cobbzilla
Very cool program.

s3s3mirror [0] is another tool for copying data between S3 buckets or the
local filesystem. full disclosure: I am the author.

At the time I wrote it, I only needed to work with AWS, and needed something
very fast to copy huge amounts of data. It works like a champ, but I do think
about what it would take to make it cloud-independent; it wouldn't be easy to
maintain the performance that's for sure.

[0]
[https://github.com/cobbzilla/s3s3mirror](https://github.com/cobbzilla/s3s3mirror)

~~~
codegeek
For s3 and local filesystem, couldn't you also install aws-cli and do a sync
b/w the 2 ? I do that for my stuff. Any reason you wrote this and not use aws-
cli ?

~~~
cobbzilla
from the readme:

"I started with "s3cmd sync" but found that with buckets containing many
thousands of objects, it was incredibly slow to start and consumed massive
amounts of memory. So I designed s3s3mirror to start copying immediately with
an intelligently chosen "chunk size" and to operate in a highly-threaded,
streaming fashion, so memory requirements are much lower.

Running with 100 threads, I found the gating factor to be how fast I could
list items from the source bucket (!?!) Which makes me wonder if there is any
way to do this faster. I'm sure there must be, but this is pretty damn fast."

~~~
diggs
This approach works well enough for relatively small amounts of objects. Once
you start getting in to the millions (and significantly higher) then it begins
to break down. Every "sync" operation has to start from scratch, comparing
source and target (possibly through an index) on a file by file basis. There
are definitely faster ways of doing it that scale to much larger object
counts, but then they have their own drawbacks.

It's a shame the S3 Api doesn't let you order by modified date, or this would
be trivial to do efficiently.

~~~
cobbzilla
I'm curious if you can share how to synchronize N files without doing at least
N comparisons.

the main innovations in s3s3mirror are (1) understanding this & going for
massive parallelism to speed things up and (2) where possible, comparing
etag/metadata instead of all bytes.

so far, it has scaled pretty well, i know of no faster tool to synchronize
buckets with millions of objects.

~~~
diggs
Sorry, I should have perhaps put a disclaimer in my original comment. I work
for a company called StorReduce and built our replication feature* (an
intelligent, continuous "sync" effectively). We currently have a patent
pending for our method, so I'm not sure if I can offer any real insight
unfortunately.

I haven't looked at your project, but based on what you've said I agree the
way you're doing it is conceptually as fast as it can be (massively parallel
and leveraging metadata) whilst being a general purpose tool that "just works"
and has no external dependencies or constraints.

* [http://storreduce.com/blog/replication/](http://storreduce.com/blog/replication/)

------
forgotpwtomain
How does this compare to duplicity (
[http://duplicity.nongnu.org/](http://duplicity.nongnu.org/) ) ?

~~~
reidrac
rsclone operates like a limited rsync between different cloud providers (and
local filesystem too), so either copy files or mirror two directories;
duplicity does incremental backups.

~~~
mshook
I read in duplicity's man page that "In order to determine which files have
been deleted, and to calculate diffs for changed files, duplicity needs to
process information about previous sessions. It stores this information in the
form of tarfiles where each entry’s data contains the signature (as produced
by rdiff) of the file instead of the file’s contents."

Where are these tarfiles stored? The cloud?

BTW rclone checks size/timestamp and/or checksum to determine what to upload,
the same way rsync does. So you don't have incremental "snapshots" the way
duplicity does.

~~~
zzzeek
yes, and these tarfiles are in an entirely difficult format, have to be
pruned, are the source of many mysterious bugs, and I've even had to write a
"synthetic backup" script of my own to automatically turn a large list of
"incremental" backups into a "full" backup every night. duplicity has been
doing the thing I need but I spent weeks writing tools around it. Even though
duplicity is written in Python, it invents a lot of its own Python
installation idioms and also is not organized to provide a usable API, so
scripting for it pretty much means you have to exec it.

"encrypted rsync to S3" is all I ever wanted in the first place so very much
hoping this can replace it.

------
estefan
This looks awesome. I've made several attempts at something that could write
encrypted files with obfuscated file names to several backends but never ended
up with something I was happy with.

I'll definitely give this a try.

Edit: One feature I would like would be to split files into n chunks to
obfuscate the length of files (assuming it wasn't obvious which chunks go
together to make up a file), so instead of a 1:1 relationship there was a 1:n
for large files. I suspect this is a lot more work though...

~~~
brotherjerky
You may be interested in CryFS:
[https://www.cryfs.org/](https://www.cryfs.org/) \-- it splits files in order
to obfuscate sizes and directory layout.

~~~
estefan
Thanks! I'll check it out.

------
schlowmo
Looks promising, but I'm not sure about the crypto-part. Can someone give some
notes about the security of NaCl Secretbox using Poly1305 as authenticator and
XSalsa20 for encryption?

Is it justified to assume that this is adequate crypto as long as the nonces
are choosen correctly (= as random as possible) and the keysize is bigger than
128bit (rclone uses 256bit key derived from user password)?

Documentation of the crypto part can be found here:
[http://rclone.org/crypt/](http://rclone.org/crypt/)

EDIT: added constraint regarding keysize.

~~~
Osmium
> Can someone give some notes about the security of NaCl Secretbox using
> Poly1305 as authenticator and XSalsa20 for encryption?

(Speaking as an unqualified outsider) Both Poly1305 and Salsa20 are creations
of Daniel Bernstein / djb, who seems about as highly respected as you can be
in the crypto community. And NaCl, the library they use that implements them
(also by djb), is often highly recommended as a 'good' crypto library to use.

That said, it does go against the usual advice not to trust code from people
who make their own encryption rather than using existing standards, but maybe
this is the exception?

There was an article recently with some good commentary about how uneasy some
people feel with how much of modern crypto being used in production is coming
from relatively few people, including djb, but I can't seem to find it now...

Older column by Aaron Swartz on djb:
[http://www.aaronsw.com/weblog/djb](http://www.aaronsw.com/weblog/djb)

Relevant tptacek comment:
[https://news.ycombinator.com/item?id=705165](https://news.ycombinator.com/item?id=705165)
(no idea if this is still valid)

~~~
niftich
Great summary. The ' _don 't roll your own crypto_' argument is mostly just
shorthand to ' _defer to the opinion of experts, use ready-made constructs
when possible, and if not, then exercise caution when hooking crypto
primitives together in unproven ways_ '. djb is without a doubt a crypto
expert and his NaCl library provides sane defaults and good interfaces for
implementing crypto in your application.

The other relevant tptacek post is 'Cryptographic Right Answers' [1], which
suggests using the NaCl default for encrypting (ie. Secretbox [2][3]), so the
rclone author is deferring entirely to NaCl for crypto, as it's recommended.

[1]
[https://gist.github.com/tqbf/be58d2d39690c3b366ad](https://gist.github.com/tqbf/be58d2d39690c3b366ad)

[2]
[https://godoc.org/golang.org/x/crypto/nacl/secretbox](https://godoc.org/golang.org/x/crypto/nacl/secretbox)

[3]
[https://nacl.cr.yp.to/secretbox.html](https://nacl.cr.yp.to/secretbox.html)

------
mafro
Neat. I wrote my own hacky little Python app to upload to dropbox, but they
recently broke that with changes to the dropbox python library. I hadn't
bothered to fix it :)

I'll check this out instead - thanks for sharing OP.

------
ashayh
If you use Amazon Prime, you also get unlimited AWS cloud drive storage for
photos. rclone works well backing up all your photos to Prime.

------
rsync
Will this work with any remote host over SSH ? All of the example targets (S3,
google cloud, etc.) are things that you _can 't_ rsync to.

That is, can you point it at rsync.net (or your own server that is only
running ssh) ?

If the author is here, please email us (info@rsync.net) if you'd like a free
account to test with.

~~~
0xmohit
> Will this work with any remote host over SSH ?

No. Quoting from notes.txt [0],

    
    
      Ideas
    
        * support
            * rsync over ssh
    

[0]
[https://github.com/ncw/rclone/blob/master/notes.txt#L25](https://github.com/ncw/rclone/blob/master/notes.txt#L25)

~~~
rsync
Yes, I suppose rsync might be a good protocol for "rsync for cloud storage" to
support.

That's a good feature request.

~~~
Dylan16807
It exists to give you rsync to things that aren't rsync. If you want to rsync
to things that are rsync, use rsync.

I don't understand the snark.

~~~
ktta
I think the parent meant to say that this could be an all-in-one solution.

------
schlowmo
Does anyone know if rclone preserves Linux File Permissions regardless of the
cloud storage?

It's not in the feature list and my guess is that this would be hard to
implement if you can't take assumptions of the underlying file system.

~~~
reidrac
No, it doesn't. Only timestamps on files are preserved.

It could store that information as metadata, like it does with the timestamps;
and then restore if supported by the destination file system when
copying/syncing files back.

------
skrowl
I'm not sure that syncing to cloud is really the best for most personal users,
at least not anymore. If you have multiple devices and use SyncThing / etc to
sync between them, you're protected against device loss and damage without
having to put your personal files on a server controlled by someone other than
yourself.

I was about to install it anyway, but I saw that it doesn't have bi-
directional sync. If 3 people at work shared a google drive folder and they
all tried to sync to it, it sounds like whoever synced last would always win
and it could potentially delete / alter some file.

------
kolp
I have this running on a Raspberry Pi and it's working 24/7, uploading media
files from my NAS to my Amazon Cloud Drive.

I use Kodi on a separate pi to stream the content from Amazon, thereby freeing
up space on my NAS.

~~~
vivab0rg
Which Kodi add-on do you use to play content from Amazon Cloud Drive?

~~~
kolp
This guy has 2 plugins - you need to download the test/beta one - it supports
ACD. The others support the other cloud services except ACD.

[https://github.com/ddurdle/Amazon-Cloud-Drive-for-
KODI](https://github.com/ddurdle/Amazon-Cloud-Drive-for-KODI)

------
NikolaeVarius
Does anyone else work mainly with Linux but use Google Drive?

95% of stuff I work with is Linux but that last 5% is done in Windows for
work. I use Google Drive but the lack of syncing is really annoying. I also
have a NAS that runs Linux that I would love to use to sync my GDrive/Amazon
Drive to.

I've been brainstorming ideas including but not limiting to seeing if I could
use W10 IOT for the RPi and install Drive on there (Pretty sure its
impossible).

It boggles my mind there isn't a elegant solution to this that doesn't require
me to pay for a service.

~~~
gcb0
instead of thinking of a solution that is cross platform and works on your
walled-garden devices, you want to bring a walled-garden solution to your
other platforms?

~~~
NikolaeVarius
Im open to anything but would prefer to integrate with what I have. Do you
have an alternative?

------
alpb
Worth mentioning Google Cloud Platform supports copying/syncing buckets from
S3.

[https://twitter.com/gregsramblings/status/770804483798421504](https://twitter.com/gregsramblings/status/770804483798421504)
[https://cloud.google.com/storage/transfer/create-manage-
tran...](https://cloud.google.com/storage/transfer/create-manage-transfer-
console)

~~~
crazy1van
I'm considering using this google service to backup an s3 bucket to google
cloud.

Does anyone have experience for how fast this transfers and is there any info
about how efficient the service is in terms of API calls? With a bucket with
millions of objects, needless extra calls to List or Get can really add up to
a ton of money.

------
swinglock
Has anyone had success with Amazon Drive? 60 USD for unlimited storage or just
12 USD unlimited storage using stenography is hard to beat. If it works better
for backup than Backblaze or Crashplans terrible clients and horrid
performance it would be a good alternative.

~~~
Veratyr
It works but it'll unpredictably throttle you if you utilize it too heavily
(I'm talking writing at least hundreds of gigabytes).

If you don't abuse it it works great.

~~~
swinglock
When you sell "Securely store all of your photos, videos, files and documents"
in a world where drives holding multiple TB of files has been available for
low prices for years, then using hundreds of GB certainly isn't abuse in my
book.

Are we talking temporary throttling after having transferred hundreds of GB in
a short time span (hours? days? do you know how fast they allow you to
upload?) or throttling more or less forever once you store just hundredes of
GB?

~~~
Veratyr
> Are we talking temporary throttling after having transferred hundreds of GB
> in a short time span (hours? days? do you know how fast they allow you to
> upload?) or throttling more or less forever once you store just hundredes of
> GB?

The former, from the people I've heard who run into it. These people are
generally uploading tens of terabytes however.

------
bsg75
I have been using this successfully with Google Cloud Storage and our own
internal Swift object store.

For the latter, it uploads much faster than the shell scripts I had been
using, and it has similar utility as an "rsync for the cloud".

~~~
tdumitrescu
I'm looking at GCS syncing solutions right now - was there a specific point
that made you choose this tool over gsutil's builtin rsync feature?

~~~
bsg75
I was using rclone with Swift first, so it was a natural choice.

If I was only using GCS, I might default to a Google supplied utility, but I
have not compared the two.

------
tgarma1234
I genuinely don't understand the use case for this since, for example, Dropbox
already just syncs the changes you make to a file and not the whole file,
automatically and does so bidirectionally, which this tool does not. So, if
anyone can help state more clearly what this is adding over and above the
features that the various cloud storage vendors already provide I would
benefit from the explanation.

~~~
niftich
Arguably the most distinguishing benefit is transparent en-/decryption of the
contents of your sync-set, such that the cloud copy is always encrypted by
your key that's only available at your endpoints --- this is unrelated to
being encrypted with a vendor-controlled key.

EDIT: Also, most 'official' cloud drive clients place a folder into your
homedir, like ~/Google Drive or ~/OneDrive or ~/Box Sync. Only files placed
into these folders get synced up and down. This client allows arbitrary local
paths to be synced up or down.

~~~
tgarma1234
Thanks for the comment and the edit.

------
sengork
It would be interesting to set this up on FreeNAS. Developers should look into
providing this as a plugin to FreeNAS users.

~~~
victorhooi
I second that as well. The plugin/jail system in FreeNAS 9 is pretty terrible
though. Hopefully the Docker system in FreeNAS 10 will make it easier.

------
nypar
I have used it with a cheap VPS (Ovh. I will test it soon with Scaleway) and
it worked fine transferring data between Google Drive and Amazon drive. ;) ps:
I did not tested it with encrypted files as it is a few weeks ago option. pps:
see also reddit datahoader board for examples. ;)

~~~
xrjn
Have you played with Online's bucket service, C14 [0]? They seem to have the
lowest pricing on the market.

[0] [https://www.online.net/fr/c14](https://www.online.net/fr/c14)

~~~
jo909
While not priced per GB used, hubic.com is the cheapest storage I know at 50€
per year for 10TB in the biggest package. Uses the Openstack Swift API.

Edit: "unlimited" Services are even cheaper on paper of course, but i don't
really trust those.

~~~
nypar
Hubic by OVH has problems right now see their official message in their Hubic
board. ps:never tried c14

------
JohnGB
Does rclone support multi-threaded/multi-processing in a similar way to how
gsutil support it via the -m option? As a clarification I'm referring to an
equivalent to "gsutil -m rsync ..."

I haven't been able to find anything in the documentation mentioning this

------
dpc_pw
I know it. :) I use i to off-site my asymmetrically encrypted backup (created
with [https://github.com/dpc/rdedup](https://github.com/dpc/rdedup) ) to
Backblaze B2 .

------
allstate
Will this work with two different s3 buckets (2 different regions)?

~~~
ratsmack
I believe they address this in the FAQ:

[http://rclone.org/faq/](http://rclone.org/faq/)

~~~
allstate
Sorry. I couldn't find. Since it effectively downloads the file and uploads
again i guess it should work.

------
topranks
Been backing my music up to Google Drive with this in a cronjob for past year.

Works a treat.... highly reccomended!

~~~
fowl2
Does uploading music to Google Drive work as an ingress point for Play Music
(and it's 50k free song storage)?

Their uploader is... something.

------
iDanoo
I use this to regularly back up to Google Drive, once you link it up, it's
seamless!

------
blackfede
use this in production since one year, installed on a synology nas to backup
on ovh storage. please get the github version as the download on the website
is quite different

~~~
0xmohit
> please get the github version as the download on the website is quite
> different

Both the website and github seem to feature the same release (v1.33 as of
now):

[http://rclone.org/downloads/](http://rclone.org/downloads/)

[https://github.com/ncw/rclone/releases/latest](https://github.com/ncw/rclone/releases/latest)

~~~
blackfede
glad to know, as I used for a long time a really limited version :-)

------
totophe
Exactly what I was looking for! Thanks!

------
anfroid555
No softlayer?

------
HalfwayToDice
Another thumbs-up. I've been using it to mirror to Amazon Cloud Photos (now
called Amazon Drive I think) and it's rock solid.

