

S3sync – Tool belt for managing your S3 buckets - clarete
https://github.com/clarete/s3sync
I started contributing to s3sync before hearing about the official amazon s3 tool written in python. However, after testing the official tool for a couple minutes, I decided to spend more time working on s3sync and here&#x27;s the result.<p>Thanks to Michael Grosser for his support and patches!
======
mfonda
How does this compare to the existing s3cmd [1] tool? One issue I've had with
s3cmd is it tends to use a lot of memory when syncing two large buckets. I'd
love to see a tool that was faster / more memory efficient than s3cmd. Would
be awesome to see some benchmarks and feature comparison in the README!

[1] [http://s3tools.org/s3cmd](http://s3tools.org/s3cmd)

~~~
clarete
So, basically, the main difference is the language they're implemented in. I
started working on s3sync cause I found its code easier to read and I really
needed to practice my ruby skills.

Although s3cmd is older and has more features, s3sync was designed to grow
stable and well tested. I'm definitely planning to write benchmarks to s3sync
and improve its performance as much as I can.

~~~
bender80
Thank you. Does this support IAM Roles?

~~~
clarete
Not currently! Please feel free to open a ticket about that! Thank you! :)

~~~
bender80
Ok here you go.

[https://github.com/clarete/s3sync/issues/15](https://github.com/clarete/s3sync/issues/15)

Wish I could help, but I don't know ruby :)

------
jacobsenscott
A tool that could do parallel downloads of small files would be a winner. We
have a lot of small files and use s3cmd. It downloads one at a time. You can
finagle it with some xargs magic, but it would be nice to have it built in.

~~~
atonse
Totally expecting to see someone build this in go in a couple weeks to do
exactly that sort of thing. :)

Wish I had the time, I just realized it would be a fun project to learn Go
with.

~~~
HeyImAlex
You could just modify s3cmd to use threads during uploads? Problem is IO bound
so the Gil isn't an issue and you wouldn't have to go through the trouble of
implementing s3's auth header.

------
kcorbitt
This is handy and extends S3's use case by making two-way sync
straightforward. My long-term storage is in Amazon Glacier; I wonder how much
effort it would take to extend this and make a straightforward process to pull
the data from Glacier, sync with S3 and then push back to Glacier.

------
afitz0
What does this provide that isn't already provided by the official AWS CLI?

[http://docs.aws.amazon.com/cli/latest/reference/s3/index.htm...](http://docs.aws.amazon.com/cli/latest/reference/s3/index.html)

~~~
clarete
Hi, thanks for asking! I definitely tried this guy before putting more effort
on s3sync. Unfortunately their cli experience is really poor and its error
report didn't help me understand why synchronizing my stuff with s3 was not
working.

In the end of the day, I had an unreadable file inside of the directory that I
was trying to back up. I found that out using `--debug` option of the official
client, but I couldn't actually continue copying the files cause of that
error.

When I tried with s3sync, as I expected, it just yielded a warning about the
single file I had with problems and kept working until my backup was done!

Sorry for the wall of text, I just think it's funny cause this exact question
came to my mind a couple days ago and that's how I answered my self! :)

------
rogueleaderr
Have you tried jets3t? ([http://www.jets3t.org/](http://www.jets3t.org/)) It's
got pretty comprehensive tools for managing S3 and has been rock solid for me
so far.

------
floodfx
You should check out this suite of S3 cmd line tools too -
[https://github.com/aboisvert/s3cp](https://github.com/aboisvert/s3cp)

------
lsb
What are the pros and cons of using s3sync versus s3cmd?

~~~
clarete
First things that come to my mind:

s3cmd: * might be considered more robust and battle tested as people already
commented here; * is written in python, so if you have a python environment
running, you might be more comfortable with it;

s3sync * Smaller codebase, might be easier to keep things simpler and well
tested, achieving the same stability with less time/effort; * Error reporting.
One of my main reasons to keep working on s3sync was its better error
reporting. I just go crazy when something wrong happens and I don't know why.
* It's in ruby, if you have a ruby environment and you don't want to add any
python dependencies, you might choose it

------
brryant
s3cmd is battle tested and hasn't had any memory issues for me. (100 buckets,
~40GB each bucket)

