Hacker News new | comments | ask | show | jobs | submit login
Replace Dropbox with BitTorrent Sync and a Raspberry Pi (minardi.org)
207 points by doctoboggan on July 19, 2013 | hide | past | web | favorite | 87 comments

I really wish they would open source the software. I'm using it in my local network for some unimportant data and it works fine, but I wouldn't trust it with anything remotely important unless I can see the source.

Yes I agree. I am hoping someone reverse engineers the sync protocol so we could build an open source client.

But like I mentioned in the article I am using this to replace a closed source cloud solution I was previously using so it is a step in the right direction.

Writing a new protocol with an open source reference implementation will probably be a better idea, because:

1) Reverse-engineering isn't clean and can be circumvented.

2) BTsync has not reached a point where it can't be beaten by a competing protocol.

We're actively developing an open source Dropbox type app that's powered by Tent (https://tent.io).

you should have a landing page with an email sign up form that you can link people to.

Tahoe-LAFS is a mature project of this nature: https://tahoe-lafs.org/

I am writing a similar project which is fairly immature: https://github.com/cryptosphere/cryptosphere

The particulars of the BT protocol are beyond me, but I thought it would already do a compare between the copy you had and the one on the external server, and only send the relevant bytes to bring everything up to sync again.

Between that and a "compare timestamps to figure out the newest file" I wouldn't think it would be that hard to re-create? What am I missing here?

BitTorrent creates a hashes of your file. The file is broken down into certain size chunks and hashes taken of each chunk. If you already have a piece with a given hash, you won't download it. If a hash differs, it'll be transmitted. The receiver verifies the hash. If it fails it re-requests that piece.

I assume the sync protocol works the same. It hashes the file, checks if any chunks have changed, sends the hash meta-data to the recipient, who then requests and pieces that have changed.

What about hash collisions?

SHA-2 has no collisions found as of yet [1]. The collision attacks on md5 required a lot of junk data, so not sure how much of a problem that would be.

[1]: https://en.wikipedia.org/wiki/SHA-2#Comparison_of_SHA_functi...

Steve Gibson of GRC/Security Now has mentioned (if I remember correctly,) that the protocol/spec will be published, or at least shared with people who have an interest.

That's better than nothing, I suppose, but unless the source is open it's still impossible to trust the proprietary binary.

impossible to trust the proprietary binary

If the protocol is published properly you won't have to: people will write their own free software implementations.

If the spec is published it should be a lot easier to build an open source client.

Take a look at git-annex. It's not quite an apples-to-apples comparison with btsync, but I find that it meets my needs quite well.

I'm using git-annex-assistant right now and while it is really nice and promises to replace dropbox for me, it would still use a lot more loving. There seem to be some weird states that you can easily get it into (I've had issues with the files syncing properly but the webapp claiming that the syncing was failing (or rather, that item is colored red, has an "!", and doesn't say anything else), it does not handle very large numbers of small files well (something git by itself does well),and a few other assorted problems).

Looks like the developer is going to be giving it more loving full time for the next year -- https://campaign.joeyh.name/ almost funded.

Yup! Joey is dedicated as hell which makes me really optimistic about the future of git annex.

Are you implying you actually investigate every piece of code you are running right now? I doubt it. And to do it properly you'd also have to review all hundreds of thousand of lines of code needed to build and run this source else you'll never know if there's a backdoor somewhere. Oh man, and what about your bios? God knows what lowleve packet sniffer is setup in there :]

Just kidding. Partly.

Then how would you know they were using the same source in their binaries?

Why use their binaries? I rather have Debian compile mine.

I wish BitTorrent Sync had an option to sync encrypted data with relatively "untrusted" nodes, so you could throw it up on a VPS or whatever without worrying too much about the how secure the server is.

Kind of like tarsnap or Tahoe-LAFS, but auto-syncing, etc like Dropbox.

The first open source project to give me Dropbox-like ease of use/functionality with strong encryption will win my support.

git annex assistant is slowly getting to that point. see http://git-annex.branchable.com/assistant/

I'm using BTSync with EncFS to sync encrypted data - including on some machines which don't have the encryption keys or in one case a machine which doesn't even have EncFS installed. (This is a trick which works to secure Dropbox/GDrive/SkyDrive stored data too).

A workaround is to run encfs in reverse mode, which exposes an encrypted mount point of an unencrypted path. You could then add this mount point to BTSync.

(Accidentally replied to the wrong parent.)

I put encrypted files in Dropbox using encfs. I imagine the same would work for this. It does require one tiny extra step tho. But not enough of a burden for me to care.

Remember, first I have to trust the untrusted BitTorrentĀ® "node"...

BTSync continues to be a very interesting project but I remain concerned about how vulnerable the shared secret is to bruit-force attack.

I'm not clear what there is to stop someone iterating through all combinations (especially as many will use dictionary-based secrets) to discover shares. Additionally, the master server BitTorrent run to broker the connections presumably also has all of the shared secrets and is vulnerable to attack.

I'm really excited about BTSync and have been proud to be in the early beta program but these security issues don't appear to be addressed.

I would not believe that the central server has all of the shared secrets. One possible solution is to hash the shared secret and send it to the central server. When someone wants to connect, just hash the key given and send it to the central server. You would get back a list of peers and then connect to them using encryption. The only problem is possible brute forcing of hashes. There are probably other ways of doing it.

Yes, I hope they hash the secrets (and I guess someone could monitor the network traffic from the client to tell).

Also, if someone did have the master list of hashed secrets, they might still be able to manipulate their own client to send the hashed secret back to the server and gain access.

Being closed source, it's hard to know what the potential vectors are (granted, Dropbox is also closed source).

You are correct that it's hard to judge security when the application is closed source. However the in the model we are suggesting, having the hashed secret would not be sufficient to get access to the files. Although you could use it to find client's IP addresses you would not be able to connect with it. The reason being that the secret key would be the base of any encryption between clients.

i don't understand. the faq says the secret is a generated 20 byte sequence. if we assume it's random then that's 160 bits, which is too expensive to brute force if it's used for a symmetric key (and it sounds like it is). and i don't see how dictionary attacks are relevant since it's a machine generated random chunk, not a small set of dictionary words.

so can you clarify what you're worried about?

BTW, you can set the secret to a much larger value than the default. This should make even brute forcing very unlikely.

Not sure I see the point. Or the complaints people are having here.

The author really should have a backup plan in place. He still doesn't, even though this software lets him sleep easy at night. He's still riding the edge of disaster if a file ever becomes corrupted and then synced. Ut oh. Incremental backups, folks. It's really not that hard.

On the open source side of things, Dropbox and BT Sync aren't really that interesting. You could whip up similar functionality with a bash script, inotify, and rsync in a matter of hours. Dropbox is only relevant because they give you off-site always online dumb storage. There will never be an open source or "trustworthy" solution to a cloud storage service. This Raspberry Pi idea is pretty neat, though. But local encryption is always the answer.

Also, if you have important data that should be encrypted, then it's a pretty stupid thing to have it synced all over the place. You don't want every device acting as a possible point of compromise. Principle of least privilege, and all that.

The article is about setting up your own simple syncing service for non-mission-critical data like notes. I don't think you see the point 'cos:

    a) You didn't read the article.
    b) Tinkering with technology to do fun things is of no interest to you.

I read the article. The author started by whining that he lost months of notes. His solution was to beef up his syncing. I still have no idea how he came to that conclusion when backup technology has existed since the dinosaurs. I guess that technology just isn't new or exciting enough to blog about.

I think the author really just wanted an excuse to use his Raspberry Pi.

I do love to tinker. But I don't do foolish things like replace backups with fragile dogshit.

Author here, I've had plenty of excuses to use my Pi in the past, this is just the latest one. I think you are incorrect in characterizing this as "fragile dogshit", this solution is in fact much more robust than my previous cloud solution.

BTSync also does built in versioning which solves the backup issue.

> BTSync also does built in versioning which solves the backup issue.

So for example, if a person was working on a thesis document and then accidentally overwrote that with a blank version which gets synced across all the nodes, it would be possible to reverse this action and recover the original thesis document ?

While I haven't exercised this functionality yet my understanding is yes that is possible. You can specify the number of days to store history. (defaults to 30)

I don't believe that is the case. I think it only stores deleted documents in the .SyncArchive folder, and does not log modifications.

Of course running a cronjob on your raspberry pi to do incremental backups would be a fun way of tackling that problem.

most here would keep his/her document versioned in subversion or git. I don't think technically minded individuals don't confuse syncing and backups (or versioning). This however does not apply to non-technical people. That is where dropbox is a win, since they do versioning.

I use AeroFS (https://aerofs.com/) on a lowendbox VPS. Pretty cheap and works well for me. Will have to give BitTorrent Sync a try though.

I've tried to replace Dropbox with BT Sync few weeks ago. It didn't work up to my expectations.

I write code on several computers and I use Dropbox to keep the files in sync so that I am able to quickly resume working when switching computers. I've never had any kind of sync problems with Dropbox.

I tried doing the same with BT Sync. The biggest problem I had was that sometimes it didn't sync all files. Say I had a directory with 10 files and 3 computers in the BT Sync "network". It was not unusual that 8 files got synced between all computers but 2 of the files would never get synced to one of the computers. Then at a later time, randomly, it would resume sync-ing. After having this problem 3 or 4 times I went back to using Dropbox and never looked back.

Does btsync have file rollback like Dropbox does? That's saved my ass so many times.

As far as I know there is no way to get something back after deleting it using only BT Sync.

They released support for file recovery a few days ago, along with their android app (http://blog.bittorrent.com/2013/07/17/now-in-beta-bittorrent...). Instructions to recover files and set the recovery history time here: http://forum.bittorrent.com/topic/16410-bittorrent-sync-faq/.

No idea why BitTorrent Sync bothered to release a client that wasn't open source. People who care about their security won't use closed-source programs; and people who don't use Dropbox.

Perhaps we could use git-annex for this?

Dropbox costs money, and it uploads everything to Dropbox.

With BTSync, I can sync hundreds of gigabytes over a LAN with very little effort. I do hope a good open source option will be available soon, but for many use cases this is a big upgrade.

AFAIK, you can sync over LAN with Dropbox for a while now.

I believe it still counts toward your quota though mooting the subject.

I don't want to sound like a lamer but can't this basically be done with ssh and rsync? I've been storing and fetching files from my home cubox with a setup like this.

While I'm away from home I use a usb-stick with a custom dev envrionment all set up. Then just backup and fetch files from there as needed. Something like this gets the job done: rsync -avze ssh ~/my/backupfile user@(home ip address):~/backup/location

Probably a bit weird to understand but this is just one example command for backing up a local file (there are more advanced options too).

It probably can, but the BitTorrent Sync use case is likely trying to cover a lot more. For example, making it easy to sync across internet, across NATs, across operating systems. Yes, with some thought, all that is possible with SSH and rsync as well but it's a PITA to maintain and not very mom-friendly. BitTorrent Sync would allow me to sync computers across the world and also share folders with family members.

This software is something which should be used only with encrypted file containers[ala truecrypt] as it is closed source and its own built-in OTW encryption is not the greatest.

Or better yet, OwnCloud which IS open source and can replace gmail (contacts and calendars) too.

Does OwnCloud run well on Raspberry Pi?

The server portion runs on anything that can run a PHP enabled web server.

The client portion runs on pretty much anything remotely modern.

I have no pi so I can't say how well it actually runs.

The Pi is fairly limited in terms for resources. I was wondering if it could support OwnCloud with its low end hardware. I attempted to run OwnCloud on 128m of ram and ran into a lot of problems.

This would be why I never bothered owning a pi.

I tried it but it was very slow.

no it doesn't.

I just love the fact I can have multiple "root" folders... something that is missing from Dropbox/AeroFS and pretty much everything else I have tried (I know about symlinks etc, but, native support is so much better!)

The whole single string security is a mixed blessing... It is what makes the application so easy, and at the same time, it makes it feel so insecure!

Alternative: Buffalo LinkStation Pro Duo[1] for $99, and rsync[2]. Mirrored disk rsync backup, open source, on basically any OS. Not as cool as BitTorrent and a Raspberry Pi, though.

[1] http://www.tigerdirect.com/applications/searchtools/item-det... [2] http://buffalo.nas-central.org/wiki/Rsync_-_synchronizes_fil...

Additional actions I've done: - Copied all my backup files to the thumbdrive first, so my main machine wouldn't need to transfer them all by network to the other computer. In my case it was a 13GB volume. - Configured a no-ip account to have a fixed URL address for my backup machine. This is not necessary, but I think it has improved the performance a little bit. - Configured the backup machine in my router to have a dedicated IP address. - Forwarded bitsync TCP port in the router to the backup machine. - Configured bitsync on my main machine to sync folders in that specific no-ip URL. My impression is that files now sync, or at least start syncing a bit faster now.

Guys, so I have a question about Bittorrent sync. I realize that it's purely a sync tool but is there a way to make it backup data as well? Maybe there's an option that I'm missing that says "Don't delete, just backup"

It will put deleted files in a .SyncTrash folder, but previous versions of modified files with otherwise be lost in the absence of actual backups.

Edit: I stand corrected, BTSync apparently now has a previous version feature.

You can create a key/node with Read-Only permissions. Also when something gets deleted it goes to the .SyncTrash folder but keep in mind it doesn't take into account file corruption.

what RPi kit do i need to buy to make this work? Do I need the complete starter kit, will that block the usb?

edit: here's what i purchased:

[1] CanaKit Raspberry Pi (512 MB) Complete Starter Kit (Raspberry Pi 512 MB + Black Case + Micro USB Power Supply + Original Preloaded SD Card + HDMI Cable)

[2] Edimax EW-7811Un 150 Mbps Wireless 11n Nano Size USB Adapter with EZmax Setup Wizard

[3] SanDisk Cruzer Fit CZ33 32GB USB Flash Drive (SDCZ33-032G-B35)

[4] Preloaded SD Card for Raspberry Pi (16GB, Raspbian "wheezy")

[1] http://www.amazon.com/gp/product/B00DLUXD64/ref=oh_details_o... [2] http://www.amazon.com/gp/product/B003MTTJOY/ref=oh_details_o... [3] http://www.amazon.com/gp/product/B00812F7O8/ref=oh_details_o... [4] http://www.amazon.com/gp/product/B00AYH22VY/ref=oh_details_o...

You do not need any particular kit. You will need:

* Raspberry Pi * SD Card with OS (Raspbian Wheezy is the simplest to use) * USB power cable * Internet connection (Ethernet cable or USB WiFi dongle)

The rest is software that is described in my article.

Well... you should be fine now :)

You can now have your own free 30gb git repo as well.

Words of caution: - Buy a second SD card ; - once you have a running rPi tweaked to your liking clone it with dd to prevent eventual data loss.

Why not just dd to a file on your hard drive?

That's what I meant.

I wasn't clear: the second SD card is to play with more than one OS or to be able to quickly swap a working distro in case the first one goes wrong (dd to a file is of course the mandatory way to go but having the second SD card plug-and-play is more convenient IMO:no fdisk involved).

Ah, alright. That makes sense then. I interpreted it as you suggesting to dd the first sd card to a second sd card which would then become your backup.

This is amazing! I was contemplating getting SpaceMonkey / Plug , but looks like I can build my own thing that will work well enough and cost much less.

After you have the internet connection, why to you need esoteric stuff like a raspberry pi?

also, on the topic of internet connection, at least the contracts i've seen (AT&T, Verizon) you can't have any 'server' or accept any connection initiated from the outside in your home account. As crazy as it may seems, this is on the contract you probably signed to have internet.

The point of bringing in the RPi was to create a node that is always-on, because there is no central file server when using BTSync. RPi makes sense because it's cheap and low power, but just use your imagination. I really doubt it was intended to be esoteric. Instead you could use an extra computer you already own that can stay on 24/7, or a VPS.

For full (or even better) Dropbox-like solution, check also AjaXplorer[1] which is a PHP file manager.

It supports zipping/unzipping, file preview, editing, sharing and many other things.

And best of all - it's open source and you can create custom plugins for it.

[1] http://ajaxplorer.info/

Replace one proprietary service with another...

this sounds really cool and nerdy, i'd love to partake.

but then again, having dropbox just work is pretty swell too.

Just set up BTSync with a couple of friends for a shared folder. Seems to work pretty nice so far.

Why replace it for something else that's not open source either?

I'd like to mention unison as an open source alternative.

I would use a VPS in place of the Raspberry Pi.

backupsy op

Is backupsy any good. I'm really interested in setting something like this up.

gnunet will be better

Will be or is? How can we use it to replicate Dropbox/BT Sync?

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact