Hacker News new | past | comments | ask | show | jobs | submit login
PrivateStorage.io: A secure and privacy-focused cloud storage solution (leastauthority.com)
194 points by aralib 11 days ago | hide | past | web | favorite | 76 comments

FWIW, I've been using syncthing [0] for some years now [1] and am very pleased. Even though my data is unavailable on the cloud from any untrusted computer (like e.g. my corporate laptop), it's synced on my "fleet".

I'm not sure that PrivateStorage actually adds anything to the equation?

EDIT> The Tahoe LAFS [2] model is more that you spread your data over multiple providers. NAS at home, several VPS providers, or what have you. It feels like RAID in the network, and it allows very precise setting of redundancy policies.

So syncthing actually only runs on trusted machines, whereas PrivateStorage will be able to run on both trusted (tightly managed) and untrusted machines (like a VPS in the USA).

[0] https://syncthing.net

[1] https://try.popho.be/byeunison.html

[2] https://tahoe-lafs.org/trac/tahoe-lafs

How much data do you sync? I'm syncing 60 GB with NextCloud and it annoys me frequently: every time I log in it spends 5 minutes scanning my data, pegging at least one core of CPU and using up a lot of my I/O capacity. And of course at a pretty annoying time, since I almost always want to be actually using my machine during the first 5 minutes after logging in. And I'd really like to be syncing more data. Anyway, wondering if syncthing does better in this respect.

Im using Syncthing as well, and in my Syncthing dir on my laptop there is currently 52 GB of data.

That includes a synchting-share for my automatic backups and my "dropbox" replacement (a simple directory for syncing between phone, and computers).

It works great. I haven't had any issues with it. The current release of syncthing is very stable. Earlier versios were a bit error prone. But they seem to have fleshed out most, if not all bugs i encountered in earlier versions.

I have about 10 GB in multiple folders synced in various combinations between:

- ec2 instance (light burstable, always on)

- desktop (debian)

- laptop (manjaro)

- desktop (windows)

- phone (one-way sync to get photos off the phone)

- tablet (kindle fire)

It has been great. Solid and trustworthy. It picks up changes made before the service was running, handles deletes and renames just fine, and updates are simple. The web-based UI is good.

I like that calmh and the team are not adding lots of features. There are lots of things they could add to make syncthing "better", but they want to make sure syncthing does one thing well.

One nit - the UI shows the "latest change" for each folder. The common understanding of this phrase would be "the file in that folder that most recently changed" but what syncthing actually shows here is "the most recent change that syncthing made to this folder". That means that if I change a file on the current device and syncthing picks it up and replicates it out to the other devices, that change will not be shown as the "latest change". If some other device changes the file and syncthing replicates that change back to the current device, then it will be shown as the "latest change". This is confusing. "latest change" should just show the file that most recently changed for any reason.

I'm syncing around 90 GB between my server, laptop, and LineageOS Pixel phone. I use it to sync my documents, music, passwords, and archived pictures. I also use it to sync photos taken by my phone camera as they are taken.


* Camera: 1.8 GB, 243 files

* Documents: 10.8 GB, 4604 files

* Music: 61.5 GB, 25077 files

* Passwords: 660 KB, 726 files

* Pictures: 16.5 GB, 6450 files

The passwords are managed by 'pass' [0], which is viewable on my phone using Password Store [1]. Cold-launching Syncthing takes ~10 seconds on my phone, but it does it automatically on boot and thereafter runs in the background. Battery impact seems to be negligible.

[0]: https://www.passwordstore.org/

[1]: https://f-droid.org/en/packages/com.zeapo.pwdstore/ and https://play.google.com/store/apps/details?id=com.zeapo.pwds...

So far I've been disappointed with sync issues with Spideroak, OneDrive, and Nextcloud.

Now I use Tresorit (which I only became aware of because of... and online ad!?) It doesn't seem to have sync issues for me. Dropbox didn't have issues either, but it wasn't as secure.

I wanted so badly for Nextcloud to work.

Given how rock-solid Syncthing has been, I wonder how hard it would be to bolt encryption onto it so anything that some specific nodes receive is always encrypted.

It has been requested since at least 2014 [0] and it just sounds like it isn't going to ever be a feature of Syncthing.

The ability to have untrusted nodes is the one feature that has kept me using Resilio Sync.

[0] - https://github.com/syncthing/syncthing/issues/109

It wouldn't be only specific nodes, but would something like encfs[1] work?

[1] https://github.com/vgough/encfs

Unfortunately that's harder than just always leaving a Raspberry Pi on at home, especially given that I want to be able to sync files to my phone, where EncFS probably doesn't work at all (or easily).

I'm unfamiliar with syncthing, but could you run two daemons, one that does encrypted sync to e.g. dropbox, and one that does plain sync to your phone and such? Or would the two instances stomp on each other or get into an infinite loop? e.g.:

    plain <-> syncthing <-> phone
    encfs <-> syncthing <-> dropbox

That might be doable, but then I'd need an always-on trusted computer to do the decryption, and if I have that I don't need the VPS...

Apparently syncthing uses fairly strong crypto in transit, or at least that's what I read recently

That's not what StavrosK is asking for.

Some syncthing nodes could host only the encrypted data, without the keys to decrypt them. This adds the benefit of having some nodes host the data, without being able to access it. Think: VPS, etc. that have very good availability track record, but some doubts about whether your hosting company can spy/might be coerced into spying.

Exactly. If I could be sure that the VPS couldn't read or mess with my files without me knowing, I'd definitely add a SyncThing node on my VPS and have increased availability along with security without any hassle.

I think it has been considered:




Unfortunately there doesn't seem to have been much movement towards making it a feature.

A big thing is that Tahoe LAFS can be run on untrusted computers.

Is Tahoe LAFS a viable solution for people without sysadmin skills?

Tahoe-LAFS by itself is probably not (you do have to configure and keep some Python-based daemon software running), but PrivateStorage is a managed service.

So you share your data on untrusted machines that you bet will live long enough to hold what you want to keep.

Sounds risky.

Do storage providers have an incentive to provide the service reliably à la filecoin? [0]

[0] https://filecoin.io/

I trust S3, B2 Google's blobstore more than some rando's machine who runs filecoin. Tahoe-LAFS gives you the assurance that the actual backend storage only sees encrypted data. The big clouds have this advantage that they are probably more reliable, faster, have lower latency, better uptime, and lower price.

The Tahoe LAFS model is more that you spread your data over multiple providers. NAS at home, several VPS providers, or what have you. It feels like RAID in the network, and it allows very precise setting of redundancy policies.

In the PrivateStorage case the machines aren't "trusted" but they are all run by the service you're paying for -- so the incentive to keep them running properly is indeed there.

For other kinds of Tahoe deployments, no there's nothing built-in to incentivize storage-server operators. That part is up to whomever is organizing and running the Grid (what Tahoe calls a group of storage-servers). For example, friends could agree to host storage-servers for each other and create redundancy + trust that way.

The difference between Tahoe and things like Storj / FileCoin is that those services intend to be "a single, global service" whereas Tahoe is software that can be deployed in several different ways -- one of which is a professionally managed Grid such as PrivateStorage.

If you are interested in these topics I'd encourage you to join #tahoe-lafs on Freenode or one of the Tahoe development meetings. These are definitely things I've seen discussed but I think Tahoe-LAFS is far more likely to introduce a concept of "federated Grids" rather than "a single global Tahoe service".

https://docs.blockstack.org/storage/overview.html. While we are rolling out AMIs that auto launch gaia hubs, and hosting some to make it user firendly for people with non sys admin skills, you can docker-compose up here: https://github.com/blockstack/gaia/blob/master/hub/README.md on a machine you own or trust.

We are working to make doing even this as user friendly as possible.

Untrusted doesn't mean randomly selected.

> Do storage providers have an incentive to provide the service reliably à la filecoin?

Yes, via a currency called Dollars.

You can configure Tahoe-LAFS to store data wherever you want but I guess PrivateStorage will have its own settings and you won't be able to select a nas at home. Just a guess though.

An expert could figure out how to get the PrivateStorage Tahoe client to use other storage servers, but yes in general it is "a managed service" and I don't think using your own storage-servers will be "a supported configuration".

You can use your own server to host a gaia hub with docker compuse up: https://docs.blockstack.org/storage/overview.html , or use a one of the cloud AMIs we are rolling out here: https://docs.blockstack.org/storage/overview.html

Making user owned storage user friendly for people with non sys admin skills is a challenge, but something we are working towards.

because our gaia hubs are associated with user's ids, we are also working on automating SSL as much as possible for individual users as well. This is another technical challenge that makes it difficult for the average person to set up their own trusted environment where they control their own data.

At this point you'd just go at the source and use Tahoe directly

Sure, yes you could do that -- I mean, PrivateStorage is just shipping you a "real actual Tahoe client". The main feature you're getting is the managed storage-servers.

So if you happened to "not completely trust" the availability of those you could also configure one of your own and configure your client(s) to use that and the PrivateStorage servers. That is, hedging against PrivateStorage going away so suddenly you can't retrieve your data.

But, I agree: if you're doing that you're likely able to run your own Tahoe grid on VPSes or similar.

How do you link to a single file you want to share ?

Looks like it's not actually ready yet? PIA has a great track record, though, so this seems promising.

I also like this when you give them your email:

>> This is not a mailing list, and your email will be permanently removed after we send a one‑time notification when PrivateStorage is available to the general public.

There are a number of alternative paths in this space if you're truly focused and willing to invest a bit, but if you care about privacy enough to seek a service like this out and just want to minimize mental overhead, this seems like a good choice.

Tahoe-LAFS makes some impressive claims like maintaining confidentiality while running on untrusted machines. I think a lot of folks now would assert that really any machine running x86 due to Intel ME and the AMD equivalent should in fact be untrusted.

I'm not in a position to criticize though, this is just from a cursory glance at the summary page, and frankly I used PIA as my own VPN provider for a number of years and had only positive experiences.

(author of Tahoe here, although I'm not much involved these days)

> Tahoe-LAFS makes some impressive claims like maintaining confidentiality while running on untrusted machines. I think a lot of folks now would assert that really any machine running x86 due to Intel ME and the AMD equivalent should in fact be untrusted.

To be precise, our claim is that you can use untrusted servers, since the client encrypts the data before it leaves your machine. You are, of course, entirely reliant on your own client being trustworthy. Nothing can save you if your client is compromised, whether via ME, a BIOS infection, an OS rootkit, or a boring old userspace compromise.

The Tahoe-LAFS client runs pretty well on ARM and Raspberry PIs, in case that feels better.

"There are a number of alternative paths in this space if you're truly focused and willing to invest a bit, but if you care about privacy enough to seek a service like this out and just want to minimize mental overhead, this seems like a good choice."

It feels to me like 'borg'[1] is becoming the de facto standard for this use-case. There were a number of similar tools (like duplicity) for years but borg seems to have buttoned up all of the issues.

Some call it the "holy grail of backups".[2]

[1] https://borgbackup.readthedocs.io/en/stable/

[2] https://www.stavros.io/posts/holy-grail-backups/

Borg et al are specifically made for backups. Tahoe-LAFS is for general use.

So on one hand, you just need some good open source software for that, there's enough cloud and there's no reason you wouldn't choose the cheapest one if you have everything client side encrypted and can add more redundancy. On the other hand..

> the system runs on Tahoe-LAFS

that got me very interested.

Should anything truly private be stored in the cloud? I have never seen a solution that doesn't boil down to trusting someone. The claim is that the code is open source. But I don't know how I would verify that that's the actual code they are running on their servers. I also don't understand the payoff. For information that's not truly private (like your music collection) but that could possibly be data mined, then a very basic level of privacy you get from something like Dropbox should be enough, right? What does this service offer that other cloud storage providers don't offer? For information that's truly private, why would I risk it becoming eventually available to hackers by putting it somewhere in the cloud? What am I missing?

The data is encrypted on your client before it leaves your computer. You're relying upon the servers to hold onto your ciphertext (i.e. availability), but not to keep it secret (confidentiality). And the client can detect changes to the ciphertext, so you aren't relying upon the servers for integrity either.

You have to trust the client code, for sure, but that's something that you're at least nominally in a position to inspect and verify. https://github.com/tahoe-lafs/tahoe-lafs

I'm a programmer. And I still don't think I'm in a position to verify if something is cryptographically secure. It's quite possible that a client has been built with an extremely subtle backdoor already in mind. One that crypto experts won't find for years.

Yes, but it's like when you're at a cafe and need to go to the bathroom so you ask the random guy next to you to watch your laptop. Sure he could steal it, but you reduced the attack vector to just him.

It's a reasonable analogy. To use your analogy I'm suggesting you don't trust anyone with your laptop and bring it to the bathroom with you. If something is truly private and / or valuable information don't put it in the cloud. I'm not alone in that thinking. When it comes to storing people's digital currency you hear about things like cold storage. For very good reason.

You can use gaia hubs, which are user owned stores, to host your data wherever you like. Gaia focuses on user owned data, and leaves the work of network consensus and replication to the identity associated with your gaia hub: https://docs.blockstack.org/storage/overview.html where the identity is defined here: https://docs.blockstack.org/core/naming/introduction.html

We have an Amazon EC2 AMI, and are working on others, but the idea is you could bootstrap the docker-compose on any VM you like, or a rasberry pi if you want even: https://github.com/blockstack/gaia/blob/master/hub/README.md

There are IPFS driver requests and now requests for drivers to support privatestorage by Least Authority as well, if you also want to replicate your data temporarily across some nodal network.

While gaia fundamentally does not require using the comprehensive Blockstack API, we are working on tutorials to abstract the use of only gaia without Blockstack. They are designed to be functional independent of each other, in the same way people can use Blockstack authentication without gaia, the reverse can be true: https://docs.blockstack.org/storage/overview.html

Currently, I want it to be even easier than just bootstrapping a docker-compose in gaia for users to host on their own machine, or rasberry pi or what have you. We are working on that as well as cloud hosted solutions.

I would like for people to be able to launch a vm with a preinstalled image locally on their own machine, not just google cloud, amazon, Digital Ocean etc. The groundwork for a secure and minimal VM is mostly in place. We need to set up more instructions for this but feel free to launch the docker-compose and give it a whirl in your environment of choice if you don't want any of the cloud AMI's we currently offer.

I wonder how this compares to Cryptomator [1].

[1] https://cryptomator.org

Or ARQ if it’s just for backup...

I'm curious what the pricing will be when this is opened up to the public. Some years ago when I compared encrypted online storage, I found Least Authority to be quite expensive. It still seems to be ($25 a month). [1]

[1]: https://leastauthority.com/

Very interesting, may use this, I'd still layer in a VPN, i've had leaks in the past[1].

[1] https://vpntoolbox.com/disabling-webrtc-browsers/

Does anyone know what kind of impact client-side encryption would have on sync speed for potentially-large files (as opposed to simple text messages)?

Tahoe-LAFS does "erasure coding" on the chunks of data. This increases the size of the data (adding redundancy) so that you can recover a file without recovering every single chunk. These parameters are decided client-side. In the smallest possible case (i.e. every chunk required) there is some slight overhead from the zfec and Tahoe headers.

If you are using redundancy of any kind, it will inflate the size of the ciphertext versus the plaintext thus affecting sync speed.

Tahoe-LAFS does split everything up into fixed-size chunks, though, so the total size of the file doesn't really matter -- it will still be uploaded in 128kb (default) chunks to the storage servers.

So, it's not the encryption that has an impact but the erasure-coding (which gives the "RAID-like" features) and you can configure it to have zero redundancy and thus only some slight increase in the total amount of data to send.

Hadn't even thought about a difference in size; I was thinking the CPU overhead. If I save a 1GB file, how much processor time will it take to re-encrypt the whole thing so it can be sent off? Or does the chunking apply here too; i.e. only the chunk of the file that's changed has to be re-encrypted?

I don't know the exact answer to that, but "not much" in comparison to the time to send the bytes over the network. The actual contents are encrypted using AES which often has built-in instructions on modern processors and is thus very fast. The vast majority of the time is uploading time here.

Tahoe does use "convergent encryption" (basically, the key is based on the contents) so that the same file encrypted by the same client results in the same ciphertext (and thus, doesn't need to be re-uploaded).

I believe that only happens at the "capability" (i.e. file) level, though, not each chunk. So, if you had a directory of 10 files each 100MB and changed one, you'd only have to upload the new directory-descriptor and the one changed file -- but if you change a few bytes of a 1GB file, you'd have to upload all the ciphertext for that file again.

Thanks for the well-informed answers!

Does this have a way to only sync partially sync? So if I have my huge music library there, but I only one sync particular folder to my phone?

Based in the US apparently. I guess everyone here knows what that means. Not a great jurisdiction if you are concerned by privacy.

Still PIA got good standing in avoiding the data/log access requests so far for their VPN service.

I'd still lean towards Iceland, Switzerland or Romania, although I'm still not sure if I should trust any EU country over this topics.

Why not?

https://restoreprivacy.com/5-eyes-9-eyes-14-eyes/ Mostly, but also by some of the news I see over here about parliaments trying with more or less success trying to pass laws to take down websites or force them to comply with questionable reasons.

I understand that there could be reasonable arguments behind, but I feel very uneasy about it.

Least Authority is actually a German company now (although it was started in the US): https://leastauthority.com/about-us/

How does this compare to tarsnap?

tarsnap's target audience is sysadmins and other UNIXy gurus. My grandma and my dad, who would benifit most from a secure sync mechanism would probably be unable to use it.

Also, I'm not sure how you can use tarsnap at good cost for p2p sync.

[0] https://www.tarsnap.com/

Tarnsap is for backup, I don't think it can really be used for sync in the general sense. It's also hard to predict how much it's going to cost you if you don't exactly know how much data you're going to upload (and is IMO prohibitively expensive for even moderately-sized backups).

I love the tech behind tarnsap, I love that the client is open source, I love the whole philosophy of it but I really struggle with the pricing.

> the client is open source

No, it's source available.

Ah, you're correct of course. Thank you for pointing it out.

time for me to make a tarsnap gui ?? XD

Hopefully they accept cryptocurrencies which tarsnap no longer supports.

This is interesting. I currently use Sync.com which works great on my Mac but unfortunately doesn't have a Linux client.

What is the advantage over ARQ + B2/S3/etc...

This sounds like just inferior Storj.

Interesting. How does this compare to hosted Nextcloud offerings?

They're calling this service S4 and not expecting to get sued into oblivion by AWS?

It's also not very creative. Says a lot about the maturity of their their thinking when there's such obvious naming similarity.

It does appear to be based on the phrase "Simple Secure Storage Service". Maybe 4S or SIV would have been better.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact