RemoteStorage – An open protocol for per-user storage on the Web

Roritharr · on June 12, 2018

In theory this seems like a great solution to many data privacy problems, but I suspect strongly that this leads in reality to very hard to support products as you can't make that many assumptions about the data stores. Performance and reliability of datastorage makes up a large chunk of the user-experience of any web-app, so having this out of ones control is a tough proposition for anyone actually wanting to build a business ontop of this.

chatmasta · on June 12, 2018

In reality everyone’s data will be stored at the same four cloud providers.

fiatjaf · on June 14, 2018

Do you think this is worse than having it all in one cloud provider?

The number of providers doesn't matter, actually. It could be just one, as long as it is easy to switch whenever you need. See GitHub.

hanniabu · on June 13, 2018

Sadly this will be true for 95% of the storage

microcolonel · on June 13, 2018

It seems to me that an API intended to make this storage a commodity would lower the barrier to entry, possibly increasing the natural number of participants in the market.

y4mi · on June 13, 2018

Don't worry, these competitors will host their service on one of these providers

Freak_NL · on June 13, 2018

Is that a big problem if data is end-to-end encrypted?

microcolonel · on June 13, 2018

Cynicism like this makes its own truth each time it's repeated.

binbasti · on June 14, 2018

Actually, this has been pretty much a non-issue for us in production over the last 5 years. As the reference JS client library works offline first, it'll just sync data whenever the remote becomes available again. In fact, that's a nice bonus for offline-first web apps in general, not just with remoteStorage.

Fellshard · on June 13, 2018

You'd need shared protocols for data, of some sort. Probably similar to the original dream for XML schemata.

kevincox · on June 13, 2018

I have created a number of small tools in the last and I generally have kept them stateless because I didn't want to manage (or pay for) storage of user data. If something like this was widespread I could trivially allow the user to save their data without needing to manage that myself.

For complex applications you might want a richer API but I think that for now making it easy to adopt is a key feature and I generally just want to stick a serialized state in a small number of "files" anyways.

I really hope this takes off at some point.

flanbiscuit · on June 13, 2018

I've been thinking about building a simple todo app with some features I haven't seen available in existing ones and i want to release it out in the wild as a PWA/site and Hybrid app. I don't want to deal with storing user data and a central database but it would be nice if they could sync their data so they are not limited to one device. This is a perfect solution for a hobby app I want to create.

fiatjaf · on June 14, 2018

Hey, if you want this to become widespread you can just add optional support for it in your stateless apps, like I did with https://templates.alhur.es/, for example (you won't find anything simpler and more stateless than this).

vackosar · on June 13, 2018

will that work with gdpr? it sounds ok but i am not sure

kevincox · on June 13, 2018

I'm not a lawyer but I think it should be because you aren't storing the users data. The one concern might be explaining that the user is sending their data to a third party (of the users choice) in a nice way.

marktangotango · on June 12, 2018

The disconnect I have with this is it’s a key value store. Sure you can build on top of that, but any non trivial application would require more flexibility. Has there been any work on a remote sql spec, for example?

rzzzt · on June 12, 2018

OData comes close to one, although it does not use SQL as its query language: http://www.odata.org/getting-started/understand-odata-in-6-s...

marknadal · on June 13, 2018

We built a remote (and encrypted) data sync spec for graphs, it can be used for both traditional table (relational) and document storage as well as key/value data ( https://github.com/amark/gun). It has push built in and is fully decentralized and running in production today (terabytes of traffic), and handles concurrency and offline conflicts out of the box.

There are some half baked GraphQL query engines for it as well as some SQL query prototypes on top, but not quite ready yet. What were you wanting to build?

mobitar · on June 13, 2018

You can also check out Standard File, which aims to accomplish something similar (namely, trustless servers for end-to-end encrypted client application). It's currently being used in Standard Notes with great success.

https://standardfile.org

Fellshard · on June 13, 2018

One consideration for where this could be very useful: if you're constructing autonomous agents who can seek and aggregate information and dispatch actions on your behalf, giving these agents access to segments of a personal datastore like this could be an interesting starting point.

Oddly enough, I keep going back conceptually to the 'gevulot' concept from Rajaniemi's 'The Quantum Thief' as an idealized implementation of this.

binbasti · on June 14, 2018

I use my RS accounts for exactly this, in combination with Huginn for example. It's super simple, because you just PUT or GET, with the bearer token for one segment (called "category" in RS) in the Authorization header. And you can also PUT things in the special /public category, so they're world-readable. Example: when I check in on Swarm, Huginn uploads the entire check-in data to my RS for archiving, as well as updates a public RS document with my current location, which my website then shows publicly: https://sebastian.kip.pe

mikece · on June 12, 2018

How is this fundamentally different than storing files in S3 or Azure BLOB storage or SFTP or.... is WebDAV still a thing?

binbasti · on June 14, 2018

The main difference is that the app developer/provider doesn't have to see, secure, or pay for storing user data. Users themselves are in full control of their data, and they can permit any app to access segments of it. Check out this comparison on the RS website: https://remotestorage.io/#explainer-unhosted

rspeer · on June 13, 2018

How is it the same? If I'm hosting a Web app, my users can't write to S3.

mikece · on June 13, 2018

So this would be in place of a service like Dropbox -- or S3 or Azure Blob -- where user passes a token to the app to allow reading/wring data to the user's storage account/location... but with a standard protocol and authorization scheme?

Vinnl · on June 13, 2018

Somewhat, but then users are able to choose their own hosting provider, rather than having to use Dropbox, S3 or whatever that you happen to support.

chatmasta · on June 13, 2018

Yes they can. I don’t know the exact name of the process off-hand, but basically the web app requests a temporary upload token which it provides to the user. The user can then use this token to upload directly to an S3 bucket as configured by the web app.

rspeer · on June 14, 2018

Most users do not have AWS accounts and are not about to get them.

chatmasta · on June 14, 2018

They don’t need AWS accounts. The app has an AWS account and provides a token to the user to upload a file via HTTP to an S3 bucket associated with the app’s AWS account.

Still, you do raise a good point - that S3 bucket is owned by the app, not the user. But it doesn’t have to be this way forever, and AWS is an Amazon product. Amazon has the credit card information of nearly everyone in the world. I would not be surprised if in the future, we see a sort of federated model, where users of an app pay for their own storage at S3 by linking their Amazon account. The app would be like a reseller/affiliate of Amazon, and could pass storage costs directly onto their users without worrying about complexities like pricing tiers to account for variable customer requirements.

bo1024 · on June 13, 2018

Sounds cool, but I'm missing something basic.

1. I update some data on my desktop.

2. I shut down my desktop.

3. I power on my laptop.

How and when does my laptop get the updates made on the desktop?

(edit1) One idea for a solution is asking peers for help: my data gets encrypted and seeded to many peers, so it is always available somewhere.

(edit2) While I like the "own your data" aspect of this, I dislike the "everything in a browser" aspect. At what point is this easier as just a native program?

hexmiles · on June 13, 2018

For what i understand the remoteStorage has different plugin storage you can use. If you want your data to be available across multiple computer you should use a plugin which store data in a cloud storage like dropbox,gdrive o even S3. But i think you should be able to implement something like you proposed by combining remoteStorage with webTorrent.

binbasti · on June 14, 2018

Most RS apps will automatically sync the data to the remote storage, when it's changed on your desktop. And when you then access it on your laptop, even in a different app than on the desktop maybe, it would sync whatever you changed down from the remote storage.

mirimir · on June 13, 2018

Is data encrypted?

icebraining · on June 13, 2018

Seems that encryption support is being added to some specific implementations[1], but not mandated by the spec.

[1] https://community.remotestorage.io/t/encryption-option-in-li...

mkesper · on June 13, 2018

For being able to be processed it needs to be plaintext.

needz · on June 13, 2018

Wouldn't users be able to tamper with their own data? Seems like another attack vector.

eximius · on June 13, 2018

If you write web apps, you should be used to treating user input as hostile. You just need to write your application with a clear perimeter around user supplied input.

needz · on June 13, 2018

Let's say the web app is a game and it keeps track of your high score when you play it, if the high score is stored somewhere you have complete control over, what's to stop someone from modifying that score? Substitute high score for any variable a server tracks about a user that isn't explicitly supplied by that user.

eximius · on June 14, 2018

The only way to prevent cheating in a game is to run the code on the server based on transmitted user actions.

Even with a server side DB, they could lie about their results or hack the game to do better.

gojomo · on June 13, 2018

I don't notice any authorization model in this, where (for example) you could grant a service read/write on a portion of your data but commit yourself to read-only access. But maybe it's in there, or possible.

If that were a part of the conventions, or even if it isn't, services could sign their own versions of data in legal states – as with signed/encrypted cookies. Then out-of-agreement edits could be detected, and possibly rejected as errors, rather than causing other surprises. (This could make for some ugly partial-failure/unexpected-state cases, though.)