Hacker News new | past | comments | ask | show | jobs | submit login
RemoteStorage.js 1.0.0 released (remotestorage.io)
85 points by walterbell on Nov 10, 2017 | hide | past | favorite | 42 comments

I've been watching "unhosted" for a while, and the key thing to know is they're talking about data. The idea is that web apps following the "unhosted" pattern will get/put data from your storage. And this is an api that presumably storage providers/services would implement to enable that. So you'd either host a storage provider on your server somewhere, or pay a provider of such a service.

This is vastly different than some web app, ie a single page app, storing data on a cloud database somewhere (DBaaS for example). Because, the web app owner still owns and controls that cloud database. I'm very interested to see where this goes.

Edit: reading the docs, I have a lot of questions. This appears to be wrapper for browser local storage, that can then sync to a third party provider (local first). But browser storage is basically a key value store. Also reading the docs, they talk about using data modules defined with jsSchema.

That's all fine, but consider the use case for a note taking app. I store all the users data as json objects, that are evenutally persisted to the storage provider. So to query the users data on a new device, to get last weeks notes for example, I have to pull down all the user note data. So basically with this model you are pushing around large sets of json from device, with no way to query or get at a range of objects for example.

This is less appealing than it first seemed. Am I missing something?

Yes, it is a key/value store as base technology. The important thing is that you can build anything you want on top, which includes for example indexes and such. The idea is that this can/will be implemented on top of a simple base protocol, which is easy to implement, and to use data modules (and shared utility modules) to add more complex functionality.

However, having used RS apps for years now, I think it's worth mentioning that most people considerably over-estimate the amount and size of the personal data they use daily, and in the case of notes, I never had performance issues with Litewrite pulling all my notes whenever/wherever I connect my storage: https://litewrite.net/

That's not just because this is actually pretty fast with multiplexing on HTTP/2, but also because you usually don't connect and disconnect all the time on all your devices. So you will usually have your existing data cached and available on app startup (including when you're offline), and only sync remote changes when you get online. This is also what similar proprietary apps like e.g. Google Keep do.

Yeah, would understand that the data pull would be biggest on that initial set up of a newly-introduced device...but then smaller pulls of only deltas going forward (like rsync, etc.). At least, i hope that is the direction taken eventually.

This is roughly how the library works. It will compare the directory listings with the cached items and only download new and updated ones, as well as remove deleted ones.

Cool, congrats to team getting to 1.0.0

We (draw.io) 100% believe in this model, it's absurd that the data should live with the SaaS application provider. It's user lock-in, plain and simple.

Thanks! I love how many different providers you support already. Ping us on IRC or the forums if you need help integrating RS into your app.

I get that the SaaS model of owning user data is wrong. However, I've noticed that some comments seem to imply that it combats being "locked-in" to a provider. Surely a web-app'a data structure is so tailored to its service that you couldn't switch to another provider easily, with local storage.

What I'm getting at here is that remote storage isn't service agnostic really, is it? Unless the data structures are really standardised/simplistic.

Nonetheless user ownership is worth enough in of itself.

Yes, it's merely an enabled - multiple providers will still have to be interoperable. That's not as infeasible as it sounds, though. For some applications it is, but consider browser bookmarks: most browsers can read each others export files (perhaps they've already converged to the same one by now?).

Likewise, this might make sense for open source PWAs. They can now be developed without needing to provide a server, and can simply be hosted on e.g. GitHub Pages or S3. People can now serve different instances of it and people can switch between those.

Yeah i agree with you...however, there should eventually be some sort of de facto standardization with the most frequent types of data sets, right? One would think, i guess? For example, LibreOffice, OpenOffice, and MS word can all open and edit .doc files. So, while the myriad webapps would have different data structures which are not compatible for the long tail...i can imagine scenarios where common app categories - e.g. doc editing, note taking, ToDo apps, etc. - would eventually crystallize around some sort of "common enough" standard that would allow for remoteStorage users to benefit. Hey, markdown isn't yet as consistent, but the files are "transferable enough", right? Anyway, i'm optimistic of remoteStorage and certainly hope it gets used alot more.

It gets somewhat easier to reverse engineer one of these apps, since the structured data can easily be made available. Not that it means that much, but it's something.

Also we can hope that the next step is standardized ways to store some data (e.g. a common key and data structure for calendar events) allowing multiple apps to read the same data =D

Core team member here. This is indeed something you can already do today, but more importantly which is our next area of focus for development and improvements: https://remotestoragejs.readthedocs.io/en/latest/data-module...

We'd be happy about any and all feedback and ideas about this specific topic. Our community forums are a good place for that: https://community.remotestorage.io

This is very interesting. I've been thinking about what kind of storage solutions to use for several tiny PWAs (progressive web apps) with which I've been tinkering.

I'm pretty skeptical that this is really practical for most people. I think it's a lot like 'programming for everyone' tools. Most people are basically doomed to fail in some kind of way because both maintaining data and programming are just inherently difficult.

Hell, I'm pretty confident in my programming abilities, but most possible interesting projects require a significant amount of time, if nothing else. Similarly, even tho I'm pretty comfortable with servers, command line shells, etc., managing a working and reliable personal backup system – that's actually resilient to infrequent but inevitable catastrophes – is not a trivial amount of work. And getting the damn thing setup in the first place has been a very significant amount of work!

I'm not sure most people are making a wrong decision with respect to all of the relevant tradeoffs. Most data is probably not that valuable and so the effort to actually secure it and protect it is wasteful.

That's why you can choose a storage provider to do it for you. But as with email, you're not locked in to a single provider, and small ones will probably not scan your personal data using AI/ML algorithms, like e.g. Google does.

Edit: the library in question also supports using Dropbox and Google Drive, so your users don't even have to know about RS in order to sync your PWA's data to their own account with those.

> That's why you can choose a storage provider to do it for you. But as with email, you're not locked in to a single provider

I wrote this before I realized you're right:

> Yeah, I get it. But the real benefit, that outweighs the cost in time and attention, and maybe money too, is that you can control your data. Not having your own copy, and your own backups, or not having the technical ability to access relevant info in your data, as I'd expect the vast majority of people to not have, doesn't seem like it means that much effectively.

Thinking about email was what opened my eyes. Even tho most people don't understand email technically, it's a standard feature, effectively, that pretty much anyone can have a new email provider import their emails from their old provider. That's a big benefit and just that benefit alone is worth implementing something like RS for a lot of webapps, if not nearly all of them.

If only some company would release a "homepod" RemoteStorage endpoint with easy backups, this could have a chance at becoming widespread.

Hellz YES!

Would be nice if RemoteStorage were include as an App in Cloudron.io and/or Sandstorm.io, or even better, in NextCloud/Owncloud.

I seem to recall that Nextcloud actually supported this, but I can't find it right now. In any case, I know that there are people who are at least somewhat involved with both projects.

Yeah, that would be me. :) At some point we supported it, but it turned out that for security reasons you would have to have a separate domain for your remoteStorage than your Nextcloud – which kind of defeats the purpose of easy integration.

At leas that is what I recall. Still think it would be great to integrate remoteStorage properly with Nextcloud! :)

Hmm, maybe you could use Content-Security-Policy to force the files to be treated by the browser as if they came from a different origin?

I think so, too. Let's make it happen! :)

How is remotestorage.io different to https://tent.io ?

Are there any other data protocols aimed at helping users "own" their data out there?

As far as I can see, Tent unfortunately seems to be inactive. Last commit in 2013: https://github.com/tent/tentd, last blog post 2014: https://tent.io/blog

so we have a failed protocol from 4 years ago, what makes this remotestorage more likely to succeed?

Tent's goals sound similar on the frontpage, but are actually very different. It's more of a communication protocol than a a data storage one. This page is slightly more informative, albeit not much: https://tent.io/about

Tent is built around posts. Each Tent server stores a single user’s posts and sends copies to the user’s subscribers.

I seem to recall that tent - the protocol - used to be an effort for decentralized social networks...though I've lost interest years ago...so not sure what is happening recently, or if they've pivoted to different space.

My immediate concern with the unhosted model diagram on the page is how do you make sure the data hasn't been tempered with? The web application needs to see it but now the added problem of not knowing if it's legit because the data source is the end user.

Another side idea that comes with this is something like IPFS or even torrent, where bits of the file storage are stored on individual nodes but incapable of meaningful data unless put together with other bits of data from other nodes.

I admit I'm not even sure how bittorrent make sures the other "bits" are valid, would love to know if there is some p2p-storage.js you could drop on your webpage

Why do you need to worry if the data has been tampered with, the user can only hurt themselves.

If you need access to the users data then you will need to run your own server anyways, so you deal with the trust problem as any other web application does.

>I admit I'm not even sure how bittorrent make sures the other "bits" are valid... Each peer has a sha1 hash of each peice.

>p2p-storage.js... Maybe webtorrent.min.js is what you're looking for.

WebSQL when?

@Firefox devs: please add WebSQL support. Now that you lost the IndexedDB devs who voted against WebSQL, it's rather time. Mind you, NoSQL and NewSQL nowadays coexist next to each other. Safari and Chrome based browsers (90% global market share) support WebSQL, except Firefox.

Until WebSQL gets a full spec, I personally am completely against it.

Right now it's defined as "What SQLite does", which is not a way to implement a standard.

Not to mention that even if they did create a full "specification" for the SQL that it uses, it seems like way too much to force onto a browser to develop for not all that much benefit.

Safari and Chrome both support it, but it's not going to get any major updates and only exists in the browser today because it was added at one point.

I will be happy to write a spec for a cut-down SQL language (a subset of the language understood by SQLite) for WebSQL, and provide a "strict" flag for SQLite to force it to understand all and only the spec language. The prerequisite for doing this is that you (or anybody else, as long as it isn't me) need to get all the browser vendors on-board and promising to implement WebSQL if I write the spec.

The cut-down SQL would include just the basics: CREATE TABLE, CREATE INDEX, INSERT, UPDATE, DELETE, SELECT. No triggers or views. No support for partial indexes or indexes on expressions, common table expressions, or other such features. There would be a minimal set of built-in SQL functions, but with support for application-defined SQL functions written in javascript.

A competing implementation would not need to understand the SQLite file format nor the complete SQLite language nor would it need to implement the SQLite API. All it would need to do is support the basic SQL subset defined by the spec.

If you get the browser vendors on-board with this idea, and I'll write the spec.

While I'd love to see this happen, I'm not even sure where I would begin.

But if someone out there has experience working with whatwg or W3C on these kinds of things and wants someone to help out, I'd love to be a part of it.

But like you said, the hardest part is getting the different browser engines to promise support (or even take it seriously). I remember reading on HN a post from a woman who was involved in trying to get SVG in the browser standardized (sadly I can't seem to find it any more!), and just reading about the struggles she went through to even talk to someone involved who, while receptive, basically told her that it wasn't going to get done.

It seems like there would need to be an "advocate" who is part of the teams on EdgeHTML and Gecko that would be willing to fight for it before it would even begin, otherwise, like you said, it would all basically be wasted effort.

The JavaScript standardization system is quite nice, I wish that browsers could get together and come up with something similar for Web APIs...

Rules are great, and you're right that there's a rule that something cannot be a browser-standard unless it's a real standard, with real generic specs.

BUT... what about making an exception to the rule? SQLite is sort of a unique beast -- public domain license, used almost everywhere (most smart phones have it embedded, Chrome and Safari have it), and rock solid? Also, it's not like it's a 100 million LOC app.

Because despite how great SQLite is (I use and love it myself), removing the "competition" between multiple implementations is removing one of the main cornerstones of the web platform.

A real spec is "loose", it allows wiggle room in many areas, it allows for multiple ways to implement things, it allows implementations to act differently, and in most cases that comes out as a win for the user. Saying the spec should do what a single implementation does is about as close to saying "you aren't allowed to improve this" as you can get.

I really believe that a monoculture is a bad thing. Luckily on most platforms SQLite (while the best in most cases) doesn't have a monopoly on what it does, and while it's a "defacto" standard in many cases, there are alternatives at every step of the way. Saying that a platform must implement SQLite takes that defacto standard and codifies it. It's forcing a monoculture. It doesn't just deincentivise improvement, it almost explicitly forbids it.

I see you got a response from "SQLite" elsewhere in this thread, with same thought I had -- just document the subset of SQL to be supported. I would say it's too late, but... SQL has been around for decades, and I think it will be around for decades more, so it probably deserves to be baked into the browser.

But isn't "What SQLite does" effectively a standard? It's got a spec AFAIK.

A core part of web technologies is the fact that they have multiple competing implementations.

For example, javascript changes can't be put into a standard until they have (IIRC) 3 major browsers which implement them.

Pinning a standard to a specific implementation (and a specific version) is the opposite of every other web standard. Plus it means that any alternate implementations must reimplement SQLite themselves if they don't want to or can't use SQLite themselves.

I'm confused about what would be implemented. Is it a database engine that supports the SQL dialect that SQLite uses. In which case, (at least) two browsers have already implemented it, right?

If you're referring to SQLite itself being implemented multiple times, I don't understand why that's relevant to Web SQL. You wrote that Web SQL shouldn't be supported, i.e. implemented, until there's a spec. It seems like there's a 'spec' now, if not an official formal spec. But now you're claiming that Web SQL can't be a standard because there aren't multiple implementations. But it certainly seems like there are already multiple implementations:

- [Can I use... Support tables for HTML5, CSS3, etc](https://caniuse.com/#feat=sql-storage)

But still, your logic seems circular: Web SQL shouldn't be implemented for all the browsers because there's no spec and there's no spec because there aren't multiple implementations.

I imagine there's a distinction you're trying to convey that I'm missing. What is it?

The implementations currently are just SQLite hooked into the browser, they aren't different in any real way.

It is a bit circular, but generally something gets proposed as a standard, multiple browsers will implement it, then it becomes a finalized standard.

In this case it was proposed, the standard says "your implementation needs to do what this other implementation does" (and no other information is given about the SQL system), 2 browsers (really 1 engine at the time, webkit) used SQLite directly, and it never went anywhere else.

I guess my point isn't that there's not a "spec", but that the spec that there is doesn't say anything about the SQL part of WebSQL, it just says what the interface to the SQL engine looks like, and then mandates that the SQL engine act, look, and work like the one in a specific version of SQLite.

No this is not websql. This allows remote hosting so data across multiple devices can be in sync

One of the four 'features' mentioned on the home page is "Offline" so presumably some kind of browser local storage is supported.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact