Hacker News new | past | comments | ask | show | jobs | submit login
Solid aims to radically change the way web applications work (mit.edu)
354 points by doener on Feb 11, 2018 | hide | past | web | favorite | 125 comments

Wishing that the landing page had a clearer explanation of how Solid provides a means for the owner of the data to control who can access their data, people and applications, as well as what the process of moving ones data from a given store to another.

This project looks like it has huge potential, I just want to be able to understand at a glance what privacy controls are in place without reading the source code before committing my personal information.


Found the specifications documentation linked to from some of the example apps[0][1]. It would be great to have more of this information on the landing page for the project.

[0] https://github.com/solid/solid-spec

[1] https://github.com/solid/solid/blob/master/README.md

Reading the landing page left my scratching my head too. Great lofty goals and all, but zero informat on _how._ The fact that you had to go to links from the example kind of makes it a failure.

Exactly. "Solid (derived from "social linked data") is a proposed set of conventions and tools for building decentralized social applications based on Linked Data principles. Solid is modular and extensible and it relies as much as possible on existing W3C standards and protocols." That sounds like a pitch from some ICO. The site has the buzzwords. It's got the neckbeards. What it doesn't have is a convincing use case. It comes across as some really complicated scheme for address book synchronization.

They have three sample applications, yet all you can click on is somebody's blog entry. Clicking on the "publishing" app gets you a screenshot of the abstract of someone's paper. It's a single page web site, like all the cool kids have now. Clicking on the top menu items just scrolls the page.

It has MIT and Bernars-Lee behind it, so it can't be totally bogus. If those names weren't on this, I'd assume it was from someone either clueless or crooked.

There's a decent description of Solid on Github.[1] From there, you can see the real problem. It's only useful if the big players adopt it. Which they won't, because it breaks their walled gardens.

This looks like is another try at Bernars-Lee's "semantic web" - hammer as much content as possible into standard formats so it can be machine processed. This is an old idea, and tends to break down once you get beyond contact lists and library catalogs.

There have been major efforts to make that work in the business sector, where it's called "electronic data interchange", and parties want to exchange purchase orders, invoices, and bills of lading.[2] It's worth looking at that area to see how hard this is for even simple-seeming problems like that. And they have cooperation - buyer, seller, and shipper all want that data to flow smoothly between the parties. Trying to do this in today's world of competing closed web empires is much tougher.

The medical data records people have it even worse, I hear.

[1] https://github.com/solid/solid-tutorial-intro

[2] https://www.edibasics.com/what-is-edi/

My experience in companies is that data exchange, data reuse and data referential are a nightmare. My mental model of a company is an insane amount of customized ETL, on top of web services, obscure IDs in each data/service silo representing the same thing, data exchange format poorly (un:)documented that require massive boilerplate code. I think it is the middle age. I have been a huge fan of the LinkedOpenData. And its Lack of adoption profoundly disturbs me. I still don't understand how people can use the Knowledge graph of Google daily, and discard the need for something similar in companies, or widely available for OpenData.

If machines can read, why does data need to be machine-readable? Seems like a transient.

For the same reason an organization will ask you to fill in a standard form page (even on paper) rather than writing a long form essay.

Why do they do that, again? I'm not arguing against summaries. Just saying that if machines can read as well or better than humans, then machine-readable in the sense of using a simplified ontology, grammar, or alphabet is unnecessary.

My point is that even humans make fewer mistakes when they're reading an established form structure with pre-defined fields over long-form text. Hence you'd expect a machine to process them better as well, even if it could read like a human.

Because even if you can read you need context, or you need to be able to identify what a new word means. Reusing a shared vocabulary is useful.

I don't understand. If you already agree on a shared vocabulary, then you're not using new words. Context is more present in the long-form un-pre-processed text than in the short-form text.

SOLID is a spec that TBL & co (like Dmitri Zagidulin) have worked on. It is quite good and based on linked data. It was mostly worked on the other year, but I think it is only now starting to be marketed - which means TBL thinks it is ready for people to implement.

We wound up implementing something very similar, about the same time (I wish I had known about SOLID earlier), except using CRDTs and graphs (which can support Linked Data). Our implementation is live and functioning already, including:

- Realtime updates across a decentralized network.

- End-to-end encryption with P2P identities.

- Backs up to localStorage/disk (if you Electron-ify/etc. it) - Can also be backed up by remote storage services.

It is as if IPFS and Firebase had a love child, and we've tried really hard to reduce the API to just a couple lines of code to get fully working P2P social netowrking dApps in place.

Check out:

- Intro http://hackernoon.com/so-you-want-to-build-a-p2p-twitter-wit...

- 4min interactive coding tutorial https://scrimba.com/c/c2gBgt4

I know nothing about the subject, how does file removal work in decentralized networks? I understand once something is on the internet it's out of your control but what if I accidentally post a naked picture when I'm trying to sell something on decentralized-bay and I try to delete it immediately or a few minutes after? What about illegal content that I don't want to redistribute without my knowledge?

Um... it is hard. Really hard. Let us just hope you don't have WiFi turned on, and can revert your changes first!

In our case, we have a tombstone delete method. So as long as every peer that saved your data (and people aren't gonna just save it for free) comes online at some point, after you've nulled/tombstoned the data, then it will be deleted.

This is only true because of our CRDT system that lets you update/mutate data. It is not true or a general property of other decentralized systems though.

We have to learn to treat all the content we post like the words we speak publicly. If it's out, it can't be taken back.

> huge potential

I was about to say "but there's no information" but then you added the Github links. Thanks!

I don't see why that's not referenced on the main page. I mean it might be a marketing tool to get funding, but it still needs to have links to same prototypes for the technical audience (even a little github cat logo to the repo would have been good).

Even a non-tech audience would understand a link to "see the spec" and knows what Github is, even if they wouldn't be able to understand any of what's there.

personal robots.txt style disallow rules that platforms can choose to honor?

The fact that they're still using RSA makes me want to avoid it: https://github.com/solid/solid/blob/master/proposals/auth-we...

Last Updated Feb 2016.

I'm sure when they come back to this they'll change the SHA1 to SHA256 (I'd hope) .. might just do that myself and submit a pull request.

What's wrong with RSA? DSA has been depreciated in most tools. Do you think they should use ECDSA?

I think they should use EdDSA for signatures and X25519 or X448 for ECDH. RFC 7748 / RFC 8032.

Although I agree with the aims of the project, trying to understand it leads you down a rabbit hole of complexity that ultimately never pays off. Ontology, vocabulary, RDFa, OWL, FOAF, etc.

I assume this is a continuation of - or somehow related to - the semantic web project that W3C spent a lot of time spinning its wheels on back in the early '00s. Back in the day, I bought into the hype that this would be the next big thing, but it never gained traction. Nobody understood it. It was too meta.

Trying to do anything with semantic web specifications was like writing an academic treatise on the philosophy of meaning, and ultimately delivered no more value to users than a hacked up <table> layout.

People should all reread the essay 'Metacrap' once in a while about this.


However, metadata is really useful in some contexts. Say you have a huge collection of scientific data from a particle accelerator, astronomy database, satellite imagery or sensors.

How do you set that up for search?

How do you make it worthwhile for academics to release data like this and get credit as they do for writing a paper?

How do you have provenance for derived data?

How do you set up a unique identifier so the data can be referenced and found as required?

You have data about the data. You have metadata. If you're smart you standardize it and bingo. You have a use for metadata.

CERN uses OPC UA for ops-level organization and integration in the large:




The information model is quite flexible, and not tied to any specific platform... but it's not for the faint of heart.

Semantic web is not that hard to understand. But facebook/google/etc will NEVER adopt it. They are not interested in opening (meta)data in highly precise and machine readable form to 3rd parties. So the only adopters are geeks/scientists.

I can see where you're coming from, and your point of view is certainly not baseless, but I just thought I'd point out that at least Google has pushed for the development and adoption of linked data formats like JSON-LD [0] and standardized vocabularies like schema.org [1]. They make use of it for "knowledge graph" [2] features, as well as in Gmail for what they call "actions and highlights" [3] (things like displaying flight reservation details, for instance).

[0] https://json-ld.org/

[1] https://schema.org/

[2] https://developers.google.com/knowledge-graph/

[3] https://developers.google.com/gmail/markup/getting-started

Yes, Google would love for you to mark up your data so that they can better consume it. But good luck trying to get Google to make any of their data more interoperable. Google Plus, YouTube, Google Photos... they do have somewhat limited APIs, but they are not federated and standardized. Semantic web in, limited proprietary access out. Walled gardens are a business tactic, no semantic web technology can change that.

I think you are correct. Semantic web is a decent technology. It has some rough edges, but it has solved the technical aspect of the data interoperability problem. The only barrier is a social/political/business one: privatizing and monetizing user data is the business model of most of Silicon Valley. I always say that a federated protocol like email would never be adopted today, the business incentives just do not exist.

Don't trust google on metadata recommendations, they change it way too often: https://aaronparecki.com/2016/12/17/8/owning-my-reviews

The W3C docs are very developer unfriendly additionaly there is a significant issue with OWA vs the default mindset of CWA. Also OWL and RFFS not being the type of schema a dev is used to and until very recently no option to actually verify and control the shape of the data (there is SHACL now). Also no sensible examples of using the stack, using FOAF to publish open data isn’t the job for most devs! Finally the rubbish tooling, clunky DBs and only viable option is Java makes the whole thing a quagmire... however it has tons of value if you can get past all that friction, Enterprise Linked Data is an order of magnitude better than alternatives.

> "Nobody understood it."

I don't think the semantic web is hard to understand. The semantic web is a distributed database that everyone can contribute to and access. It allows you to write queries to explore the data on the web, rather than having to browse through it.

For those that are new to the idea of the semantic web, DBpedia is a decent example:


The BBC website also makes use of semantic web technologies:


I have to agree - once I was in the not so comfortable position of having to deny funding for this - you don't say no to TBL lightheartedly, but SOLID doesn't offer anything that couldn't be done more simply with existing tools and techniques. It's byzantine and ROI is unclear.

> It's byzantine

How so? Can existing tools achieve the goals that SOLID is trying to deliver?

Distributed is hard, but it's not impossible.

The Whole Internet has become so centralized that it's about due for something to come along that will blow it up and move power back to the edges of the network. That's what happen when mini-computers shattered centralized mainframes, and then when PCs shattered centralized minis. The Internet leveraged all of those PCs to move even farther out to the edge for a time and now it's back to a centralized system again.

You almost sound like the voice of the status quo tech investment establishment. I'm not saying you are, but back in the late 80's we were repeatedly turned down for funding using almost the same language that you're using.

And I agree, from that point of view whatever the next BIG thing that comes along and shatters the existing system will look at first like a bad investment whose ROI is unclear.

I don't know if SOLID is the next big thing or not. From what I've seen from the spec I don't think it's likely. But these ideas which have been around for 20 years are starting to gain traction among the smartest people in the industry.

As Kevin Kelly said, something can be inevitable, but no one knows the form it will take. Distributed is one of those things. We don't know what it will look like when it does take off, but most of the pieces are in place and waiting for the right implementation at the right time to catch fire and move power back to the edge of the network again. ROI will soon follow after that as all of the late comers pile on with the only goal of making money and try to stop others from making money from it. And so the cycle will start again towards centralization again.

Arguments following this pattern are often made to defend technologies: The current state A is bad, and we should aim for state B. Technology X aims for state B, so X is good.

This pattern is fundamentally flawed. X needs to be useful on its own, no further assumptions made. Otherwise it will never get traction, regardless how much we wish B to come true.

Decentralization advocates make lots of specific arguments for advantages they claim it has.

Yes, and rightfully so. But to convince someone that SOLID is the right approach for decentralization, one needs to refer to the specific technology, and to specific use cases where specific user groups have an incentive to switch to it.

We know from the mainly failed P2P wave that just wishing for decentralization is not enough.

Yeah these ideas have been floating around for 20 years. It sounds great in theory, but the devil is in the details. And the whole problem with "owning your data" is that copying data is essentially free. So your only actual recourse is to invoke the power of government, which is unlikely to be on your side versus Google, Facebook, the credit bureaus, and law enforcement.

Can anyone explain what this actually is? I get that it's apparently an exciting new project, but what actually is it? A framework? A programming library? A data specification? I honestly can't tell.

They refer to "applications built using the Solid stack" at one point, which implies that it's an application platform. But elsewhere they talk about how it's "a proposed set of conventions and tools", which implies that it's an interchange data format.

What am I missing?

I'm not sure either what exactly this is, but it reminds me of https://blockstack.org/ when it comes to each user owning their own data storage which the app runs on.

Blockstack essentially lets you pick for example Dropbox as the datastore and an app would save files in a directory on your Dropbox instead of a server controlled by the app developer.

Solid seems to talk about also keeping track of your social connections. So I suppose it also tracks people's profile data and public keys locally?

I heard a funny story once about an interaction between Tim Berners-Lee and Ryan Shea. Woulda been about 4 years ago and I can see how it might have prompted TBL to take a crack at the same problem. Maybe you're right?


this should answer all of these questions

Interesting. Is that actually linked on the Solid project homepage? I'm not finding it.

> Can anyone explain what this actually is? I get that it's apparently an exciting new project, but what actually is it? A framework? A programming library? A data specification? I honestly can't tell.

Apparently it's a drop-in replacement for Google Wave.


The reason there isn't a clear explanation is because there probably isn't even a functioning beta. Hell, there isn't even a drafted design depicting what it is. This is part of a PR hype machine. Put MIT and Berners Lee on a project page to drum up interest.

Unvoting this one. Not going to help them achieve their mission.

No idea why the website offers so little information but here is the project: https://github.com/solid/

Feels that way. The question of whether it's actually Solid seems to be immaterial.

"As the project director, but also as a web developer, Tim Berners-Lee is involved in the overall planning and evolution of Solid."

'Web developer' as in 'developer of the web' :-)


Google, facebook, and big data will kill any chance of a standardized linked data format or protocol. Google's main product isn't search, it's you. Facebook doesn't care about messaging or social apps, they care about mining as much data about you as possible. If they had to ask for this data upfront and in a clear way they probably wouldn't exist today.

Well they will kill any big changes, but as tech people, we need to make small changes to start with. I've add LetsEncrypt to all my personal websites. I'm currently working on getting my Masterdon server up. I have a little docker system to self host things.

You could write some scripts to publish your personal block to ZeroNet, and add some "also available on Zeronet" links to the regular HTTP version.

It takes little steps and it has to start with us in the tech community. Even if our distributed tools don't grow, at least we can say we tried.

Sure, they shouldn't have that choice.

Currently when faced with regulation they can go with a straight face and say that it's technically not possible to let users own their own data. Google has to control the data it uses so as to operate its business, and privacy is just an unfortunate casualty of that. Projects like this (if successful) demonstrate that you can actually have your cake and eat it too and will (at least potentially) give regulators a lot more a leeway to force the hand of companies.

While it's true that Google and the like benefit from closed or partially closed data ecosystems and formats, as I've pointed out in another comment elsewhere in this thread [0], they've at least been pushing for the development and adoption of linked data formats like JSON-LD and vocabularies like schema.org, and they're using both of those pretty extensively in products like search and Gmail.

It's a far cry from things like Solid's vision of a fully decentralized web of linked data, but at least it's something.

[0] https://news.ycombinator.com/item?id=16356193

I have no idea what they are planing but my simple approach to the problem would be, to let browsers offer an API to store their users data in the cloud.

That might sound a little abstract, so I will give an example: A year ago I built an PWA which had the ability to use a WebDAV server as a backend to store data and sync multiple clients. While some might argue that WebDAV is not the perfect protocol, most of its disadvantages can be worked around. The real issue is that the user has to enter his credentials to every app which wants to store some data for him.

In my opinion, that is something browsers could make a lot easier and just let the user grant/refuse access to the 'cloud-storage'. That cloud storage would need a standardized protocol (e.g. WebDAV) to manage its access, but that way browsers could offer web apps a large storage and the user could select its own storage provider (in the browser settings).

Most users would probably stay with Google Drive or Dropbox, but at least their data would not be stuck in the walled gardens of the various web app providers and some might even choose to store their data in a Nextcloud (at least that is what I do with my PWA).

Really what we need is first-class IPFS support in Firefox/Chromium.

Img tags should have an href and an ipfs option. The browser can than chose to pull from either source, depending on user preferences or which one is faster. Users can be prompted if they want to store/serve that data in their own ipfs cache.

I think the FF59 plans for IPFS just make it easier to interface with your existing IPFS server (if it's running). Making it first class would be a game changer, but don't expect to see it in Chromium or Webkit any time soon.

Could you please elaborate on how IPFS would solve the problem? As far as I understand IPFS it is just like a public cloud. So either you run your own node (which would be similar to running your own server in terms of owning your data) or you upload to the network in which case you would not care on which server the data would live (similar to uploading an ?encrypted? file to Dropbox?). The real difference is how files are served as soon as you request one.

So what I don't understand is how it would make app developers give you the control over your data?

Regarding your img-tag remarks: what should href and ipfs attributes do? I mean href is an attribute used for linking, not for loading data like the src attribute and adding attributes for different protocols isn't a good idea either as for that use case we have URLs. So whats wrong with?

  <img src="ipfs://example.com/cat.jpg">
I never used IPFS so I might completely miss the point as I just read about it and saw a few videos, so please tell me if I misunderstood something.

I was thinking more of the lines of the first steps towards distributed assets. Maybe something like

    <img src="images/cat.jpg" some-new-attr="ipfs://mdgpreefl215jfeiwef2456/cat.jpg">
Or maybe `some-new-attr` could be a 2nd src? In any case, it's only to deal with asset distribution. One of the challenges with something like YouTube is simply the infrastructure to host all that content, as well as content/videos disappearing when YouTube decides they don't want to host it. With IPFS, you could reduce the server load on the content creator and also give the asset permanency if it's popular enough to be seen/rehosted by others.

The img tag example isn't that great of an example I'll admit. But it could eventually be extended to permanence of the entire page/site. Although there are already better tools suited for that, like ZeroNet.

This would have been possible with foreign fetch in Service Workers, but guess what... Google didn't like the idea.

Thanks for the hint. If someone else wonders why foreign fetch was canceled, this seems to be the relevant issue:


Closest comparison I can draw is with urbit[0], mainly in regard to data ownership. Beyond that there's not much else that's similar between them. I'm disappointed by the lack of information on Solid.

[0] https://urbit.org/

Even after reading several of the github linked specs and examples, I still struggle to understand what this intends to be, how it will work with e.g. CORS, or how anything will be able to flexibly work with other things playing in the same data-sandbox.

Like, we already have XML DTDs and schema.org and whatnot. Everything appears to be in place to build everything that I can see this doing, and has been for decades. But it hasn't turned into anything because the problem isn't how to store the data, it's how to use it, and allowing everyone to manipulate all data seems to repeatedly be demonstrated as a failure.

Is it really just "let websites manipulate my data, stored on my machines"? What would possibly incentivize websites to do so, rather than pull in data and subtly break it for others (intentionally or by accident)? Users are absolutely not going to understand why site X broke site Y, only that site Y is broken.

Or am I missing something fundamental? Entirely possible, I can't figure much out at all.

I don't see how the spelling errors in the spec can possibly discredit the technical merits of it.

There are some parts of the project that I don't like or maybe simply don't understand. For example, its user stories include an utterly simplistic privacy system [1].

[1] https://github.com/solid/solid-spec/blob/master/UserStories/...

What if Ian starts spamming everyone on the entire web (let's call this 'root node') with his "you've got a file from Ian" notices? Some kind of rate limiting system is required in this case, but is it really possible to decentralize such a system?

I imagine a system of many communities that can be subscribed to, think of subreddits, with their own behavioral rules (code of conduct?) requirements, groups, permissions, blacklists etc. So if Ian and Jane both are subscribed to the same community that grants them the permission to do the described actions (thus Ian is not banned nor is over his rate limit to send his notice to Jane, and Jane's privacy settings permit people like Ian to send their notices to her), they can be performed. I'd call that a 'third party node'.

Such a system would also solve the problem of discoverability. I expect the rise of the githubs and gitlabs of Solid if this problem is not accounted for early on.

Let's say these two users already got to know each other and they want to decouple from the restrictions of the third party node they were met at, how can they pair their 'profile nodes' so that there's no more third party constrains to limit their interactions? Let's say the profile nodes include a social, facebook-y, functionality in them. Ian sends a direct pairing request to Jane. She accepts, by including him to a personal custom group named 'new friends' that will restrict him to be able to see only a few of her photos (maybe based on the tags that were used on the photos, maybe based on creation timestamp ranges so he is able to see only her most recent/ probably less embarrassing ones, who knows, she's the one to decide). On her personal node, it's her rules. Much better control than the current social media sites provide.

This calls for a really privilege-centered system. Can Solid provide it?

The most compelling "what's next for the web" proposal I've seen is Douglas Crockford's Seif Project [1]. Though it has it's flaws, I'm willing to give the inventor of JSON the benefit of the doubt.

[1] https://youtu.be/lVezcdfjWis

In Dec 2016, I visited Tim Berners-Lee and his team up at MIT with my cofounder. Our company had a lot of overlap with the Solid project, and we wanted to explore working together.

That didn't really happen, because our approach was different at the end of the day. We wanted to raise VC money and get a lot of user adoption, and they were focusing more on promoting RDF, SPARQL, ontologies and so on:


However, I did meet a lot of cool people at the W3C and now some of them are our advisors!

PS: If you watch that youtube video, let me know what you think. Is it a clear explanation? Do you feel there is a need for this? It's all available online already btw.

I've watched many minutes, but it is too long. A good video, but I couldn't get to the part where you explain the product.

Well crap, he beat me to the punch. I was going to write an article on data ownership and how we should be able to maintain our own data and only let networks curate and syndicate.

You think any app writer would allow you to do that? Just take yourself away from the platform like that?


Bigger and more influential apps will always roll you into their corner. It's what every single vendor would do. It's what you would do.

Yeah I was thinking the same thing. This whole concept is a tin-foil pipe dream. It will never see mainstream adoption because data can be really profitable. What business would choose to eliminate a revenue stream for no reason like this?

Plus, as a user, I don't like analytic data being collected. As a software provider, analytics drive nearly every decision.

I like the spirit of what Solid is trying to do, but as long as targeted advertising works better than randomizing every ad you see (it always will), there's zero reason for any profit-seeking entity to choose it.

https://indieweb.org is more or less about this. May I suggest that instead of writing about it, implement something that's actually usable for others as well? There had been enough writing, but very little actual solution.

there are thousands of such articles over last 20 years but nobody ever seems to ship anything useful, even solid has been on hacker news many times

This is where I think that he RDF/Semantic web part went wrong. That and there were other networks that tried to lock down everything.

Hey, great minds think alike. It's truly a compliment and kind of nice to have TBL on your side!

My idea was that you should manage your own data, and have them tagged with the rights to broadcast or syndicate.

The gist of it would be:

If I write something as a tweet, wordpress would be the first to pick it up and possibly manage it. WP would be more of a CMS for your data. Write a tweet, it can be shown on your blog, and possibly to twitter and/or facebook. This would work for other content types. The value of twitter and fb is the network affect and curration.

I've recently become niggled by a very similar idea of a personal data store with a layer for managing rights of broadcast and syndication.

A simple example is that I would want any food delivery app, or web-connected restaurant to know my dietary data instantly, but not really much else. Any new-media site I connect to I want to instantly know what kind of content I find boring or distracting, but not really anything else.

If you like this idea, maybe you'll like https://remotestorage.io/.

Sounds a bit like IPFS. Which makes me wonder what TBL thinks of it. Has he said anything on the topic?

I mean, IPFS is designed to replace http, which is TBL's baby, but lately he is really into web decentralization, and IPFS is farther along on that than any other technology, from what I understand. Maybe http was a kluge and he would be happy to have it replaced. Maybe Solid could run on top of IPFS, or is that not possible.

Hmm ipfs has many issues for solving that problem, and the whole filecoin nonsense adds an additional layer of problems.

But ipfs is charging forward. Very often the technology that gets adopted is not perfect, but succeeded because it attracted more support than any other one.

And web decentralization is not going to have any real impact unless there is a standard set of technologies that can be the basis for all the main use-cases, and that the average, computer-illiterate user can adopt by clicking a link or two, or better yet comes built into their browser. IPFS is much further along here than anyone else, so it seems like the Solid people should be looking at integrating with it, if that is technologically possible.

Can't speak for the rest of the world, but in the U.S. people would completely panic if the Govt had a extensive registry with everyone's info, and that's exactly what we have except a couple corporations have it.

Nobody really cares cause the email is free.

The bright side.. Greed usually tends to overreach at some point, after which may come more public outcry.

> in the U.S. people would completely panic if the Govt had a extensive registry with everyone's info

Not in the same vein as google having my internet based data, but the us govt very clearly and publicly has many such extensive registries that are widely known and relied on parts of the country's basic infrastructure.

How is this different from Kenton's sandstorm?

TL;DR: Solid focuses on storage and data formats, Sandstorm focuses on compute and protocols.

AIUI (disclaimer: haven't spent a huge amount of time studying it), Solid focuses on data and storage, not on compute. It proposes that each user's data should be stored in that user's own storage, under their control, in formats that are standardized so that they are compatible across multiple applications. The applications themselves, though, still execute as they do today, probably on servers owned by the developer. Applications interact with each other by virtue of supporting common formats. A big part of Solid is defining these standardized data formats.

Sandstorm is focused on compute: it says that applications running on your behalf should run on your own server, in isolated sandboxes. An application's storage format is a private implementation detail, and applications never directly access each other's storage. Instead, applications communicate via standardized protocols.

Abstractly, you can think of Sandstorm as object-oriented, in that it combines data and compute into an "object" that implements an "interface".

In my opinion (as the architect of Sandstorm), protocols are the correct place to find interoperability; data is the wrong place. The data format inherently defines the feature set which can be implemented on top of it, and thus forcing apps to use a standard data format tends to prevent them from implementing new features. It is much easier to create protocol compatibility because a complex app with a larger feature set can implement compatibility shims for other protocols using code.

Also, of course, Sandstorm's sandboxing model has huge security benefits. Solid doesn't seem to provide any security benefits since the apps still run remotely on the developers' servers where they could do anything they want with your data.

I couldn't find a link to the GitHub on the page, but that appears to have a lot of useful info as well.


"This site can’t be reached"


The phrase World Wide Web is great because while it remained ambiguous it still perfectly summarized the ambitious nature of the project. Also you can’t beat that alliteration.

Calling this project “Solid” reeks of “unsinkable ship.”

I am optimistic that something like this could work. I know there will never be "one true database schema" that everyone agrees on, but I don't think there has to be.

Does a site that sells widgets really have nothing in common with a site that sells gizmos? Surely that's not right.

So the question is: do they share most of an application, or a protocol, or data? I think data is the best answer here. It requires more coordination, but the ultimate benefits are much greater, both to users and to new businesses trying to innovate (less friction to bring in new customers).

There doesn't have to be one database schema, or even one data interchange format. There only has to be an agreed upon semantics that can be mapped on to different data interchange formats. Here is where linked data/semantic web comes in. For instance json-ld allows you to use json as you normally would, but add a schema that allows others to know what all your fields actually mean.

Radical change often happens unnoticed, it doesn't come pre-announced.

I'm not sure if this can really help users protect their data. Tons of personal data is already out there, and companies that want it can buy it fairly readily and cheaply. Users have shown a willingness to exchange personal data for useful applications. Tons of data isn't even individual; who owns your personal social graph?

PII is the most circulated currency on the internet and anything that stands between companies and their cashflow (privacy/data ownership projects like this) seems like it would be a nonstarter.

I don't think those are invalid points, but one consideration is that if a standardized format for encapsulating and controlling PII did emerge and was adopted, the value of the previously-released personal data would decay over time.

That is, if your current PII suggests that you are a fan of (Brand X) right now, it is more valuable than if it shows you were a fan of (Brand X) 5, 10, 20 years ago. There will still be correlations that can be drawn from historical data, and some data's value doesn't decay as a function of time (DOB, for example), but it would still be an incremental improvement over the current system.

As technologists, I don't think throwing up our hands and saying "It's too [late|difficult|expensive] to solve this problem!" is the right solution in most cases.

Furthermore unsanctioned data would lose its value even quicker in the face of systematic data poisoning.

Such a campaign against unsanctioned data would be a likely consequence of the success of solid or a similar protocol.

Not immediately, as others have pointed out, but those who develop products that incorporate this approach, and those who embrace it swiftly, will provide us choices.

For example, if "MeWe" were to incorporate this I think they would grow faster and at some point put enough pressure on Facebook to do the same.

The landing page is horrible.

It seems like like a scam.

There's spelling mistakes everywhere, and the top navigation doesn't appear to work. How are we supposed to take this seriously?

Because the inventor of the web helped make it.

Perhaps their server is overloaded, the page is not loading for me. Surprising considering the mit.edu domain.

This sounds interesting, I will check back. I would love to own my own social media profile, where I get to install it on any server or service I wish to, and have full control over it. I'm guessing that's what this is.

The problem is that people should agree beforehand on the data format and schema for every new app. That's impossible.

Actually, any interchangeability is impossible. Apps are different, thus they have different data schemas. If they had the same schema they would be the same app, just with a different color.

Yes and no. Think of something like a travel reservation. The schema for travel reservations I argue have remained unchanged for centuries from horse drawn carriages, to passenger ships, to trains, to buses, to planes: departure location, departure date-time, destination, arrival date-time, etc. Thousands of firms have come and gone offering similar services all operating under the same basic schema. Or take the modern day example of social media feeds: twitter, Facebook, instagram, tumblr, all share at their base the same schema. Maybe they add a propriety concept here or there.

Indeed, I believe this was conceived with Instagram/Facebook/Twitter/Tumblr/Mastodon/all open source copies in mind, but this is the SINGLE USE CASE of the entire standard. And it makes it worse, I think. I didn't want all "social networks" to be the same like they are today, there are other alternatives.

About reservations, I don't see why would you want to move your data. Reservations are tied to companies in a way no standard can solve. And by design.

> About reservations, I don't see why would you want to move your data. Reservations are tied to companies in a way no standard can solve. And by design.

Maybe company A has a history of all your travel reservations with them, and company B has a history of all your travel reservations with them. Then company C offers a service where they collect travel reservations from various services to create an business expense report. You, the supposed owner of the data would like to authorized C to import your data from A and B.

In fact there are other use cases, for example: Trello and its copies. In this case no standard could have predicted that Trello would raise,thus there would be no standard for it.

There are probably many other examples of something that came unexpectedly from nowhere and suddenly there were copies.

Sure, but once a defacto standard schema exists between these services it is trivial to formalize that standard and implement it in the various services. Of course there is still the business problem where no one is willing to expend any resources to make it easier for users to migrate.

This is a solvable problem but you have to solve it with markets not with standards committees.

Markets are there, right? Why haven't they solved it yet.

The site seems to be completely down? I ran a few external checks and it is completely inaccessible right now.

I heard about this from a CNI keynote https://youtu.be/o4nUe-6Ln-8 -- seems like linked data's take on decentralized web / diy facebook replacement.

I get so angry about this shit. It's not possible to make a landing page, and actually describe what's going on.

Why do I have to search for information?

Why is it, that all those shitty, shiny looking new homepages are the same. NO CONTENT.

Ok, I'm done. Not interested anymore.

There's a really simple solution that everyone is overlooking. Stop trying to make applications in browser and actually make applications that run native on an OS. And no, I don't mean running on some trimmed down JS engine.

in a world of unlimited resources... sure. However, I, as an independent software maker, want to target as many screens as possible with the least amount of friction. I am not overlooking anything nor are the companies and other indies that target the web... that is called choosing pragmatically.

Yup. Now all your client's data can be stolen at once and isn't protected from government intrusion due to third party doctorine and network problems cause application problems. All these things that make it worse for the end user and more.

But it's easy for devs so none of it matters.

This is incredibly eerie, I literally just started working on this idea this weekend https://github.com/byod/byod-home

I had a similar idea way back when:


It's likely a pretty common thought process with open source internet devs.

My impression is that this is basically Urbit. Am I wrong?

Cool project. Thank you for sharing this here.

Goals sound similar to https://holochain.org

<3 holochain

Does anyone else find it slightly awkward that they include past contributors on their website?

> autehntication ... autentication ...

That’s why you don’t roll your own authentication.

Tim Berners-Lee as a web developer! THE web developer is more accurate actually

It sounds like a real world blockchain project.

thought it is actually a blockchain application in the real world

I can't figure out what this is from the about page...

>> The project aims to radically change the way Web applications work today, resulting in true data ownership as well as improved privacy.

What is the product? How does it achieve its means?


Please don't post unsubstantive comments to HN, and especially not ones tinged with personal attack.



its not entirely false though, he has expressed regret in the past about how it worked out. the web isn't really a great idea but it caught on anyway. arguably its been the cause of a new scams, hacks and certain politicians getting into office.


i do remember alan kay not being too happy about the whole situation too


Lots of people and organizations, myself included, built "web browsers" in the late '80s. Unfortunately, it was the design of TBL that caught on.

What was your web browser?

What were the features of your, and/or others' web browsers, that distinguished them from TBL's work?

Mind: there were several other alternatives out there, with Viola being amongst the more interesting -- it aimed at becoming a suite of web-related tools and capabilities, based on what I've seen.


I am referring to things that had no name and never escaped the confines of the corporate environment where they were built. Mine was for browsing nuclear power plants. Had full vector graphics rendering engine and a limited but capable scripting engine.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact