Hacker News new | past | comments | ask | show | jobs | submit login
A self-authenticating social protocol (blueskyweb.xyz)
87 points by wmf on April 6, 2022 | hide | past | favorite | 56 comments



This early rather than focus on technical problems I would have liked to see the essential question addressed first, how Bluesky will deal with the issue that moved us from protocols to platforms to begin with.

The reason the internet is now increasingly in vertically integrated platforms is governance. Companies can innovate quickly. The question any protocol driven approach must answer is how it is going to develop and push its protocol forward in the face of dynamic competition from proprietary systems.

The reason why we're all now on Slack or Discord or Signal and Whatsapp rather than on IRC or Activity Pub social networks is because firms, that is to say top-down organizations can push features quickly. Decentralization comes at very high costs. If a protocol is too slow in offering more features what will naturally happen is that closed systems will come, fracture it and eat their lunch. This is exactly what already happened once. And given that it already happened, how that can be avoided ought to be answered first.


Yeah this is a really interesting question. It's the point Moxie brings up when people ask why signal isn't federated or otherwise decentralized. You basically have to solve multi-stakeholder collaboration. You want projects/companies to have the flexibility to add features, but you also don't want subtle incompatibilities to plague the system like "I can't see the picture on your tweet because you use the MediaEmbeds schema and I used the EmbeddedMedias schema." This is the topic I get nerd-sniped about a lot.

There are a couple of ideas I want to try. One is machine-readable schemas that get published on the network. That's kind of conceptually similar to what smart contracts represent; there's a program that essentially "lives on the network" and which people can build against, reference by hash/pubkey, etc. It's a coordination primitive. I like how intuitive that feels.

Another idea is schema negotiation, essentially metadata that says "this post uses MediaEmbeds, here's what needs to happen if you don't support it."

A third idea is using graphs of fact-triples like RDF, which gives you a lot of flexibility but I worry may not solve coordination well enough.

And a fourth idea is to use lenses like Ink and Switch talked about with Project Cambria.

I think this is a really interesting topic, I hope we find some good answers.


Talking about IRC in particular, the reason I'm on Discord is:

(a) UX far beyond anything I've seen with IRC

(b) super easy for people to create new Discord "servers"

(c) super easy to discover or link to Discord servers

Talking about social microblogging/commenting (aka Twitter), the reasons I joined then later exited the ActivityPub "fediverse":

(a) Twitter's discovery is massively better

(b) My "identity" was not portable between instances (that's different from user on Network A interacting with user on Network B), also no easy way to link multiple instance identities

If bluesky can solve all these identity/discovery/onboarding UX issues in the context of an open protocol, it'll be a massive step forward.


> Twitter's discovery is massively better

Do you mean recommendation algorithms, or just the number of users ie network effects?

> My "identity" was not portable between instances

Your identity isn't portable between email servers either. Or between HN and Reddit. Isn't that part of what Keybase is about?

I can imagine a couple ways of achieving this but it runs into a number of thorny issues regarding namespaces. Regardless of the approach taken truly portable identities would require lots of refactoring of lots of software in a FOSS ecosystem with fairly limited developer resources available.

Merely linking identities should be much easier but still each piece of software has to support it. Simpler just to link to your other accounts in your bio the same as on any other website.


> Isn't that part of what Keybase is about?

was


> My "identity" was not portable between instances (that's different from user on Network A interacting with user on Network B), also no easy way to link multiple instance identities

That is by design. It's a federated system! Just like you don't need multiple phone numbers to talk with people in different networks, or different email addresses depending on the recipient, you don't need different identities to communicate with different servers.

Unless you want to keep your identities separate, there is no point in having multiple accounts.


It's not about having multiple accounts, it's about the fact that federation is a bit meaningless when only a megacorp can run a server.

Using a small server is just asking to lose everything when it goes away, and small things come and go all the time. Porting one account to a new server(Or better yet not tightly binding it in the first place) is essential.


> Or better yet not tightly binding it in the first place

Yeah, I agree. As I repeated elsewhere on this thread: having your own domain is the most straightforward way to guarantee that you can port your identity.

But this is a separate problem from this idea of multiple "accounts". This is mostly from Mastodon's weird culture where people make this association between themselves and the in-groups from the instances. I wrote about that on my blog

[0] https://raphael.lullis.net/federations-and-identity/


> Unless you want to keep your identities separate, there is no point in having multiple accounts.

Until an instance goes down either temporarily or permanently and it was your only account. Then you'll wish you'd had a few extras. How many email addresses do you have? From how many independent providers?

Accounts aren't portable but there's no inherent reason that accounts need to map 1:1 to identities. In fact they already don't, a single identity already commonly has multiple accounts. The software just isn't aware of this fact and has no standardized way to communicate it.


> Until an instance goes down either temporarily or permanently and it was your only account.

That's a technical problem, but not a UX or system design one. If a server goes down, my backup account is only going to allow me to keep participating, but it is not going to save the conversations I had on the lost one.

> Accounts aren't portable but there's no inherent reason that accounts need to map 1:1 to identities.

Again, you are talking about identity portability, which is not the same as "creating one different identity for each different server".

In any case, I totally understand the use case and I tried to get the Pleroma devs to focus on this idea as well: https://mastodon.communick.com/@raphael/106825107786781891


> my backup account is only going to allow me to keep participating, but it is not going to save the conversations I had on the lost one.

The same issue exists for other services including email, irc, and matrix. The obvious simple and immediate solution is to retain copies of things you care about locally instead of expecting a particular service to stick around forever. With email this used to be standard practice before the rise of webmail. Even now most desktop clients sync to and retain a copy of everything on the local device.

> > Accounts aren't portable but there's no inherent reason that accounts need to map 1:1 to identities.

> Again, you are talking about identity portability, which is not the same as "creating one different identity for each different server".

It seems like it is though? Servers need a way to handle authorization so they presumably need accounts. With a simple protocol, accounts map 1:1 to identities as far as the software and protocol are concerned. If you make a new account then you also make a new "identity" from the perspective of the software. The obvious way to solve that is to map any number of accounts to one (or possibly multiple) identities, where an account is the thing the server is concerned with and an identity is something else such as a private key.

There's an additional question about the mechanisms for account creation and authentication. Maybe you don't want to maintain an email and password based login for a long list of accounts that are each tied to the same backing identity. There are multiple approaches available there as well, from oauth providers to public key based authentication possibly including things like cross signing.


> The same issue exists for other services including email, irc, and matrix (...) expecting a particular service to stick around forever.

It doesn't have to be like this. This is the point where identity portability comes into play. If the identity belongs to the user and they carry to different servers, the user is independent from the provider.

In the case of email/irc/matrix, as long as you own the domain name, you can change the service provider whenever you want.


You seem to be describing hosting your own service, changing the underlying hosting provider, and tying your identity to ICANN DNS.

There are two separate things here, identity portability and data portability. If your identity is portable (say a key pair) but the service hosting the data goes down and you don't have a backup or there's no way to import the data to a new provider then you've got a problem. Similarly, if the service goes down and you do have a backup of your data and a way to import it to a new provider but your identity isn't portable (perhaps it was user@domain and you don't own the domain) you also have a problem in that case but it's a different problem.

And neither data nor identity is the same as account, at least the way I was using the words.


> You seem to be describing hosting your own service

No, it still gets to be hosted by someone else. And it is not just because I am using a custom domain that it requires a separate server. "Shared hosting" has been a thing for web and email forever, and there is nothing stopping* a matrix or an activitypub server to process requests for different users on different domains.

> tying your identity to ICANN DNS.

Not necessarily. If you see the linked conversation I had with Alex, it was about using Ethereum ENS. It could also be something based on OpenID. The important thing is that the actorid in activitypub can be anything that can be authenticated by the server, it doesn't have to be only the internal username.

* there is a current limitation on matrix synapse (where the homeserver can serve only one domain), I don't see anything on the protocol that forbids the implementation of a multi-tenant server. Perhaps it is not a top priority because such a server would incentivize big hosting companies to offer it as a service and possibly re-centralize it.


Most federated services DO heavily focus on identity as username@server though, and don't have any real account sync.


This is half-true. For plenty of services it is easy to add "custom domain" support, and then your identity is your domain.


Fair point. But OTOH Twitter isn’t really innovating anymore either. Arguably the “move fast and break things” phase is over. Now we’re heading into the “monetize or commoditize” phase.


The wireless telecom industry does this on the regular. It requires tons of collaboration, tons of design & specification, tons of testing & verification, etc. etc. With all that, I suspect the reason why Bespoke Walled Garden 5G Wireless™ doesn't become a regular occurrence is because the barrier to entry in both buying spectrum and rolling out infrastructure is so incredibly high.

I have no idea what if any lessons there are to learn from that, but it is at least an example of where an entire multi-vendor, multi-layer industry migrates forward more akin to the "protocol" model mostly all together.


> because firms, that is to say top-down organizations can push features quickly

I'm not convinced that's the case. I think it's more an issue of resource investment. Firms are capable of funding things. Twitter at a low level barely has more "features" than email if you stop and list off what you actually need in order to do what it does. But it has reams of UX stuff built on top of the base functionality, recommendation algorithms, moderation tools, lots of servers, technicians to maintain those servers, etc, etc.

FOSS stuff is continually at a resource disadvantage in most domains.


Decentralization can push features very fast, and support all kinds of extensions and variants that spread and become standards.

ActivityPub and all the rest are either not decentralized, or they are blockchainful with all the associated disadvantages. Namely, it doesn't work on isolated LANs, throwing away one of the coolest killer apps of P2P. Federated protocols often tie your identity to some random hobby scale project server.

What stops P2P from adding feaures is just the fact that it's a lot of work and nobody seems all that interested, and most P2P projects are held back by a very narrow defined scope, wheras companies like Facebook go for the kitchen sink.

The other thing that slows dev is multiple unrelated apps. If you control everything(Which is totally possible in FOSS, if one group is the de facto leader that most people go with), you can update as fast as you want, as long as your protocols have primitives for arbitrary message types.

If Tox or Jami had a million dollars, or SyncThing were interested in expanding scope, or a few dozen people wanted to make a social network on BitTorrent they could do just fine, but everyone wants to build yet another Etherium token, or doesn't have the resources to support all platforms and continually add features.


All of these problems go away the moment that we stop accepting non-free software.

If we as consumers start saying "no, even if your product is better I will not sacrifice my long-term freedoms for the short-term advantage and convenience of your product", companies would still innovate, but they would be playing in a more leveled field.


How about free software creators start innovating first, instead of insisting people should simply be willing to put up with poor quality because "freedom?"

This way of thinking is the reason Gimp will never be a viable competitor to Photoshop. Meanwhile, Blender is actually used by professionals because its quality makes it worth the effort.


Who said anything about "putting up with poor quality"?

> This way of thinking is the reason Gimp will never be a viable competitor to Photoshop.

Have you contributed to Gimp in any way? How do you expect a barely funded project to compete with a multi-billion dollar company?


> Who said anything about "putting up with poor quality"?

you did...

>> If we as consumers start saying "no, even if your product is better I will not sacrifice my long-term freedoms for the short-term advantage and convenience of your product (...)"

>Have you contributed to Gimp in any way?

No. I don't have the time or the specific domain knowledge, and as a tool I don't find it useful enough to use. Have you? I didn't see a name on the contributors' page[0] that resembles your username.

>How do you expect a barely funded project to compete with a multi-billion dollar company?

I don't know. I do know that isn't an argument that's going to convince anyone using Photoshop to switch over to Gimp. Maybe one of the millionaires that hang out here can do it.

[0]https://www.gimp.org/about/authors.html


No, I did not. Making a priority to have free software in my stack is not the same as accepting anything "just because it is free".

> I do know that isn't an argument that's going to convince anyone using Photoshop to switch over to Gimp.

Gimp vs Photoshop is a somewhat bad example to the topic at hand (closed systems taking over free software that implement open protocols), so if you don't mind maybe we can go back to those?

In this case, I am a backer of Mastodon. I also contributed to Pleroma and to the developers of Conversations (XMPP client). Nothing much (~100€/year), but I feel a lot better by contributing to them that I would ever feel by paying for Slack or Discord.

The point is, it doesn't take a millionaire to do that. If companies took 5-10% of their IT budget to contribute to free alternatives of the closed software they use, perhaps they would quickly lessen their dependency on the closed software and free their resources.


FOSS was innovating a lot, and then it took a major hit when people decided code and features were evil and they didn't want to risk any extra attack surface, and then all the P2P types seemingly mostly moved to blockchain dev.

Plus, I suppose most P2P was piracy driven and piracy is no longer an everyday conversation topic, or something a lot of people would even think of.

That might also have something to do with why the privacy conversation moved away from things like neutrality and right to encrypt, to focusing almost entirely on extreme hate for telemetry.


The linked ecosystem overview[1] is a great exploration of the space. However, there's no mention of IndieAuth[2] in the exploration of decentralized and self-sovereign identity. I'd love to better understand why it's been left out because it seems really appealing in my own explorations of the space.

[1] https://gitlab.com/bluesky-community1/decentralized-ecosyste...

[2] https://indieauth.net/


Strong agree. It would even allow for more seamless transitioning between silos and sovereign identities.


Forgive my cynicism, but it's been two years already since bluesky was announced and all we see is the occasional update from what seems a research project. Nothing really tangible is released, no extension proposal to the existing protocols is made, not one single proof-of-concept that shows "look, protocol X can not do Y, so we implemented Z".

Every update from them feels more and more like an elaborate PR piece. Just some hot air to keep the illusion that Jack really cares about decentralization.


Two years were spent talking among projects in the community. You're seeing stuff now because we formed a company at the beginning of this year and started building. (We explain in this post why none of the existing projects were picked and why we decided to build new.) Backstory here: https://blueskyweb.xyz/blog/2-28-2022-how-it-started


No, you do not really explain.

"None of them fully met the goals" is not an explanation, it's an excuse. What are the goals missing? What is it from each of these protocols that is falling short? Can they be remedied or are there inherit limitations to all of them that warrants working on something completely new?


The "Our Team" link near the end points to https://blueskyweb.org/blog/2-31-2022-initial-bluesky-team

Take a close look at that URL, there's a terrible bug in the date handling of whatever CMS they are using.


There's no mention here of moderation, a key ingredient to any social network of people who lack real-world ties to each other.


It is, and we're talking with people who do trust & safety at Twitter to make sure we're approaching it right. To be clear, we're really in the early stages of the project. This post is just the stuff we have a clear concept about, and even that's just an early prototype (which we'll put on github soon). I could talk about some of the approaches we're looking at for moderation but it's just really early, and I don't want to talk out the wrong end. All I can say is, we talk about it a lot.


It's hard to take a project that calls for decentralization using twitters trust and safety people as a source. The company that killed a relevant true story during a presidential election should disqualify them. Unless the advice is used as a template in what not to do.


One controversial decision does not define that team, which spends the vast majority of its time dealing with uncontroversially awful content.


I often discover in hindsight that what I thought was the core mechanic was really not in the top 3.

Distributing content is the nr 1 from the users perspective. It is what we think we want. What we really want is interesting content and interesting dialog. Getting rid of garbage should be a side effect of that. Something you get for free along with it. I imagine it easy to make the mistake to aim to get rid of trash and put much effort towards that while really it should be zero. One might even chose to tolerate a sub set of trash if the author performed sufficiently.


I really appreciate that you're at least thinking about it. Community is a bit like security: you can't just layer it on after building the "cool parts" of your system. The name "trust & safety" makes the parallel even more clear.

It's harder to set rigorous definitions for community than it is for security. So it will likely change as the project evolves. But if you ignore it entirely, and just hope it will work itself out, you'll be in a continuous state of crisis-fighting.


Could you also talk to the web team about re-adding the vertical scrollbar? It's an important UI element.


I'd go further and say that our toxic social media environments are a far more damaging, dangerous, and unsolved problem - possibly more than litterly any problem in the world, IMHO - than decentralized communication, and this project addresses the latter.

And I strongly believe in decentralized social media, far prefer Usenet (RIP) to Twitter, and have from the start. But it's a solvable problem (Signal is working toward it, Matrix, existing solutions like Mastadon, etc.), they just need to attract development investment and, most of all, the hardest part, attract users.

And Twitter seems to have bought into the toxicity, bringing in one of the few leading popularizers and uses of that subculture - trolling, mis- and disinformation, manipulation, corruption of power, etc. - onto their board. And rather than decentralizing power in our society, they've concentrated more of it in the richest person in the world, who already adds to that a cult of personality, and now has significant influence over one of the largest public communication forums.


I don't know anything about bluesky, but I applaud any initiative that could provide a path away from the highly centralized web we find ourselves in today.

Also excited to see verifiable computation as an underpinning of this particular initiative. It's a fun topic and technology I haven't seen too broadly explored.


This rambling post from a web company I've never heard of touches on some interesting theories and attractive buzzwords, but it's overall main point is about as indecipherable as if the post had been autogenerated by GPT3 using the headline as the prompt. My unanswered questions are: who is Bluesky? What is their relationship with Twitter? Are they developing a tangible protocol that has concrete specifications, a roadmap, and path to adoption? Or is this just a rambling blog post about some ideas that could be applied to a social protocol?


I believe it's part of (or at least funded by) Twitter: https://en.wikipedia.org/wiki/Bluesky_(protocol)


It is/was a project of Jack Dorsey, Twitter CEO

https://www.cnbc.com/2019/12/11/twitter-ceo-jack-dorsey-anno...


Hello Fellow Internet User, In general, it is your own responsibility to keep yourself informed when commenting on something. The World Wide Web makes this easy with something called a 'hyperlink', for example, the one at the top of the article you read that says 'blog': https://blueskyweb.xyz/blog


The parent comment, not the GP, is against HN guidelines. Just downvote the GP and demonstrate appropriate communication.


The responsibility to communicate well does not fall on the reader. This blog post did not communicate well.


I believe platforms will always exist but imagine data will be more portable in the future. I think the focus should be on data.

I propose the Newton Web Protocol.

For each action that a user makes, an equal action is pushed back for the user to consume.

When a user interacts with a website (comment, like, save, upload, etc.) the website pushes data/an event which can be consumed only by the author. When consumed the data/event could be processed: saved or replicated to other sites or the authors own site.


For those unaware, it is/was a project of Twitter's Jack Dorsey. You can find many news stories covering it from when it was announced. Here's one:

https://www.cnbc.com/2019/12/11/twitter-ceo-jack-dorsey-anno...


How would bluesky address the “political decentralization” aspect that is present in some blockchains? i.e. ensuring the protocol ends up being widely distributed rather than hosted and managed by a centralized few. Cryptocurrencies seem to address this by creating an incentive for any user to validate state and secure the public proofs (PoW/PoS), but I’m interested to hear how bluesky would approach this?


I don't think you need to. With cryptocurrencies there's a need to obtain global consensus regarding where a coin is transferred from and to in order to prevent double spend. With communications you only need the ability to cryptographically verify the authenticity of the data and who published it. There's no double spend problem.

Looking at email, there is an issue with people tending towards a few large providers and with some providers refusing to federate with other ones for various reasons. The former probably isn't possible to solve directly via technical means. The latter is addressed by the described "self-authenticating" aspect. Peer to peer data transfer coupled with cryptographically based identities means an identity isn't tied to a particular provider.


Who would host the data? BitTorrent protocol has a notorious “seeder/leecher” problem. And how would that hosting mechanism be resistant to collusion and manipulation?

For example, a signature or zero knowledge proof is published onto a social network by the president’s cryptographic identity, how can we (the people viewing the network) verify that the data we see on the network is in fact published by the president, and not a malicious actor who is hosting their own node? At a scale of just one individual, we could easily establish “Person A = public key hash X” and verify that, but at a scale of millions of users it seems you will need some sort of decentralized state/data tracking (like a distributed blockchain ledger).


> Who would host the data?

Anyone who wanted to. If you wanted to make sure your personal data remained available when you turned your computer off or that it could be downloaded quickly then you'd need to arrange for paid hosting. Pick up cheap vps and share it with your family and friends. Whatever.

> how would that hosting mechanism be resistant to collusion and manipulation

It doesn't need to be, that's the point of using strong cryptography.

> it seems you will need some sort of decentralized state/data tracking (like a distributed blockchain ledger)

No, you just need to keep track of keys. Your identity is your key. We already do this. Certificate authorities, ssh keys, etc. The issue you're describing is one of using a centralized authority to solve the problem which is hardly the only solution and certainly doesn't require a blockchain.


Presumably Bluesky's social media protocol will need to operate on some data/ledger structure, to maintain a large social graph (follower/follows), user records (public key hashes, posts, etc). If too few nodes are hosting this data, it would be malleable to manipulation by the hosts.

The same occurs in blockchains that are not widely distributed enough: a blockchain with 5 validators (hosts) is more susceptible to manipulation (i.e. hosts changing records/data for their own gain) than a blockchain with 50K validators, because the cost to control the network becomes exponentially expensive. Strong cryptography has nothing to do with this, and does not prevent this kind of manipulation. Imagine a sort of 51% attack on Bluesky's social graph, in order to manipulate a user's associated public key hash or alter the social graph for some nefarious reason.

A blockchain ledger probably is not the only solution, but it does create a strong financial barrier to manipulation (i.e. 51% attacking a widely-distributed PoS blockchain ledger would be prohibitively expensive and short-lived). I am interested if Bluesky is considering these problems and whether they have alternative non-crypto/blockchain solutions.


Sounds like Taligent, Pink and Longhorn all in one.


Interested, but it seems a bit light on impl details. Presumably too early?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: