Hacker News new | past | comments | ask | show | jobs | submit login
On Dat (kickscondor.com)
246 points by unicornporn on June 12, 2019 | hide | past | favorite | 68 comments

> The prolonged obsolescence of distributed protocols like Bitcoin and torrents means we’re maybe skeptical or jaded about any new protocols

Funny choice of examples, personally I would have pointed to bitcoin and bittorrent as two examples of distributed protocols with runaway success

Yeah, Bittorrent is actually getting fairly old compared to many other technologies in use today; it's _older than_ Windows XP. :D

A newcommer compared to nntp, which arguably is a kind of p2p network!

Isn't nntp more of a store and forward network?

nntp relied on central servers which clients connected to, so not fully decentralized.

That the servers did inter-server replication and forwarding doesn’t really change that.

But the servers are p2p between them, i.e. they propagate the content to the other servers. If you take down one of them, you can still connect to any other server to get the same data. Keeping in mind this is 1980s technology, it looks pretty close to p2p to me.

Federation (email, nntp, matrix) is decentralization, but it's not p2p.

Bittorrent has succeeded in that it's now widely used by publisher to distribute their content.

As a piracy/anti-censorship tools it overwhelmingly relies on the generosity of a handful of power seeders... which could theoretically be taken down though with some effort.

Also to thank is the laxism of the Russian government (and maybe others) in fighting these seeding hubs.

Ok, good point. I've changed the word 'obsolescence' to 'lifespan'. I think I was trying to convey something similar what you're saying: that these protocols are old and don't seem to hold as much promise as they used to.

I tried beaker when it first featured on HN. My website isn't particularly heavy. It completely crashed the client trying to upload it. We're talking < 100mb of data. I believe they're still working on the issue [1].

[1] https://github.com/beakerbrowser/beaker/issues/952

The next major release of Beaker (0.9) is going to have a new version of Dat that improves scaling quite a bit. Mafintosh and his team has done some amazing work using trie indexes for much faster reads across the network. Beaker's also gotten much better at syncing with folders, which was the issue in your link.

I believe the issue was with the client, not the protocol :)

Beaker didn't run on my Ubuntu. I opened an issue and was ignored.

I guess Dat is too much JavaScript to be worth it anyway. The tooling and documentation are very poor, most of stuff is outdated. They claim to have a vibrant community but it's mostly two or three guys loving their Beaker browser and caring none about the rest of the world. Perfectly fine for a closed community, but not for a protocol that should be aiming widespread adoption.

While we're at it, could someone share their experience with Secure Scuttlebutt vs dat vs ipfs? My interest is to have the simplest thing that could possibly work to make existing static and moderately dynamic sites p2p-publishable. From what little I know, ipfs seems too heavyweight, requires/repurposes git with GPL license, etc. and isn't rooted in HTML/http concepts, while dat/beaker is at least targetting my use case directly. But I'm open to get convinced otherwise.

I don't have a lot of experience with these, but I tested each of them a little bit and know mostly how they work.

I believe that for small sites, the main aspect that differs between them is how peer discovery/name resolution is done. For this, IPFS has the most distributed approach (using a DHT and IPNS) but is also the most fragile and highest latency one. DAT has plans to switch to a DHT in the future (hyperswarm) but relies on centralised DNS discovery servers + local network multicast right now. Secure Scuttlebutt takes a more social approach and organises its network in terms of pubs. There is no central discovery, you can only find users that you "meet" in a pub.

This means that DAT has the most website-like feel because their solution provides low latency (only DNS lookup) and is global. I also think that DAT is much more pragmatic compared to IPFS thus at the moment it is much more stable and its us ability is better (also because it has more focus on stable, core components compared to IPFS which is a bigger project trying to do lots of things at once). I like secure scuttlebutt's idea of focusing on communities but that is a different approach to how the web works right now.

> Secure Scuttlebutt takes a more social approach and organises its network in terms of pubs. There is no central discovery, you can only find users that you "meet" in a pub.

The caveat is: those users can’t meet _you_. I used SSB for a few months, joining pubs, commenting, trying to participate. No responses. (Not a huge surprise—you can get lost in the piles of people on Twitter, Reddit, etc.)

However, eventually I discovered that no one could see my contributions unless they added me—and SSB (or Patchwork, in this case) gave me no way of advertising my presence. This was pretty self-defeating. So now I don’t have to just build a ‘presence’ inside the network, I also have to build a ‘presence’ outside the network to announce that I’m somewhere inside the network. The SSB tools also give you no inkling that this is the case. So just know to bring friends!

Yes, they consider this a feature, not a bug, though. Their social network is proudly invite-only, so you need someone to pull you in. Once somebody subscribes to you, your content will be visible to all their subscribers.

In this recent essay, Darius Kazemi suggests that Scuttlebutt and Dat (as well as Activitypub) should be regarded as complementary efforts rather than as re-decentralization silver bullets: https://blog.datproject.org/2019/03/22/three-protocols-and-a...

I wouldn't put Secure Scuttlebut into the same box as the other two. It doesn't actually try to tackle some of the harder problems like huge files/huge directories etc. that they do.

SSB is more in the realm of ActivityPub, but more p2p where ActivityPub is federated.

IPFS doesn't require git, and it's licensed under MIT. IPFS doesn't have any concept of file or directory versioning; are you mixing it up with Dat?

I'm not sure what you mean by "isn't rooted in HTML/http concepts". IPFS lets you share files and fetch them in a content-addressed way individually or grouped in directories, like a re-imagining of BitTorrent+magnetlinks built for the browser and website use-case. IPFS is usable through web gateways with regular browsers. An IPFS gateway can either be a public gateway (like gateway.ipfs.io), or it can be set up to serve a specific IPFS url for a given domain name. I can make mydomain.com have a DNS record that says the site's content is available at a certain IPFS hash, and then I can set up A/AAAA records pointing at a server which runs the standard ipfs daemon and serves only that IPFS hash, so regular browsers can access my site normally, and people with the ipfs companion extension or future ipfs-compatible browsers will automatically fetch my site's content from the ipfs p2p cloud when they visit mydomain.com, which is useful if my server ever goes down and other people have my content pinned. Cloudflare.com actually has a service where they'll run the ipfs daemon for your domain, so you can set the DNS ipfs record and just worry about keeping your content in available ipfs (possibly with a pinning service) without needing to keep your own web servers up.

As far as I can tell, Dat doesn't support or at least doesn't emphasize this web gateway use-case. From reading about it, Dat seems to have a bunch of features like directory versioning, allowing clients to publish files within a page, and some browser WebRTC-like p2p swarm thing, which sound neat, but also require its own browser and sounds like it's making its own separate browser ecosystem. To me, IPFS feels like it's doing one thing, it's easy for me to picture how to slot it into my current understanding of the web next to existing technologies, and it's easy for me to envision a future where it's incrementally adopted (in the graceful fallback style that most web progress has followed) and maybe even becomes part of browsers.

(Though I feel IPFS still has a number of UX and reliability problems, like unpredictability in ability to fetch a resource, lack of information about fetching/pinning progress, and web gateways and the ipfs companion extension are still a bit janky. There needs to be some nice front-end for pinning sites you look at, enforcing that your pins don't take too much bandwidth or disk space, and updating your pinned content as sites update. But it seems all of these things could be solved with it still retaining roughly the same interface.)

> I'm not sure what you mean by "isn't rooted in HTML/http concepts".

Yeah, sorry, what I meant was that it has a protocol spec (eg. as an RFC or equivalent), and not just one canonical client and server impl, resp., and aligns with HTTP's concept of a network entity, URL, or even ETag, etc., because that'd be natural for my use case.

Don't forget Swarm as well.


>IPFS is really cool—but how do I surf it?

If a site has a domain that's set up with a IPFS DNSLink record, then you just go to its domain name and access it over HTTP(S). But if you have the IPFS companion extension, then your browser will access the site through IPFS peers instead of HTTP. So even if the site's webserver is down, you'll still be able to access the site if any other IPFS peers have its content pinned. You could then pin the site's content and help host it.

This all would work through a normal web browser, and gracefully degrades to a plain HTTP connection to a web server for people without the IPFS companion extension.

If you have a domain name and are able to keep your site content available in IPFS (on your own machines you keep up, on a pinning service, on a vps, on some friends' machines who you convince to pin it, etc), then you could have someone else host a ipfs web gateway for your domain scoped to your domain such as Cloudflare (https://www.cloudflare.com/distributed-web-gateway/). Then you don't need to keep your own web servers up, and the only maintenance you have to do for the ipfs web gateway on your domain is to keep your IPFS DNSLink record correct. Regular users will access your content through Cloudflare (who fetch it from IPFS and then aggressively cache it), and IPFS companion users will access it directly through IPFS.

There's also the concept of IPFS links (like ipfs://QmQB1L5PDwcEcMW6hWcLQrNMTKWY3wxX4aDumnKi385KPN/introduction/usage/), which you can access directly if you have the IPFS companion extension, or you can access through a public ipfs web gateway (like https://ipfs.io/ipfs/QmQB1L5PDwcEcMW6hWcLQrNMTKWY3wxX4aDumnK...). But you probably don't want to give links like either of those to your users unless you don't care for domain names. I think the IPFS documentation makes a mistake by emphasizing raw IPFS links so much as opposed to DNSLink records.

This issue boils down to "absolute links don't work". See here: https://discuss.ipfs.io/t/how-do-absolute-links-work/5267/3.

> So, there’s some good news: we will be moving to putting the CID/IPNS-key/HASH in a subdomain by default (so every site gets its own origin). That is, websites will be hosted from https://HASH.ipfs.dweb.link/... instead of https://ipfs.io/ipfs/HASH/.... The nice side effect is that absolute links will “just work”.

So every IPFS website is going to be https://HASH.ipfs.dweb.link/? I like that I can stick with the familiar dat://kickscondor.com. And it works today.

>So every IPFS website is going to be https://HASH.ipfs.dweb.link/?

If someone sets up an IPFS DNSLink record for their domain, then it's just accessible directly from their domain. The user goes to https://example.com, and if they have a normal browser, they fetch it as normal, and if they have the ipfs extension, they fetch it over ipfs instead.

(There is also support for https://hash.ipfs.dweb.link/, but as I said, I don't think that's what people should give to users if they can avoid it. Though that does have the nice benefit over dat:// links in that it works for normal browsers even if your web server goes down.)

Too complicated for very little gain.

Assuming Cloudflare is being used, then the effort on your side is to run "ipfs add -r directory_of_your_site_content" on a machine that stays up (or a pinning service), and then you copy the hash into a DNS record.

The gain is that you don't manage any webservers, and other users can help host your site content and can keep your site alive after you stop hosting it on ipfs.

I find the concept of being able to make a site that outlives my ability to host it (and hopefully outlives me) as long as people find it worth pinning is super interesting.

You don't have to do any of that, just browse:


Of course, it would be better if you ran your own node.

It's complicated for the developer, but easy for the user. Referencing the actual ipfs link on a public gateway is easy for the developer, but off-putting to the user.

With Dat, they "solved" it by building their own browser. Which I think makes it easy for the developer, but adds a lot of friction to your users.

I'm hoping both projects meet somewhere in the middle.

The thing that always rubbed me the wrong way was how much Dat seems to embrace mutability. That the content behind every URL is mutable is one of the things I dislike the most about the current internet. A lot of the content and data I consume on a daily basis is rarely-changing, so it would be fine to be immutable.

I feel that dragging along all the problems of mutability-by-default that plague the current internet when rebuilding it in a more decentralized way is a big mistake. Yes, I know that there are some fringe efforts to enable more immutability with "hypercore-strong-link" etc., but they look like an afterthought, that won't be supported by most of the Dat ecosystem if it finds major adoption.

Interesting! Immutability by default is what initially rubbed me the wrong was about IPFS and SSB.

I've come to see now though that both have valid uses.

This is a nuanced issue related to privacy and user safety.

>A lot of the content and data I consume on a daily basis is rarely-changing, so it would be fine to be immutable

Would you want to see typos corrected, facts checked and other improvements on that content?

You could simply make all content versioned, with a full history available. Browsers would probably want to show the "latest" content by default, but they could implement a function to view previous versions or even to highlight changes. Ideally versions would come with Git-style comments so you could read the reason for the change.

This would provide an amazing archive for anyone wanting to see, for example, what was on a "front page" on a certain date. It would also let you see if someone had altered a specific article for whatever reason.

This is true of Dat, though. You can jump back to a website’s prior history at any time, so long as those chunks are still seeded.

So in theory an archive.org type entity could continue seeding old versions of websites, allowing you to go back in time, and it could be verified that it was an older version? If so, that's a good compromise.

For sure! And the URLs aren't theoretical - you can view an older version of my own blog by adding '+version' to the end of the URL. Like so: dat://kickscondor.com+1600/.

Each atomic file change creates a new version.

Would you want to see code you depend upon arbitrarily change, content you linked to become invalid through a 404, etc.?

I didn't say "mutability needs to be completely eradicated". It's mutable-only (what we currently have) vs mutable-by-default vs immutable-by-default. Each of them has their own trade-offs.

FWIW Dat is going to add "strong links" in the near future, which are links that include a version & content-hash for pinning to a specific version.

The end of my first comment explains why I don't think that's good enough:

> Yes, I know that there are some fringe efforts to enable more immutability with "hypercore-strong-link" etc., but they look like an afterthought, that won't be supported by most of the Dat ecosystem if it finds major adoption.

Why do you figure? If it's part of the protocol, wouldn't clients use it?

Same reasons that you can't use many edge cases of email addresses (or even common "+"-addresses) with a lot of websites, even tough they are valid according to the spec:

- Most developers don't read spec and add e.g. validations in end-user applications where they think they are being clever

- Multiple fully fledged implementations of the core protocols are usually hard to come by. This means that rarely used features are that are not supported in all implementations won't be used and so organically wither

I think that a soulution would involve locally copying anything you look at (nevermind the outdated notion of copyright).

otherwise the problem becomes ensuring the immutability of bad actors how would change what they had published anyways.

On the question about how a search engine would work. The Dat team have thought (and implemented) quite a bit about building performant indexes for distributed file systems. I've wondered if we relaxed the time requirements for search whether there are other processes that we could use that would e more robust. "Who has" seems like it might be a little bit too low level, but what if it was "Who has the most reliable data about the publishing industry between 1500 and 1600?" expiration in 30 minutes with an extension of up to 90 minutes for each preliminary result with more than two nodes responding.

Could it work? What kinds of use cases could it support? How would a bot that had a good model of which forums to post things to do in comparison if it had 2 hours, 24 hours, 1 week? What about if we put _lower_ bounds on the amount of time that had to be spent looking (this seems like a fun challenge for proof of 'real' work).

While this is at a much higher level than the current dat protocol, it is in a sense quite similar to the query "Who has 353904391670d2803b34990e37f4d2e96f49351998e162d0e335b16812daf592e0f71470af7bee31f6a1da03744d03bcde659d73a0ebf56fd4a9fc6ef67edf60 that is 5 bytes long?"

> "Who has the most reliable data about the publishing industry between 1500 and 1600?"

So, to verify I understand: the backend of this would be handled by grad students (working with some kind of 30/90 minute timer alarm), and would experience distributed consistency failures caused by differences in which departmental tradition they did their major area exams?

Pretty much. Though for that question you might have to go all the way up to tenured faculty, which could incite a full on Byzantine war over which particular source was more reliable, however I think at that point it is clear that we have lucked out and found two really great sources of information instead of just one and potentially triggered the creation of a third :)

Part of the 0.9 release of beaker will be the https://unwalled.garden spec, which is a bunch of file formats geared toward improving discovery and search (and general applications). Beaker 0.9 has some internal FTS indexes that let you query content published by people you follow. It's a kind of social search.

I wrote a small essay about Dat for a course on distributed systems: https://bernsteinbear.com/dat-paper/

In it I do some analysis of its strengths and weaknesses.

Having to stay on a website 24/7 to get new content sounds not very usable or user-friendly.

Great for engagement metrics though.

Haha! :) Actually, as long as one peer is connected 24/7, everyone will see everything. Which seems reasonable.

Has anyone looked into PJON? I see a lot of work done in the direction of decentralization, but is all at a really high level, shouldnt we re-work the layer 2, 3 and 4 to be compatible with a decentralized structure before thinking about content ids? See https://github.com/gioblu/PJON

Some of this high level stuff is really important for thinking about stuff like data structures and "asynchronous ux design". These problems don't strictly depend on the exact underlying transport or discovery mechanisms, although there is definitely work to do there as well!

Thanks for sharing this interesting thing I'd never heard of!

Ciao macawfish absolutely :) the only issue is: until we don't rework the existing low level standards and we build our own private physical network infrastructure (each user takes care of his own router and provide others with connectivity for free without intermediation of any third-party or service provider) we will not have a truly "decentralized" network.

I just tried the Beaker browser and their own main site is currently unaccessible through it.

Thanks for mentioning it, I just deployed a fix

If you need a custom browser for this to work then why not get rid of the entire HTML+JS+CSS+Wasm+whatever else complexity monstrosity and not build something much simpler based on almost three decades worth of hindsight?

Because I suspect whatever solution they would come up with will end up being as complex a monstrosity when trying to interoperate with other browsers, other OS, other organisations, other developers, various form factors and devices etc...

Also, if I need to learn a whole new stack to use a whole new product that has no-one using it yet, chances are I won't bother.

We may win on learning from the past and not having some of the tech debt the web has, but I've seen this industry repeating the same mistakes over and over again so I wouldn't bet on it too much.

What i meant is getting rid of html,css,js,etc and the entire stack and replacing it with something simpler. There isn't an issue with interoperability with other browsers as there wont be any other browsers, there is no need to interoperate with other organizations as there wont be any other organizations and any interoperability with operating systems can be addressed by simply using cross platform tech when building the browser.

You may not bother but others will (and really you can say that about anything that tries to come up with something new, that is not a reason to stop trying to come with new stuff).

Yesterday I was making a bunch of joke marketing lines and "Sorry we still use HTML" was one of them.

It's really just a matter of constraining the novelty. We're trying to make some specific improvements on the applications & networking stack of the Web. Redoing the entire web platform is out of scope (for now).

Seriously, replace it with stgh. which is unfriendly to advertisment, yet still enables a freemium web experience.

Ipfs with filecoin sounds resonable at first sight.

When it comes to gaining real adoption, compatibility with existing technologies is far more important than having a flawless technology stack. Dat already has an uphill battle to get people to use it; jettisoning the largest application development ecosystem currently in existence would make it a non-starter.

It should be possible to build a DAT-capable browser for simple types of content (RST, MD, orgmode, etc); no HTML capability included. If you made it download media content instead of displaying it in-browser, it might even be simple enough to implement this without spending too much time chasing down security issues.

One could even go down to a slightly modified Gopher protocol. Just let publishers publish in whatever format they see fit (PDF, docx, html, ...) and let the users configure the viewer in their browsers.

Yes that is what i had in mind.

Why do we need node?

This blog has a really cool page layout

Multi-writer not working, and still requiring hacking workarounds?

This bothers me.

P2P Reddit ( https://notabug.io/ ) has been doing this for over a 1+ year.

All content there updates in realtime too, fully decentralized. (WebRTC or daisy-chained socket relays)

I did not build NAB, but I work on the underlying P2P protocol that powers it ( https://github.com/amark/gun ) which competes with DAT - but I've always recommended DAT to people because I thought it already had working multi-writer apps on it.

Paul, isn't this already possible? You've shown me demos!

I had a look at https://notabug.io, oh my ... this is a great reminder that full decentralization may not actually be the most desirable thing to pursue.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact