Hacker News new | past | comments | ask | show | jobs | submit login
How the Dat Protocol Works (datprotocol.github.io)
707 points by nanomonkey on July 5, 2019 | hide | past | favorite | 75 comments

The diagrams on this page are FANTASTIC.

The dat homepage[1] is also nicely done. It would be interesting to see how much and how it's being used. Since there is a central tracker does anyone know if there are stats available?

[1]: https://dat.foundation/

Props to Duncan Keall for creating the diagrams [1][2].

> Duncan is a technical writer and illustrator interested in internet infrastructure, peer-to-peer protocols and net art.

[1] https://blog.datproject.org/2019/01/21/how-dat-works/

[2] https://github.com/datprotocol/how-dat-works/commits?author=...

He's great! He also did the Scuttlebutt protocol guide:


I agree. This is one of the best, modern protocol documentations I've seen. I've been tempted to redo old, but critical RFCs (FTP, HTTP, PNG, TCP, IP, etc) in a similar manner.

I would pay good money for well thoughtout docs on all those protocols including HTTP, DNS, how they relate to mdns / avahi / bonjour and other modern implementations and differ as well as the IRC protocol. I would love to read RFCs front to back but I learn by talking with someone about tech or just seeing schematics / diagrams and pics and then lastly sheer experience.

For HTTP I highly recommend https://github.com/for-GET/http-decision-diagram/blob/master...

Scroll to Overview and then there are links to each section in the diagram

Great on desktop, unreadable on mobile though.

What is it about the diagrams that you find to be high-quality? I’m a technical writer and just curious about what stands out to readers.

Low noise, small palette, straight lines, good use of space. Basically textbook Tufte philosophy with a good eye for proportion/symmetry/rhythm.

The scuttlebutt doc linked elsewhere in here is even more impressive IMHO. Kuodos to the designer.

Any specific tooling generating these or just a good designer?

Duncan Keall is just talented. He also did the Scuttlebutt Protocol guide (https://ssbc.github.io/scuttlebutt-protocol-guide/).

Here is a blog post describing how it was made: https://blog.datproject.org/2019/01/21/how-dat-works/

Looking at the site repo, they are SVGs generated by Inkscape, so likely drawn by hand by a good designer.

They could be done easily with Excel.

I was going to comment on the same thing. I’m squinting at my mobile phone screen so I bookmarked the site so I can check it later on a proper screen indoors. It just looked so good I thought I’d heck back later to learn something about lovely documentation:)

Dat (and Beaker Browser) is one of those things that I think about playing with every time I see it pop up somewhere. The technology truly feels like something special.

Then I sit down to actually start playing with it I struggle to think of anything to build that would actually take advantage of the tech aside from purely static sites. I would really love to see more complex examples folks have seen in the wild.

A while ago I made a demo of a P2P chat application that runs on the beaker browser and uses the experimental P2P APIs. It's running at dat://feathers-chat.hashbase.io (slides from the talk at https://vanjs-decentralized.hashbase.io). If the peers can see each other you should be able to register with any email address and send chat messages between browser instances.

What I realized is that this technology ticks a lot of the boxes of what we right now think only the big cloud providers can do. By using a more decentralized protocol it is by design

- Actually Serverless

- Offline-first

- Real-time

- Auto-deploying

- Live-updating

- 100% uptime

I really think there is something there from both, a developer and user experience perspective. The problem is that a lot of it is still very experimental and far from the usability and maturity of e.g. a Firebase or Heroku.

dude I love how many people in Vancouver seem to be interested in or even working with dat :) thumbsup


Can you explain how P2P chat is saved into dat?

How can someone without the keys add their messages to the feed? Wouldn't only the initial creator have the keys necessary to add to the feed?

This is a really cool idea, but how should I be using it? Trying to test it out just now I opened up two tabs in Beaker but both rooms just have the one user in them.

Love the design, by the way.

I believe this is happening because it is using the same peer connection. If you start to separate instances of the browser (e.g. in a VM or another machine) you should be able to see both users.

The design is taken from the Chat guide for https://feathersjs.com. Feathers is a JS library that allows to architect APIs in a way that they are protocol independent. Which worked great in this case because I just had to swap out the existing REST/websocket Feathers adapter for a DAT/Beaker API Feathers adapter.

> Then I sit down to actually start playing with it I struggle to think of anything to build that would actually take advantage of the tech aside from purely static sites.

Actually as someone that deals heavily with static sites, if dweb was only used for static sites and perhaps pushed us back to a more document-oriented web I and a lot of others would be quite happy with that. I don't think it needs to be capable of feature-complete replacement of Twitter and Facebook to be successful, and requiring it to be that way as a precondition for success is IMHO a losing strategy for dweb. Start with document-oriented and slowly increment.

There is the Dat installer for installing Android applications (https://github.com/staltz/dat-installer/).

Also a bunch of zines and music albums are shared over DAT.

There is hypercore (https://github.com/mafintosh/hypercore), a distributed append only log library.

Rotonde, a DAT based social network (https://github.com/Rotonde)

Secure Scuttlebutt now supports dat attachments...

Cabal (https://github.com/cabal-club/cabal-cli), a chat program.

Thanks for these! Cabal looks especially fascinating, I'm going to have to try and get some friends to join one and test it out.

Rotonde [0][1] is/was a pretty fascinating social network built over Dat/Beaker which basically involved each user helping to seed each user-feed they followed, resulting in a pretty stable/persistent ecosystem. I think it has mostly fallen by the wayside, but of course you should still be able to access a lot of instances due to hashbase.io seeding. To kickstart the platform we started a really basic "user list"[2] where you could find your friends' Rotonde URL based on their twitter username, since dat URLs are not exactly memorable.

[0] https://wiki.xxiivv.com/#rotonde

[1] https://github.com/Rotonde

[2] https://github.com/Rotonde/People

Every once in a while these xxiivv guys pop up and I find myself back in the hole of their website, wandering and exploring for hours, it's such an interesting personal wiki

Who are they? I want to know, but I felt like I would end up going down a lot of rabbit holes on their website to find useful information.

I worked on a Muxtape clone (making and sharing mixtapes in the browser) that I've been quite happy with: dat://duxtape.kickscondor.com

Code is here: https://github.com/kickscondor/duxtape.

Dat tape is a missed opportunity.


True dat!

Duxtape is really nice, I enjoyed the writeup as well which was posted the other day. Thanks for writing cool things! :)

Taking the time to encourage someone is pretty sweet too. Glad you stepped out of the crowd to say hi - I'm enjoying looking through your HN comments.

I built a Connect Four clone you can play multiplayer over the Dat protocol.

- dat://connect-four.hashbase.io - https://github.com/rjsteinert/four-in-a-row-game

Yeah same here- it reminds me a lot of "the old days" of the internet where you just stumbled across cool things from sites with lists of links and no real decent search engine or e-commerce going on.

There is some interesting stuff people are playing with in Dat - e.g. their Twitter clone (1) has the actual posts distributed in what they are calling their WebDB. I.e. you as an individual publish posts at a dat:// address you control, and the central site just references that address when other users load the feed. Your posts are distributed across the other users so there is no SPOF.

There is the kernel of something really interesting there in this GDPR-world where people need to be more aware of who has their data, and what can be done to control that.

Interesting stuff.

1 - https://github.com/beakerbrowser/fritter/blob/master/README....

Does anyone know when the unwalled.garden extensions land in beaker?

This is something I struggle to understand as well. Can anyone explain what the steps would be to take say a server rendered MVC app or on the opposite end a JAMstack app and deploy it to Dat?

Beaker browser misses some simple stuff like dark themes which makes it look weird in my dark themed desktop manager. If anyone can point me to that, I will play more with it.

As a tech illiterate, what makes DAT superior to the standard p2p and blockchain technology? If my understanding is wrong, can someone explain in laymen term why DAT matters for the future?

It seems closer to bittorrent than anything blockchain related.

Specifically it seems like a better version of bittorrent. It seems to be more resilient that bittorrent, better able to do updates than bittorrent (they apparently promise to have live updates by multiple writers, implying one could host for example a live database in a dat URL), better able to find optimal peers than bittorrent, better security around discovery, a flexible datamodel (e.g. the data being shared does not have to be files and folders, where as bittorrent would require wrapping that in a file, this can allow the transparency through the protocol layer)

Say you have a 1 GB file that you want to share with everybody in this thread. How do you do this? What if you have thousands of such files? In theory, DAT and similar DHT-based file sharing protocols, can enable us to implement something like YouTube, but without the enormous costs for running the datacenters. In practice, ISPs won't let this happen and DAT-like protocols will be limited to the world of VPSes that have static IP addresses.

My hope is that asynchronous swarm based applications ("dweb") will work so well without ISPs (e.g. on dynamic local networks and mesh networks) that they will pose an existential threat to ISPs.

"Think WAN, act LAN"

or is it

"Think Wan, net LAN"?

I don't know.

>Say you have a 1 GB file that you want to share with everybody in this thread. How do you do this?

You'd just create a dat feed for it and share that address in the thread. You can do that directly from Beaker Browser easily.

If everyone on the thread was getting it, you might only have to upload slightly more than 1Gb data, as all the chunks could be shared between the other people.

Same as bittorrent, but even better because you can easily add more files to the feed later.

Is there a reason why that last risk cannot be worked around via dynamic dns?

When you read "blockchain" always think "decentralized: cool; wasteful: not cool". I think it was designed like this on purpose, but besides creating a value store it's not a reasonable choice. Also it's a list structure, which inherently does not represent reality. E.g. if you and I both agree about the blockchain until September 13 2018, but afterwards have different opinions about how it should go on, there is not way to present that misunderstanding in the technology. (It's possible if you put an abstraction layer on top, but that's another topic. You can put almost all protocols/algos on top of each other.)

So it's wasteful and can't represent reality well. So it's quite limited to how far we can use it as a whole species.

It's less the Amazon of algorithms and more the Myspace. In 15 years it will still have some impact in the abstract sense of how it influenced technology, but blockchain itself is unlikely to still be there.

Genuine question to hopefully help others as well: What are the major differences with dat and IPFS?

DAT seems very well thought through, and having never seen beaker before that looks amazing! However, how private is a P2P service like this? One major problem for BitTorrent is how trivial it is to snoop other peoples history (esp if you’re the RIAA or MPAA, or even iknowwhatyoudownload). Is overcoming this a design goal for dat? Is it even possible to overcome it?

There's some information about this in the protocol description:

"Discovery keys are used for finding other peers who are interested in the same Dat as you.

If you know a Dat’s public key then you can calculate the discovery key easily, however if you only know a discovery key you cannot work backwards to find the corresponding public key. This prevents eavesdroppers learning of Dat URLs (and therefore being able to read their contents) by observing network traffic.

However eavesdroppers can confirm that peers are talking about a specific Dat and read all communications between those peers if they know its public key already. Eavesdroppers who do not know the public key can still get an idea of how many Dats are popular on the network, their approximate sizes, which IP addresses are interested in them and potentially the IP address of the creator by observing handshakes, traffic timing and volumes. Dat makes no attempt to hide IP addresses."

So as I understand it, if there was a popular movie shared via dat and people were sharing it (the key would have to be publicly known for people to discover it at all), it would be trivial for the MPAA to snoop and see what IPs were downloading it, just like torrents now.

Is that correct? Would those snoopers be able to prove who is sharing and who is downloading?

The first thing that popped to mind for me was that this seemed like a good protocol to build some sort of decentralized/private social network, then in their docs I managed to end up looking at this: https://github.com/Rotonde/rotonde-client

"Hypercore assumes a linear history. It can not support branches in the history (multiple entries with the same sequence number) and will fail to replicate when a branch occurs. This means that applications must be extremely careful about ensuring correctness. Users can not copy private keys between devices or processes without strict coordination between them, otherwise they will generate branches and 'corrupt' the hypercore."


That's a beautiful guide. Anyone know how the diagrams were created?

The accessibility of that guide with a screen reader is horrible. The text is readable, but that's about it. Why, oh why couldn't people just write good, textual articles like they used to, instead of all this picture based crap? Instagram, Tinder and snapchat are bad anough as they are, but tech was mostly free of this, but then we see stuff like this appearing. I won't really be surprised if I will not be able to read 50, 60% of the content in 10 years if current trends continue.

Here is the original 100% textual spec. https://www.datprotocol.com/deps/

The post here is an alternate visual version. Best of both worlds!

What would help you screen read a visually oriented guide? Would detailed textual descriptions of the figures help?

Personally I think that the way they're using diagrams to illustrate some of these technical concepts is really helpful. I wonder if the format would translate to spoken text? I.e. in parallel with the figures, having little descriptive "imagination breaks" in the text which would provide opportunity for focusing on some key concept or relationship that might otherwise be illustrated using a visual figure. Maybe even using spatial language?

IMO there should be way more aurally oriented technical materials.

Recently I was listening to math lectures on YouTube while driving and realized that they were much easier to follow than I expected. Still, I'd love it if there was an aurally focused higher math practice. Sometimes I find it easier to focus when I'm listening rather than using my eyes.

In this case, not really descriptions, but equivalents. I don't care that a figure shows two red boxes connected in a particular way, I care that packets flow from node x to y, but not in the other direction (just an example).

How would a screen reader work with the ASCII art of RFCs? https://tools.ietf.org/html/rfc793#section-3.1

Not really that well. I know it's there, I know the rough shape, I know the order of the fields, but I don't know the spacing and alignment. I could count spaces by hand but that would be tedious. That's way, way better than the OP, where images are just indecipherable SVG blobs that the screenreader does not understand at all. The only thing I don't understand here are the numbers above the diagram, I don't know what they're for and it's not obvious for a SR user. I think I could understand it, though, especially that everything seems to be described below, with bit lengths and such. As far as I understand this RFC, you can get by without the diagram. What could help with it, though, would be a screen reader that has excellent braille display support (JAWS) and a good 80-cell braille display. I think that would let me see the spacing and understand it much better, but I don't have such fancy and expensive hardware at hand to test this out.

I noticed the other day GNUnet [0] here on HN, and it seems to me that there could be a bit of overlap here, especially for the filesharing system... Is there any collaboration happening between these two systems? From my couch perspective, they both look great and seem as though they could compliment each other really well.

[0] https://gnunet.org/en/#gnunet-0.11.5-release

They really need to work on a better explanation of what exactly this is! Something like "It's similar to BitTorrent except..."


>Academic Torrents [13] uses BitTorrent to share scientific datasets, and BitTorrent has many drawbacks that hinder direct use by scientists. BitTorrent is for sharing static files, that is, files that do not change over time. Dat, on the other hand, has the ability to update and sync files over the peer-to-peer network. BitTorrent is also inefficient at providing random access to data in larger datasets, which is crucial for those who want to get only a piece of a large dataset. BitTorrent comes close to the solution, but we have been able to build something that is more efficient and better designed for the data sharing use case.

Much better. They should stick that on the home page, not in the FAQ.

Is this similar to Syncthing?

It's kinda like syncthing crossed with git. Lower level than syncthing, focused on stuff like p2p shareable, versioned data structures and cryptographic verification.

Although you can build applications on top of syncthing's API/protocol, it's primarily a high level application, whereas Dat is a protocol meant for building applications.

It would make sense to build something like Syncthing on top of Dat.

I didn't see IPFS mentioned on the page so I was wondering if someone here could help me understand the difference between this and IPFS which seems to share at least some of the same goals.

IPFS as I understand it doesn't have a built-in ability to alter files once uploaded and has to use IPNS to handle pointing to newer versions over time. Dat Protocol does this for you as I understand it haven't dug in too much.

Dat seems like it could be a good way for those infosec people and others that don't like youtube to share their videos.

Is there any way to stream video using OBS via dat?

I'm sorry, but accessibility and UX is a primary focus, and then you explain that urls consist of a gigantic hex string?

I can't share a file without having to copy paste the url, and I can't look at two similar urls and immediately determine which is which. No thanks.

This seems like yet another product that seems good on paper, and seems great if you're a nerd, but this will never catch on with the average consumer.

They don't usually. You can put the key in DNS and then use regular DNS names.

The keys are akin to IP addresses, which are also not so human friendly.

People are still trying to wrap their heads around decentralized naming systems. There are lots of ideas, and nothing has stuck yet in a groundbreaking way.

But yeah, the naming problem is independent of what this protocol gives us.

I think the best idea for naming is probably going to be something like giphy or what3words (though their implementation could use work since they're not directly translatable between multiple languages) where a hex address gets encoded in N words which are generally easier for people to grok than N*m (m being 2/4/8 depending on how many hex characters are encoded in which words). Choosing the words is hard though especially as you add more languages.

Can you share on dat via tor?

is this different from ipfs?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact