
PeerTweet – Decentralized feeds using BitTorrent's DHT network - sktrdie
https://github.com/lmatteis/peer-tweet
======
autoreleasepool
I think it's clear that, in light of what's been going on, we need a more
diverse ecosystem of P2P networks in order to protect the integrity of our
private communications. So I support any effort to experiment in this area

~~~
supergreg
Is there any library/framework that helps build p2p things? That could help
the ecosystem the same way web frameworks have helped build websites faster
and more secure than ever.

~~~
lewisl9029
I'm not aware of any batteries-included _frameworks_ , but here are some of
the p2p-ish libraries that I've either worked with personally, or looked into
and found promising, along with some commentary:

Telehash: [http://telehash.org/](http://telehash.org/)

Provides a p2p-capable messaging stack. I've worked with an earlier version
(v2) for Toc Messenger [1]. V2 included an integrated DHT for public peer
discovery along with the p2p messaging stack. Was actually surprisingly robust
and reliable considering how young the project was at the time.

V3 seems to have pivoted towards the private mesh networking side, and spun
out the DHT into a separate project, but V3 itself doesn't seem to have a
clear story for integrating with an external, public DHT the last time I
checked, so the intended use cases aren't quite the same as V2.

[1] [http://toc.im/](http://toc.im/)

Matrix: [http://matrix.org/](http://matrix.org/)

Doesn't allow for fully peer-to-peer messaging, but is decentralized and
federated like XMPP. Probably the closest thing on this list to a framework
(handles both messaging and storage).

Last time I looked at it, my main concern was how much implicit trust users
needed to place in homeserver providers (messages and user profiles are stored
in plaintext). Unless this has changed, I don't find it viable for any app
that values user privacy. Even federated networks eventually gravitate towards
a large degree of centralization once the tech becomes mainstream (see email).
If any servers are used at all in a privacy-minded app, they need to be zero-
knowledge as far as I'm concerned, for actual user data at the very least, if
not metadata as well (I realize the latter is an unsolved problem).

RemoteStorage: [https://remotestorage.io/](https://remotestorage.io/)

A decentralized, federated protocol and library for data & file storage. Also
used this for Toc. Worked as advertised but I'm looking to explore fully p2p
alternatives for my next project.

The main problems I ran into (asides from the non-fully-p2p architecture) was
the rather clunky API (for instance, it requires developers to provide a
schema to fully specify objects they want to store, but doesn't provide a
versioning and migration mechanism to take advantage of the schema specs,
although it is planned), and the fact that encryption for data at rest wasn't
implemented (until I finished working on a custom encryption layer for Toc
built on top of the library).

Kinto:
[http://kinto.readthedocs.org/en/latest/](http://kinto.readthedocs.org/en/latest/)

Similar to RemoteStorage, but decentralized discovery is planned and not
implemented yet. It's built by Mozilla and apparently used in Firefox and
Firefox OS (I'd assume for Firefox Sync?).

They also have a comparison table with RemoteStorage as a part of their docs
if you're interested in a more detailed (but possibly impartial) comparison
[2].

[2]
[http://kinto.readthedocs.org/en/latest/overview.html#compari...](http://kinto.readthedocs.org/en/latest/overview.html#comparison-
with-other-solutions)

Swarm: [https://github.com/gritzko/swarm](https://github.com/gritzko/swarm)

A decentralized data replication library that's built on ops-based CRDTs [3].
Some very impressive work here. Although I recently found out that V1 of the
library no longer supports fully p2p replication due to architectural changes,
which made it a lot less interesting for my use cases.

[3] [https://en.wikipedia.org/wiki/Conflict-
free_replicated_data_...](https://en.wikipedia.org/wiki/Conflict-
free_replicated_data_type#Operation-based_CRDTs)

Replikativ:
[https://github.com/replikativ/replikativ](https://github.com/replikativ/replikativ)

The one I'm most excited about personally. Similar to Swarm, a CRDT-based data
replication library, but even more ambitious (they envision an open data
exchange whereby users fully control their own data, and can specify any
number of applications to have access to that data, instead of the status quo
where user data is kept in closed, per-application silos). There's a
comparison with other alternatives, including Swarm, in their README [4].

They also plan to have pure browser-based p2p replication through WebRTC,
which is exactly what I'm looking for.

It's built in and for Clojure and ClojureScript, but JavaScript support is in
the works. As if I needed any more incentive to get started on my first
ClojureScript app. =)

[4]
[https://github.com/replikativ/replikativ#alternatives](https://github.com/replikativ/replikativ#alternatives)

GunJS: [http://gun.js.org/#step1](http://gun.js.org/#step1)

A decentralized, replicated graph database. The pitch is very impressive
sounding, but I'm hesitant to actually use this because they promise _a lot_
of replication magic, but the replication is built on a home-grown algorithm
rather than something rigorously peer-reviewed and implemented industry-wide
like CRDTs.

I'd love to hear from anyone who has actual working experience with it though.

IPFS: [https://ipfs.io/](https://ipfs.io/)

IPFS and its suite of associated libraries has already been mentioned
elsewhere in this thread, so I won't elaborate on it.

~~~
marknadal
Author of gun.js here, great comments/concerns.

First, Aphyr (Kyle Kingsbury of Call Me Maybe fame, Jepsen database testing
suite) did some tweets about GUN (
[https://twitter.com/aphyr/status/646302398575587332?ref_src=...](https://twitter.com/aphyr/status/646302398575587332?ref_src=twsrc%5Etfw)
), and I met with him at a conference in Berlin where we both were talking. He
had some concerns (wanted the conflict resolution to be able to be plug-in-
play with your-own-CRDT, and if journaling is disabled then some stale/old
updates would be dropped), but overall seemed positive about GUN (check his
own tweets). I hope to have him do a more serious review at some point, but I
also hope that his initial thoughts provides some initial industry peer
review.

Tailing the first point, part of the reason why we didn't receive a more
critical response from him is because we make NO claims of serializability or
linearizability, which is what most academic research focuses on. In fact, GUN
takes a somewhat controversial view with regards to classical things like
Strong Consistency. For more information on how GUN's algorithm works, check
out this recent podcast we did with the LondonJS community -
[https://youtu.be/qJNDplwJ8aQ?t=32m45s](https://youtu.be/qJNDplwJ8aQ?t=32m45s)
.

Second, GUN does use CRDTs. All CRDT means is "Convergent Replicated Data
Type", which is a fancy way of saying that you'll get the same end result
regardless of the ordering of the operations involved. Because GUN is offline-
first it is incredibly important that it can retry an update multiple times to
handle network partitions - this fundamentally results in operations not
having a consistent ordering, especially when dealing with multiple peers
updating (offline) simultaneously. If you watch the "Holy Grail Demo" (1min
long, [https://youtu.be/-i-11T5ZI9o](https://youtu.be/-i-11T5ZI9o) ) you see
that this does not wind up being a problem.

Third, there are other CRDTs that can be implemented on top of GUN's core. A
specific instance of where GUN's default will fail you is in commutative
operations (something Kyle was also concerned about), by default you cannot do
increment (adding numbers) operations with GUN - partly because GUN treats a
single value as being atomic. If you have a HN upvote count and you naively
were to use GUN's out of the box CRDT, and two people locally added 1 to their
local value at the same time... they would both converge to the same end
result (5 + 1 = 6) but the intended result is 7. This intention-preserving
CRDT is called a Commutative Replicated Data Type, which a specific variant on
the general category. However, this does not mean that it is impossible to do
this with GUN - you can, because GUN's underlying CRDT is designed to be
emergent, allowing for other CRDTs to be built on top. At some point we'll (or
somebody in the community) will provide these as plug-in-play
module/extensions to GUN, so that way you won't have to build them yourself.

Fourth, "home-grown" is somewhat true, but I think that is giving me too much
credit. GUN's underlying conflict resolution algorithm is based on a hybrid
lexical, vector, and timestamp clock - hardly magic at all, in fact they are
somewhat naive intentionally. The nice thing is that they "just work" for the
majority of web based apps right out of the box, which is probably why it
seems magical. However, they won't work (alone) for more complicated business
logic, and that is why we encourage the UNIX/NodeJS philosophy of adding those
algorithms on top. It is important that your bread-and-butter machine is not
based on a monolithic black box (like a lot of databases are), instead you
should understand the base of the system and then plug your business-specific
needs on top.

Fifth, we're looking for more academic and peer review! I've already had
numerous discussions with a Distributed Systems PhD at Carnegie Mellon that
got poached by Uber's self driving car division, he's trying to help me find
some people to put together a paper. That said, if you know anybody or could
help connect me into that world it would be greatly appreciated.

Thanks so much for including us in the list, please shout if you have any more
questions! :)

~~~
lewisl9029
Thanks Mark, that definitely cleared up a lot of my concerns on Gun. To be
honest, I only had a cursory glance at it when I first heard about it, but
reading your comment definitely sparked enough interest for me to take a
deeper look.

One question still on my mind after quickly skimming your docs: Can Gun sync
directly between browsers (using something like WebRTC) without involving any
servers? If so, could you point me to where it might be documented? If not,
please consider this a feature request. =)

~~~
marknadal
The answer is "Yes, it is in development" by one of our contributors:
[https://github.com/PsychoLlama/gun-rtc](https://github.com/PsychoLlama/gun-
rtc) . But you'll realistically have to wait (or help) with it, before it is
usable.

As you know though, WebRTC unfortunately requires some initial servers
(STUN/ICE, or a signaling server, etc.) so it is not fool-proof P2P sadly.

------
PuercoPop
[http://twister.net.co/](http://twister.net.co/)

A more mature similar open source implementation of the same idea.

~~~
sktrdie
Totally appreciate Miguel's work and I've been following them for a long time.
I think what they're doing is awesome and people should definitely check
Twister out. I should probably have a related work section on the README.
Anyway the approaches differ in some areas even though they seem to have the
same goal in mind.

First off Twister is its own network. They use libtorrent and therefore most
of BitTorrent's tech (with some alterations). PeerTweet on the other hand
actually uses BitTorrent's mainline DHT network. Then they also include
ability to assign names to your feeds based on blockchain mining - again this
is their own blockchain. PeerTweet doesn't do that and instead perhaps relies
on others to build that functionality on top of it (hint namecoin).

I'm not saying that having your own network for everything is bad, I'm just
stating the differences :)

Also I think we need different tools that try to do the same thing (ex.
Twitter) using different methods.

~~~
synchronise
From what I've seen, some of the Namecoin devs have distanced themselves from
the project because they want to introduce some sort of zero knowledge
authentication such as Zerocash to improve anonymity, but because it may be
incompatible with the current network they want to focus on either making a
parallel or competing solution.

------
metasean
DHT - Distributed Hash Table
[https://en.wikipedia.org/wiki/Distributed_hash_table](https://en.wikipedia.org/wiki/Distributed_hash_table)

------
jimktrains2
> "d": <unsigned int of minutes passed from epoch until when item was
> created>,

I know it's a simple /60, but why change from the more-or-less standard
seconds from epoch? To try to squeeze more time out of a 32bit unsigned
integer? Since there is no discussion of what size this integer is, I'm
expecting you're just defaulting to the json standard, which would support a
53-bit signed integer, iirc.

~~~
sktrdie
Yeah it's just a matter of saving space. Here's the actual code:
[https://github.com/lmatteis/peer-
tweet/blob/master/app/compo...](https://github.com/lmatteis/peer-
tweet/blob/master/app/components/Tweet.js#L77)

I'm really bad at binary arithmetic so I might have done something wrong, but
essentially I calculate how many bytes I need to use for the integer, and I
store it in a buffer.

I might change this if the space required is the same for seconds. I just
figured I'd save some space, and I only need minute precision really.

~~~
dTal
My personal intuition is that the tiny space savings isn't worth the creation
of a new timestamp format that will undoubtedly cause frustration down the
line, if this catches on.

------
levemi
> PeerTweet

Since it's not actually using Twitter and you're not turning into a little
bird to share a song can we call writing short messages something other than
"tweets"? You might also run into trademark issues with Twitter, but that
doesn't bother me as much as grown adults referring to general status updates
and conversating as "tweeting", especially when Twitter isn't even involved.
We just like need to draw a line in the sand somewhere.

~~~
Klathmon
Do we? Is it really that big of a problem?

If someone says they read a "Tweet", I instantly know that it was a very short
message most likely made with very little thought or planning behind it.

It's a word, just like any other. This whole idea that we should stick to more
vague words because they are "more adult" is silly to me.

------
patrickaljord
One of the top feature of twitter is search, trending topics, easy
discoverability etc. In other words, all the goodness brought by
centralization (which also comes with all the bad which is censorship, no
open/free data etc). The first one to crack this thing (offer both great
search and discoverability) and decentralization will win the internet. But it
just seems super hard, people tend to concentrate and group no matter how
decentralize they start, same thing is happening with bitcoin right now. Not
saying it is impossible, but not coming in the near future I think. I sure
hope I could be proven wrong though.

~~~
rasengan
I think a twitter-decentralized-blockchain could solve this?

~~~
toyg
Check out twister, it's exactly that: a blockchain-backed twitter, where
direct messages are handled through DHT. It is very promising, although it had
its birth pains: decentraluzation inevitably brings spammers and squatters...

------
jorgecurio
what's really holding back the explosion of trustless, decentralized, secure,
p2p applications is the very people using the internet. Most people don't give
a rats ass or even have the capacity to appreciate something like this and
what's going on in the background and _why_ it matters.

All it takes is some teeny boppers like a snapchat demographic to adopt new
technology. Facebook, twitter, all worked because we were all young once and
so was everyone around us.

Some applications by nature will not be possible such as Google but one can
dream right? A decentralized, distributed Google that's as good as the real
thing but without any of the inherent creepiness? Duckduckgo is a real market
reaction to the ongoing battle between politics and technology, but it just
isn't good enough as the real thing.

I think that all in all, decentralizing applications mean a completely new
paradigm change in how we value the end user. No longer is a browser
fingerprint of some schmuck an exploitable piece of data by advertisers,
government bodies.

I believe decentralization movement is happening now things are moving far far
faster than what I had expected, really underestimated how quickly adoption is
measured on the internet...in months if not a year, people's habit drastically
change, suddenly a new monster appears and everyone loses it.

~~~
arahant7
Every event like the lavabit shutdown, the sybil attack on Tor and even the
current Apple issue will add further push to the decentralization movement.
Hopefully in the next year or two we will see many completely surveillance
proof systems come up.

------
nyan4
What's the threat model here? Is the potential adversary able to do MITM?

\- The DHT does not anonymize the poster ipaddr, the post contents, posting
time and other metadata

\- same for the reader ipaddr, what was reading, and when

~~~
fiatmoney
Based on the timing, I would imagine the threat model is ad purveyors taking
over a previously relatively open application provider and engaging in soft-
to-hard censorship of controversial topics until the only acceptable content
is tweeting about how happy you are about #brands and #socialjustice.

~~~
IncRnd
That isn't a threat model for this but speculation that this is a mitigation
of a threat in twitter's threat model.

It isn't clear what threats the design of this system guards against or that
much thought was given to the types of attacks so much as to create something
shiny and decentralized.

------
lisowski
Very Cool! I've been thinking a lot about decentralized social networks and
the value that they could bring. With companies I love like Soundcloud so far
in the red, it is only time until they go under or change fundamentally. I
think this idea of decentralized feeds fits very well with the current state
of music, many different artists, collectives, and labels that basically only
exist on the internet. If anyone wants to explore the idea with me I am very
open to it!

------
rakoo
> "a": <utf8 http url of an image to render as your avatar>,

HTTP, again ? Ah, if only there was a decentralized way to exchange content
without the need for any central authority, all while being sure of the
authenticity.

~~~
sktrdie
HTTP is easy and well supported. Why not mix centralized solutions with
decentralized ones? The important part is that you can edit your tweet with
different URLs if they go down.

~~~
ihuman
Plus, you could always host your own image. No one is forcing you to use a
centralized image-hosting service.

~~~
Taek
You still end up relying on things like BGP, the centralization goes a long
way down.

------
im_dario
I did a decentralized Twitter in 2010 with a very similar design:
[https://github.com/imdario/qantiqa](https://github.com/imdario/qantiqa)

------
TheAndruu
Reminds me of
[https://github.com/TrsstProject/trsst](https://github.com/TrsstProject/trsst)
and others like it before, like App.net.

This isn't a new idea and I'm 100% behind a more decentralized and free
interwebs. The question is what will it take to gain traction. Ease of use is
helpful, but there must be something compelling for people to adopt it.

------
yazriel
Does this create a new DHT entry per tweet or per user ?

I think the DHT itself can quickly become overloaded/slow - DHT does impose
some b/w overhead for each node

~~~
sktrdie
One DHT entry per tweet. I do agree that it does create overload on the DHT.
Initially in fact I was looking at implementing it as such:
[https://github.com/bittorrent/bittorrent.org/issues/19#issue...](https://github.com/bittorrent/bittorrent.org/issues/19#issue-110508430)

Problem is that you don't get the "hosting" effect that you get with the DHT
so you always need a peer seeding your feed or else it won't be reachable.

Anyway, I hope others take on the challenge of implementing alternatives
looking at different methods.

~~~
yazriel
Actually - i have been playing around with building a feed on top of a single
torrent - so it would be one torrent per user - and subscribers to the torrent
would (hopefully ;)) get a feed of updates

Could this interest u ?

~~~
juanpabloaj
you have some example about this?

~~~
yazriel
yep.. seems to work in alpha..

No changes required to other DHT node. Which is cool. And i was specifically
aiming for a low/scalable bandwidth overhead

Each new "announce" propagation is kinda slow.. 1-10min to spread through the
DHT..

------
flashman
Call me cynical but a decentralized alternative to Twitter (or the domain name
system for that matter) won't work unless:

a) there's a mechanism for brands to get their associated nickname, and

b) spammers can't create millions of accounts, or doing so is worthless.

Even BitTorrent only really works because central authorities (torrent sites)
confer some level of authority on the contents of particular torrents.

~~~
sktrdie
BitTorrent's DHT is constantly being spammed and has been so since its
inception several years ago.

The whole point of DHT feeds is to actually filter out spam by only trusting
specific feeds. Torrent site owners can publish their torrents on PeerTweet
and avoid having to change their DNS every week or so (therefore having to
rebuild their reputation).

------
cm3
Do browsers prompt like for mic and camera before opening p2p connections?
Nobody wants a letter from the MPAA because a random website's javascript
starts downloading/seeding stuff in the background without the user's
knowledge.

~~~
uptown
P2P isn't always mic and camera.

Here's an example: (WARNING: Don't open this link if you're on mobile, or on a
capped data connection)

[https://webtorrent.io/](https://webtorrent.io/)

~~~
cm3
That's what I'm saying. It should also prompt for permission if there's no
audio/video involved. Visiting a random site can make you upload and download
illegally at the same time. Therefore there's no justification for it to be
enabled by default if just JS is allowed.

------
ilostmykeys
For security: this should be combined with something like CertCoin. HTTPS is
Evil™ 8)

------
reitanqild
Very interesting, wanted to install and try until I read npm.

I do not feel very dumb, but dealing with node takes more energy than I think
I have right now.

So summarized: still very interested but won't try unless someone explain how
to like node.js.

~~~
mrmondo
I see you've been down voted, however I felt the same way, I loved the idea -
then saw that it was in nodejs and while I don't discount it's merits I can't
bring myself to try it out if it requires nodejs & friends.

~~~
reitanqild
Thanks, I have no reason to hate Javascript, it's just that I find the Node JS
hard and I fail to see the reason for using it for a lot of things. (Then
again I found Java hard before I started and someone showed me the basics of
eclipse so this is possibly more a cry for help : )

------
Humjob
Super cool idea - I'll try this out today.

