Hacker News new | past | comments | ask | show | jobs | submit login
Decentralized Twitter? You just don't get it. (volaski.tumblr.com)
21 points by volaski on Aug 19, 2012 | hide | past | favorite | 48 comments



Why does a decentralized service entail high latency?

The article's author takes the single example of email, and, extrapolating from that, assumes that it's impossible to build low-latency decentralized publishing protocols.

And his statement, "Protocols are for when there’s no good centralized way of getting things done," is so ridiculous, I don't even know where to begin.


Please begin though, since I will never know why you disagree with me if you don't begin.

"Protocols are for when there's no good centralized way of getting things done" is actually not as crazy statement as you think, since it also means "If there's a good decentralized of way of doing something via protocols, you don't need a centralized approach".

What I meant by "getting things done" was efficiency. Decentralized and centralized systems each have their pros and cons, and efficiency is definitely not one of the pros of decentralized systems.

I do apologize for the incoherency in the post but I still don't think I didn't make any incorrect points. I think it would be if you could provide a counter example to my statements.


efficiency is definitely not one of the pros of decentralized systems

The economy is a simple example of a system where decentralized == efficiency and centralized == inefficiency.


Great point! I agree, once a system becomes really huge it becomes hard to maintain and centralized system which used to be efficient becomes inefficient. However when the system CAN be maintained they are generally more efficient IMO. For example many developing countries go through a stage where the government makes a centralized plans. Even the U.S. went through this until the point where the scale itself brought forward inefficiency. However when it comes to an internet service like Twitter which CAN be maintained in a centralized manner--at least so far--I don't think decentralized approach has any benefit in terms of efficiency :)


It makes sense if you read it as "[server to server] protocols are for when there’s no good centralized way of getting things done,"

But then it's tautological.


I get the impression the author doesn't have a clue what 'realtime' means. I appreciate he is trying to say 'within seconds' but that is not realtime, true realtime is a completely different game.


Yep, twitter is not the classical realtime taught in school. He probably would have done better by referring to it as "everyone in the same living room commenting on the TV". Its timeliness is directly linked to how many people are at / watching the same event. The tweets about a NASCAR race or the plane coming down in the Hudson really would be a pain / confusing if received hours later. For a lot of stuff, this doesn't matter so much (like e-mail).

"Pause and Consider" is pretty much a lost concept in the online news world. Look at the Apple screw fiasco for a point when the repeaters didn't. Maybe having a distributed network of sorta timely servers wouldn't be such a bad thing. I seem to remember debates happening fairly rapidly on USENET.


> Maybe having a distributed network of sorta timely servers wouldn't be such a bad thing. I seem to remember debates happening fairly rapidly on USENET.

I believe this is what twitter does. At least that is the impression I have as a user. Having 'watched' a number of live events on twitter I tend to get a floods of 10-30 tweets at a time all in the same language. I presume this means there is a delay related to geographic servers packing up multiple tweet for replication.


There ought to be a way to bring IRC into all this..


Exactly, IRC networks have been realtime AND decentralized for decades, they even solved the redundant data problem.


Why do you get that impression? I do have a clear understanding of realtime in the sense you are implying. That kind of "realtime" system is actually closer to the point I'm trying to make. Generally centralized architecture is better for building robust and reliable systems than decentralized.

In fact it would be great if you could share some counter examples.


'Realtime' may have many definitions but those I'm familiar with relate to near instantaneous feedback based control systems. More generally it applies to systems with mandatory and very limited time constraints. A common high level software example would be algorithmic stock trading. A more historically proper 'realtime' system is code/hardware that put MSL rover on the ground on Mars. These are not things which twitter does. Their UI may give the impression of 'realtime', their systems however are not, nor do they even pretend to be realtime systems.

In short, you've taken a term and misappropriated it based on marketing hype. That is a reasonable thing to do when pitching to a bunch of consumers/marketing folks, it doesn't usually set all that well with engineers.


Like you said, "realtime" has many definitions. And I disagree it's a misappropriated term based on marketing hype. Twitter IS realtime, just in a different context. If you stick to your strict definition, it would be incorrect to even call things like XMPP realtime.


I have no idea how your bringing XMPP into this but just as Node, AMQP, ActiveMQ, ZeroMQ, Jetty, or <insert messaging or async tool here> they have have no singular impact on a system being realtime or not, they are just tools. If you want to consider a system which involves these tools and has a realtime constraint e.g. X arrives before Y time or we're seriously screwed, then sure, I'll happily concede and call the system, not the tool, realtime.


Clearly, you've never worked in distributed systems, and your impression of Twitter (and other large "centralized" by your definition systems) are actually huge, loosely coupled decentralized systems.

The real difference is whether than decentralized, loosely coupled system is run by and owned by a single corporation, or not.

It's the design of the system itself that determines latency, not the centralized vs. decentralized argument.

You're speaking as if twitter.com is a single giant machine on a single IP address. In fact, it's thousands of loosely coupled machines and services all working in tandem across nearly the entire globe. Sounds like a typical decentralized system to me.


I have a feeling that everyone in this particular branch of the thread seems to criticize me based on their disagreement with the definition of terms I used.

I understand it can be confusing because I used these terms interchangeably as well as in more loose context, and I apologize for being incoherent. I tried to elaborate my logic throughout the rest of this thread here and there, so if you have time, please read through them.

Trust me. I do know a thing or two about the technologies I mentioned in the post.


The only thing using a centralized service buys you is a single point of failure.

(FTR, this is hyperbole, but not by much.)


That's exactly my point. Single point of failure means single point of responsibility, which means that single point of failure must do a damn good job maintaining it.

That's why a "non-profit" approach will never work.


So, by that measure, "The Web" as a nonprofit (w3c.org?) will never work and should be a huge failure, right?


Did you get to read my post by any chance? That's not what I talk about in the post. My point was, sites like Wikipedia are successful as non-profit, but the type of value Twitter provides is very different from the value w3c or wikimedia provides and should be considered with a different frame.


The only value twitter provides is in the value of it's users. There's virtually no value in the platform itself.


I have been playing with the concept of a distributed Twitter like thing built on Google's App Engine. You download it, edit one file (or run a setup program) and upload to your own account. Unless you become insanely popular, you should be well under the free tier limits.

Google would probably hate me.


I already did it http://code.google.com/p/qantiqa/ It was my CS degree final project.

It is based in an obscure P2P framework from my university (based in another one) and it uses GAE as an overlay's entry point. The network is fully functional and independent of the entry point, but it is needed to do authentication.


I imagined relying on Google accounts for the App Engine application and on https for internode communication. Pushes to subscribers could require a confirmation callback but https would ensure you'd be conforming with the right sender.


Wouldn't be more easy to solve sender identity with PGP keys? Just a random idea.


XMPP is cooked into the platform. I wonder how hard would it be to use it with other implementations.

PGP is probably too complicated for the average user. I wouldn't like to expose people to that (and failing to do so while relying on it would open a lot of attack vectors)


Two words: do it! If you can do it, and if you'd learn something useful for yourself, think about the consequences later as long as it's legal :P Google might just as well hire you ^^


> Google might just as well hire you

I interview with them every once in a while. Unfortunately, the opportunities for engineers in São Paulo are very limited.


Honestly of my tweets showed up at the speed of email instead of the speed of an IM my twitter experience wouldn't suffer one iota.


Twitter is just hosted, proprietary pub/sub. You could do the same thing with a XMPP server and client for a long time before Twitter appeared. I think the only reason people are asking for decentralized Twitter is they're not aware pub/sub protocols that do the same thing exist.


In fact, "Twitter is just hosted, proprietary pub/sub" is exactly the type of thinking that I was criticizing in the post. Sure the technology is not difficult to implement, but the point I was trying to make was Twitter's value is not in the technology but the realtime delivery network it created.


But XMPP does the exact same thing, and it works in a distributed fashion. I can have an instant-message conversation in my IM client over XMPP with messages being exchanged effectively instantaneously. There's no appreciable delay between me hitting Enter and the other party seeing my message, barring problems on the network between me and them; and those problems would degrade my perception of Twitter's performance too, if they were between me and Twitter. While centralized instant-message services exist (AIM, for instance), in 15 years of using instant messaging more or less daily I've never ever had the experience of feeling like a conversation I was having over AIM was happening "faster" than one I was having over XMPP. Never. It has literally never happened.

XMPP is performant enough for Google to use it as the IM component of Google Talk. Google Talk is instant messaging. Not "eventual messaging." Instant. Just like the centralized alternatives.

Centralized services have some pros over distributed ones -- ease of discovering other users is a big one. But you're just making an assertion that only a centralized system can pass messages back and forth fast enough to be perceived as effectively instant without providing any evidence to back that assertion up.


Let's say you hosted your Diaspora or Status.net or whatever on your heroku server. You post something and it gets huge traffic and the server goes down. And this doesn't happen just for you but for lots of people on average. Who will maintain this? In this type of environment, you cannot make any assumptions on realtime delivery. That's what I mean by it's not a technology problem. You can use XMPP in Google Talk since most of the times you're doing 1:1 chat. Twitter does 1:N (where N goes up to as large as several millions), and this becomes a big deal. Does this make sense?


So AT&T isn't decentralized? I'm not an AT&T customer, nor do I live in the states, but I assume AT&T is a a decentralized network which uses a communications model or let's call it a "protocol" to talk to other networks.

If you tell me AT&T is centralized because it's covering a country, or that you can call all your family and never leave AT&T's network, that doesn't mean it's centralized per-se, it might be centralized for it's own services, which makes sense for any company.

But it's still part of a decentralized network of communication providers, which use a protocol... A live one actually, kind of like what twitter could become a king node of (after all, the celebrities are there, right?).

Oh... I see what you meant, I might have misread that whole post!


Imagine tweeting what you ate for lunch, only to have it show up at dinnertime.

I believe this is what the author means when he says the value of twitter is in realtime delivery.


We already have an existence proof of near-real-time 140 character messages crossing networks. It's called SMS.


Are there any statistics on the average delivery time for email? Meaning delivery to the mailserver, not final retrieval by the client. While email tends to be used more asynchronously (ie, most people don't constantly check it), is it really much more inherently latent than Twitter?


The important thing is this has nothing to do with technology. It's more about psychology--what type of assumptions people make when using these technologies.


by definition decentralizedness means no one is responsible

That is flat out wrong.

In this sense, I would go as far as to say “Protocols are for when there’s no good centralized way of getting things done.”

That isn't even logically connected to the previous sentence at all. I don't understand this bit, either:

Realtime protocols like PubSubHubBub and RSSCloud are there because that’s the only way to make the Web work in realtime. It doesn’t work the other way around.

The only way I can parse this is as refuting the strawman that "the web works in real-time because there are realtime protocols". But of course it does, even Twitter has internal protocols that work in real-time.

Would GMail be faster if they had invented their own protocol? No, it would just be dumb.

So... in summary: wait, what?


I agree my post wasn't coherent. Forgive me I just wrote down what came to my mind and posted it.

That said, I don't understand what you are trying to say either.

It would have been nice if you discussed why my argument "by definition decentralized means no one is responsible" is "flat out wrong".

Also regarding my statement "Realtime protocols like PubSubHubBub and RSScloud are there because that's the only way to make the Web work in realtime", maybe I didn't state it clear enough, but it's not as obvious as it sounds in my opinion. Before RSSCloud and PubSubHubBub, if you wanted to run an aggregator you had to actively poll sites, which was extremely inefficient. With these protocols now aggregators can deliver content in near realtime because they just have to implement those protocols. The point I was trying to make was, even these efforts will never be completely reliable because they are built on top of protocols and standards. Protocols and standards work only when "good enough" is good enough.

For example, building a decentralized version of a stock ticker system sounds cool but in practice its performance will never be as good as the original centralized version.


In a decentralized system like email or jabber there is responsibility, @example.com is responsible for delivering messages to its users, and for the necessary communication with other servers.

And of course you'll want to use push messages for communication between servers. It really doesn't have to add all that much latency, a few roundtrips is perfectly doable. That's a very small price to pay for the benefit of being independent of the whims (or unreliability) of a single entity.

That said, it's probably not going to happen soon, but not because of technical issues. I think it's just a matter of inertia among users, just like people are still using MSN messenger or skype instead of jabber.


I think the when we talk about Twitter, the definition of "reliability" is entirely different. When we talk about emails, we can say it's reliable as long as my email gets delivered one way or another. For example, if I send a mass email to 20 friends, I don't assume anything about when each of them will receive my message. I just think that it will get delivered in a reasonable amount of time.

However what Twitter brings to the table is this: It is safe to assume when I tweet, all my followers will be able to see it immediately. This is a huge difference not because of the delivery time, but because of the type of behavior it enables. Being able to assume that my message is delivered in realtime is very different from not being able to assume. And this is not a technical issue, so even if you build a "good enough" realtime system using decentralized architecture, it wouldn't be half as good as Twitter.


It think the difference between 0.1s and 1s, as I imagine it would be, is quite irrelevant. The immediacy comes from the push, adding another hop for a federated server does not have to add a significant latency.


It is safe to assume when I tweet, all my followers will be able to see it immediately.

Unless they do something else with their computer, or are doing some chores in room with the big blue skybox.

And you can only assume this because you know Twitter has fast servers etc. - not because it's a proprietary black box.

The same would hold true for your favourite, lightning-fast diaspora or OStatus pod of choice. Yes, ONLY those followers who ALSO care about super instant real-time speed, will be signed up to fast pods in turn -- but guess what, if you forced them to be on twitter they still wouldn't read, much less react, to your stuff in real-time. So you gain nothing.

Is there a delay when updating to a status to identi.ca before it shows up on rstat.us for example? Have you even tested before you accuse people of "not getting it" --- ??


Stock tickers are decentralised. Trading houses 'subscribe' to ticker 'feeds' from many different exchanges and receive updates in 'real time'.


But the protocol used and the latency are not linked at all. You can use a proprietary protocol to run a centralized service, you can use a proprietary protocol to run a decentralized service, you can use a public protocol to run a centralized service, you can use a public protocol to run a decentralized service. And of course you can also mix them, insofar as they interface.

For example, building a decentralized version of a stock ticker system sounds cool but in practice its performance will never be as good as the original centralized version.

Well, yes - but the question is, how fast is fast enough, for most applications? And wouldn't it make more sense for the people who need to have really, really fast exchange of status updates, to sign up for the largest, fastest provider -- instead of everybody having to be there, just because some people need it real-time, all the time?

Personally, I prefer some latency actually. I like being able to edit thoughts after I posted them, and so on. I like being able to search stuff and organize it -- Twitter fails dysmally in that aspect, if it wasn't for the "that's a joke, right?" limit of 140 characters. And if something really important happens, it will show up on blogs and news sites a few days later. Friends will tell me about it. Personally I need information I can ACT on, not just... stuff... that clogs me up. New software or knowledge is usually still fine when it's even a week old, or more. It's not like I can instantly learn it. And that something is popular is not something I need to know while it's popular, actually, a delay filters quite a lot of lame stuff out.

And again, you can still build the exact same thing centralized and interoperable with slower leaf nodes, on open protocols. People just don't wanna because then they don't get to license it to others, and brand things that are much prettier and more useful unbranded. To me the proof that it's people problem is that the technical problems are more or less solved, or at least proven to be solvable (there's always room for improvement, none of our protocols are utterly fantastic). The people with the most energy are the ones that are kinda greedy. There are sparks of genius or excitement here and there, but they disperse too easily, while the ones that keep at it are the ones who should give it a rest, so the rest can think and build things that are by and for everybody, forever.


Sorry for being so snarky, but I really don't think you make any valid/coherent point I could see. Let me make a counterpoint, maybe someone can find holes in it:

Imagine Twitter supported OStatus.[1]

Anything going on between users signed up on Twitter would be instant just as it is now. Twitter could still have ads, and it could justify those ads by offering great features (OStatus et. al don't tell you what you can do, after all, they just restrict what you can share with other apps that implement it), and great speed.

Twitter users could then still subscribe to someone hosting their website from their crappy DSL connection; just those updates would be slower. And it would be up to that person to either get a really fast server, and/or pay for a CDN etc., or simply suffer the cost of the delay, and the fact that some people might not even subscribe to them because of that delay.

[1] Now image GitHub supporting OStatus.. Wordpress, forum software, everything. Why not? Turns out the content aggregator IS THE INTERNET, and the people who "don't get it" are those who desperately try to see it as a pie that can and should be carved up. But you cannot fool all the people all the time, and by the chain of logical fallacies based on wrong claims/assumptions, I really wonder what audience this is even aimed at. Hackers, or people with money/eyeballs?

We - and I don't mean "hackers", I mean most people if you asked the question straight - do not need or want middlemen for something as basic as communication or trust. That's what it boils down to. Just like everybody should be able to read and write and do basic math, everybody should at least have the option to have their own address on the web, without being forced into one of the gated communities.

There is a hope for an ironic twist: should protocols and DIY take off, Facebook and Twitter and others better support RSS, OStatus etc; because otherwise they'll just have walled themselves into a tomb, while we play in the actual garden. After all, a garden is where exciting things can happen, where people play games for fun, where they build crappy things with stones and branches and leaves, and where it's boring most of the time - in contrast, the alley is where people mistake drug addiction for fun, the mall where they mistake consumption for creativity, and the whorehouse is where the fun never stops. I know where I'd rather be. (no offense to prostitutes intended, I'm just running out of metaphors)

To finish my rant: in the wake of the app.net enthusiasm I often read that even 10000 users would be great if those were "the right people". Well, I will not ever be able to get to know, much less dedicate time to, even a 1000 people. So give me a network of 10000 people who are willing to read manuals and fiddle, if they can keep their independence and dignity that way. I don't mean hackers, I'll never be a good programmer myself; I mean people who are semi-intelligent and stubborn like me. People who write for themselves, not for audiences, who not only don't care what's on TV, who don't even know. That's enough.

A person in a rented apartment must be able to lean out of his window and scrape off the masonry within arm's reach. And he must be allowed to take a long brush and paint everything outside within arm's reach. So that it will be visible from afar to everyone in the street that someone lives there who is different from the imprisoned, enslaved, standardised man who lives next door.

-- Friedensreich Hundertwasser


> We - and I don't mean "hackers", I mean most people if you asked the question straight - do not need or want middlemen for something as basic as communication or trust. That's what it boils down to.

The fatal flaw in this argument is asking the person. Very few will say 'I want a middle man in my news', but at the same time few will question that middle man. Quite to the contrary, they will spread that middleman's link to everyone they know through Facebook, Twitter, Reddit, etc. This even happens on Hacker News (e.g. constant TechCrunch and HuffingtonPost articles).

Often times people will say what paints them in the best light. They will answer questions not in the context of what they would actually do but what they would do if they were whom they imagine themselves to be.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: