Hacker News new | past | comments | ask | show | jobs | submit login
Building a Modern Web Stack for the Real-time Web (igvita.com)
169 points by igrigorik on Jan 18, 2012 | hide | past | favorite | 33 comments



> The messages are delivered into a PUSH (publish) ØMQ socket (ala Mongrel2) ... Backend subscribers use a PULL (subscribe) socket to process the SPDY stream

I might be fuzzy on my 0MQ knowledge and Mongrel2 but I thought PUSH/PULL are distinct socket types from PUB/SUB...


They should call it “backend workers” instead. Backend “subscribers” (think ”workers” please) don't use PUB/SUB because then the same message will be delivered to every worker. You hardly want that behavior to process web requests. PUSH sockets only deliver the same message to one PULL socket.


The most interesting part of this article for me is the comment from Jay Levitt about AOL's old stack. Fascinating.

The reimplementation side of things is one more strike against software patentry, at least.


This architecture supported 1.5 million simultaneous users - with buddy lists and IMs, which means a full-mesh n^2 pub-sub system

Novadays I can hold 1.5M open sockets on single server ...


SPDY is here to increase efficiency between browser and server (this is sometimes called scaling). 'Dynamic network typologies' as presented is moving the network routing stack to a software CPU. Not sure I would bridge these two technologies together to solve my 'interesting worker topology patterns'. This is classic top down design.


Right, I'm not linking SPDY and "dynamic topologies". Rather, I'm linking the message-oriented communication of SPDY and 0MQ (and 0MQ is not the only way to achieve this).

Once a request enters your "back office" it should all be message oriented, over persistent connections, with flow control / QoS, and the like.


Btw, there's another project that acts as a gatway between HTTP/WebSockets and 0MQ here: http://tailhook.github.com/zerogw/

Also, there's an attempt for 0MQ module for Nginx: https://github.com/FRiCKLE/ngx_zeromq


On one hand, you have tcp, which already allows you to multiplex multiple streams, every bit of software understands it, and your back office web application servers fully support it. Yes, it has limitations (slow start applying to individual streams instead of all the streams between two computers seems to be the primary one), but instead of trying to fix it (which admittedly would take a lot of work and would be a slow process), you take a single tcp stream and implement multiplexing all over again, now you have to work support into all your applications from top to bottom. Not surprisingly, this also takes a lot of work and is slow process.


TCP is a transport layer and we're not talking about replacing TCP - not going to happen, and no reason to do that. 0MQ, SPDY, etc, run at application layer, and TCP flow control, window sizing, etc, work in tandem with the application layer controls.


Why do you think multiplexing multiple independent streams at the application layer instead of the transport layer is a good thing? Now that people are using it they find they have to re-implement the other things the other things the transport layer does (like demuxing the stream and sending each part to the correct backend service)

Yes, replacing or modifying TCP would be a long, slow process, but so is re-implementing all its functionality at the application layer. Google introduced SPDY over two years ago (Nov 11, 2009), and as the article points out, there is still a whole lot of work that needs to be done before it lives up to its promise.


Really? I was under the impression that SPDY used SCTP and not TCP?


"To minimize deployment complexity. SPDY uses TCP as the underlying transport layer, so requires no changes to existing networking infrastructure." http://dev.chromium.org/spdy/spdy-whitepaper


Should we be expecting in a few years, that web sockets will be advanced enough for us leave HTTP/1.1. behind? Aside from chat/messaging I'm not sure how web sockets can provide additional functionality from what we have now.

Edit: I confused web sockets with SPDY. I have to go read up on what these things actually are.


WebSockets is not a replacement for HTTP/SPDY. Don't forget that a WebSocket is effectively an upgraded HTTP / SPDY connection (in fact, there is work in progress to enable WS over SPDY).

In many cases, you don't need bi-directional communication.. Which is also why HTML5 spec introduced Server Sent Events (SSE): http://www.igvita.com/2011/08/26/server-sent-event-notificat...


> Aside from chat/messaging I'm not sure how web sockets can provide additional functionality from what we have now.

Ever edit a document with multiple people, at the same time? Much easier with web sockets.

Ever get a notification that a new email has arrived the second it arrived? Much easier with web sockets.

And yes, there's chat and messaging.

There are many more scenarios that will become evident as people start using the tech. It is a step up from XHR.


>It is a step up from XHR

Thanks, this is the simplification I was after.


So if someone wants to implement EtherPad or Google Wave, they don't have to write a complex system utilizing Operational Transformation algorithms?

Or... don't have to deal with concurrent editing/locking mechanism? WebSockets do that for us?


> WebSockets do that for us?

Nope. However, it will allow you to contact any connected client at will from the server-side. Try doing that with XHR without constantly polling the server.


What's wrong with long polling and keep-alive connections, handled with evented socket code on the server side? Is it mainly because of the HTTP request overhead?


With long-polling you have reconnect after every message, which is a very expensive operation (relatively speaking). HTTP headers often form the bulk of the request, and the number of connections you have to maintain is high. All of these things add up.

In addition, TCP slow start, window sizing, etc, work against you when it comes to optimizing for latency: http://www.igvita.com/2011/10/20/faster-web-vs-tcp-slow-star...


> Is it mainly because of the HTTP request overhead?

Aside from the overhead you and others mention, there is a bigger problem. For most server-side languages, once a request is initiated, it is very difficult to feed new data into that request and make it perform additional functions. Most requests get their input data from GET or POST. What if an external server outside of the one that is handling the request needs to contact one of the clients?

Using long polling/SSE, you essentially have to have the thread/process handling the request also poll any external sources for information for any potential events that might need to be triggered. This turns into a huge mess very quickly. Especially since each polling/SSE request thread/process is isolated from the rest, forcing you to do really funky stuff to coordinate everything.

This isn't as much of a problem with web sockets. Your web socket server has access to all of the connected clients and you can create a web socket interface in any external servers to allow them to talk directly to the proper clients. It's cleaner.

Web sockets isn't perfect. I had to implement a proper client for web sockets in PHP so that it can talk to my Node.JS websocket server. It was horrible. But it was less horrible than setting all of that up with long polling.


The problem you mention is solved by evented sockets.


There is nothing inherently wrong with long polling, but when you consider the overhead of a small payload (let's say 4 bytes for fun) the HTTP request headers add significant overhead. WebSockets have very little overhead for a 4 byte payload (2-4 bytes depending on masking (4 for the spec)) ... and the latency is less than half. It's just more efficient.


Pretty much. That, and I don't think you can really send data back to the server over a long-polling connection - if you can, it's a giant hack - but WebSockets is explicitly designed for bidirectionality.

SSE, on the other hand, basically is long-polling and keep-alives, but with better defined behavior and a nicer API on the client side.


Thank you Ilya for sharing this today - a small oasis of information in a desert of SOPA spam.


HTTP has always just seemed... clunky to me. I think it is the statelessness.


The statelessness constraint is there to provide for cacheability; so that resource usage doesn't scale on a per-connection basis. HTTP's architectural constraints might not always be the best fit for your application's architecture, but the constraints aren't arbitrary.


Are you too good for cookies?! Who doesn't like cookies.


ØMQ?

How does one pronounce that? Yur-m-k?

http://webcache.googleusercontent.com/search?q=cache:zsS5aXX...


How do you pronounce the letter "Ø"?

http://answers.yahoo.com/question/index?qid=20100101105722AA...

(but really, it's ZeroMQ)


ZeroMQ


That's no zero - I'd give you it as a backwards null.


So, basically Mongrel2 with SPDY?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: