Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: A Unixy approach to WebSockets (websocketd.com)
421 points by joewalnes on Feb 14, 2015 | hide | past | web | favorite | 140 comments

> Write programs to handle text streams, because that is a universal interface.

I always hated that part of UNIX. It would be so much better if programs could handle data-structure streams instead. Having text streams causes every single program to implement ad-hoc parsers and serializers.

I now use the command-line only for trivial one-time actions. For everything else I'll use a scripting language and forget about reading and writing text streams altogether.

The answer is probably the same as why powershell isn't as usable as a unix shell is. Which in turn has a lot to do with why we're still programming text files and not clicking fancy objects, despite it is seemingly a more powerful system and the many projects which tried to take advantage of that.

Text is a useful common denominator. Text is possible to version control, tie to bug trackers, and handle with configuration management systems.

The same is true for the command line. If you handle structured data, or objects, you communicate using APIs. While it's not theoretically impossible to still use version control and configuration management, it turns out that it's much more difficult in practice. Plain text is a useful lowest common denominator.

We're already creating ad-hoc APIs using cut, sed and awk and grep (to name a but a few) all the time in order to massage the data into a format the next program in the chain will understand. This sometimes involves non-trivial invocation chains, I always feel like I'm working on a representation of the data rather than the data itself.

I would much rather have functional primitives (map, filter, reduce, zip, take, drop, etc) doing this work.

It would seem to be better in theory, but I don't think it's much better in practice. I could never get on with PowerShell, though that's a further step beyond what you suggest - not just structured streams, but object streams.

It's like the difference between static and dynamic typing. Solving the type system's constraints adds complexity over and above the irreducible complexity of the problem. Static typing pays for its added complexity by proving certain things about the code, but for ad-hoc, short-lived code it usually isn't worth it. And most code (by frequency, if not by importance) using streams is ad-hoc, on the command line.

With a structured stream, there are only a handful of generic utilities that make sense: map, filter, reduce, etc. (and they better have a good lambda syntax). Whereas the advantage of unstructured streams is that utilities that were never designed to work together can be made to do so, usually with relatively little effort.

For example, suppose you have a bunch of pid files in a directory, and you want to send all the associated processes a signal. What kind of data structure stream does your signal sending program accept? What needs to be done to a bare number to convert it into the correct format? How do you re-structure plain text from individual files? Structure in streams seems to have suddenly added a whole lot of complexity and work, and for what?


    cat $pid_directory/*.pid | xargs kill -USR1
(I don't really see how a scripting language solves your issue. You still need to parse the output and format the input of all the tools you exec from your scripting language. Or maybe you're not actually using tools written in lots of different languages? Because this is one of my main use cases for the shell using streams: gluing focused programs together without constraint on implementation language.)

>For example, suppose you have a bunch of pid files in a directory, and you want to send all the associated processes a signal. What kind of data structure stream does your signal sending program accept?

What program? A single line of shell code would work fine. Kill itself only need take a pid, or an actual handle if Unix had sich a thing.

>What needs to be done to a bare number to convert it into the correct format?

If a "bare number" isn't the correct format, why would you have them at all?

>How do you re-structure plain text from individual files?

The whole idea is not to use plain text at all.

>Structure in streams seems to have suddenly added a whole lot of complexity and work, and for what?

Structuring your data doesn't add complexity; when you consider the hoops one jumps through to strip data of its structure at one end of a stream and reconstitute it at the other, it's really reducing it. It's only if you insist on also using unstructured representations that complexity is increased.

Of course, as long as Unixes and their shells only speak bytestreams and leave all structuring, even of program arguments, to individual programs, it's a moot point. He's still right aboutnit being a shitty design, though.

> When you consider the hoops one jumps through to strip data of its structure at one end of a stream and reconstitute it at the other, structured data is really reducing complexity.

Exactly this. I think HN doesn't have much experience with powershell, which is why you're currently being downvoted. So let's have a practical example: consider killing processes run in the last 5 mins using posh:

ps someprocess | where { $_.StartTime -and (Get-Date).AddMinutes(-5) -gt $_.StartTime } | kill

Now try the same on bash, and spend time creating a regex to extract time from the unstructured output of Unix ps.

kill `ps -eo pid,etime | grep -P ' (00|01|02|03|04):..$' | cut -d " " -f 1`

Not really a complex regexp thogh. I almost exclusively use Linux and thus bash/zsh etc. And yes, my piece above looks uglier and like but of a hack, but that's not the point. It's easy because it's discoverable. These are one-liners you write ad hoc and use once. But powershell in my experience lacks the discoverability that bash has, you can't look at what some tool outputs and then know how to parse it, you need to look at the underlying object first. Granted I have maybe one day of experience with PowerShell, but I don't know anyone who uses it for their primary interaction with the Computer. For Bash though...

(And yes I'm aware that you can also create huge complicated bash scripts, but you could also just use python)

Find the name of the CPU using powershell and have fun looking up the correct WMI class and what property you need.

Here's bash: grep name /proc/cpuinfo

> my piece above looks uglier and like but of a hack, but that's not the point.

Well it was precisely my point.

get-wmiinfo seems pretty discoverable to me. You can browse the output and pick what you want.

For the sake of completeness: your regex doesn't perform the task either.

Structure in streams seems to have suddenly added a whole lot of complexity and work, and for what?

Being able to stream a collections of bytes (and collections of collections of bytes, recursively) is one case that I find myself wanting when sending data between programs at the command line.


  ls "$pid_directory" | xargs rm
This, of course, has problems for some inputs because ls is sending a single stream and xargs is trying to re-parse it into a collection on which to use rm.

If there were some way to encode items in a collection OOB, you could pipe data through programs while getting some guarantees about it being represented correctly in the recipient program. (Sometimes you see scripts that do this by separating data in a stream with NUL delimiters, but this doesn't work recursively or if your main data stream might have NUL in it.)

>Being able to stream a collections of bytes (and collections of collections of bytes, recursively) is one case that I find myself wanting when sending data between programs at the command line.

If you don't mind using JSON as intermediary format, you might like to have a look at jq: http://stedolan.github.io/jq/

jq can also convert raw (binary) data to JSON objects containing strings (and vice versa) for further processing. Naturally, jq filters can be pipelined in many useful ways.

There's work ongoing on FreeBSD to add libxo support to tools in the base system, which will allow you to get (amongst others) JSON out of various commands: https://github.com/Juniper/libxo

The .NET shell does this, but that's why it's not Unix, and it's not universal.

If programs passed data structures then either you're forcing a certain data structure model (i.e. it's not universal, because it's not compatible with anything else), or your data structures are so general (i.e. a block stream) that your applications are going to be parsing anyway... and that's going to be even nastier than if everything was stupid text steams in the first place.

Programs already need to have their input in the proper format in order to parse them. I'd like the mismatch error to be from the invocation of the program rather than halfway down the parsing.

For this to work data structures would have to be nothing more than scalars, sequences and mappings without specific concrete types. Just like JSON, YAML, and the rest do it now.

The main problem is that there are many kinds of incompatible "data-structure streams". Using a scripting language doesn't solve the problem, it just standardises on one particular ad-hoc parser/serialiser combination (say, Python's pickle, for example).

That's fine for personal use, or in a single project, but doesn't scale like "dumb" text does.

If there was a non-ideal but reasonable standard serialization format, like maybe IFF [1] or BSON [2], it would still simplify things, and common tools like `cat` might support decoding them.

It would be easy to wrap an arbitrary other packed format inside a binary string, or an arbitrary text format inside a text string.

IFF is quite successful in certain areas, BTW.

[1]: http://en.wikipedia.org/wiki/Interchange_File_Format [2]: http://en.wikipedia.org/wiki/BSON

You could always pass a message per line (so to speak) and use msgpack or json... if you want compression you could use json+gz

Of course, you could also do something similar with a 0mq adapter as well for a request/response server, which is available pretty much everywhere you might need it (language wise)... It's an interesting generic interface... however as I mostly use node for newer development, it's so trivial to use sockets or websockets directly for this, I'd just assume go that route... or for that matter, something like sockjs/shoe.

There's a very good reason that UNIX is data structure ambivalent; because supporting lower level (text streams) offers far more flexibility. If you want to use a specific data structure type among your own suite of scripts you are free to do this, UNIX does not get in your way.

At the same time, if you want to write a utility like grep that is agnostic to the structure of your text, it can exist and work. If UNIX cared about data structures, this wouldn't work.

Here's a good rule for you; if you find yourself thinking the way UNIX (or any other long existing and widely supported system) does something is dumb and you know better, assume you are wrong and look for the reasons that the people who are smarter than you chose to do it the way that it was done.

This is a tough lesson and for many the only teacher is experience. For the longest time I avoided what I considered abstract math, despite the general wisdom of its importance to computer science. I spent weeks writing software that could predict customer complaints based on supplier action. Given the first three days of complaints following an action, I could reliably predict the total number of complaints a year into the future. I proudly showed one of my coworkers, who didn't have the aversion to math that I did, he informed me that I just reinvented poisson linear regression. I learned my lesson that day. This guy will likely figure it out after a couple of tries as well :)

Have you read the Unix Haters Handbook?


I hadn't, looks like an hour of two of amusement.

While there are some occasional good points that could be converted in to solutions, I can't help but think that this book is more amusement than actual constructive criticisms well packaged.

It's a collection of some of the best rants from the Unix Haters mailing list that existed in the late eighties/early nineties. The people that were on the list usually had experience with operating systems that were more advanced than Unix. Constructive criticism was never really the point of the mailing list.

My point was that the parent's suggestion that anyone who criticizes Unix obviously isn't smart enough to understand it is just flat out wrong and offensive.

> the parent's suggestion that anyone who criticizes Unix obviously isn't smart enough to understand it

That wasn't what I was saying at all. But thanks for playing.

How exactly would that work? How would you pass one kind of data structure from one program to another so that they both could understand it without involving parsers or deserializer? To be concrete let's start with the simplest possible data structure the linked list.

That's no different from text. You can't pass one kind of text from one program to another without both understanding it. That's why ls has a million different and confusing quoting options.

The advantage of using a proper data format is

a) You don't have to do in-band signalling so it will be far more reliable (you still can't have spaces in filenames for a lot of unixy things). b) The encoding is standard. Using text for pipes still requires some kind of encoding in general, but there are many different ways (is it one entry per line? space separated? are strings quoted? etc.)

A linked-list is already too specific, a sequence is all you need to express any linear arrangement of data, be it an array, linked list, vector, or any of the myriad of other concrete sequence types.


And how do you convert any data structure to an s-expression? You serialize the data. How do you get the s-expression into a form your program can understand? You parse it.

In other words you still haven't solved the fundamental problem of passing data back and forth between different programs. In fact if you are going to mandate a specific serialization/deserialization format then JSON, XML, or even ASN.1 are better options than s-expressions.

My point was more, let your language do the parsing and deserialization for you. S-expressions are merely a textual representation of linked lists. The parsing and evaluation of text is already written as part of the language.

The other point was that we're ultimately stuck with serial forms of communication, be it wires, pipes, sockets etc. If we want to easily transfer structured data through these serial channels, we should probably build up our structures from a serial primitives, and S-expressions are much more handy than plain strings (which we may not even be able to parse without ambiguity), or XML, JSON or whatnot. One, because the parser is already implemented as part of the language, and secondly, because you can transfer code in addition to data, and evaluate it on the remote end to bring into scope more "structured" data like records.

I did try to include a bit more in the previous post, but I'd accidentally hit save, and I was unable to edit the post afterwards

Somehow it always felt natural in Lisp Machines.

How does PowerShell do it?

PowerShell can not pass anything structured unless the other end of the pipe is a cmdlet and even then there are times when the other end of the pipe is forced to interpret whatever is passed to it as a string instead of a more structured object like a dictionary.

So even within a controlled environment like PowerShell where everything is a cmdlet it is still impossible to pass only structured objects between commands.

Since there is so much pointless discussion about use of text streams under this comment.

When someone has example of situation when binary, json or other communication between websocketd and program is needed, please just file a ticket, it would be great to see practical situation instead of just arguing with each other about text stream/unix principles/json and other stuff.

At the risk of perpetuating the meme that hacker news is overly enamored of Golang, I'd like to say that providing "data structure streams" is exactly Go's sweet spot, and it's about as awesome as you would hope it would be.

No risk of that here. websocketd is written in golang.

You're on the right track. This does however present us with a challenging problem: developing a single consistent efficient cross-platform language-independent form of storing and transmitting data (between programs).

It's something to ruminate upon.

This is already a partly solved challenge. Text streams don't magically come into existence, programs needs STDIN, STDOUT and STDERR at their disposal to use them.

All you really need are scalars, sequences and mappings to fully express any data structure. Tagged types would be sugar on top but it could quickly add unwanted complexity.

All you need is a shell capable of marshalling data between the different programs piped together. Powershell is a nice idea but it only runs .NET code and that's a limitation I can't live with.

Most json stringifiers will create a json with no line terminators that aren't inside escaped strings... so a json object terminated by a CR would seem to be a pretty logical data format for open text...

There's also msgpack, protocol buffers as well... I think the plain text of json that is readily a line per message is far simpler and easier to handle though.

you never got to learn and understand the unix philosophy. your loss.

The "unix philosphy" is out-dated and wrong in many cases. You're an idiot if you follow it blindly.

A philosophy is neither right nor wrong. It's a philosophy, one that you choose to subscribe to, or not.

OP here. Something I forgot to mention on the page is how much this simplified life for a server admin. Using "ps" you can see each connection, the memory overhead, CPU usage. It's easy for a sysadmin to kill bad connections without taking down the whole server.

I like the project. But it says:

> Full duplex messaging

And the examples only show single direction. Is my understanding correct that everything received goes as STDIN? And is it also possible to run websocketd as a client?

In the source code /examples folder most of the languages have a bi-directional example called "greeter"

can i use other web socket clients like socket.io??

websockets after upgrade are not message based like the same as socket.io (which adds fallback/fall-forward) and sockjs does this providing a stream interface...

If you don't need to support a web browser client, you don't need the extra overhead.. and if you do, you're better off just using socket.io or shoe and having your server in node.js ... as an aside, I'd probably just use raw sockets in node if I didn't need browser support.

WebSockets are message based.

I realize this... I meant to say that socket.io adds a lot more than just websockets.

Unless you need to handle a very small number of concurrent connections, using 1 process per connection seems to be a huge overhead, although I can think of some use cases.

However, I can imagine a similar tool doing multi/demultiplexing, eg the handler process would take input text lines prepended with a connection identifier (eg. "$socketID $message") and output using a similar formatting. Pretty much like websocketd but with multiplexing and unixy/pipe-friendly (eg. you can apply grep/awk/etc before and after).

How would this fit compared to websocketd?

Indeed, there is no free lunch.

At is stands, this is only really workable for low traffic (so it doesn't eat memory) where connections do not come and go frequently (so it doesn't eat process management CPU).

Once you start doing multiplexing for the sake of making this more reasonable in terms of resource usage, the simplicity benefits kind of fall away as you move closer to a full concurrent web framework.

I guess it really depends what you're tuning for, what your use case is, and how much hardware budget you have to throw at the problem.

CGI fell out of favor for this reason, but WebSockets have a different runtime profile: instead of having to deal with 10K shortlived requests per second, WebSocket endpoints have much fewer but longer lived connections. This is why the CGI model actually works well on WebSockets.

BTW, there is a VM for Dart that is experimenting with different concurrent modes to provide an alternative to async programming: https://github.com/dart-lang/fletch

You can read its short wiki for some clues: https://github.com/dart-lang/fletch/wiki/Processes-and-Isola...

I like Fletch's idea very much. Imagine not having to worry about Async all the time.

Not sure how everything is implemented in Fletch, but I think I heard that in Fletch they are able to share 1 thread per many processes if need be. And they have been trying hard to save memory while implementing those features.

If you want to run some tests to compare with, I created some small samples using different Dart implementations and NodeJS here: https://github.com/jpedrosa/arpoador/tree/master/direct_test...

Fletch also supports a kind of Coroutine: https://github.com/dart-lang/fletch/wiki/Coroutines-and-Thre...

> Imagine not having to worry about Async all the time.

I'm nitpicking, because Fletch truly sounds very cool indeed, but when I use Elixir, Erlang, or Go, I never worry about async either. From that wiki page, I can't really see what the difference with the Erlang VM is.

(that's a good thing, the Erlang VM is awesome, and being able to write server code on an Erlang-like VM and still share code with the browser sounds like the thing that could make me adopt Dart)

About wow many connections for an average machine with 8GB ram would be deemed 'OK' with this tool?

I was just playing around with this tonight using PHP and each process was about 5Mb of RAM. I'd imagine if you wrote your server code in, say C instead, then the memory footprint would be much smaller.

There's also a limit to the number of processes allowed as well. On my OSX laptop with 16Gb RAM for example the default limit is 709 (kinda strange number???). The command

ulimit -a

will tell you the value of "max user processes" for your machine.

FYI I built simple count and greeter versions in Nim and they used around 350k. Some napkin math theorizes that's over 10k concurrents on an 4GB VPS for say a simple chat service backed by redis. I'm not sure how well websocketd will hold up at that point though...

Per process could mean easier scaling too. Like round robin connections balanced between multiple app servers backed by a beefy shared redis for example. I've never really understood how best to scale websocket services, but this could make it easier.

Thanks for that experiment!

The problem here of course is that a CGI-like approach does a fork plus execve for each request, which does not give a large benefit of sharing.

If you have a simple forking socket-based server, Linux (I assume that OS X is not any different) the amount of memory per process is much lower because it uses copy-on-write pages for forks and it's largely the same process.

That must be an OSX oddity. I just checked my Ubuntu laptop with 3GB RAM: 23967 processes, and a Debian server that I happened to be logged in to, with 0.5 GB: 127118 processes.

Of course, with 3GB you could only get 600 connections at 5MB a pop.

There is closed ticket (github.com/joewalnes/websocket) where we were battling this out, you're always welcome to join with good ideas, that would be appreciated.

It's just so easy to write a web-socket server these days in Go and other languages it might not be worth the trouble. I can only see something like this being useful in odd-ball use cases; of which none really come to mind right now but I'm sure I'd recognize it when I was faced with it.

Sounds basically like the difference between CGI and FastCGI.

I used this a while ago on my own insane project. I had a Webdav server and websocketd providing a Web interface to a Linux box.

Even at it's young age (bug reports #5 and #7 were mine), it allowed me to progress a lot further in my project before I needed to write a server designed specifically for the task at hand.

In the end I wrote userserv https://github.com/Lerc/userserv for the one must have feature I needed. I needed logins and responses delivered from a process with the UID of the CookieToken.

So, thanks Joe. Notanos got further because of websocketd and while I'm not currently using it, There's a high chance of me doing so in future projects.

Great to hear! Thanks :)

I guess the next logical step is to build a virtual filesystem such that each connection is represented by a file. Then you can further decouple the connection handling from the actual application. Applications can start later to talk to a connection established before.

That would be less 'unixy' and more 'plan-niney' :)

I like the idea of having a FS interface, e.g. using named pipes made available by the daemon.

That's the idea behind https://thefiletree.com/. The decoupling works well for applications tied to a textual format (see eg. https://thefiletree.com/heap/diagram.sequence). It means collaborative SVG editing, for instance, is not a challenge. On the other hand, binary data (bitmap images, sound, video) would need to be treated specially, I think, unless we had a common structure format and protocol that would map to each of those.

(The code is here: https://github.com/garden/)

There was SpockFS here on HN a while back: https://github.com/unbit/spockfs

I don't understand how I'd use this. WebSockets are generally for server-initiated events, but this doesn't make it very easy to initiate events on the server. Usually it will be some pubsub situation, so I'd want a server process to be able to emit a message to one or many connections - but instead the connection is tied to one process for its life. I'd like to see an example like a simple chat room.

Actually, I think you're looking for Server-Sent Events [1] for something of a pub/sub nature. WebSockets, on the other hand, are intended for realtime bidirectional communication over a single TCP connection.

Pub/Sub is much more one-directional. A single subscribe event initiates a stream of all published events. Server-Sent Events are more suited for this because the server will never have to bother checking the TCP connection for incoming data.

[1] http://www.w3.org/TR/2011/WD-eventsource-20110208/

I've built chatrooms with websocketd. To do this, you a shared pub/sub store. Redis is great, so is Postgres (with its pub/sub add-ons). Storing chat data in a store is more durable than keeping it in memory on a single machine.

The solution in question could well be used for pub/sub servers or for request/response patterns... it's really pretty cool. I'm needing to expose a very small utility that uses C# for the micro service... I'm considering this for that reason... aside from that, I think the solution in question is pretty cool.

A project in a similar vein is websockify ( https://github.com/kanaka/websockify ) , which makes an existing app with a TCP endpoint available over websockets.

What are the differences? Do I miss anything?

1. websocketify allows binary, while websocketd is text-only

2. websocketify does I/O through a socket that the program uses (and that it sniffes through rebind.so), while websocketd relies on stdin / stdout

3. The arguments are different

websocketd --port=8080 my-program

websocketify 8080 -D -- my-program

Also websocketd is multiplatform and does not require python. Neither good or bad, just another difference :)

Also, not sure if websocketify can but websocketd also has --dir argument and can supervise and route many sockets to many different programs.

websockify can be run on Node.js as well as python. Both versions seem to be highly reliable (I'm using them on multiple production servers with paying customers for web conferencing). The python version doesn't seem to work on Windows servers, but the Node version works fine.

I haven't used websocketify. I'd be interested to have users try out both and write up their experiences.

Nice! I hadn't seen that. Yeah, that's pretty sweet.

> Each inbound WebSocket connection runs your program in a dedicated process. Connections are isolated by process.

That sounds bad; it is like “CGI, twenty years later”, as they say. In 2000 at KnowNow, we were able to support over ten thousand concurrent Comet connections using a hacked-up version of thttpd, on a 1GHz CPU with 1GiB of RAM. I’ll be surprised if you can support ten thousand Comet connections using WebSockets and websocketd even on a modern machine, say, with a quad-core 3GHz CPU and 32GiB of RAM.

Why would you want ten thousand concurrent connections? Well, normal non-Comet HTTP is pretty amazingly lightweight on the server side, due to REST. Taking an extreme example, this HN discussion page takes 5 requests to load, which takes about a second, but much of that is network latency — a total of maybe ½s of time on the server side. But it contains 7000 words to read, which takes about 2048 seconds. So a single process or thread on the server can handle about 4096 concurrent HN readers. So a relatively normal machine can handle hundreds of thousands of concurrent users without breaking a sweat.

On the other hand, Linux has gotten a lot better since 2000 at managing large numbers of runnable processes and doing things like fork and exit. httpdito (http://canonical.org/~kragen/sw/dev3/server.s) can handle tens of thousands of hits on a single machine nowadays, even though each hit forks a new child process (which then exits). http://canonical.org/~kragen/sw/dev3/httpdito-readme has more performance notes.

On the gripping hand, httpdito’s virtual memory size is up to 16kiB, so Linux may be able to handle httpdito processes better than regular processes.

Difference with cgi is that they are live and die, with each user request. websocketd programs are more like "one per user session"... Makes sense when your user sessions are lengthy.

Yes — that’s exactly the problem. Handling ten thousand concurrent users with CGI is easy, even on 2000-era hardware. Ten thousand concurrent users might be four requests per second. But ten thousand concurrent users using websocketd means you have ten thousand processes. And if you’re doing some kind of pub/sub thing, every once in a while, all of those processes will go into the run queue at once, because there’s a message for all of them. Have you ever administered a server with a load average over 1000?

Still, the O(N) scheduler work in current Linux might make that kind of thing survivable.

Yes, I often have 10K plus processes running on a production server. It's caused troubles at time due to misbehaving processes, but mostly it's been ok. Linux is surprisingly good at this (wasn't always the case).

For the times when some of my processes were misbehaving, it was easy to identify which processes were misbehaving with "ps", "top", etc and resolve with "nice", "kill". This killed the bad connections without bringing the rest of the app down. Sysadmins like me.

Have you been waking up all 10K of them at a time? Handling 10,000 sleeping processes is not so surprising.

rapind did nice experiment in comments above.

Looks very nice for quick prototype kind of things in any language or to make existing tools quickly available to a small circle of users!

for larger-scale usage the overhead probably is to big.

OP here. The overhead is really dependent on your process overhead. I typically use it for Python/Ruby/C apps which are relatively lightweight compared to something JVM based.

The runtime profile of WebSockets tends to be different to typical HTTP requests too. With typical HTTP you're often optimizing for 1000s of short requests per second, whereas with WebSockets the requests are much longer lived.

The JVM is optimized for long-running processes as it takes time for the JIT to kick in. Other than the memory footprint your code gets quite a performance boost; I'm always surprised how much of a difference it makes before and after.

You wouldn't want to boot a JVM process per connection but rather implement the listening socket yourself and dispatch connections within that same process with all of the required services already initialized. A WebSocket server is no different.

Depending on the job. If your job involve long lasting connections, overhead of setting up websocket connection and spawning new process wouldn't matter.

But this is clearly will not cut for typical HTTP static stuff.

So if your server side keeps connection for a long period of time (think IRC server, or netcat?) - this might be good.

The overhead isn't the setup. It's the memory footprint of a process. A minimal process (just sleeping) will use ~128KB. One process per connection means a server will need well over 1GB memory per 8000 connected users. A python process just sleeping will be almost 3MB, which means you get 333 connections per GB.

Compare this with an efficient green threads implementation. In Erlang you can have 100k sleeping "processes" in just 200MB of memory.

Yes, there is memory overhead, just like there is programmers overhead to create daemon doing what you need, instead of just piping existing tool's stdio into websocket. I guess each tool should be used for tasks it is good at.

I had come across websocketd a while ago, and to try it out, wrote a simple Python program and a web page. The two posts about it are here:

1. Use WebSockets and Python for web-based system monitoring:


2. websocketd and Python for system monitoring - the JavaScript WebSocket client:


Note: As it says in one of the posts, you have to:


for the Python program to work as a websocket server, though it works fine without that if only run directly at the command line (without websocketd). Thanks to Joe for pointing this out.

websocketd is a nice utility.

This is a good idea.

Myself, I noticed that almost all my websockets projects could easily share one single code base, and finally I just made a Websockets boilerplate repo that I can pull from for any given project. This is what node.js really excels at, and it's an execution model fundamentally different from websocketd: being a message broker between the client and the request-based web server.

Some people don't like the separation between web server and websockets server, but when you think about it they don't belong quite on the same level of abstraction. Plus, it's usually orders of magnitude easier to reason about single requests than to reason about a complex persistent application server's state.

For simple one-directional communication (Server to client(s)) as shown in the example, it may for many people be simpler to use EventSource [1]

I've used this for UI's where the server continuously sends/pushes updates to the clients. Really handy, and multiple implementations and libraries available in most languages.

Of interest is perhaps also the spec [2]

[1]: https://developer.mozilla.org/en-US/docs/Web/API/EventSource

[2]: https://html.spec.whatwg.org/multipage/comms.html#the-events...

Not supported in any of IE versions... https://status.modern.ie/serversenteventseventsource?term=Ev...

And WebSockets is working since IE10 https://status.modern.ie/websocket?term=WebSocket

Due to it's nature, EventSource can be quite easily implemented using a polyfill for legacy browsers.

It's also MUCH easier to setup/run in a multi-layered stack (e.g. Pound/HAProxy > Varnish > App Server).

As per usual, WebSockets is what the "cool kids" use, even when it's often much less appropriate and much less flexible.

So what would you use to implement the polyfill for IE10? Is there something more appropriate than websockets? If not, are we not back to square one where you may as well use websockets from the start?

There are multiple polyfills for EventSource - mostly all it needs is a working XHR implementation.

The key thing is that a proxy/load balancer doesn't need special cases to support EventSource like it needs (and often doesn't have) with websockets.

And then you're polling instead of pushing.

In one browser.

If your server side stack (https terminator, load balancer, caching proxy, app/web server) can't support websockets, you'll be falling back to long polling in every browser.

No, you are polling for every IE user. Which is likely a majority of your visitors. You can control your server side stack, you can't control your visitor's browser.

My argument is that since IE does not support EventSource a large portion of your userbase will have to use an alternative implementation. If this implementation polls then you get minimal gain since now the majority of your userbase is polling again.

It isn't 2003, IE is not likely to be a majority of visitors any more.

Caniuse.com reports 75% global relative support for EventSource and just 85% for websockets.

To claim that "you can control your stack" as a solution to a protocol that has well known issues with multiple components in the path between a client and the server is somewhat naive frankly.

As for the "pull vs push" debate: event source lets you open a channel to a resource on the server and wait for it to send you data.

XHR long polling works exactly the same way - you open an asynchronous connection and wait for data, the only difference is you need to handle reconnects, and the message parsing is done in user land code rather than in c/whatever.

Long polling is definitely not about making a new xhr request every couple of seconds to check for more data.

There is nothing wrong with server side events. I even played with them already.

But for bi-directional communication we do have to use websockets and it's the good tool for the job.

If you'd like to see maybe event source is something that could be added to websocketd as additional protocol, start conversation in issues, nothing is wrong with pitching an idea. We already got cgi and static html because it's not always about websockets, sometimes you want other things handy.

The thing with WebSockets is that they are message oriented. WebSockets endpoints are presented with a series of distinct messages, whereas stdin/stdout are stream based and you have to build messaging on top, if that's what you want. I guess the idea here is that you are just using '\n' as the message delimiter?

Pretty much. Good example is Joe's vmstats https://github.com/joewalnes/web-vmstats

Elegant approach. Reminds me of inetd.

I'm surprised this isn't just implemented as an inetd handler, given the "do one thing and do it well" mantra.

Its the 21st Century. People don't look backwards any more. inetd is passé..

I'm assuming you're being sarcastic. This tool wants to "do things the Unix way"... which was invented in the 1970s.

Why is this better than netcat?


Because it handles the WebSocket protocol for you.

Oh my! I have been waiting for this, trying to write my own websocket server implementation in multiple esoteric languages! Fantastic!

This looks great, without knowing much I was able to get:

./websocketd --staticdir=. --port=8123 ps

to give me a simple output of processes. What I'd really LOVE is to get this to work:

./websocketd --staticdir=. --port=8123 htop

It'd be great if I could see the output of htop on the web...from anywhere. I guess htop is setting up a different video mode or something that isn't compatible?

I think you would need to feed what you are getting to a Javascript terminal emulator. (e.g. https://github.com/chjj/tty.js/)

This is perfect for admin tools. I'd rather parse stdout strings than code a custom API.

I'll try this. Looks good!

Oh, this is going to be so fun for doing terrible things.

Untold rule is not to run "sudo websocketd --devconsole bash" :)

The browser will not implement terminal controls. You could get by though. This sort of scenario is why ed is never obsolete.

ed is for the weak. Real developers edit bits using the magnetic forces of oceans.

Could anyone give an example of what this is used for?

It allows you to write a WebSocket end point, in any programming language, without having to deal with building socket servers, dealing with WebSocket protocol handshake/parsing/formatting/etc, worrying about threading etc. Just write a script that reads and writes to stdin/out.

It appears to me that it is useful for eliminating the "middle man" scripts/code often written for web application and server interaction. If I want to create a web application that displays the status of my server (CPU, memory, I/O, etc) I can use this software to call vmstat directly instead of writing an API (PHP, Node.js, RoR, etc) that executes the same command or uses a library that requires even more code to get the exact same data.

I'm sure this this is probably because I don't have do this for work - but from that description it's hard for me to tell what the advantages are. A script to feed vmstat (or other command data) with a python socket would be ~50 lines or so.

This is cool. I can see this being used for prototyping and validating an idea before going full steam ahead with a more scalable framework.

It's a petty comment, I know, but:

    for count in range(0, 10):
      print count + 1
makes me feel sad.

Please ignore complexity of supplied examples... It's really not for that kind of use. I have lots of scripts that do something like this (and most of them are not in bash :)...):

    while read ARG; do
          if validate $ARG; then
                run_something $ARG
Most of my "validate" pieces are bash functions and most "run_something" return lengthy CSV datablocks.

Then, there is the javascript that uses user mouse (and other signals) to generate arguments to send to websocketd and draw pretty visuals based on data that arrives back.

I assume you are treating `stdin` and `stdout` as just a stream of bytes rather than trying to do anything clever a la Python 3 ?

Yep. The only magic thing is a line break (e.g. \n) is used to terminate each message.

> "Write programs that do one thing and do it well."

So why does this WebSocket daemon also serve static files and CGI applications?

Because you need to implement a tiny HTTP server for WebSockets anyway (if you want't them to work from browsers). So, adding the ability to serve static files on top of that is trivial.

Not sure why they added CGI, though.

That indeed was a dilemma. Although it focusses on doing one thing well, there were also a few core things that are useful to support a websocket based app - namely a place to serve the actual web content from (static) and the ability to use websocketd using vanilla http instead of websockets (cgi).

beautiful website may I ask with which template / framework it was written? (is it in github by any chance?)

Didn't use any templates or frameworks (apart from font-awesome for icons and prism for syntax highlighting the code examples).

Just some good old fashioned hand written HTML + CSS. https://github.com/joewalnes/websocketd/tree/gh-pages

Holy shit that's cool

How are message boundaries handled?

WebSockets are message-based. UNIX streams are not.

websocketd treats each line as a message. i.e. boundaries are marked with \n

Link is down =(

this is fucking cool. I got it instantly just glancing at the code example. I immediately understood how it worked. That is powerful stuff.

My only worry of course is, how would you scale this up? What's really going beneath the hood.

I'm really excited and trying to think of something so I can use it as an excuse to use this.

The only other suggestion I would make is maybe change the name to something more catchy and brandable. Websocketd...okay like systemd...but I don't know, something as good as this deserves a brandable name like Jupiter, or some Greek goddess or clever hacky name.

Thanks. And I suck at naming stuff.

Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact