Hacker News new | past | comments | ask | show | jobs | submit login
Fun with Unix domain sockets (simonwillison.net)
212 points by edward 55 days ago | hide | past | favorite | 66 comments

I wish browsers could talk to Unix domain sockets (and maybe named pipes on Windows). That would allow to run web servers locally and more securely: domain sockets live on the file system with the usual access control means available, while binding to a port on localhost has no such easy and familiar control.

I use this all the time for mocking / testing REST-ish services, since otherwise the test has to consume a port. Ports are limited in number, contended (eg. with real services or other tests), and may even be exposed to the public if your firewall is misconfigured.

On the client side of the test Curl makes this easy with CURLOPT_UNIX_SOCKET_PATH.

Agree with you about support in browsers.

Here's a fun and little known fact about Unix domain sockets on Linux: They have two independent SELinux labels. One is the regular label stored in the filesystem, and the other has to be set by the process creating the socket (using setsockcreatecon). On an system with SELinux enforcing and a decent security policy you usually have to set both. I just posted a patch for qemu to allow the second one to be set: https://lists.nongnu.org/archive/html/qemu-block/2021-07/msg...

Check your local "listening" API. Unix permits you to just ask for "a port that is open", in a range defined by your kernel that is usually left untouched by everything else and will have no forwarding associated with it. Then you don't have to worry about the port selection process.

However, this is a prime example of a feature that many people writing abstraction layers around networking leave out of their layer, due to either not knowing it exists or not knowing the use case and figuring that nobody could ever have a use for it. So you may very well find it is not available to you (or not available conveniently) in your preferred environment. (See also "what are all these weird TCP flags? Eh, I'm sure nobody needs them. Abstracted!")

If you have Unix sockets working, by all means stick with them, but if you need sockets this feature can help. You can even run multiple such tests simultaneously because each instance will open it's own port safely. (I have test code in several projects trust does this.)

If call bind with port 0 in Linux or Windows or OSX the OS will bind to a random unused port within the unrestricted port range. Whether there is any forwarding or firewall associated with that port is outside the scope of the code that opens the port. If the application is assigned a random port such as 55555 there could be forwarding or a firewall on that port at the OS level that the application is unaware of. If an application is opening random listening ports it just needs to record that information somewhere so clients can actually connect to it. In the case of using this for testing localhost servers the application can just print out the port number or save it to a file so that you can look it up easily without going through netstat or something.

Opening a random listening port to serve content to clients which initiate communications to it is less common than the use case of initiating a communication to a listening port elsewhere and sending content through that socket, though both of these use cases open random local ports.

But that’s a worse solution unless you feel the need to test the kernel’s tcp stack.

You’re back to opening local ports that anyone on the box can access. You can work around that by isolating the tests into a container/namespace, but now you have more stuff to orchestrate.

Finally, the problem with binding to 0 and letting the kernel pick a port is now you have to wait for that bind event to happen to know which port to connect to from your test side. With domain sockets you can set that up in advance and know how to communicate with the process under test without needing a different API to get its bound port number.

It's definitely the last point which is the main problem. We can start the server side and tell it to pick a port, but then we have to somehow communicate that port to the test / client, and often that channel of communication doesn't really exist (or it's a hack like having the server print the port number on stdout - which is what qemu does). Unix domain sockets by contrast are an infinite private space that can be prepared in advance.

I don't have a problem running my test code on boxes with hostile people logged in to them, nor do my socket connections offer them anything they couldn't already do if they have that level of access. You sound like you may have a very particular problem, and if this is your situation I'm not convinced "Unix sockets" are the answer anyhow... you seem to have bigger problems with unauthorized access.

Isolation on a machine is a very basic security primitive. Binding to localhost circumvents all of that and you end up with vulnerabilities like this: http://benmmurphy.github.com/blog/2015/06/09/redis-hot-patch

Remember, anything you put on localhost can be reached by your browser (unless you use iptables with the pid owner check) and arbitrary webpages you are on can hit those endpoints in the background.

The solution you are offering is worse both from a security and a usability standpoint.

FYI- that link returns 404. "There isn't a GitHub Pages site here."

If you run tests through the loopback interface, you should be able to use any of the addresses, instead of (a.k.a ”localhost”). Each address has its independent port space, so you would avoid collisions/contention.

You can use the whole 127/8, but only if you've added the address to your loopback interface.

Unix sockets skip all of tcp though, so I'd recommend them vs loopback, if possible. I hear Linux short circuits a lot of tcp for loopback, but there's probably still a bunch of connection state that doesn't need to be there. FreeBSD runs normal tcp on loopback, and you can get packet loss if you overrun the buffer, and congestion collapse and the whole nine yards. Great for validating the tcp stack, not the best for performance; better to skip it.

Hmm. I've not added to lo interface but I can ping it. Start a listening server and connect to it. I didn't have to add anything in 127 - because it's a /8

I just tested this and it seems to work by default on Linux but not on FreeBSD. On both the address is configured as, i.e the whole block. Go figure.

This discussion [1] sheds some light on the matter, but does not fully explain the reasons for differing behaviour across OS's.

[1]: https://serverfault.com/questions/293874/why-cant-i-ping-an-...

Interesting. Linux also does that with any other address/subnet added to "lo":

  # ping -c 1
  1 packets transmitted, 0 received

  # ip addr add dev lo

  # ping -c 1
  64 bytes from time=0.023 ms
  1 packets transmitted, 1 received

The 127 I get. This one seems odd though.

It seems to me (based on reported behavior), that Linux is special casing the loopback interface: when you add an address with a netmask to the loopback interface, it considers all of those addresses as local addresses.

As opposed to a normal interface where only the specific address is a local address, but the rest of the network specified are accessible through that interface.

Maybe one /8 wasn't enough addresses for you, so you added more? Doesn't seem like an unreasonable way to behave, even if it's different than BSD behavior; it certainly makes it easier to use lots of loopback addresses.

The annoying thing with Unix sockets is making sure to delete them and recreate them. IIRC, there’s no file system equivalent of SO_REUSEPORT.

UNIX domain sockets also support an abstract namespace, not a part of the filesystem [1].

An excerpt of [2]:

> The abstract namespace socket allows the creation of a socket connection which does not require a path to be created. Abstract namespace sockets disappear as soon as all open instances of the socket are removed. This is in contrast to file-system paths, which need to have the remove API invoked in code so that previous instances of the socket connection are removed.

It worked quite well when I tried it (also in addition to using SO_PEERCRED for checking that the connecting user is the same as the user running the listener in question).

[1]: https://unix.stackexchange.com/a/206395/33652

[2]: https://www.hitchhikersguidetolearning.com/2020/04/25/abstra...

That’s fantastic. Too bad it’s Linux only though. Is there an alternative for other systems outside of adding a delete step first?

I second this, I have been using this in production for years.

It's reliable, and very convenient.

Firefox can talk to a unix socket. In your proxy settings specify the socket.

E.g., it is how you can more safely proxy through tor (which also can listen on a socket), with firefox running inside a network name space without access to a network interface. Things like webrtc cannot leak your real IP.

Op means typing a socket into the address bar. Using a socket for proxying is not the same thing.

It's still strictly better than no way at all!

Using something like OmegaSwitcher, it's easy to configure some DNS suffix like `.local.test` to be mapped to local Unix sockets. Not as easy as built-in support, but at least doable.

I don't see a field for this in the Connection Settings dialog. Where do you type the socket name/path?

We recently used them at work to easily implement role based access control to an internal REST service since unix sockets have the useful ability of being able to tell which user is reading/writing to them. So you don't need to provide passwords or certificates because if the user is logged in on the system, they are already authorized.

su -u $otherUser

Wouldn't that mess it up?

You need the other users password to do that, or be root, so it is not a problem. Also, in our system, non admin users don't have linux shell access anyway, they get a custom shell that uses the rest server as the backend. The admin user can get linux shell and run commands as other users or do whatever they want but that is expected.

You kind of can already, I know this isn't what you mean, but you can connect to unix sockets via a native application.

"Native messaging enables an extension to exchange messages with a native application, installed on the user's computer. The native messaging serves the extensions without additional accesses over the web."


This a million times. Imagine how much less annoying e.g. syncthing UI would be!

(Granted programs can sort of do this with TCP sockets too, at lwast on Linux: read from /proc/net/tcp/somepathicantremember and use that to find out the U credentials of the client currently connected to your server...)

>Imagine how much less annoying e.g. syncthing UI would be!

Assuming you mean the local web app needing an self-signed certificate...


>Note: Firefox 84 and later support http://localhost and http://*.localhost URLs as trustworthy origins (earlier versions did not, because localhost was not guaranteed to map to a local/loopback address).

So if syncthing is reachable on a port on localhost you can just switch https off.

There was a blog post about this recently:


Yes me too. What would it take to get there? Does the URL format allow for a socket to be specified, how would such a URL look? http:///path/to/socket:/uri-path ? Would that be distinct enough?

Rather than Unix sockets, I wonder if you could do something with TLS client certs. Use https to localhost, then require the user to generate a TLS client cert and somehow register that with the app.

Personally, I think I would add a browser extension forwarding fooapp:// to a native messaging process gatewaying to the socket.

opnbsd’s netcat can bind to af_unix endpoints, perhaps that can brought to bear to solve this ?

That still requires listening on tcp.

“It turns out both nginx and Apache have the ability to proxy traffic to a Unix domain socket rather than to an HTTP port, which makes this a useful mechanism for running backend servers without attaching them to TCP ports.”

That’s one useful nugget of information! I had no idea.

On webide.se (free shell and web IDE) I use unix sockets alot!, for example nginx proxy from https://foo.user.webide.se to /home/user/socket/foo so that users can test their apps using HTTPS/SSL as many browser features need a httpS URL to work. Unix sockers are also used for accessing a shared mySQL server, and an x11 server so you can test out "native" apps and run Android emulator in the browser.


Now Docker should provide some way to bridge container ports to UDS. That would be perfect.

You can bind mount the socket in a volume. You can also do this with named volumes shared between containers. You’ll need to make sure the GID/UIDs match, but it works great. It’s often faster for inter service communication than the Docker proxy.

Quick hack for you:

socat TCP-LISTEN:4000,fork UNIX-CONNECT:/run/yoursock

HAProxy can as well (as well as listen on unix sockets)

And Envoy Proxy (with an example [0])

0. https://gist.github.com/moderation/5d9c5352842ca068781f367a6...

Unfortunately and annoyingly not all applications support listening on Unix domain sockets.

socat is the tool for this. No need for a full blown webserver.

socat is a "dumb" tool. With a proxy that understands HTTP you have a lot more influence over what goes where and for example not direct requests to a down backend instance.

Are you challenging me to build a socat loadbalancer in bash? In all seriousness, I concur. But if you just need to redirect some data streams socat is a rock solid and simple solution.

> Are you challenging me to build a socat loadbalancer in bash?

It’s a Saturday and all, so yeah, I’m challenging you (for fun, not because I don’t think it’s possible).

From a quick read of the manpage, I don't think it's possible either, unless you mess with EXEC or create multiple listeners to a single address.

Anyway, the Linux kernel does have a builtin loadbalancer in the form of shared sockets that you can socat outside of the machine. But I wouldn't call that a socat loadbalancer.

one lesser known facility I have used with UDS is called SCM_RIGHTS. It is used to send open file descriptors for actual files/shm regions/socket connections. the receiving process gets the same file handle as if translated into its process space by a dup by kernel. this is very useful for implementing things like graphics compositor where specialized memory regions need to be passed around to various processing stages.

For people curious: https://man7.org/linux/man-pages/man3/cmsg.3.html

This is used by a lot of classic unix tools(dbus). The "modern" api for linux is https://man7.org/linux/man-pages/man2/pidfd_getfd.2.html .

its good to have that api but a significant difference is that it requires clients (and even servers) to know the pid of the process they are sending this fd to. this is unnecessary when you can make do with a socket connection (inherited of otherwise).

The UDS API has a particular property that allows the receiver to know the PID of the caller. The SPIFFE/SPIRE project uses this to perform introspection of a process and distribute certificates based on the results of that inspection. It is a really great tool to use in systems engineering.

I think you can do even more and pass other file handles (FDs) through the socket, in order to provide the peer access to those resources.

Yeah, and when you add kernel TLS to the mix, you can handle the certificate part of TLS termination in a guarded user/process, which never needs to touch HTTP headers.

You can do this for TCP too by pulling out the info about the process that has connected to your socket from /proc. Not as nice an API but at least it's there.

I read this from Wikipedia: "In addition to sending data, processes may send file descriptors across a Unix domain socket".

Does this mean you can send a Unix domain socket via a Unix domain socket?


Yes. You can send any file descriptor. You might also be interested in socketpair[1], which creates a pair of FDs connected via an anonymous Unix domain socket. Sending a socketpair over a Unix domain socket is a nice way to establish a new channel of communication.

[1] https://man7.org/linux/man-pages/man2/socketpair.2.html

You can even send a Unix socket over itself. The kernel has to be careful to handle that correctly. :)

That is pretty far-out, conceptually.

But in practice I assume there would be little need for that right? If you have the socket already and you get the same socket again from somebody else, would that give you any new capabilities?

If a socket is shared among multiple actors, can a same message be read by all of them? Or does the first reader "remove" the message from it?

Yes although I inherited something that did this once for no good reason and it made me scratch my head for several hours before I understood it properly. YMMV on that.

With Unix sockets you can even have in-order datagram semantics, which I think is just neat!

I recall using this on a SunOS 3.x system (BSD 4.2-based) to build a message passing system. Over the years I expected it to be used more often for machine-internal communication. I think it was overshadowed by its sibling the network socket because that could be used for local as well as network communication.

We used LDAP over domain sockets with SO_PEERCRED many years ago, to tightly couple other services that needed different authorization to remote users.

Unix domain sockets are cool and awesome.

It's all fun until some local application decides that it should bridge a third party server and some service on your system. Do you want your browser to facilitate Google talking to your docker daemon? What about your dbus?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact