I’ve been lurking in the 9fans mailing-list for years and kept a Pi running Plan9 as a sort of 24/7 “home console” (to log in to other machines) for a long time, but I recently upgraded that to a 3B+ and Raspbian instead due to the lack of a good enough web browser, and having a cluster of my own (now running k3s), I would have liked to see a reference configuration of some sort.
The Plan9/Inferno ecosystem has always been fascinating to me, but it’s so closed upon itself that it’s hard to, say, go out into GitHub (or equivalent) and find actively maintained tools (or even blog posts with updated info), and source archives seem to be maintained by fewer and fewer people...
Untrue, the community is quite active and alive. I've lurked for years, but this past spring I stumbled on 9gridchan. I was hooked and drawn in by it's simple the interfaces, clean api's (just look at the man page for dial(2)), fresh take on a C library, and great documentation.
Second, pure labs Plan 9 is pretty much unmaintained. 9front is the most up-to-date distribution complete with a very active community and a small yet extremely talented group of core developers. You also have 9gridchan which is maintained by another talented individual. 9gridchan is two things: first a 9front fork called ANTS and a publically accessible grid system where users can connect to share files, chat in an irc like chat program which is just a shared text file and the ability to plumb messages, code and images. In addition, there's spawngrid which provides on demand CPU environments similar to Linux containers.
Plus the number of e-mails I get has been steadily decreasing over time...
The 9front camp might be off putting to some but I don't mind. It's good to see young people be passionate about something and actively participate.
There's plenty of ongoing work. the ANTS stuff in particular coming from mycrotiv is pretty interesting, though as usual somewhat whacky.
Just like you outline, I would expect some mention of which machines in the cluster run cpu, auth, venti, etc., since splitting those was actually one of the primary design criteria for real clusters (there used to be a nice doc on the Plan9 wiki for that).
Also, Richard’s image (and 9front’s) runs all services on the same node. To break those apart and run a cluster, you really need to reconfigure it, and there is no mention of that either.
EDIT: There is an open issue about updating that page.
Most of that stems from the use of the Plan 9 C library which is not ANSI but has many of the same functions. So to bridge the portability gap ANSI C/Unix/POSIX is via the APE (ANSI/POSIX Environment) library and the cpp preprocessor.
Maybe Slack could release a client that does that, running each connection on a separate node.
Why, so it can consume ram on even more machines at once? :P
I think the word he's looking for is "retrofuturistic" ;)
We invented something better than Unix. We just didn't want to use it that badly.
I'm not sure how a Lisp machine of Symbolics or LMI heritage or a Smalltalk system would deal with distributed functionality across untrusted networks. From the language PoV, it seems natural to Smalltalk.
The only thing is that your OS will have to do some cluster management to make sure data will be close to the programs using it.
Case in point, the other commentator mentioning Jupyter is an excellent example. Jupyter is kind of a classic problem where something like QNX would shine: it's a multi-process system that we expose over HTTP to get remote transports to other clients. In QNX, IPC is the remote transport, as well as the local transport, so the distinction between running the Jupyter notebook/kernel locally, split distributed, or all of it entirely on another machine is relatively transparent. This goes all the way up and down the stack -- from the core process layer to the GUI itself (so even GUI programs could be remote, and the desktop protocol proxies the command buffers to you to render locally.) Jupyter, as a system, always has an underlying transport layer for talking between processes, computing and transferring results. So your "GPGPU" being at the other end of a network socket is already a very common case, in fact -- one that it is designed for explicitly (for basically anyone who does DL, for instance.)
In something like QNX I'd be able to simply type the command `jupyter notebook` and the kernel would start on the machine in my other room (Threadripper with a nice GPU) and the notebook UX itself would start locally, they would talk immediately (due to policy/authorization being baked into the IPC/user/process mechanisms -- no HTTP Auth, etc) and there would be no need at the API layer to distinguish between local shared memory or remote network transports. It would always just work. I could boot up a GCloud machine with 8x TPUs and a $10,000 CPU and just "add" it to my network, run Jupyter again, and it would all be the same (except some latency, of course). I could just use a Raspberry Pi as my thin client for most purposes, honestly. Compute resources would be completely disaggregated, more or less.
Jupyter already does things like "compute the ggplot2 of some data on a remote machine, convert to png, tunnel it over HTTP into browser for display" -- what's the difference between using a socket and HTTP? Not much. You could even use HTTP as the layer-7 protocol over QNX IPC, if you wanted...
It's probably not a coincidence that the rise of HTTP as an L7 application layer protocol has happened and exploded in popularity, in retrospect. Remote compute is a vital component of many systems today, and HTTP is one of the easiest ways to accomplish it thanks to the ubiquity of browser protocols (think of how much stuff tunnels over HTTPS now!) All mainstream operating systems make very hard distinctions between remote and local IPC mechanisms -- so you might as well use HTTP, and bind to /run/app/local.sock or 0.0.0.0:443 and just issue GET requests. Boom, you have a local and remote system. It's the easiest way to get "local" and "remote" application transport all in the same purchase, even if it's error prone and crappy as hell.
And, of course, if you are playing a game -- there's nothing stopping you from running everything locally at native speed!
Instead of systems like QNX which elegantly handle distributed computing at the core of the IPC/process/network mechanism in a single place, though -- we basically look doomed to reinvent all of it over bespoke transport/application/distribution protocols throughout the stack. It's a huge shame, IMO.
Burroughs B5500, nowadays sold by Unysis as ClearPath.
OS/400, nowadays known as IBM i.
Using a database as file system, like OS/400 catalogs.
Containers were invented on OS/360, improved on other IBM models, and still offer resource management features not yet mainstream.