Good call-out, and I think that's a more practical approach for most systems.
For this project, one of my goals was to impose the fewest dependencies possible on the loaded executables, and give the illusion that they're running in a fully independent process, with their own stdin/out/err and global runtime resources.
There's a rich design space if you impose "compile as .so with well-known entry point," and certainly that's what I'd explore for production apps that need this sort of a model.
> I think it can work if you want processes with different lib versions or even different languages
This is exactly right, unrelated binaries can coexist, or different versions of the same binary, etc.
> it sounds somewhat risky to pass data just like that
This is also right! I started building an application framework that could leverage this and provide some protections on memory use: https://github.com/jer-irl/tproc-actors , but the model is inherently tricky, especially with elaborate data structures where ABI compatibility can be so brittle
Thanks! The idea of launching additional components nearly "natively" from the shell was compelling to me early on, but I agree that shared libraries with a more opinionated "host program" is probably a more practical approach.
Explicit shared memory regions is definitely the standard for this sort of a problem if you desire isolated address spaces. One area I want to explore further is using allocation/allocators that are aware of explicit shared memory regions, and perhaps ensuring that the regions get mmap'd to the same virtual address in all participants.
I wasn't but I'll have to read more! Some good relevant discussion here too https://news.ycombinator.com/item?id=7554921 . I wanted to keep this project in user-space, but there's a lot of interesting ground on the OS side of things too. Something like Theseus <https://github.com/theseus-os/Theseus> is also interesting, providing similar protections (in theory) by enforced compiler invariants rather than hardware features.
Could you clarify what you mean by that? This does heavily rely on loaded code being position-independent, because the memory used will go into whatever regions `mmap(..., ~MAP_FIXED)` returns.
I think it was meant not in a literal sense. ASLR is meant to make it hard to access memory which isn't yours. Your system makes it easy to access memory which isn't yours.
I wonder if the Rust checker could be made "extra process" aware in your scenario and thus allow rust programs to "connect to each other" in this shared memory space.
I'm not familiar with CPython GC internals, but I there there are mechanisms for Python objects to be safely handed to C,C++ libraries and used there in parallel? Perhaps one could implement a handoff mechanism that uses those same mechanisms? Interesting idea!
Not negative at all, thanks for commenting. You're right that the answer is "nothing," and that this is a major trade-off inherent in the model. From a safety perspective, you'd need to be extremely confident that all threadprocs are well-behaved, though a memory-safe language would help some. The benefit is that you get process-like composition as separate binaries launched at separate times, with thread-like single-address space IPC.
After building this, I don't think this is necessarily a desirable trade-off, and decades of OS development certainly suggest process-isolated memory is desirable. I see this more as an experiment to see how bending those boundaries works in modern environments, rather than a practical way forward.
One of my hobby open source projects includes multiple services and I don't want to have to start and stop them individually just to test anything. They're designed to run standalone since it's a distributed system but having having to launch and stop each individual process was adding friction that lowered my enjoyment of working on it.
I recently ended up redesigning each service so they can run as a process or within a shared process that just uses LoadLibrary/dlopen (when not statically linked) to be able to quickly bring everything up and down at once, or restart a specific service if the binary changes.
Sure, everything will crash rather than just one service but it's lightweight (no need for complex deployment setups) and made the project far more enjoyable to work on. It's adequate for development work.
Another plus has been a much cleaner architecture after doing the necessary untangling required to get it working.
It's certainly an interesting idea and I'm not wholly opposed to it, though I certainly wouldn't use it as a default process scheduler for an OS (not that you were suggesting that). I would be very concerned about security stuff. If there's no process boundary saying that threadproc A can't grab threadproc B's memory, there could pretty easily be unintended leakages of potentially sensitive data.
Still, I actually do think there could be an advantage to this if you know you can trust the executables, and if the executables don't share any memory; if you know that you're not sharing memory, and you're only grabbing memory with malloc and the like, then there is an argument to be made that there's no need for the extra process overhead.
For this project, one of my goals was to impose the fewest dependencies possible on the loaded executables, and give the illusion that they're running in a fully independent process, with their own stdin/out/err and global runtime resources.
There's a rich design space if you impose "compile as .so with well-known entry point," and certainly that's what I'd explore for production apps that need this sort of a model.reply