Once the API is high level enough it gets unusable by major users who are high end networking, GPU and graphics libraries and low latency sound.
Nobody else truly needs to bypass the kernel. Even low latency can work with good real time task handling, making the users exactly two cases, who have special DMA handling in hardware already. If it means introducing special case kernel bypasses for high scale computing, it's already done, and the low level APIs just get wrapped.
And the Achilles's foot is security.
If it's arguing for making all hardware a kernel free fabric, it's essentially a move of everything to firmware. Worst case, we get zero memory protection and unfixable bugs.
The reality of 100G+ networking or low latency networking is the kernel can't keep up with the interrupts from the hardware, so you turn interrupts per packet off (adaptive coalescing / ethtool -C for ethernet), so userspace tcp/ip stacks such as Intel / Linux Foundation's DPDK, Solarflare's OpenOnload, Mellanox's VMA and Chelsio's Wire Direct exist to fill this need. Heck, even the BBC wrote their own kernel bypass networking layer! Note that Solarflare, Mellanox, and Chelsio are all heavily used in High Performance Computing supercomputers along with finance such as electronic trading. If there was no need, there wouldn't be so many options due to the market wanting them.
Source: have worked in electronic trading as a Linux engineer for 11-12ish years.
KeyDB can easily get 2-3x the QPS with half the latency.
You and I disagree vehemently on this (hence the fork), but I really think your optimizing for your own simplicity not that of the user's. It should be the opposite since the developer has the most insight into the software.
This seems counter-intuitive.
Hardware limitations mean different abstraction than OS-level APIs, as them to applications.
Even POSIX does not expose hardware limitations.
Rather, high-level in the paper is more like some suitable interface to a wide range of applications. I.e., high-level as it's targeted to be used directly by applications as a portable interface.
You said you don't want to make users deal with flow control and hardware details.. does that imply a userspace bypass library which does that stuff for us? Does it look posixy?
Memory safe languages with rich runtimes, only need a mini kernel to run bare metal.
Windows has been pushing for user space drivers for a while now, including GPUs.
Android is following the same path with Project Treble, and who knows what will happen with Fuchsia.
Likewise on many high integrity OSes for embedded deployment.
.NET has netduino and eventually meadow, although it is a subset.
Erlang has GRiSP.
OCaml has MirageOS.
To come back to your question, Go has tinyGo, gVisor, emgo, Biscuit.
And you can have a look at this as well, https://nanovms.com/dev/tutorials/running-go-unikernels
And here http://unikernel.org/projects/
One third of the cost is actually expensive !
Also, ScyllaDB NoSQL database(C++ clone of Cassandra) uses Seastar framework to achieve high IO throughput.
I am wondering whether or not this is a missing or a different understanding the concept.
Sorry I did not really get the difference between library OS and unikernels.
It's still a lack of reference considering their connections.
RDMA and DPDK both use user-space drivers, which is necessary for kernel-bypass. I'm not advocating for a particular kernel-bypass solution. I'm arguing that if we use kernel-bypass for I/O, we should have a common, efficient, high-level interface for it.