
Uvm: a BSD virtual memory system (2016) - fanf2
http://blog.pr4tt.com/2016/02/02/BSD-virtual-memory/
======
drewg123
The author writes: _Page loanout is when a process loans its memory to another
process. This is useful particularly in networking, in which data can be sent
to the kernel’s network stack simply by loaning the appropriate pages. This
avoids the need for costly copy operations._

Does NetBSD/OpenBSD actually have a zero-copy sosend()?

I wrote one once for FreeBSD back in the 90s for some research work I was
doing. FreeBSD, even at that time, had the primitives needed for a userspace
application to loan pages to the kernel for transmit. However, unless the
application had knowledge that it was using a zero-copy socket, it was a bit
of a mess in practice, as most applications would not benefit, due to taking
COW page faults by re-writing to memory that was loaned to the kernel (and was
marked read-only). The other big problem was handling the mapping changes (eg,
the marking memory read-only, and then restoring RW access). The whole thing
was crying out for a better interface. I probably should have required
applications use aio_write(), or something similar. That would have removed a
lot of the overhead..

~~~
rjsw
Isn't sendfile() the usual API to trigger zero-copy send ?

It isn't available in NetBSD yet.

~~~
drewg123
Yes / no. Sendfile is for files. Sometimes you want to send normal anonymous
memory.

~~~
cryptonector
Why do CoW in sosend(2) -- just have such a system call that requires the
caller not to touch the memory being loaned until the writes are completed.
CoW for memory is way too expensive.

This reminds me that CoW filesystems (think ZFS) and writing through mmap()
don't play well. You end up having to use msync(2), which many apps assume
there is no need for, and msync(2) is often terribly slow (ISRT at least one
system ended up doing page-at-a-time sync writes!).

Now, what happens if you have a sosend(2) that requires the caller leave the
memory alone, and the caller touches it anyways? Undefined behavior. Possible
outcomes include: some CRC/hash/MAC will fail to verify, the mod will have
come too late and not been included, the mod will have come soon enough to be
included.

~~~
drewg123
Yes, as I said above, I should probably have required a new API.

~~~
cryptonector
It could just be a new flag.

------
ncmncm
Uvm used to be really cool, when there was only one core and one memory map.

Then those busybody CPU manufacturers gave us lots of cores, and each core its
own TLB, and then we had to trash everybody's TLB whenever somebody re-mapped
something. That made working by flapping page mappings slow everything else
down, and UVM became a specialist technique for embedded systems small enough
to have just one core, but big enough to have mapped memory.

Meanwhile, memory systems have got better and better at copying -- mostly just
by adding bandwidth, but also by having lots of registers to slurp bytes into
and spew out elsewhere -- to the point that gymnastics to avoid copying are
often slower than just copying.

