
Linux's Vmalloc Seeing “Large Performance Benefits” with 5.2 Kernel Changes - gmiller123456
https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.2-vmalloc-Performance
======
saagarjha
> Currently an allocation of the new VA area is done over busy list iteration
> until a suitable hole is found between two busy areas. Therefore each new
> allocation causes the list being grown.

Sounds accidentally quadratic?

~~~
kazinator
Sounds like "vmalloc won't be heavily churned, unlike mmap, so who cares".

------
ncmncm
Still pointer-chasing, I see. It is remarkable that performance was ever
tolerable. Probably the need to zero the pages before delivering them to user
space masks almost any amount of inefficiency.

~~~
namibj
I wish they would consider using cacheline sized B+ trees instead of dumb RB
trees. The latter are not making proper use of pipelined superscalar
processors (AKA any modern CPU that can run at over 1 GHz).

~~~
Circuits
Couldn't you (or someone) write the code and submit it for approval. I was
under the impression anyone could hack on the kernel (fairly new to Linux) and
make submissions for review.

~~~
ncmncm
The natural choice of language to code well-optimized data structures in is
C++, but the Linux old guard have shown themselves irrationally hostile to
integrating anything coded in C++.

Coding data structures in C is a formula for wasting your time, because at
each next use you have to start over nearly from scratch. That is why kernels
are such heavy users of ancient data structures user-space has largely
abandoned.

------
olliej
Sorry my reading of this is that they had an important allocator using a O(N)
free-block search? That makes lots of algorithms and operations become
quadratic really easily - especially given no one expects linear allocation
cost.

RB tree is an interesting choice, presumably there’s a benefit vs btrees
(maybe reduced metadata cost?)

It’s also kind of frustrating when articles like this say things like “up to
X% faster”. That’s way underselling it: this is asymptotically faster - the
performance increase gets larger and larger over time, it’s not a simple
multiplier :-/

------
kazinator
> _It uses a red-black tree that keeps blocks sorted by their offsets in pair
> with linked list keeping the free space in order of increasing addresses._

I.e. what has been used by the regular mmap for user space allocations for
like two decades.

------
egberts1
I see big boost for high-speed network drivers.

~~~
viraptor
Why? I expect all high speed network drivers to be already zero-alloc in their
sending path. We have zero-copy interfaces as well. Is any network operation
really held up by allocs anymore?

~~~
cesarb
And even if they did allocate in the send or receive path, it would probably
be with kmalloc (possibly with GFP_ATOMIC), not with vmalloc.

