Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Linux's Vmalloc Seeing “Large Performance Benefits” with 5.2 Kernel Changes (phoronix.com)
109 points by gmiller123456 on May 20, 2019 | hide | past | favorite | 14 comments


> Currently an allocation of the new VA area is done over busy list iteration until a suitable hole is found between two busy areas. Therefore each new allocation causes the list being grown.

Sounds accidentally quadratic?


Sounds like "vmalloc won't be heavily churned, unlike mmap, so who cares".


Still pointer-chasing, I see. It is remarkable that performance was ever tolerable. Probably the need to zero the pages before delivering them to user space masks almost any amount of inefficiency.


I wish they would consider using cacheline sized B+ trees instead of dumb RB trees. The latter are not making proper use of pipelined superscalar processors (AKA any modern CPU that can run at over 1 GHz).


Couldn't you (or someone) write the code and submit it for approval. I was under the impression anyone could hack on the kernel (fairly new to Linux) and make submissions for review.


The natural choice of language to code well-optimized data structures in is C++, but the Linux old guard have shown themselves irrationally hostile to integrating anything coded in C++.

Coding data structures in C is a formula for wasting your time, because at each next use you have to start over nearly from scratch. That is why kernels are such heavy users of ancient data structures user-space has largely abandoned.


It takes time, effort and interest to do so.

Additionally, one must learn the politics, code-style, idiosyncrasies, etc before submission will be successful. And of course, the architecture of the project itself.

Open Source / FOSS creates the opportunity for anyone to offer code; it does not mean it will so simply be accepted, or should be. And it does not mean you can, or should. But if you wish to, a path always exist (if they just plain don't want it, and you really want to add it... fork!)


Sorry my reading of this is that they had an important allocator using a O(N) free-block search? That makes lots of algorithms and operations become quadratic really easily - especially given no one expects linear allocation cost.

RB tree is an interesting choice, presumably there’s a benefit vs btrees (maybe reduced metadata cost?)

It’s also kind of frustrating when articles like this say things like “up to X% faster”. That’s way underselling it: this is asymptotically faster - the performance increase gets larger and larger over time, it’s not a simple multiplier :-/


> It uses a red-black tree that keeps blocks sorted by their offsets in pair with linked list keeping the free space in order of increasing addresses.

I.e. what has been used by the regular mmap for user space allocations for like two decades.


I see big boost for high-speed network drivers.


Why? I expect all high speed network drivers to be already zero-alloc in their sending path. We have zero-copy interfaces as well. Is any network operation really held up by allocs anymore?


And even if they did allocate in the send or receive path, it would probably be with kmalloc (possibly with GFP_ATOMIC), not with vmalloc.


Drivers mostly allocate things from kmalloc, not vmalloc. vmalloc is only used for large allocations that exceed the maximum size that kmalloc can provide. It's traditionally not expected to "churn" either.

One use for vmalloc is for allocating loadable modules: when you insmod a driver, the space comes from vmalloc. Needless to say, there are few use cases for inserting and removing a driver thousands of times per second.


How come? Are skbuffs allocated with vmalloc? I thought they were allocated from low memory.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: