Hacker News new | past | comments | ask | show | jobs | submit login

you can do that, if you can to accept the overhead of a system call per outgoing/incoming frame.




Those work and will reduce the system call overhead. But testing showed that it isn't actually the main culprit (e.g. you might gain 5% efficiency by going for it).

A far bigger bottleneck with the kernel stack is that e.g. the route lookup, iptable rules and similar things will be evaluated per packet - which is taking up most of the time. That will happen independent of you deliver one packet per system call or multiple of those.

UDP Generic segmentation offload (GSO - https://lwn.net/Articles/752184/) reduces that overhead by amortizing all in-kernel and driver operations over batches of datagrams. It makes a far bigger difference in efficiency than purely reducing syscalls (e.g. to +100% efficiency - but it will all depend on the other work the application and QUIC stack does and what drivers support).


You can always use sendmmsg(2), writev(2) or other iovec-based APIs.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: