Those work and will reduce the system call overhead. But testing showed that it isn't actually the main culprit (e.g. you might gain 5% efficiency by going for it).
A far bigger bottleneck with the kernel stack is that e.g. the route lookup, iptable rules and similar things will be evaluated per packet - which is taking up most of the time. That will happen independent of you deliver one packet per system call or multiple of those.
UDP Generic segmentation offload (GSO - https://lwn.net/Articles/752184/) reduces that overhead by amortizing all in-kernel and driver operations over batches of datagrams. It makes a far bigger difference in efficiency than purely reducing syscalls (e.g. to +100% efficiency - but it will all depend on the other work the application and QUIC stack does and what drivers support).