Tokio supports io_uring (https://github.com/tokio-rs/tokio-uring), so perhaps when it's mature and battle-tested, it'd be easier to transition to it if Cloudflare aren't using it already.
Last time I benchmarked io_uring, it was good enough, but I was able to get measurably faster from syscalls and a well-crafted userspace thread pool for storage to NVMe.
(Note: the I/O thread pool I wrote came out noticably faster than fio, the benchmark tool, so don't assume fio results are the best possible.)
For that reason, I have a small application which can be configured for either io_uring or the thread pool, and the thread pool is preferred for performance.