Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> there’s a trick to turning these kinds of I/O operations into something you can put into an event loop: you run the desired operation (mkdir() in this case) in another thread, and then wait for the thread to finish with a timeout

This isn't a trick, this is expected. Calls return when they're supposed to return, or not at all. If you need to return BEFORE the system call is done, you need to do it somewhere else (like in a new thread or process, or node). This is also not limited to I/O, but basically any system call: if you return before it's done, it may break something, so it might not provide a good timeout method.

Something to ask yourself is also why you need to return before the call is done? It's similar to the NFS hard-vs-soft-mounting argument. Soft mounting can cause damage when improperly interrupted; hard mounting prevents this by waiting until the system is behaving properly again, with the side effect of pissing off the users.



In some cases, you have knowledge that you're about to do a whole bunch of I/O operations, sometimes all of the same time, sometimes not. (It doesn't really matter.) Ideally, I'd like to transfer this knowledge — that is, the list of I/O operations — to the kernel wholesale, so that it also has complete knowledge of the task at hand, and can better figure out the most optimal way to go about completing that (which it really can't do if it can't see the whole picture). This might mean scheduling disk operations more efficiently, to avoid seeking, or batching multiple network requests into a single packet, etc.

You can't do that with synchronous APIs without hackery, since the very structure of the API is self-defeating when it comes to getting the complete picture to the kernel. If I have a 1000 I/O operations, I do not want to spawn 1000 hardware threads: relative to the amount of information required to describe an I/O op, threads are incredibly expensive.

I don't want the syscall to represent the entirety of the work, simply, the request to have the work performed. The kernel's response is then essentially "Acknowledged, beginning this I/O. Here's a handle/means¹ to obtain the result of the operation." Then, I can batch-request notifications of results through some kernel I/O event queue … e.g., kqueue or epoll.

¹if handles are too much, you could also agree to have it stuck in some sort of queue of results, that might be usable with kqueue/epoll.


Do you want this bulk i/o syscall to inform you every time an operation is complete, or in stages? Do you want it to prioritize latency over bulk operations? Do you want it to take up more or less CPU? Will interrupts get thrown each time you query the status? Do you want to know when the operation is complete on spindle, in on-disk cache, in the filesystem cache? Do you want it to handle network filesystems differently? Do you want it to take advantage of multichannel NCQ and other features or implement your own in the kernel? Do you want this new i/o scheduler to affect the rest of the system's i/o, or only your application's? Do you want multiple applications to use different command queues or for yours to trump them (priority) ? Do you want the kernel to implement it's own batch ordering or rely on vendor firmware? (It sounded at first like you were describing vectored i/o but I assume you want something more abstract than that, kinda like a more generalized blk-multiqueue?)


All good questions, but none of these seem possible in today's POSIX APIs either. (Most, I feel, probably are best just implemented as "options" to the syscall in either the sync or async view of the world.) The point was more to have async operations be possible, whereas today, they're not.

> It sounded at first like you were describing vectored i/o but I assume you want something more abstract than that, kinda like a more generalized blk-multiqueue?

Asynchronous I/O, not so much vectored (though vectored is similar, but I want to stay away from that term as most of the APIs (e.g., readv writev) I've seen for that aren't actually asynchronous and are just more efficient user-to-kernel bindings).


In our case at $WORK, it's because we have hard timeouts to meet our service level agreements with the Monopolistic Phone Companies. We need to return a response within X time, no exceptions.


In my past working with teams with network service SLAs, they had to design a robust multithreaded backend app and modify the frontend service to ensure all http transactions finished within 60ms. Timeouts were one of the smaller concerns...




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: