(I should caveat what I'm about to say that I'm primarily concerned about writing robust and highly performant programs, and while I believe it should be a focus broadly, it's a practical niche.)
That's the thing, though. It's arguably even more important for distributed code. If we abstract away the state machine too much, it becomes difficult to reason about the code precisely because of abstraction. The complexity that was present explicitly in the state machine will just cause confusing behavior in the abstracted version. Using lightweight threads or another high level abstraction that approximates blocking code will allow getting a program out faster, but lower quality at that. Two examples to illustrate my point: first, you mention scatter-gather, but the base concept is orthogonal to sync/async. However, I/O is characteristically async, and therefore the underlying mechanisms are async anyways. Second, io_uring is showing that async can be good for performance while not being a difficult interface.
Sync code makes things easier to reason about but also kinda not. I think the big issues with async are that the OS hasn't done a good job allocating responsibility of async interfaces and the fundamental difficulty. The former makes async seem less efficient than it could be, which is true in that a sync-over-async interface is better than an async-over-sync-over-async interface, but we should have async interfaces accessible. The latter probably feeds into a bias to not even touch async where mixing async and sync would be the best blend of performance and programmability.
That's the thing, though. It's arguably even more important for distributed code. If we abstract away the state machine too much, it becomes difficult to reason about the code precisely because of abstraction. The complexity that was present explicitly in the state machine will just cause confusing behavior in the abstracted version. Using lightweight threads or another high level abstraction that approximates blocking code will allow getting a program out faster, but lower quality at that. Two examples to illustrate my point: first, you mention scatter-gather, but the base concept is orthogonal to sync/async. However, I/O is characteristically async, and therefore the underlying mechanisms are async anyways. Second, io_uring is showing that async can be good for performance while not being a difficult interface.
Sync code makes things easier to reason about but also kinda not. I think the big issues with async are that the OS hasn't done a good job allocating responsibility of async interfaces and the fundamental difficulty. The former makes async seem less efficient than it could be, which is true in that a sync-over-async interface is better than an async-over-sync-over-async interface, but we should have async interfaces accessible. The latter probably feeds into a bias to not even touch async where mixing async and sync would be the best blend of performance and programmability.