I think you would get a much higher long term payoff with a custom scheduler. Dask does something like this. Both on scheduling and when it has to "Drain".
We considered that approach, but even plugging in a scheduler would not allow us to own scheduling end-to-end; there’s still latency introduced by having events go through etcd. In the end the complexity of kubernetes got in our way, and we realized we were using it as a glorified OCI runtime API, so we decided to cut it out.
I think you would get a much higher long term payoff with a custom scheduler. Dask does something like this. Both on scheduling and when it has to "Drain".