In regards to the section on `vectorcall`, I think that the UNIX x64 ABI specifies that any floating point/SSE parameters are passed in the XMM* registers, so it should stay in the registers the whole time.
OTOH I am not sure if GCC actually does this in practice. Whenever I've had to read x64 code produced by GCC with -S, I've noticed it tends to constant spill quite a lot.
To be honest, I have only limited experience with vectors at the moment, so I'm not entirely sure. I scanned the ABI[0], but in my current sleep-deprived state I can't make all that much sense of it, sorry. You might have better luck!
OTOH I am not sure if GCC actually does this in practice. Whenever I've had to read x64 code produced by GCC with -S, I've noticed it tends to constant spill quite a lot.