Nice. The way I read the cmove version, it's more or less this except the trick line goes
res += (c == 's') ? 1 : (c == 'p') ? -1 : 0
I haven't done C in decades so I don't trust myself to performance test this but I'm curious how it compares. Pretty disappointed that TFA didn't go back and try that in C.
So I actually did try that, but and IIRC it didn't produce a CMOV with either gcc or clang. I didn't put it in the repo because it wasn't an improvement (on my machine) and I decided not to write about it.