I would also not expect it to pessimize code generation, since the final dereference should always be of a non-volatile pointer, though I suppose an optimizer bug might make it behave otherwise.
You can combine it with atomic, they're not substitutes. It could let you implement two versions of an algorithm: a lock-free multithreaded one, along with a single-threaded one that uses relaxed accesses (or even fully non-atomic accesses, had C++ allowed that). And then you'd auto-dispatch on the volatile modifier. The possibilities are really endless; I'm sure the limiting factor here is our imagination.
I've thought about the other types of split for a long time too, and I haven't managed to come up with other compelling use cases, even though I also feel they should exist. It would be interesting if someone could come up with one, because the ability to have commutative tags on a type seems really powerful.