That would make the delay much worse if you mis-predict the users action. Games would also have to add additional 100ms lag for all important events.

If you had a big enough server, you could render multiple frames for each possible user action...

True, that could work, and the two frames are probably very similar, so wouldn't even require that much more bandwidth. As someone here pointed out, however, the game controller is connected to the cloud directly, so the display doesn't even know of the inputs until the roundtrip is already done.

