The Quake 3 engine added procedural curved surfaces which were computed on the CPU. Meanwhile, the Unreal engine simply optimized the rendering of arbitrary meshes which allowed artists to build any shaped surface they want and then let the GPU brute force it.
The Doom 3 engine added a unified lighting model based on stencil shadows. Again, this was a heavily CPU based technique while the industry was trending towards being CPU-bound. Additionally, the chosen lighting equations created a very plastic-like look. Competitors used shadow maps which are generated by the GPU and these are still the standard in speed and quality for most games today.
The latest engine he is working on is pushing "mega textures" which effectively emulate a hardware paging technique in software. "Mega textures" is the next logical step in mip-mapping, but mip-mapping has been performed by the graphics card for over a decade now. The SeaDragon team over at Microsoft Research and Live Labs built a superior version of this idea before Carmack started on this and are working actively with hardware vendors and the Direct3D team to speed it up for use in games.
Once again, Carmack has opted to trade precious CPU cycles -- better budgeted for game logic, physics, and AI -- for generality. He seems to be fighting a crusade against special cases in his engines at the expense of the games made with that engine.
There is a great reason why the Unreal Engine has been the most popular middleware platform since Quake 3...
"Again, this was a heavily CPU based technique while the industry was trending towards being CPU-bound." -- isn't the industry now trending in the opposite way? ie - everything is becoming a CPU (and it has more and more cores)
GPUs are becoming more CPU-like. They are gaining more general-purpose instructions and supporting additional complexity in computations.
CPUs are becoming more GPU-like. They are increasingly pipelined and parallelized.
However, there are still fundamental differences between CPUs and GPUs that make them suited for varied tasks. CPUs tend to support branch prediction (although not always, see game consoles) and various volatile threading operations. GPUs gain a lot of their speed by making assumptions about side effects and causality in their calculations. Branching is rarely, if ever, needed at runtime and almost all of the operations performed by most applications can be encoded as vector operations.
The future of powerful computers is probably a hybrid of various processing units, memory architectures, and special purpose hardware. We're going to need software abstractions to deal with this complexity.
When designing rendering engines, the general goal is to design for 80% to 90% of the market of the market conditions for when your game releases. Considering most games have a 1 to 3 year development cycle, this seems like a weak bet.