Not all GPUs have the Z-Buffer fillrate to make that approach viable(esp on mobile). I've seen more than a few cases where it was actually faster to turn off Z-Buffering and do the overdraw. More than a few architectures share Z-Buffer bandwidth with other pipelines.
On tiled architectures it'll also increase your tile count which can impact your per-drawcall overhead.
For low-tri things like UI you're better off doing the rect-culling yourself in software(ideally on another thread) and only falling back to the Z-Buffer if you have actual 3D transforms that need per-pixel culling.
Not in our experience. WebRender 1 used to do the rectangle culling on CPU and it ended up being way slower than using the Z buffer on every architecture we tried, including mobile. (Overdraw was even worse.) There are a surprisingly large number of vertices on most pages due to glyphs and CSS borders. Note also that rounded rectangles are extremely common on the Web and clipping those in software is a big pain.
Generally, we are so CPU bound that moving anything to the GPU is a win. We had to fight tooth and nail to make WebRender even 50% GPU bound...
Fair enough, my data was from about 4 years ago so it may be out of date. There's some embedded GPUs that have some pretty 'interesting' architectures.
I would argue though if overdraw vs z-buffer hurt your performance then you are more than 50% GPU bound :).
> Not all GPUs have the Z-Buffer fillrate to make that approach viable(esp on mobile). I've seen more than a few cases where it was actually faster to turn off Z-Buffering and do the overdraw. More than a few architectures share Z-Buffer bandwidth with other pipelines.
Uh, mobile GPUs tend to be way, way ahead of desktop GPUs in Z-Buffer bandwidth (especially in relative terms)
and you can't increase your tile count; there's a fixed number of tiles in the frame buffer (unless you're referring to not drawing some tiles at all - in which case set your scissor rect appropriately. oh, and if you can find out the GPU's tile size round your rectangle up to cover whole tiles)
Not quite, tile based GPUs tend to be better on Z-Buffer bandwidth but not all mobile GPUs are true tile-based GPUs.
The number of tiles can definitely change, it's something Qualcomm calls out directly in one of their talks[1]. As your tile count increases so does your setup cost for your drawcalls.
Browsers don't typically do very good occlusion culling in general.
WebRender aims to change that, by using the hardware Z-buffer. :)