Did you profile the layout code? How many UI elements do you display normally? And what is the O complexity of the layout algorithm above? My intuition is that even if it looks like a lot of code, it should be incredibly fast for at least hundreds of elements.
I profile aggressively using Superluminal. All of the passes are O(N), it's mostly an issue of the amount of time it takes to go through and lay out a few thousand boxes with constraints and configuration flags set. There aren't many 'bottlenecks' and it's more just a bunch of CPU time spread across the whole algorithm.