I don't have a problem with 4 layer boards. It's not worth wasting the brainpower (and debugging time) anymore to route something with 2 layers when you can have great ground and power planes for a really marginal increase in cost.
The issue seems to be keeping all the RAM in the FPGA. That's pretty expensive.