There were some pushes for wafer scale stuff in the 80s , I think we are better suited from an algorithms, EDA and architecture standpoint now to actually make it work. A GPU is actually pretty good test bed and eventually GPU and FPGA functionality will merge into a single programmable compute fabric.
Yield on a wafer scale FPGA could actually be much better than a special purpose chip. The faulty logic elements / LUTs could just be marked as bad and not used.
An ASIC (GPUs are ASICs) running at 300 MHz can do a lot more per cycle than an FPGA at the same technology node running at the same frequency. A lot. Think order-of-magnitude more.
> The faulty logic elements / LUTs could just be marked as bad and not used.
This will screw up your timing, unless you reserve more setup slack (which in turn hurts the achievable performance).
Edit: Or it might have been a PGA, rather than an FPGA.
Fab equipment also has limits in how big a single chip can be due to "reticle limits", i.e., the size of a single mask exposure, and a chip has to be a single exposure (per layer).
The CAD software (place+route) might actually be a limiter here too. Even if, in theory, you had an FPGA with enough logic blocks to match a modern GPU's raw gate count (i.e., circuit size), an FPGA is just a sea of identical configurable blocks and parts of the circuit are assigned to the blocks and connected with wires using something like simulated annealing. This is why FPGA place+route takes so long. In contrast, a custom chip can put exactly the right gates and wires at exactly the right places, and especially for something like a GPU or a large cache where there are lots of repeated elements, this human involvement makes things much more tractable. (It's also why e.g. the caches on a modern CPU die look so pretty, like a big grid of farms or something -- an FPGA-synthesized CPU cache would look nothing like that, and take much more space!)
All of that said -- from an academic point of view, this GPGPU core is an impressive piece of work, especially given that it includes an LLVM backend!
FPGA's for ASIC simulation of larger chips are typically arrays of FPGAs rather than single chips.
You get stuff like this:
I'm not aware of anybody shipping a wafer scale FPGA.
Now I get it. Sorry for the misunderstanding, appreciate the expansion.
i wonder if an OpenCL stack can be made..