I find it curious - and slightly scary - that as the world is stampeding towards increasingly-parallelized computing models, most of us in ASIC design are becoming increasingly thwarted by the limitations of functional simulation - which, by and large - is pretty much single-threaded. I mean, we're supposed to be designing to keep up with Moore, and our simulator performance has pretty much flat-lined. And even more alarming, I've heard very few ASIC people even talk about it.
I'm curious of your take on that.
Our software stack uses adaptive compilation to reconfigurable hardware, so we can identify hotspots in the code that can be compiled to the FPGA at runtime. Eventually we will be able to write and debug the whole ASIC in our software at runtime on the FPGA.
Simulating a single core is not to hard because our microcode processors is small. The ring network connecting cores, caches, memory and four 10 Gbps off-chip communication channels are harder to simulate tough.
Our SiliconSqueak processor is in an FPGA stage right now, we will make the first ASIC version this year. At this time I have no hard numbers on the watt/performance and price/performance of the ASIC. The energy use of the FPGA per result is an order of magnitude below that of an Intel x86 core and we expect the ASIC to be much better.
Regardless of the efficiency of the code, I see no need to re-introduce determinism down the line.
Some older information can be found on http://www.siliconsqueak.org/