> They'll probably still have to do advanced packaging putting HBM on top to save things.
This is where the interesting wafer-size packaging TSMC does for the Dojo D1 supercomputer comes in - Cerebras has demonstrated what can be a superior process for inter-element bandwidth, because connections can be denser than they are with an interposer, but the ability to have different elements coming from different processes is also important, and used on the D1 slab. Stacking HBM modules on top of a Cerebras wafer might help with that. I'm sure the smart people there are not sleeping on these ideas.
For ultra low latency uses such as robotics or military applications, I believe a more integrated approach similar to the Telum processors from IBM is better - putting the inference accelerator on the same die as the CPUs gives them that, and they are also much smaller than a Cerebras wafer (and it's cooling).
This is where the interesting wafer-size packaging TSMC does for the Dojo D1 supercomputer comes in - Cerebras has demonstrated what can be a superior process for inter-element bandwidth, because connections can be denser than they are with an interposer, but the ability to have different elements coming from different processes is also important, and used on the D1 slab. Stacking HBM modules on top of a Cerebras wafer might help with that. I'm sure the smart people there are not sleeping on these ideas.
For ultra low latency uses such as robotics or military applications, I believe a more integrated approach similar to the Telum processors from IBM is better - putting the inference accelerator on the same die as the CPUs gives them that, and they are also much smaller than a Cerebras wafer (and it's cooling).
Gene Amdahl would have loved to see them.