The system uses 30MW, but this job used a portion of it that would consume 2.6MW.
There isn't really a figure for how much compute time it takes to train this thing, but 8x H100s have 32PF of AI compute among them. This job had 2,100 (half precision[1]) PF-in-fugaku / 158,956 nodes-in-fugaku * 13,824 nodes-in-job = 182 PF-in-job, implying it can get the job done 5.6x faster, or a little over ten days at the most optimistic.
Electricity costs for these nodes for ten days looks fairly similar to the rental costs of 8xH100s for 60 days according to my research. Lambda labs seems to have very cheap instances for 8xH100, but AWS and its ilk are much higher. However, the comparison is a little weird, as Fugaku is also a few years old now, and the contemporary GPU at the time of its release was the A100 (1/13th of an H100). The next Fujitsu chip may well narrow the power/performance gap between itself and (say) Blackwell or whatever is current at the time.
There isn't really a figure for how much compute time it takes to train this thing, but 8x H100s have 32PF of AI compute among them. This job had 2,100 (half precision[1]) PF-in-fugaku / 158,956 nodes-in-fugaku * 13,824 nodes-in-job = 182 PF-in-job, implying it can get the job done 5.6x faster, or a little over ten days at the most optimistic.
Electricity costs for these nodes for ten days looks fairly similar to the rental costs of 8xH100s for 60 days according to my research. Lambda labs seems to have very cheap instances for 8xH100, but AWS and its ilk are much higher. However, the comparison is a little weird, as Fugaku is also a few years old now, and the contemporary GPU at the time of its release was the A100 (1/13th of an H100). The next Fujitsu chip may well narrow the power/performance gap between itself and (say) Blackwell or whatever is current at the time.
[1] https://www.fujitsu.com/global/about/innovation/fugaku/speci...