pldpb's comments

pldpb · on Oct 1, 2024

It's a deadend. SRAM doesn't scale on advanced nodes.

Similar to Tenstorrent who chose GDDR instead of HBM, they throught production AI models won't get bigger than GPT3.5 due to cost.

txyx303 · on Oct 1, 2024

I don't think they rely on SRAM very much for training. https://cerebras.ai/blog/the-complete-guide-to-scale-out-on-... outlines the memory architecture but it seems like they are able to keep most of the storage off wafer which is how they scale to 100s of GB of parameters with "only" 10s of GB of SRAM.