The most important note is: > The software titan is rather late to the custom si...

surajrmal · 2025-10-03T15:35:36 1759505736

At this point it will take a lot of investment to catch up. Google relies heavily on specialized interconnects to build massive tpu clusters. It's more than just designing a chip these days. Folks who work on interconnects are a lot more rare than engineers who can design chips.

alephnerd · 2025-10-03T15:59:54 1759507194

> hardware takes a couple generations to become a serious contender

Not really and for the same reason Chinese players like Biren are leapfrogging - much of the workload profile in AI/ML is "embarrassingly parallel", thus reducing the need for individual ASICs to be bleeding edge performant.

If you are able to negotiate competitive fabrication and energy supply deals, you can mass produce your way into providing "good enough" performance.

Finally, the persona who cares about hardware performance in training isn't in the market for cloud offered services.

rcxdude · 2025-10-03T17:23:30 1759512210

As I understood it the main bottleneck is interconnects, anyhow. It's more difficult to keep the ALUs fed than it is to make them fast enough, especially once your model can't fit in one die/PCB. And that's in principle a much trickier part of the design, so I don't really know how that shakes out (is there a good enough design that you can just buy as a block?)

alephnerd · 2025-10-03T17:38:50 1759513130

The bet I'm seeing is to try and invest in custom ASICs to become integrated as part of an SoC to solve that interconnect bottleneck.

It's largely a solved problem based on Google/Broadcom's TPU work - almost everyone is working with Broadcom to design their own custom ASIC and SoC.

kenjackson · 2025-10-03T16:43:19 1759509799

And current LLM architectures affinitize differently to HW than DNNs even just a decade ago. If you have the money and technical expertise (both of which I assume MS has access to) then a late start might actually be beneficial.

interestpiqued · 2025-10-03T18:02:40 1759514560

Most of the big players started working on hardware for this stuff in 2018/2019. I worked at MSFT silicon org during this time. Meta was also hiring my coworkers for similar projects. I left a few years ago and don’t know current state but they already have some generations under their belt