Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I actually think the chip level HW-SW co-design is a good idea. It does open up more opportunities to mitigate communication issue than optimizing the mapping given a fixed chip and system design. For example, the number of GPUs per server limits the maximum tensor model parallelism size, you don’t want to do tensor parallelism across servers due to the low bandwidth between servers. Here the # of chips/server depends on chip size and cooling, etc. So you probably want to do the co-design -- you have the chance. It’s difficult though.


Having hardware and software talk to each other before tape out is a really good idea. The early Graphcore work was done on a whiteboard with people from both sides writing on it.

There's still a lot of compromises and tradeoffs to be made:

> We observe that the inter-chiplet communication issues can be effectively mitigated through proper software-hardware co- design

Doubtful. Especially given it's all vapourware. Codesign is not adequately magic to handwave away this one.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: