You can read bits at that rate yes, but keep in mind that it’s 250 GiB /paramete... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

toxik on June 23, 2022 | parent | context | favorite | on: YaLM-100B: Pretrained language model with 100B par...

You can read bits at that rate yes, but keep in mind that it’s 250 GiB /parameters/, and matrix-matrix multiplication is typically somewhere between quadratic and cubic in complexity. Then you get to wait for the page out of your intermediate result etc etc.

It’s difficult to estimate how slow it would be, but I’m guessing unusably slow.

lostmsu on June 23, 2022 [–]

The intermediate result will all fit into a relatively small amount of memory.

During inference you only need to keep layer outputs until the next layer's outputs are computed.

If we talk about memory bandwidth, it is space requirements that are important, not so much time complexity.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact