Just skimmed the paper. Seems to me like this paper wants to optimize transforme...

saturn99 · on July 12, 2023

I think FPGAs would be an awesome prototype but maybe too constricting in terms of resources? The extrapolation might be so far out to be just as accurate as their simulated model...

londons_explore · on July 12, 2023

It seems fine to say "others have proved that this math makes a good LLM, we have designed an ASIC that can do this math fast, therefore we can make a good fast LLM"

JonChesterfield · on July 12, 2023

Yes, but saying that shouldn't be mistaken for "we can make an asic that runs some model fast". There's a wide implementation void between the two.

kraken12 · on July 12, 2023

Yep, it's a research paper in comp arch, the initial proof-of-concept study before you go and spend real money on it.