Hacker News new | past | comments | ask | show | jobs | submit login

That's called CPU cache. It doesn't require "mixture of experts" (whatever that would mean) it just needs transistors for SRAM.

That's one example of the more general category of what I am talking about. But I was trying to get just a little more specific.

Can you give another example and explain how "mixture of experts" gets data closer to a CPU?

I'm talking about GPUs and don't know the details very well. It was a rough idea.

the more general category of what I am talking about. But I was trying to get just a little more specific

Can you get more specific then? Can you give any details or any overview? There must have been some information that led you to post this originally, can you link it?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
