Hacker News new | past | comments | ask | show | jobs | submit login

"this ability to change from functioning as a bit of memory to being a bunch of functional logic on the fly at the speed of a memory read?"

My excitement is tempered by considering the areal demands on the silicon for the putative "smart memory". Suppose, just for the sake of argument, you want your smart memory to be able to take a 4K block of 64-bit integers and add them together. It happens incredible quickly, sure, though you'd have to get an expert to tell me what the fanin can be to guess how many cycles this will take. But you're now looking at massively more silicon to arrange all that. And adding blocks of numbers isn't really that interesting, we really want to speed up matrix math. Assuming hard-coded matrix sizes, that's a whackload of silicon per operation you want to support; it's at least another logarithmic whackload factor to support a selection of matrix sizes and yet more again to support arbitrary sizes. In general, it's a whackload of silicon for anything you'd want to do, and the more flexible you make your "active memory" the more whackloads of silicon you're putting in there.

It may sound weird to describe a single addition set of gates as a "whackload", but remember you need to lay at least one down for each thing you want to support. If you want to be able to do these operations from arbitrary locations in RAM, it means every single RAM cell is going to need its own gates that implement every single "smart" feature you want to offer. Even just control silicon for the neural nets is going to add up. (It may not be doing digital stuff and it may be easier to mentally overlook, but it's certainly going to be some sort of silicon-per-cell.)

Even if you were somehow handed a tech stack that worked this way already, you'd find yourself pressured to head back to the architecture we already have, because the first thing you'd want to do is take all that custom silicon and isolate it to a single computation unit on the chip that the rest of the chip feeds, because then you've got a lot more space for RAM, and you can implement a lot more functionality without paying per memory cell. And with that, you come perilously close to what is already a GPU today. All that silicon dedicated to the tasks you aren't doing right now is silicon you're going to want to be memory instead, and anyone making the smart memory is going to have a hard time competing with people who offer an order of magnitude or three more RAM for the same price and tell you to just throw more CPUs/GPUs at the problem.

RAM that persists without power is much more interesting than smart memory.




Yeah, this seems like the exact same problems that we already have with FPGAs.

1. They're really hard to program for in a way that is easy to understand and scale.

2. They eat a ton of power. Heat = power = max speed. If you can't make it better then today existing ASICs are still going to be used.

The intersection of ASIC + reconfigurable serial processes(aka sequential programming) strikes a really sweet point between power and flexibility that I think is going to be hard to unseat.

I think FPGAs are incredible but if this were true I think we'd already see them taking over the world.


> 2.

so you don't see the potential combination of this and that?


No, you don't need more gates in each memory cell to support each possible operation. The memory cell turns into gates. They are the same thing, just in a different configuration. And yes, it would be a ton of silicon, likely more than you're imagining. It would subsume all CPU silicon, all cache, all main memory, all mass storage. Into one fabric which at any point can either store data, or compute. And yes, it persists without power too.


You've jumped to the idea that somehow this research makes microarchitecture disappear. It won't.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: