Interesting. That makes me wonder what architectures are (or used to be) out there that do support rapidly reconfiguring the cores. Obviously, if deploying the code is so time-intensive that you'd lose the benefits of the parallelism for dynamic use cases, there's no point. It's a little reminiscent of the situation with GPGPU where the overhead of marshalling all the data can eat away your computational speedup.
The program I'm working on (a kind of JIT compiler) is inching ever more towards a strict dataflow model in which the computations are divided into blocks and each block knows exactly which downstream blocks to send its outputs to. The ultimate output of the computation is simply the last downstream block that collects whatever outputs were originally requested and pushes them back to the client. Seems like a sweet spot for an array-based architecture. However, it would need to rapidly reconfigure the cores. Each time a block finishes its work, its core would go back into the available pool and (hopefully soon afterward) be reprogrammed with the code for some other block as the compiler produces it. The scheduler would also try to position blocks near their downstream neighbors. All of this would have to be dynamic. So, a poor fit for the GA144 given what you're saying.
If the above makes sense, I'd be curious to hear what alternatives come to mind.
Running this code on such an architecture is not currently a priority, but it does excite my curiosity—it seems so obviously doable in principle.
Well, so it's hard to do in part because you need code to load code, and there's only 64 words of RAM. But if you got the code to the core, let's say you're only changing 32 words of code, about five-ten functions, you'd have to use 32* ~10 ns = 320 ns to do the reload. Not too shabby, but you have to get it there, and that means using other cores. It's hard, but in theory there is no outstanding reason it would be a problem. Is the GA144 a poor fit? Barring that you crack the problem of getting the code there, yes, it's a poor fit.
I'd forgotten that there's so little RAM to work with. It might be doable; the blocks I'm talking about are mostly very primitive operations.
Do you think the benefits are valuable enough to be worth all the trouble of figuring out how to program this architecture, or is it more just a fun puzzle?
The program I'm working on (a kind of JIT compiler) is inching ever more towards a strict dataflow model in which the computations are divided into blocks and each block knows exactly which downstream blocks to send its outputs to. The ultimate output of the computation is simply the last downstream block that collects whatever outputs were originally requested and pushes them back to the client. Seems like a sweet spot for an array-based architecture. However, it would need to rapidly reconfigure the cores. Each time a block finishes its work, its core would go back into the available pool and (hopefully soon afterward) be reprogrammed with the code for some other block as the compiler produces it. The scheduler would also try to position blocks near their downstream neighbors. All of this would have to be dynamic. So, a poor fit for the GA144 given what you're saying.
If the above makes sense, I'd be curious to hear what alternatives come to mind.
Running this code on such an architecture is not currently a priority, but it does excite my curiosity—it seems so obviously doable in principle.