
Nyuzi: An experimental FPGA multicore GPGPU processor - jdmoreira
https://github.com/jbush001/NyuziProcessor
======
zymhan
Just thinking about writing VHDL again gives me the heebie-jeebies. But this
is very cool, and I love the possibilities enabled by FPGAs and OSS. However
we're still a ways away from having an entire open source FPGA development
stack.

~~~
brian-armstrong
I've been really wanting to try Chisel
[https://chisel.eecs.berkeley.edu/](https://chisel.eecs.berkeley.edu/)

They have a RISC V implementation in it so it can't be too bad

~~~
sebastic
Chisel doesn't help. It just converts to verilog. The verilog is then
converted by a closed source program to a closed file format that is then
uploaded by a closed sourced program to closed fpga hardware.

------
Cieplak
Looks like he got it running on a Altera DE2-115 board, which has these specs:

    
    
        114,480 logic elements (LEs)
        3,888 Embedded memory (Kbits)
        266 Embedded 18 x 18 multipliers
        4 General-purpose PLLs
        528 User I/Os
    

[1]: [http://www.terasic.com.tw/cgi-
bin/page/archive.pl?Language=E...](http://www.terasic.com.tw/cgi-
bin/page/archive.pl?Language=English&No=502)

------
zackmorris
Does anyone know if there's an opposite analog of this? I would very much like
to run a parallel language like VHDL or Verilog on the GPU since:

1) OpenCL/CUDA have an OpenGL-inspired syntax with a steep learning curve and
limited generalizability

2) FPGAs don't seem to be gaining the economies of scale of GPUs

I simply want to be able to emulate thousands of CPUs (millions of gates) for
physics, AI, big data etc, in a way that's accessible, affordable and won't
catch fire. I'm thinking MATLAB or Octave but with near-ideal speedup for
embarrassingly parallel problems.

~~~
jasonwatkinspdx
Do you already know VHDL or Verilog? Most people would not consider them
simpler or more productive than OpenCL IMO.

Julia fits your last sentence.

------
tkinom
Is this GPGPU only, or does it also support GLES or OpenGL?

~~~
jdmoreira
I guess you could implement GLES on top of it. They implemented a renderer for
quake maps...
[https://github.com/jbush001/NyuziProcessor/tree/master/softw...](https://github.com/jbush001/NyuziProcessor/tree/master/software/apps/quakeview)

~~~
vvanders
Looks like it runs at about 1 FPS on a 50Mhz core(with screenshots):
[http://latchup.blogspot.com/2015/06/not-so-
fast.html](http://latchup.blogspot.com/2015/06/not-so-fast.html)

Still crazy awesome, that's a ton of work.

------
wyldfire
Man, it'd be super sweet if we could get an OpenCL frontend for this target.

~~~
jeffbush
Technically it already supports OpenCL, as it has an LLVM backend and Clang
port. However, it will generate scalar code that doesn't take advantage of the
vector unit. To support it properly, it would need extra passes for SPMD
vectorization.

~~~
wyldfire
I think there's a lot of follow-through beyond just llvm and clang support in
order to make a full OCL platform -- device enumeration, etc. Plus I don't
think clang distributes a complete front end (headers/type defns etc). There's
some open source projects that could supplement this, though.

