

Write hybrid CPU/GPU programs in Haskell - dons
http://parfunk.blogspot.com/2012/05/how-to-write-hybrid-cpugpu-programs.html?m=1

======
tmurray
(insert standard disclaimer about being responsible for CUDA here)

I'm definitely happy to see more languages with GPU support, but schedulers to
distribute work between CPUs and GPUs are a particular interest of mine. The
most full-featured I've seen is StarPU:

<http://runtime.bordeaux.inria.fr/StarPU/>

But there's still a lot of work to be done; it would be very interesting to
remove the need for the developer to estimate time spent on CPU (or one type
of processor) versus time spent on GPU and see the effects on developer
productivity, for example.

~~~
rrnewton
Thanks for the StarPU reference! This definitely looks worth checking out.

------
meric
My final year thesis was on implementing some algorithms with Accelerate, and
one of the things I noted was that on a 2009 Macbook Pro (256 megabyte
integrated Nvidia GPU), a single threaded C program runs faster than using
Accelerate, even when all it does is multiplying each element of an array by
two. The performance discrepancy is even greater for more complicated
problems. So, before you jump in to use this and expect better performance on
embarrassingly parallel problems, make sure your Nvida GPU is not integrated
and has lots of memory.

Of course this new package is different because it uses both CPU/GPU...

I also found Accelerate programs hard to debug. You cannot use "trace" to
print out stuff during computation because that is a CPU instruction.

~~~
Symmetry
It seems like NVidia has much less of an advantage in GPGPU in the most recent
generation of graphics, with AMD backing off from VLIW with their new graphics
cores, and NVidia shifting their balance of scheduling/compute to something
that favors graphics more than GPGPU.

------
kaosjester
My wife worked on this a bit with Adam (<https://twitter.com/#!/acfoltzer>)
and Ryan. There is a pending submission to ICFP.

The reason they went with CUDA was to plug into Accelerate's existing
framework without redeveloping the entire wheel. As meric mentioned,
Accelerate is a pain to do anything with and you can bet dollars to do syntax
that this package will generate the hard parts for you.

IIRC, ParFunk also has some nice framework in place for distributed
computation (though I'm not certain it's completely in working order yet).

------
wtracy
Holy cow, I didn't even know that we had Haskell -> CUDA compilation working.
Very awesome stuff!

------
tikhonj
That's really cool.

I also like how the blog post is available as a literate Haskell file. I think
that's a great way to make an introduction more useful, and I wish more
languages would take an approach like that for different articles.

------
hypervisor
Where is the OpenCL version?

~~~
Periodic
This is a generalized frontend for modeling parallel computation. OpenCL is a
future backed target.

~~~
rrnewton
There's a not-quite-complete OpenCL backend here:

* <https://github.com/HIPERFIT/accelerate-opencl>

Either OpenCL, or a freely available x86 CUDA, would really make CPU/GPU
programming more useful in this case. We might already have what we need re:
CUDA x86. On our TODO list is trying PGI Accelerator and GPU Ocelot for this
purpose:

* <http://code.google.com/p/gpuocelot/>

* <http://www.pgroup.com/resources/cuda-x86.htm>

