

CUDA Performance: Maximizing Instruction-Level Parallelism - hhuuggoo
http://continuum.io/blog/cudapy_ilp_opt

======
iskander
Vasily's approach to CUDA really revolutionized how I think about GPU
programming and I'm glad the continuum folks are giving ILP on the GPU a
broader audience. Can anyone testify to the quality of continuum's CUDA
wrapper? Is it nicer to work with than PyCUDA?

~~~
alcari
I haven't dealt much with PyCUDA recently, but Continuum's wrapper is
interesting in that it compiles python code (or at least a subset thereof) to
run natively on the GPU, via LLVM if I'm not mistaken. As far as I'm aware,
PyCUDA only allows Python code to call pre-compiled CUDA kernels.

~~~
lmeyerov
A labmate did some great work similar to Continuum's wrapper and has been
continuing on now at NVIDIA:
[http://copperhead.github.io/](http://copperhead.github.io/) . He basically
identified an ML-like subset of Python (sort of like asm.js vs js) and
specializes it.

For me, the big surprise is that Copperhead departs from NESL-like flattening
transformations (e.g., those used by Data Parallel Haskell.) It's a bit less
surprising when you realize the creator is a GPU expert :)

Edit: Vasily, the guy behind the paper advertised in Continuum's blog post, is
also from our lab ;-)

~~~
iskander
Is Bryan still working on Copperhead?

Also, do you know if the DPH folks ever managed to iron out a version of
higher order flattening which gives a predictable performance gain?

~~~
lmeyerov
I think Bryan has been doing a followup to Copperhead, probably easy to just
ask him :)

I don't know what you mean by predictable performance. Flattening is a direct
transformation and seems simple to reason about on SIMD architectures, though
the recent dynamic schedule (work stealing) approach for multicore/distributed
has the usual caveats. (I tend to avoid it for HPC.) Given the 10+ year
history of the researchers involved, it seems like a slow-but-steady project..

------
MisterNegative
Why is the on the frontpage? This is just a copy/dumb down of the original
presentation. Also, the practical use of this idea is extremely limited.

Also the poster seems to have an agenda; this is just marketing.

~~~
pwang
> Also the poster seems to have an agenda;

True. Most people who take the time to write content that they publish on the
internet have an agenda.

> this is just marketing.

False. This is an informative and useful summary of a 75-page deeply technical
presentation into a few screens of text, and shows actual Python code (and
benchmarks) to demonstrate the principles.

> Also, the practical use of this idea is extremely limited.

Care to elaborate? A substantive discussion about the subject of the original
post would actually be constructive and add value for the HN community.

~~~
MisterNegative
I'm sorry, and you are right. What I really meant was that I dislike them not
giving enough credit, thus making them look smart with Vasily's knowledge.
This kind of marketing feels immoral to me.

>Care to elaborate? This really isn't the right place for that so I didn't
bother. The right place would be a thread/forum talking about the original
presentation. Also there wouldn't much to elaborate since my opinion was based
on general insight, it is not a provable fact.

