Hacker News new | comments | show | ask | jobs | submit login

Yeah, I got the runtime down from 40min to 10min. Then I implemented a Python->CUDA compiler and got it down to 30sec. :)



I implemented a Python->CUDA compiler

Wow, interesting. Is it released somewhere? Googling found me the Copperhead project[1], was that what you used?

I'm not sure if "implemented" means you implemented your code for it, or you implemented the entire compiler. :)

[1] http://code.google.com/p/copperhead/


No, it's a custom implementation of a simple compiler. It's nothing complicated. It converts Python to C++ and compiles that with nvcc. It also supports numpy arrays. It doesn't do any complex optimization steps like a full compiler. It's more like Cython, actually (with type annotations via an @gpu decorator). This allowed me to take my Python image processing code almost literally and annotate it with @gpu. The code isn't released, yet.

I originally wanted use Copperhead and got in contact with the developer a year ago, but it was too early even for "private" beta testing, so I never got access to their code. Also, my compiler is specialized on image processing, so probably Copperhead wouldn't have worked, anyway. I'm only jealous of Copperhead's type inferencer. :) But then again, I have to get finished with my thesis and a type inferencer wouldn't help with that goal. ;)


Interesting, thanks for explaining. :)




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: