Hacker News new | past | comments | ask | show | jobs | submit login
Modern GPU: a collection of well-written GPGPU programming tutorials (moderngpu.com)
175 points by profquail on Sept 16, 2011 | hide | past | web | favorite | 20 comments

If you want an "easy" way get into GPU development, AMD recently launched an interesting (open source) project that transparently compiles JVM bytecode into OpenCL: http://blogs.amd.com/developer/2011/09/14/i-dont-always-writ...

Definitely the easiest way I've seen to begin mucking around with GPU development.

Yep, I've talked to the AMD engineers who wrote it -- bunch of very smart guys.

My startup (TidePowerd : http://www.tidepowerd.com) has a product called GPU .NET which JIT-compiles CIL (.NET bytecode) directly into GPU machine code; essentially, we've extended the .NET VM onto the GPU to make GPGPU coding as seamless as we possibly could.

We'll be releasing a new version next week with a much-improved API; if you're experimenting with GPGPU coding, please give it a try -- feedback is super-helpful in shaping the API into something that's both really powerful and really easy to pick up and start coding with.

Oh, and GPU .NET is written in F# -- which we don't support just yet for writing your GPU code, but we're hard at work to add that (likely around the end of November)!

How do you deal with documenting the way the code gets mapped on to the GPU architecture? I've found that small tweaks in how things are laid out can cause huge performance gains (or losses) and if you've managed to automate that it would be really great.

Well, we've managed to automate some of it -- things like the memory allocation/transferring we've got down fairly well and the new API we'll be releasing soon will take care of any edge cases.

For some things, like how your structs are organized/laid out, we haven't automatically optimized that yet -- but one advantage of using .NET (vs. native code like CUDA or OpenCL) is that the CLR specs allows a lot of freedom in implementation; so in the future, we could pretty easily implement some code to analyze your data layout / access patterns and reorganize things under the hood for better performance. All without you needing to rewrite your code, of course ;)

As time goes by though, and solidify the rest of our codebase, we'll be able to spend more time adding optimizations to the JIT compilers to get your code running as fast as the hardware allows, as often as possible.

Cool, really! Is there a way to 'hint' to the JIT compiler what layout you'd like to be used?

Kind of -- there's 3 different layouts you can use for structs in .NET: "auto", "sequential", and "explicit".

"auto" lets the JIT compiler organize the fields in any order, with any padding bytes, etc. it wants to. This is the default, and changing it is basically a tradeoff between speed now (where the JIT compiler may not recognize where it can optimize something) and speed later (when we add a new optimization and your code automatically executes faster).

"sequential" requires the JIT compiler to layout the fields in the order they're defined, but it can add any padding bytes, etc. it wants to.

"explicit" forces you to specify the offset of each field, and forbids the compiler from re-ordering the fields or padding them in any way. It's rare to use this unless it's to handle interop'ing with a C library which uses some weird data structure as a parameter. You might get a speedup from using it in your GPU code, but since you've taken everything out of the hands of the compiler, there's little room for improvement/optimization.

Check out the docs on [StructLayout] for more info: http://msdn.microsoft.com/en-us/library/system.runtime.inter...

Very cool. I've been waiting for this sort of thing to happen for a long time. You still have to manage things by hand, but I bet in a year or two that won't be the case.

On the other hand, as CPU's get more/beter vector units what's to stop a rep add or rep mul instruction from automatically vector/parallelizing things for you?

Plenty of compilers/languages already perform autovectorization (to take advantage of SSE, for example); however, the problem in most cases is that the compiler won't (ever) be able to determine, at compile-time, how your code is going to run, so it'll have to take the safe route and not vectorize it, or only vectorize some of it.

Now, what follows is just my opinion: I think in the (relatively near) future you'll see a lot of data-crunching, high-performance code switching away from C to some of the newer functional languages, or even back to some older languages like FORTRAN -- it's easier to express certain kinds of data-parallelism in those languages, which makes less work for the developer while also making it easier for the compiler to generate optimal, vectorized code.

There is also a package for Python, pyCUDA (even easier than JavaScript - personal opinion):


I find it interesting that you can write for the GPU using Haskell's GPipe: http://www.haskell.org/haskellwiki/GPipe/Tutorial

Nice Link!

We need more of these technical links on HN.

Back in the old days...

Yeah, like, back in the old days we had a link to an 8085 instruction reference card that got some attention. I think that was about four months ago... ;)

I dunno, I just did some random sampling of old HN frontpages on archive.org, and there's not much technical stuff--- mostly startup news.

yeah, way too many "LOOK AT MY COOL STARTUP!" posts these days lol

You've been on this website for less than two months. You do realize that this site used to be called "Startup News", right?

> You do realize that this site used to be called "Startup News", right?

Did it? I've been here for over 1300 days and I don't remember it ever being called Startup News.

I guess my memory isn't what it used to be:)

Actually, been around longer than that just finally made an account because there was something worth commenting on.

Keep down voting tho because of not liking the truth being said about how this site is turning more into what I stated previously, or better yet emo posts about how xyz startup is failing and the founders want to know why people don't like them

There is a downvote button?

There's a karma threshold before you can see the downvote arrow.

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact