
Modern GPU - ot
http://nvlabs.github.io/moderngpu/
======
gjm11
The title here, although it matches that of the linked page, conveys little
information to anyone who doesn't already know what it's about.

("Modern GPU is code and commentary intended to promote new and productive
ways of thinking about GPU computing. This project is a library, an algorithms
book, a tutorial, and a best-practices guide.")

Perhaps something like "Modern GPU: NVIDIA's guide to making effective use of
CUDA"?

~~~
bluehex
Totally agree. However every single time I've tried to editorialize a title
like this one to make it better the mods have changed it back to the page's
title. I imagine a lot of people have given up on even trying to make
informative titles. It's the worst part about this forum in my opinion.

~~~
flatline
And the suggestion to make a blog post about it, then submit that, is really
just encouraging blog spam, which typically gets flagged into oblivion. I know
that I will do so - a one or two sentence blog post should not IMO be
submitted in lieu of the article itself. I personally thought the ability to
editorialize with only flamebait titles being changed by the mods was way
better. Yes, it led to arbitrary decisions by the mods, but I see two problems
with the current system:

\- Information is often missing from the original, as it is here

\- The mods are not always so quick to rename, and it's frustrating to see an
article that I previously read changed to another title, resulting in
confusion when I see it.

Further, The current system actually incentivizes over-editorializing the
title to get a post boosted onto the front page, knowing it will be causally
re-titled by the mods in due course.

------
seanbaxter
It's my site. I didn't realize it was even posted here (I did post it at
reddit).

To address some points:

The title of the site is kind of vague, agreed. I'm going to write an abstract
for the index page to describe the gist of the design methodologies.

It's not part of an NVIDIA sales pitch, at all. I work in NVIDIA Research and
have a lot of autonomy. I like doing programming patterns and algorithms
research. At some point I had amassed a lot of stuff and thought it would be
nice to share everything with the community.. no coordination with the
marketing folks. My employer was nice enough to let me release it, but this is
my project, not an official company one. This flexibility is one advantage of
working in research.

I don't want to get dragged into a vendor or API debate. The MGPU project is
about programming concepts. It wouldn't be hard for a developer to translate
all the functions into another API--"hackability" is a goal that I talk about
in the introduction. I chose CUDA because it's succinct (important for code
readability) and its what I use on a daily basis.

~~~
baggers
Cheers for the great work, I'm really looking forward to working through this.

------
yan
For those interested in articles like these, there's a really great course on
iTunes U (and YouTube, among other places) from UC Davis titled "Graphics
Architecture", taught by John Owens. [1]

I clicked on it being tangentially interested in GPUs and ended up watching
all the lectures within the next few days with rapt attention. The topic ended
up being far more interesting than I thought and was delivered in a very
approachable way.

[1] [https://itunes.apple.com/us/itunes-u/graphics-
architecture-w...](https://itunes.apple.com/us/itunes-u/graphics-architecture-
winter/id404606990)

~~~
DocSavage
There's also a GPU programming course being taught at Udacity:

<https://www.udacity.com/course/cs344>

------
compilercreator
Is it CUDA only or can we submit code written in OpenCL, DirectCompute or C++
AMP etc?

~~~
sliverstorm
It looks like NVIDIA is hoping to highlight CUDA here. You can always try
though.

~~~
seanmcdirmid
To be honest, for the target audience of this book, CUDA is the only thing
that matters right now. OpenCL and DirectCompute will possibly be significant
in the future if AMD and Intel can catch up with NVIDIA in GPGPU performance.

~~~
Jach
Here's a question you might be able to answer... what exactly do Nvidia cards
do better than AMD cards? In the bitcoin and litecoin worlds, Nvidia is
horribly outclassed for hash algorithms like sha-256 or scrypt. In the gaming
world it's closer, but ATI's 79xx series still wins. (Especially when you
consider in the cases a single 7970 isn't enough, which is true for several
games, you can get 2 7970s for less than the price of a GTX Titan and win that
way. Or a single 7990.) 32bit and 64bit floating point benchmarks favor AMD,
the openCL Sala and Room benchmarks in Luxmark favor AMD... (Not exactly fair
for that one since it's not CUDA, but the difference is enough that a
performance boost from CUDA likely wouldn't close the gap.)

I suspect Nvidia's advantage is their proprietary software like PhysX and
other software used in high-end computing or scientific-computing, and
possibly they scale better (or simply there are Nvidia-supported solutions)
when you want to add dozens of cards together. Is this the case? Because I
don't see how one can claim AMD needs to "catch up" in performance if you're
looking solely on a card-by-card basis.

Edit: Comparing [http://clbenchmark.com/device-
info.jsp?config=14470292&t...](http://clbenchmark.com/device-
info.jsp?config=14470292&test=CLB40200) and [http://clbenchmark.com/device-
info.jsp?config=11905561&t...](http://clbenchmark.com/device-
info.jsp?config=11905561&test=CLB40200) (easier comparison:
[http://clbenchmark.com/compare.jsp?config_0=11905561&con...](http://clbenchmark.com/compare.jsp?config_0=11905561&config_1=14470292)),
the only ones where Nvidia trounces are on Mergesort and memory usage of
Gaussian blur; mergesort is mentioned in the submission. So what about the
rest? And factoring in being able to buy two 7970s for the price of one Titan?

~~~
binarycrusader
Drivers, ISV certification and SDKs. Ask any user of AMD on Linux workstations
what it's like. Then compare the experience of nVidia users.

AMD has the superior hardware in many ways, but their software just isn't
quite there yet in my personal opinion.

------
rdtsc
So how do I use this with the new and modern Radeon HD7990 card?

Hmm looks like there is no CUDA library for it...

(flagging post for misleading title)

~~~
amouat
To the best of my knowledge, CUDA is Nvidia only. You would need to look at
OpenCL or something similar. I guess the techniques presented in the article
could be ported to OpenCL.

~~~
alexchamberlain
I think that's the point... The title conveys a more general article.

------
quackerhacker
Guide looks very well written, and I'm actually studying parallel programming
right now...bookmarked.

Side Note: Maybe this is Nvidia's way to promote sales, since AMD cards sold
like crazy cause of the mining craze.

~~~
jjoonathan
It's not reactionary: nvidia invested heavily in getting to market first and
advertising to the HPC crowd and they have been reaping the rewards of a
monopoly for a few years now. Their latest high-end cards are dramatically
more expensive (vs AMD at the same raw computational power) while offering
completely crippled GPGPU capabilities. I recall incredulity on IRC when
people found that their new 6xx cards were a step _backwards_ from the
corresponding 5xx cards for their GPGPU apps. Pro media & HPC users fork over
the cash for Tesla because they can't jump to AMD due to their legacy CUDA
code. Gamers don't care since they don't use double precision arithmetic.

The bitcoin community was the only one nimble enough to switch to AMD. For
those of us in academia, we'll be paying the piper for the foreseeable future
:/

~~~
hendzen
Bitcoin miners flocked to AMD because the SHA-256 algorithm utilizes a 32-bit
right rotate operation that AMD cards could execute in one clock cycle, while
NVidia cards took ~3 cycles to do so.

At this point it doesn't really matter since GPU mining is obsolete.

Also, what's wrong with NVidia's GPGPU capabilities? The last time I had to
write GPU code, I found that CUDA was much more mature - and much more
pleasant to write than the equivalent OpenCL. Also, the benefits of being able
to run OpenCL code on multiple compute platforms were being realized by CUDA
with the development of a PTX to x86 compiler.

~~~
jjoonathan
>Bitcoin miners flocked to AMD because the SHA-256 algorithm utilizes a 32-bit
right rotate operation that AMD cards could execute in one clock cycle, while
NVidia cards took ~3 cycles to do so.

That would have made the difference even larger, but even at 1:1 instruction
timing AMD would have had a large price advantage so long as we're talking
about the last 2 generations of cards or so. If the price/performance optimum
lies further back than that, it may well have fallen at a point in time when
AMD / NVidia were closer to price/performance parity.

To clarify, by "performance" I mean "performance for my needs" which means
"double precision float performance."

> Also, what's wrong with NVidia's GPGPU capabilities?

The price per double-precision FLOP was off by a factor of four at the
consumer level last time I went shopping and the number of DFLOPS available at
the consumer level was capped lower. It seemed like a move to force penny-
pinching CUDA-dependent academics to upgrade to teslas.

> The last time I had to write GPU code, I found that CUDA was much more
> mature - and much more pleasant to write than the equivalent OpenCL.

It still is, but the gap has closed for many use cases (including mine). I'm
not saying that CUDA isn't/wasn't a reasonable choice, especially a few years
back, just that we are now paying the hidden price that comes from allowing a
monopoly to develop.

------
profquail
Previous HN discussion:

<https://news.ycombinator.com/item?id=3004446>

The url linked there redirects to the new site, but the old site is still
available through the Wayback Machine:

<http://web.archive.org/web/*/http://www.moderngpu.com/>

