

Nvidia announces CUDA 4.0 - m0nastic
http://pressroom.nvidia.com/easyir/customrel.do?easyirid=A0D622CE9F579F09&version=live&prid=726171&releasejsp=release_157&xhtml=true

======
tmurray
Been working on the API improvements for this for quite a while now. If there
are any questions I can answer, I'll tell you whatever I can.

(edit: in case it's not obvious, I work for NVIDIA on CUDA)

~~~
lrm242
I haven't done any CUDA programming, however I have wanted to dive in. I've
been to some conferences where the breakpoint between whether CUDA gives
enough benefit to the computation is largely driven by time to load a dataset
into memory on the GPU. From what I've been told this has limited the use of
CUDA to batch processing on large scale datasets. Is this still a limitation?

~~~
tmurray
PCIe can be a limitation, but there are a lot of ways to amortize the latency
of copying data to/from the GPU. Overlapping transfers with kernels, direct
load/store from system memory to a kernel, multiple kernels running on the
chip at the same time--there are a lot of things you can do. But in general,
you're looking at a runtime of at least a few hundred microseconds before
you're going to be able to get a benefit from the GPU.

------
babs474
Is anybody who is interested in CUDA also interested in the idea of GPGPU
computation in the browser using webGL? Here is a link to a proof of concept I
did a while ago <http://learningwebgl.com/blog/?p=1828>

I think the concept is pretty compelling but I don't get that much traction
when I bring it up. Any thoughts?

------
bryanlarsen
anandtech has some good info, too: <http://www.anandtech.com/show/4198/nvidia-
announces-cuda-40>

------
m0nastic
The listed link is PR, but here is also some more details about what 4.0
contains:

[http://developer.nvidia.com/object/cuda_4_0_RC_downloads.htm...](http://developer.nvidia.com/object/cuda_4_0_RC_downloads.html)

------
ciupicri
Seeing all the buzz about CUDA I can not stop wondering if OpenCL has any
future. Does it?

~~~
vilya
It does, I think. OpenCL's trump card is that it works on non-Nvidia devices
as well. I believe that'll ultimately be enough for it to win out, despite the
technical merit being in favour of CUDA right now.

------
matclayton
If you are interested in GPU coding, checkout www.tidepowerd.com for a .net
cross vendor system, also falls back to CPU and they are a startup :)

------
ericxtang
CUDA has come a long way since it's birth as a student project.

~~~
iskander
Who wrote the first version? I always assumed it was nvidia's version of
Brook.

~~~
tmurray
CUDA and BrookGPU share the same creator, but there's not really anything else
in common. The programming models are fundamentally different (multi-level
memory hierarchy with sharing of local data with fast synchronization versus
pure streaming).

------
phlux
So forgive my ignorance, but would the following be of value in a system
design/architecture.

(I dont know what type of cards you have which you would use in this, such as
a multi-GPU cored PCI-e card, so in my ignorance, I will assume these exist
and are separate from a traditional "video card")

I would be interested to see a system, as a brick, which has a mobo with
multiple PCI-e slots, as many as one can get then fill it with both GPU based
cards, as well as Fusion IO cards that provide extremely fast IOPs on the
storage side. The main CPU in the box, is effectively used to boot the OS -
all the other processing and IOPs occur on the GPUs and the Fusion IO cards as
data storage and caching...

Is this a realistic idea?

~~~
lwat
GPUs can't run arbitrary code. You have to code specifically for them. They
don't run operating systems or web servers or whatever. These are for
massively parallel mathematical computations, not for system calls and threads
and stuff like that.

~~~
phlux
Right, I do know that --

What Iam saying is that a system that uses both GPUs and the fusion IO cards
would seem to provide a kickass platform for mass computation on lots of data
in a small form factor.

~~~
lwat
Cuda rack servers are available for multiple vendors.

[http://www.geeks3d.com/20091009/cxt8000-nvidia-tesla-
based-s...](http://www.geeks3d.com/20091009/cxt8000-nvidia-tesla-based-server-
with-1920-cuda-cores-for-gpu-computing/)

~~~
phlux
haha, When I was new to IT in Silicon Valley in 1997, we bought a bunch of
clone servers from Colfax International - dual pentium II something-or-others,
based on Super Micro mobos.

We were running Windows NT with SQL Server as well as Linux (the founders of
Linux Care, Dave Sifry and Art Tide set them up - we were doing EDI with sun,
and were one of the first companies to use XML for the docs, which were
manufacturing orders for copies of Sun OS -- we physically manufactured all
their software (cd, manuals, boxes etc, and dropped shipped them to the
customer)

Anyway - after a few months we were having HORRID problems with the colfax
boxes always locking up. They were about 12 or 14,000 bucks each at the time -
which was insanely expensive for what we were getting. I think they had like a
gig or ram or something small.

We also had AS/400s which were many more times that, but that was for the JDE
ERP system...

We tried to work through the problems with Colfax, and finally called a
meeting with them.

They came in and my manager asks, "Colfax.... Colfax International. Where are
your offices?"

CI: "Sunnyvale."

US: "So, any other offices"

CI: "Nope - we are all based in the US"

US: "I see, any international customers?"

CI: "Nope, all in the bay area"

US: "So why do you call yourselves "international""

CI: "...."

\--- Anyway, we ended up scrapping the boxes as high end IT workstations and
bought name brand systems.

Last time I bought clone HW. We even had some of the first VA Linux and
Penguin systems -- when they were barely seen as anything but clones...

