
Will OpenCL help displace GPGPU? (2012) - luu
http://yosefk.com/blog/will-opencl-help-displace-gpgpu-parallella-p2012.html
======
DiabloD3
That article is largely a trainwreck.

Yes, OpenCL is an API and shader language for GPGPU usage.

Yes, it is a competitor to CUDA.

No, CUDA is not the majority of GPGPU usage (although in 2012, 3+ years ago,
this may have been true depending on who you spoke to), OpenCL now is.

No, CUDA and OpenCL competing is not an issue, as Nvidia is a founding member
of Khronos's OpenCL committee, and Nvidia uses a merged compiler backend for
all their shaders (including OpenCL and CUDA); neither API will produce better
optimized binaries than the other on Nvidia hardware.

Yes, there are OpenCL implementations for CPUs: AMD and Nvidia's
implementations both can be executed on x86 CPUs (useful for lock step full
program debugging without needing to deal with the async nature of GPUs),
Intel has one for x86, IBM has one for PPC/PPC64/Power.

Yes, there are OpenCL implementations for DSPs. That was already a thing in
2012. Sony/IBM has one for Cell SPEs.

Yes, GPUs are in fact highly parallel DSPs.

No, since 2012, Parallela nor P2012 ever took off.

No, OpenCL cannot displace GPGPU because that is like saying Electric will
displace the Car. No, instead, we'll make cars that run on electricity instead
of gasoline, ie, we'll be changing APIs to more universally supported ones.

Is GPGPU dying? 'Yes.' What is killing it? The march of technology into making
EVERYTHING generic processing: floating point used to be a specialized
application requiring an optional co-processor: it is now part of the CPU;
SIMD used to be part of specialized DSPs, that, too, is now part of the CPU,
and that part of the CPU becomes more complex every other generation or so;
GPGPUs were merged on-die a few generations back and natively speak to the
CPU's memory controller and socket bus as if they were just another core.

And merging GPGPUs on-die isn't even new, if we view them as just specialized
highly parallel DSPs: MIPS and ARM SoCs have done this for like the past
decade by including communications and audio processing DSPs in the package or
on-die.

So, just like the FPU, and SIMD, GPGPUs as highly parallel DSPs are being
merged into the CPU.

And, as a side note, you know what this looks like from the other side, a CPU
becoming a highly parallel DSP? Intel's Xeon Phi series.

And, as another side note, since it is now a given that CPUs need to be able
to do embarrassingly parallel tasks optimizedly, Intel is now tackling the
fact that QPI is not scalable across a CCNUMA cluster like AMD's
Hypertransport is, and is now moving to embed their Infiniband variant onto
CPU and Phi dies, for native 40/100gbit on-die. There are already Xeon E5/E7s
slated to have this on a post-LGA-2011-3 socket.

Disclaimer: I wrote DiabloMiner, that popular GPU Bitcoin miner everyone used
until ASICs took over.

~~~
klagermkii
Why do you think AMD would bother with something like
[http://www.anandtech.com/show/9792/amd-sc15-boltzmann-
initia...](http://www.anandtech.com/show/9792/amd-sc15-boltzmann-initiative-
announced-c-and-cuda-compilers-for-amd-gpus) so recently if OpenCL has become
the dominant platform?

Nvidia still doesn't even support OpenCL 2.0 from two years ago. Surely if
OpenCL was making significant gains they would be forced to support it,
especially since Nvidia is far more popular in any kind of HPC setup with
Tesla compared to FirePro?

~~~
DiabloD3
Nvidia has been having problems implementing OpenCL 2.0 (DX12 support has been
a higher priority for them, especially given that only Maxwell rev 2 properly
supports things like Rasterizer Ordered Views and Tiled Resources).

Intel has had OpenCL 2.0 support since August 2014, and AMD since September
2014.

OpenCL 2.0's interesting features are shared virtual memory and generic memory
space, which Nvidia hardware before Maxwell rev 2 may not be able to support,
and Nvidia may have to wait until Pascal to support it (via the NvLink and
unified memory feature set). Ergo, Nvidia may not have the hardware support to
add OpenCL 2.0 yet.

OpenCL 2.1 is more of the interesting one because they are adopting a SPIR-V
(Vulkan's IL representation)-based C++ subset, something Nvidia has been
leading the charge on.

OpenCL 2.1 was finalized last month, so I don't expect drivers to support it
properly until this time next year.

