
Drop-in GPU Acceleration of GNU Octave - bprs
http://devblogs.nvidia.com/parallelforall/drop-in-acceleration-gnu-octave/
======
jordigh
Sigh, I wish people would address the problem here that nvidia's CUDA runtime
is non-free. There is no OpenCL free implementation that I know of either,
except Clover, but it currently merely translates the OpenCL to run on the CPU
instead of the GPU. I think I've heard of some free CUDA implementations that
were in similar embryonic stages.

Octave is a GNU package. GNU's purpose is to ensure that you can use free
software. Running to tie Octave to flashy features like GPU acceleration
without first pausing to fix the initial problem of non-free GPU acceleration
is putting the horse before the cart. This works against Octave's goal, to
provide a free alternative to Matlab, one that lets you understand and control
your computations down to the hardware level. If we don't emphasise software
freedom, then there is no need for Octave, since we already have Matlab.
Indeed, it is this very freedom that nvidia is abusing here to accelerate
Octave's BLAS libraries, a task that would be much more difficult with Matlab,
where they don't have the source code.

I know that nobody wants to even think that this problem exists and even fewer
people want to fix Clover because it's such a difficult task, but it's a task
that we can't ignore.

Note also that because the GPU libraries are not a system library as defined
by the GPL (they are not shipped with the OS), you can't even distribute GPU-
accelerated Octave object code. We consider the Octave C++ API to fall well
under the domain of the GPL's copyleft.

Personally, as a GNU Octave developer, I am very unhappy that nvidia is using
Octave to advertise its hardware and non-free drivers. I am also unhappy that
nvidia is luring users to use non-free software, acting against our goals. I
reiterate Linus Torvalds's well-known sentiments against nvidia.

~~~
CGudapati
Completely unrelated and Sorry for derailing but isn't it cart before the
horse? I mean the horse is always in front of the cart. I am not a native
English speaker. Am i missing anything?

~~~
graphene
You are correct.

------
zurn
The entire GPU programming landscape is so fragmented and unapproachable now.
If they'd only exposed the processors directly and worked to unify behind a
common compiler frontend... Imagine a world where a good GPGPU language was a
standard GCC/LLVM frontend and all the GPU vendors worked in harmony to
improve their back-ends in the upstream compiler. This would enable
development of new cross-vendor GPU languages to compete with the ailing
OpenCL.

(This is what Apple has done with Metal Shading Language btw, minus the open
source part. And what Intel's Larrabee would have enabled.)

Even if you disregard the licensing, the union of NV + AMD + Intel
capabilities (software stacks included) is so weak that it almost never worth
the effort. The comically bad software from AMD and lack of hardware oomph
from Intel means the only option is to require NVidia and build on their
software support. This is passable for a narrow sector of activities that can
tolerate vendor lock-in (Cuda), like short lived HPC projects.

All this has resulted in lack of open GPU programming languages. OpenCL is
better than nothing, but even if the implementations were of usable quality,
it's not a good compiler target for higher level languages and it's not a good
language to write by hand.

The situation pretty much guarantees GPU computing stays in the fringes for
the foreseeable future.

~~~
foxhill
> The entire GPU programming landscape is so fragmented and unapproachable
> now.

it's fragmented, true, but it's not that bad.

> If they'd only exposed the processors directly and worked to unify behind a
> common compiler frontend...

they don't need to, LLVM-SPIR is supposed to enable this - compile kernels to
IR and let the runtime JIT it into the GPUs required binary.

> And what Intel's Larrabee would have enabled.

intel did release the larrabee, sort of. it's called the Xeon Phi (and it
isn't exactly great..)

> The comically bad software from AMD

AMD have a well deserved reputation, but things aren't that bad now.

> OpenCL is better than nothing, but even if the implementations were of
> usable quality

i don't know what problems you run into specifically, but most of the runtimes
are definitely of usable quality. ironically, it's apple have the worst
runtime (but even that isn't _so_ bad).. think of that what you will!

things have improved in the OpenCL space, and they will continue to improve
for the foreseeable future. AMD and intel are doing good here, and nvidia
actually _do_ support OpenCL. questions of performance portability aside, CL
is a pretty good option for people wanting to run on GPUs.

~~~
zurn
LLVM-SPIR doesn't have any production quality implementations for the popular
hardware, nor are any announced. This nonwithstanding, it's only proof of the
fragmentation as it's merely one of many vapourware competitors on this
sector.

Xeon Phi is not Larrabee the GPU, it's a product of salvage & pivot from the
project resulting in a HPC sidecar. (HPC sector I adressed in my original
comment).

------
osivertsson
_We choose GNU Octave, since it supports single precision which the GK104
we’ll be running performs best on._

Is single-precision computations useful in the context where GNU Octave is
generally used?

~~~
valarauca1
Single vs Double precisions is fine.

Most of Matlab calculations are homework or simulations. Generally speaking
the real world will add more errors then double floating points will reduce.

Honestly GNU Octave has been a surprisingly fast developing product. Matlab
only got GPU acceleration 2 years ago (according to my buddy who uses Matlab)
(which is very fast for a GNU project). A lot of academics are picking up on
it, even professors recommending it.

I think the most _glowing_ recommendation I heard was, "Well its free, and
free goes along way when you're living on a research stipend."

~~~
wtallis
Double-precision is a good default for a tool like Octave, but I agree that
single-precision can be plenty accurate for a lot of uses. It's often just not
worth the added time and code complexity to determine where single-precision
should be used.

------
cieyo9938y88
I dont know much, but I REMEMBER the finger gesture of Linus or the LINUX
founder about no no vidia and the big hassles on no no vidia drivers for BSD
and FreeBSD. Python, MATLAB and closed source?? - Using Haskell on toy
examples can get FUNNY SUPRISE. FUNNY SURPRISE. FUNNY SURPRISE.

How about connecting the deep secret GPU to the no function Intel chip TSX
transaction and one safe thread, while observing the side channels and effects
for better GUESSING GAMES and understanding PSEUDO or fake random numbers?

~~~
codygman
> Using Haskell on toy examples can get FUNNY SUPRISE.

Funny surprises such as?

