
Hacking GCN via OpenGL - Impossible
https://onedrive.live.com/view.aspx?resid=EBE7DEDA70D06DA0!107&app=PowerPoint&authkey=!AD-O3oq3Ung7pzk
======
striking
> GPU-based OS, anyone?

This makes me... uneasy.

Incredible work, though. I had no idea GPU shaders were run as Von Neumann
programs. I always thought they were really tiny sets of math operations with
minimal branching, because it had to scale to a bunch of little cores. But
that's not true entirely, somehow!

~~~
RyanZAG
They can read and write memory (to access textures), do math (to calculate
output colors), and can jump (necessary for bounds checks, etc). That's pretty
much the definition of a Von Neumann program:

    
    
      program variables ↔ computer storage cells
      control statements ↔ computer test-and-jump instructions
      assignment statements ↔ fetching, storing instructions
      expressions ↔ memory reference and arithmetic instructions.
    

Maps pretty much exactly to what you need.

~~~
greggman
This is all true but ... AFAIK gpus are optimized for graphics and therefore
suck at general purpose programs. Sure it would be fun to get some general
code to run on them just as it's fun to make a gameboy or toaster run Linux.

There's also the issue the gpus are not premptable which kind of makes
preempitve multi-tasking hard

~~~
bjwbell
Modern gpus (last 5 years) are optimized for GPGPU in addition to graphics.
They also have preemptive multitasking on either a per batch or per workgroup
basis.

Intel's latest gpu architecture has an embedded OS running on the gpu for
scheduling command batches, I'm not sure what AMD and Nvidia do.

I still wouldn't write a general purpose OS for it.

~~~
greggman
I think we have different definitions of "preemptive multitasking". There is
no GPU I know of that can be preempted once given a command to draw. Once it
starts if that drawing command takes 30 seconds there's no preempting it. This
is why Windows has a timeout that resets the GPU if it doesn't respond. (I
believe other OSes have added that feature but I'm not 100% sure). Anyway,
I've yet to use a GPU or an OS that supports preempting the GPU. I'd be happy
to be proven wrong. I can also give you samples to test. It doesn't require
fancy shaders. All it requires is lots of large polygons in one draw call.

~~~
bjwbell
That's what I know too for graphics draw calls. For GPGPU there's been hard
work for finer grained preemption, last I looked into it (~1yr ago) on Linux
it was a work in progress to put it kindly.

If you're curious, lookup the Intel Broadwell GPU specs, there's sections
devoted to the various levels of preemption. If you're really curious look up
the workarounds needed for the finest grained preemption (this would be
preempting a single GPGPU draw call).

Then decide enabling fine grained preemption should probably wait for Skylake,
unless you took too much Adderall and no challenge sounds impossible. Do I
speak from personal experience? I plead the fifth.

I've no experience with how fine grained nvidia's preemption is.

------
bjwbell
Is there a good reason to disassemble the shader machine code instead of using
what's needed from the open source Linux/Mesa GCN driver code?

~~~
Sanddancer
Because Mesa is an OpenGL system, and as such, doesn't support some of the
instructions that are available for the card. Also, the Mesa GCN code didn't
support GCN 1.2 until a couple months ago, while the first GCN 1.2 cards were
released last year. Finally, as far as I can tell, Mesa doesn't expose an
assembler and linker for shader programs, so he'd probably spend more time
chopping Mesa up to get what he wanted.

~~~
nhaehnle
There is an assembler for GCN in the AMDGPU backend of LLVM, which is what
Mesa uses to compile shaders. I haven't actually tried to use the assembler
stand-alone, but for any serious work in that direction, it makes sense to use
that.

~~~
h3r3tic
It will definitely be useful for something more advanced, thanks! One feature
I did _not_ want was register allocation, as I needed full control over it. I
figured it would be quicker to roll something custom than wrangle a complex
piece of software to my needs.

------
sklogic
Nice! But I suspect ot would have been much easier with the OpenCL driver and
the current LLVM GCN backend.

~~~
h3r3tic
I suspect AMD's OpenCL driver uses very similar or the same headers. I opted
for OpenGL because the subsequent experiments I'm doing interact with
graphics, and require indirect dispatch.

