
New Libre GPU Effort Based on RISC-V - xvilka
https://www.phoronix.com/scan.php?page=news_item&px=Libre-GPU-RISC-V-Vulkan
======
monocasa
I don't have particularly high hopes for this. "Start looking at HW" is their
last step in the roadmap. GPU microarchitectures are all about being a
reflection of the hardware, amortizing a lot of traditional CPU components
across as many threads as you can, to such a degree that there are ISA
implications.

EDIT: What would be cool would be a more traditional GPU architecture, married
to RISC-V minion looking cores that today already exist, but aren't open at
all. There's a lot of very closed (even more so than shader unit) RISC
processors in GPUs. AMD has a RISC core reading and processing the command
lists, among other things. [https://github.com/fail0verflow/radeon-
tools/tree/master/f32](https://github.com/fail0verflow/radeon-
tools/tree/master/f32) PowerVR has one doing thread scheduling and argument
marshaling (Programmable Data Sequencer). Nvidia has their cores that they
just switched to RISC-V.

Imagine having a custom command list tailored to your application sort of like
the benefits custom N64 RSP microcode had.

~~~
abdullahkhalids
> GPU microarchitectures are all about being a reflection of the hardware,
> amortizing a lot of traditional CPU components across as many threads as you
> can...

Is that so bad? I would think the point of the project is to demonstrate proof
of principle, not to actually build something usable. So efficiency would be
pretty low on the list of priorities.

~~~
monocasa
The whole point of a GPU is efficiency. If you don't care about that you can
already run OpenSWR on a rocket core and call it good.

------
petermcneeley
I have recently done a realtime tile software rasterizer and shader in SSE.

Graphics APIs (openGl, dx) are leaky in that the graphics programmer is
knowingly targeting an architecture with GPU performance properties. This
means raster is cheap, texture and filter is cheap, free fixed function
blending and saturations, free clears, and free compressions.

CPU rendering cannot hope to compete against GPU hardware as GPU style
optimizations were made by the graphics programmer in their usage of the GPU
APIs to render 3d scenes.

------
snaky
EOMA68 September update [http://lists.phcomp.co.uk/pipermail/arm-
netbook/2018-Septemb...](http://lists.phcomp.co.uk/pipermail/arm-
netbook/2018-September/015830.html)

------
nickik
For those interested in the RISC-V and the possibility of GPU type of things.
I can mention some stuff that might not be well known.

RISC-V is devloping a Vector extension (V) will allow SIMD style programming
with a variable vector length.

See from the last workshop:

Intro:
[https://www.youtube.com/watch?v=S4fxBZD79gc](https://www.youtube.com/watch?v=S4fxBZD79gc)

Project Update:
[https://www.youtube.com/watch?v=ESu9NI3h1Y4](https://www.youtube.com/watch?v=ESu9NI3h1Y4)

Initially part of the standard was also a Vector Type field for each register,
and that would have allowed different types, such as Tensors, Matrix and so
on.

This has been removed from the initial Vector extension proposal but work on
this will continue. At least one company is already actively devloping
hardware for V with Tensors (see below), sadly not open-source.

This Libre GPU project is going a slightly different route targeting the
Simple-V, a slimmer version of the Vector extension, Simple-V. See:
[http://hands.com/~lkcl/simple_v_chennai_2018.pdf](http://hands.com/~lkcl/simple_v_chennai_2018.pdf)

Esperanto Technology is devloping a chip (and IP) that will do this, but it
will be closed source (as far as we know).

See this talk by David Ditzel:
[https://www.youtube.com/watch?v=f-b4QOzMyfU](https://www.youtube.com/watch?v=f-b4QOzMyfU)

------
rbanffy
I _love_ this idea. It's kind of a Xeon Phi, but with RISC-V and an ISA that
could be tailored for different functions.

Imagine a processor with a couple beefy RISC-V cores, with lots of memory
bandwidth and deep pipelines, sharing the system with some more cores that are
more power efficient (but slower) and some more cores that have very wide SIMD
pipelines, but sacrifice branch prediction and speculative execution for that.

I'd love to program such a beast.

~~~
zokier
Well, on the other hand the article is very light on details; indeed it
doesn't even say that the proposed GPU would be massively multicore in the way
Larrabee/XeonPhi was.

~~~
thechao
Also, RISC-V completely defeats the benefits that Mike Abrash & co. designed
into the ISA. LRBni was x86 _based_ , but certainly had a lot of highly-CISC-y
features. At minimum, fused load-op & store-op, in addition to replicated mask
setting. For instance, one of the most important instruction primitives was
the instruction 'addsets'; for registers A, B, and K

    
    
        A += B
        sets K to the sign bit of scalar float in A
    

Which is an odd duck, except if you knew this function's nickname: "the
rasterizer". LRBni was chock full of these instructions; they were added
because Mike _knew_ what the hotspots were in building SW rasterizers after
decades of experience.

There were a few other instructions that were implemented (MAD233) and not
implemented (full permute) that were needed to finish out the performance
profile.

In addition, LRB was designed so (almost) every instruction retired in 4
stages. Each 'core' was a 4-wide barrel shifter so, except for a wart dealing
with RAW mask register ops, all instructions (including fused test-jmps)
retired "the next cycle".

LRB died on it's shit backbone (the triple ring). If they'd had a proper
message-passing cache, with a parallel scratch RAM next to the cache hierarchy
it'd have knocked everyone's socks off---even in 2009, three years late.

~~~
monocasa
For those curious, but without the context enough to google the right terms,
addsets is pretty obviously the inner loop of a barycentric coordinate
rasterizer. With the right setup, K will be set based on whether or not a
pixel is inside a triangle.

------
MisterTea
This is something I have been interested in for a long time. I was hoping that
if Risc-V were to split off to a GPU we would get a very similar set of
instructions for both the CPU and GPU allowing us to use one compiler.

------
agumonkey
let's see if an open source gpu would unleash user friendly oss machines.

