
Let's Fix OpenGL [pdf] - mad
http://www.cs.cornell.edu/~asampson/media/papers/opengl-snapl2017-preprint.pdf
======
geokon
I'm not an OpenGL expert or anything, but I get the impression the author
doesn't really know what he's talking about and sounds a bit amateurish (I'm a
bit hesitant to say that given it's a professor at Cornell..)

He seems to just hand-wave and says "well just use C you idiots". He
criticizes Metal for using a version of C++14 that doesn't allow recursion,
but offers no alternate solutions

The reason GLSL isn't C is because you can't do everything C does on a low end
cellphone GPU - so obviously you have to restrict the language. The "cognitive
load" of knowing the restrictions is overblown but also unavoidable.

SPIRV isn't even mentioned. DSL's are dismissed.

I didn't really follow what he didn't like about the preset input/output
variables that are predefined in the different shaders. It's a bit ugly.. but
that's also pretty much a non issue.

~~~
withjive
> The reason GLSL isn't C is because you can't do everything C does on a low
> end cellphone GPU - so obviously you have to restrict the language.

GLSL limitations has nothing to do with lowend devices and performance (which
now is becoming irrelevant compared to desktops).

But rather GLSL is executed on the GPU— typically across many stream
processors— this type of parallelization makes using a general purpose
language like C difficult.

I think you need to familiar yourself with some shader programming at least,
before you critique the author's domain.

~~~
jwatte
I think he is familiar. The "you can't do" comment really means "you can't do
with good performance." Typical C code would not run fast on a GPU. Indirect
data (pointer chasing) has /extreme/ latency penalty!

------
33a
This author seems pretty clueless about how things actually work and a lot of
the assumptions are just plain wrong. For example, ubershaders are preferred
instead of many small shaders because the overhead of switching shaders and
recompiling them is expensive. It is not something that people build because
it is convenient (rather it is quite the opposite!) and then specialize with
fixed parameters.

------
vvanders
As someone who's a fan of PL languages(and spends a fair amount of time in the
GPU space) I'm not sure I buy many of the arguments.

The reason that GPU drivers/APIs have few safety checks is that in graphics
code, performance is valued above all else. Even simple calls can introduce
overhead that's undesirable when you're making thousands of the same type of
calls.

His example of baked shaders doesn't really seem to hold much value since
interactive shader builders(ShaderToy, UE3/4, etc) are all content driven
anyway so the extra code generation isn't a limiting constraint.

Nice effort but I don't see it solving actual pain points in production.

~~~
fulafel
If performance was valued above all else in a narrow sense, OpenGL would not
be used and people would program GPUs natively.

In a wide sense that includes programmer productivity and end product
robustness in the equation, well, safety checks sell themselves.

Bounds checking is very cheap on CPUs, but even more so on GPUs because GPUs
are more rarely bottlenecked on simple local arithmetic.

~~~
pandaman
>If performance was valued above all else in a narrow sense, OpenGL would not
be used and people would program GPUs natively.

People do program GPU natively on game consoles. However, with so many GPUs in
use (at least three major manufacturers each with multiple architectures just
for desktop) it's impractical to write native graphics code. It's exactly the
same concern as people have for CPU programming - it would be the best to
write ISA code but given the variety of available CPUs the C/C++ is the best
compromise between performance and usability. All attempts to push "better"
language with forced run-time costs have failed to displace it.

>Bounds checking is very cheap on CPUs, but even more so on GPUs because GPUs
are more rarely bottlenecked on simple local arithmetic.

Good news. GPUs do bounds checking in hardware already. The errors people want
OpenGL to find are not about out-of-bounds but mostly of "I want to draw this
but instead it's drawing that, what to do?" kind, caused by the complex state
and its dynamic nature.

------
nighthawk454
John Carmack weighed in on Twitter:

> ...some interesting thoughts, but the shading language is the least broken
> part of OpenGL.

> Lots of people consider automating the computation rate determination
> between fragment and vertex shaders, but it is a terrible idea.

[https://twitter.com/ID_AA_Carmack/status/851258064909070336](https://twitter.com/ID_AA_Carmack/status/851258064909070336)

------
nhaehnle
A concrete problem that the author misses is the need for a better
understanding of SPMD semantics. GLSL has the notion of "dynamically uniform"
values, i.e. values that are the same across all shader threads that arise
from one draw call, but this notion isn't really properly defined anywhere. It
involves an unholy mixture of data flow and control flow that doesn't seem to
appear anywhere else in PL theory.

Stuff kind of just works because GLSL doesn't have unstructured control-flow
(i.e., there's no goto), and people have a mental model of what the hardware
actually does and use that for the semantics.

But a proper study of those semantics, and how to carry it over to
unstructured control-flow, or to what extent it is possible, would be awesome.

------
shurcooL
> Potential solutions. Shader languages’ needs are not distinct enough from
> ordinary imperative programming languages to warrant ground-up domain-
> specific designs. They should should instead be implemented as extensions to
> general-purpose programming languages. There is a rich literature on
> language extensibility [27, 36, 39] that could let implementations add
> shader-specific functionality, such as vector operations, to ordinary
> languages.

I like this part.

~~~
flohofwoe
Apple's Metal shader language is C++14 with some restrictions, and extensions
(such as native matrix and vector types), implemented with LLVM:

[https://developer.apple.com/metal/metal-shading-language-
spe...](https://developer.apple.com/metal/metal-shading-language-
specification.pdf)

Microsoft's HLSL is now also based on LLVM:

[https://blogs.msdn.microsoft.com/directx/2017/01/23/new-
dire...](https://blogs.msdn.microsoft.com/directx/2017/01/23/new-directx-
shader-compiler-based-on-clangllvm-now-available-as-open-source/)

Khronos has an LLVM-based translator between LLVM bitcode and SPIR-V:

[https://github.com/KhronosGroup/SPIRV-
LLVM](https://github.com/KhronosGroup/SPIRV-LLVM)

So things are converging, may be one day GPU extensions will end up in the C++
standard ;)

~~~
nicwilson
Sadly Khronos' LLVM only supports OpenCL i.e. compute SPIRV code generation,
is rather old (3.6.1) and is nowhere near in shape enough to be upstreamed,
and MS's in only slightly newer at 3.7, the the repo also contains a custom
clang and so would require some git surgery to get it upstream once stable.
Apple's metal I don't think is even open source.

However I have a an up to date fork of LLVM 5
([https://github.com/thewilsonator/llvm](https://github.com/thewilsonator/llvm))
that has Khronos' changes cleaned up a bit, i.e. the spirv triples are
actually targets, but once I make the SPIRV OpenCL extension operations proper
LLVM intrinsics instead of mangled C++ (and delete all the associated mangling
code) then there is no reason that can't go upstream.

I don't think SPIRV graphics support will be all that difficult to add once
the intrinsics nonsense is fixed.

> So things are converging, may be one day GPU extensions will end up in the
> C++ standard ;)

Not before D gets them ;) (Shameless plug: I will be speaking at DConf about
this.)

------
anderspitman
As someone who recently start learning to program GPUs, I enjoyed this read. I
particularly find the concept of a linear algebra-aware type system
compelling. I love the idea of the type system statically checking that I'm
performing operation in and between correct vector spaces. Is the fact that
Vulkan uses SPIR-V sufficient to support creation of languages to allow this
to be implemented?

~~~
johncolanduoni
If all you want is more static checking, it doesn't much matter if the
graphics API consumes GLSL or SPIR-V. SPIR-V is mostly about factoring out
part of the optimization phase from the driver to the compilation process.
This speeds up the process of loading a shader, and more importantly lets
developers control aspects of optimization they couldn't before.

------
d33
Flicking through the article made me wonder - would it be possible to have
something like ACID tests for OpenGL?

~~~
cmrx64
[https://people.freedesktop.org/~nh/piglit/](https://people.freedesktop.org/~nh/piglit/)
is an extensive open conformance test suite for opengl. Khronos has their own
([https://github.com/KhronosGroup/VK-GL-
CTS](https://github.com/KhronosGroup/VK-GL-CTS), used to be proprietary), but
judging by the amount of bugs and piglit failures in drivers which supposedly
pass it, I think it's probably not very useful.

~~~
nhaehnle
Be aware that the open-source GL tests in vk-gl-cts still contain quite a lot
of bugs from porting from the old framework that was used in proprietary test
suite releases.

Piglit isn't really a _conformance_ test suite. It is a test suite, is usually
extended when people add new features to Mesa, and collects regression tests
over time. However, it actually started out in part as a way to modify glean
tests that drivers _couldn 't_ pass because the hardware was lacking, for
example, hardware didn't have enough precision in the blender... those were
the days. The p in piglit stands for your choice of pragmatic or practical.
I'm not sure if that's actually documented anywhere, but as the original
author, I should know ;-)

I'm sure piglit has bugs, but it's also well known that certain closed source
drivers have less than conformant GLSL compilers, for example - so the fact
that drivers passing the Khronos conformance tests fail piglit tests in itself
doesn't mean anything other than shit needs investigating, and occasionally
needs spec clarifications.

~~~
cmrx64
Fair enough, and good to hear from you :) piglit made my life better when I
was doing mesa work.

------
nice_byte
I don't think I fully understand the point about metaprogramming facilities.
Sure, it would be nice to have compile-time ifs that get eliminated from the
generated code if the condition isn't met. But I don't think this necessarily
solves the problem of "combinatoric explosion" of different shader variants -
you still have to generate a separate chunk of code for each possible
combination of compile-time conditions. Unless a corresponding change is
proposed at the level of shader bytecode (SPIR-V), which probably leads to
opening a can of worms.

~~~
AlphaSite
Im fairly certain shaders can and do eliminate compile time if statements, you
can observer this by seeing which uniforms are defined (for example in WebGL).

~~~
nice_byte
You mean if we write

if (some_constant_expression) { //...code... }

and some_constant_expression evaluates to false, the entire code block will be
stripped from the result?

That may be true, but it's not standardized behavior, is it?

~~~
JoeAltmaier
True for many compilers for many years. Its a way to integrate 'conditional
compilation' into the normal program flow. It can make the code easier (or
harder) to read. Use judiciously.

------
nice_byte
While I agree with the points about type safety with uniforms/attributes, I've
found that this class of bugs didn't happen to me all that often in practice.
The bug that happens way more often is calculations in the wrong coordinate
system (or using two values in different coordinate systems in the same
calculation), which the author also points out.

------
HugoDaniel
I find it hard to test opengl. That would be my first approach to fix it:
provide a good way to debug and test the compiled programs.

~~~
dom0
Eh, you can already debug kernels on a line-stepping level. What exactly do
you need more?

~~~
slavik81
That sounds pretty awesome to me. What tool should I be using to do that for
Intel Haswell on Linux?

~~~
dom0
CodeXL can do that (at least on Windows)

I think VTune can do OpenCL analyses, I don't know whether it has any graphics
debugging capability (quite a lot of features are Windows-only or annoying to
set up on Linux [because who needs a kernel with a stable ABI?], including
managed analyses [1] :(

The simple truth here is that most development happens on Windows and thus
most development tools are developed for Windows. This is almost universally
true, with some notable exceptions (e.g. Valgrind can be a killer app; nodejs
is so unportable that it requires a new Windows subsystem to run).

[1] That is, VTune detects whether and which invasive runtime you are using
and is able to descend into and analyse what the runtime is doing. E.g. with a
Python managed analysis VTune replaces all these recursive gibberish calls
like PyEval_EvalFrameEx with the Python traces that it's executing. When I
first used that mind=blown. Sadly, Windows only. But even then VTune is,
without consciously trying to make this comment an advert, one of the best (if
not the best) performance analysis tools I've ever used.

