In contrast, piet-gpu (the subject of the original blog post) has high enough path rendering quality (and performance) to render glyphs purely on the GPU. This makes it clear you didn't even perform a cursory investigation of the project before making a comment to dump on it and promote your own library.
For text, I think bitmaps are better than splines. I can see how splines are cool from a naïve programmer’s perspective, but practically speaking they are not good enough for the job.
Vector fonts are not resolution independent because hinting. Fonts include a bytecode of compiled programs who do that. GPUs are massively parallel vector chips, not a good fit to interpret byte code of a traditional programming language. This means whatever splines you gonna upload to GPU will only contain a single size of the font, trying to reuse for different resolution will cause artifacts.
Glyphs are small and contain lots of curves. Lots of data to store, and lots of math to render, for comparatively small count of output pixels. Copying bitmaps is very fast, modern GPUs, even low-power mobile and embedded ones, are designed to output ridiculous volume of textured triangles per second. Font face and size are more or less consistent within a given document/page/screen. Apart from synthetic tests, glyphs are reused a lot, and there’re not too many of them.
When I started the project, the very first support of compute shaders on Pi 4 was just introduced in the Mesa upstream repo. Was not yet in the official OS images. Bugs are very likely in versions 1.0 of anything at all.
Finally, even if Pi 4 had awesome support for compute shaders back them, the raw compute power of the GPU is not that impressive. Here in my Windows PC, my GPU is 30 times faster than CPU in terms of raw FP32 performance. With that kind of performance gap, you can probably make GPU splines work fast enough, after spending enough time on development. Meanwhile, on Pi 4 there’s no difference, the quad-core CPU has raw performance pretty close to the raw performance of the GPU. To lesser extent same applies to low-end PCs: I only have a fast GPU because I’m a graphics programmer, many people are happy with their Intel UHD graphics, these are not necessarily faster than CPUs.
> > This makes it clear you didn't even perform a cursory investigation of the project
> I did, and mentioned in the docs, here’s a quote: “I didn’t want to experiment with GPU-based splines. AFAIK the research is not there just yet.”
Not what I said. I said that you didn't investigate the project discussed in the original blog post before declaring, in your words, that "the quality is not good" and comparing it to your own library.
Vrmacs and piet-gpu are two totally different types of renderer. Vrmacs draws paths by decomposing them into triangles, rendering them with the GPU rasterizer, and antialiasing edges using screen-space derivatives in the fragment shader. This approach works great for large paths, or paths without too much detail per pixel, but it isn't really able to render small text, or paths with a lot of detail per pixel, with the necessary quality. (Given this and the other factors you mentioned in your reply, rendering text on the CPU with Freetype is a perfectly reasonable engineering choice and I am not criticizing it in the slightest.)
In comparison, piet-gpu decomposes paths into line segments, clips them to pixel boundaries, and analytically computes pixel coverage values using the shoelace formula/Green's theorem, all in compute shaders. This is more similar to what Freetype itself does, and it is perfectly capable of rendering high-quality small text on the GPU, in way that Vrmacs isn't without shelling out to Freetype.
Again, to be clear, I'm not criticizing any of the design choices that went into Vrmacs; it looks like it occupies a sweet spot similar to NanoVG or Dear ImGui, where it can take good advantage of the GPU for performance while still being simple and portable. My only point here is that you performed insufficient investigation of piet-gpu before confidently making an uninformed claim about it and putting it in a somewhat nonsensical comparison with your own project.
Oh, you were asking why I said so? Because I have clicked the “notes document” link in the article, the OP used the same tiger test image as me, and that document has a couple of screenshots. And these were the only screenshots I have found. Compare them to screenshots of the same vector image rendered by my library, and you’ll see why I noted about the quality.
> Vrmacs draws paths by decomposing them into triangles, rendering them with the GPU rasterizer, and antialiasing edges using screen-space derivatives in the fragment shader.
More or less, but (a) not always, thin lines are different. (b) that’s a high-level overview but there’re many important details on the lower levels. For instance, “screen-space derivatives of what?” is an interesting question, critically important for correct and uniform stroke widths. The meshes I’m building are rotation-agnostic, and to some extent (but not completely) they are resolution-agnostic too.
> and it is perfectly capable of rendering high-quality small text on the GPU
It is, but the performance overhead is massive, compared to GPU rasterizer rendering these triangles. For real-world vector graphics that doesn’t have too much stuff per pixel that complexity is not needed because triangle meshes are good enough already.
> it looks like it occupies a sweet spot similar to NanoVG
They’re similarities, I have copy-pasted a few text-related things from my fork of NanoVG: https://github.com/Const-me/nanovg/ However, Vrmac delivers much higher quality of 2D vector graphics (VAA, circular arcs, thin strokes, etc), is much faster (meshes are typically reused across frames, I use more than 1 CPU core, and the performance-critical pieces are in C++ manually vectorized with NEON or SSE), and is more compatible (GL support on Windows or OSX is not good, you want D3D or Metal respectively).
The document explains above the tiger image (like, directly above it) that it is a test image meant to evaluate a hypothesis about fragment shader scheduling:
> Update (7 May): I did a test to see which threads in the fragment shader get scheduled to the same SIMD group, and there’s not enough coherence to make this workable. In the image below, all pixels are replaced by their mean in the SIMD group (active thread mask + simd_sum)
I cloned the piet-gpu repository and was able to render a very nice image of the Ghostscript tiger: https://imgur.com/a/swyW0gl
Way better than in the article, but still, I like my results better.
The problematic elements are thin black lines. On your image the lines are aliased, visible for the lines which are close to horizontal but not quite. And for curved thin lines, results in visually non-uniform thickness along the line.
The original piet-metal codebase has a tweak where very thin lines are adjusted to thicker lines with a smaller alpha value, which improves quality there. This has not yet been applied to piet-gpu.
One of the stated research goals [1] of piet-gpu is to innovate quality beyond what is expected of renderers today, including conflation artifacts, resampling filters, careful adjustment of gamma, and other things. I admit the current codebase is not there yet, but I am excited about the possibilities in the approach, much more so than pushing the rasterization pipeline as you are doing.
I have doubts. The reason why rasterizers are so good by now — games been pushing fillrate, triangles count, texture samplers performance and quality for more than a decade.
Looking forward, I’d rather expect practical 2D renderers using the tech made for modern games. Mesh shaders, raytracing, deep learning, and even smaller features like sparse textures. These are the areas where hardware vendors are putting their transistors and research budgets.
None of the features you mentioned is impossible with rasterizers. Hardware MSAA mostly takes care about conflation artifacts, gamma is doable with higher-precision render targets (e.g. Windows requires FP32 support since D3D 11.0).
This isn't really accurate; the immediate-mode approach you're talking about was deprecated in OpenGL 3.0 and was never supported in OpenGL ES, and in fact, Direct3D started out with an immediate-mode API just like OpenGL before moving to retained vertex buffers.
Exactly. In some interfaces you'll be eventually switched back to "Home" automatically. You can tell this has happened because Twitter becomes less awesome. When I find myself thinking, "Why does Twitter suck today?" I scroll up and find that I'm back on "Home".
As a simple demonstration, here's an image of a 1px white-on-black line and a 1px black-on-white line: https://imgur.com/a/9d9Tu3x
It's axis-aligned and box-filtered so there are no shades of gray, which means gamma is irrelevant. The white-on-black line appears to have a greater thickness than the black-on-white line.
There's a lot of empirical research showing that reading performance is better with positive-polarity (black-on-white) text than with negative-polarity text [1, 2], probably because the higher overall luminance results in a smaller pupil size and thus a sharper retinal image [3]. So, white-on-black lines appear thicker than black-on-white lines because the eye doesn't focus as sharply on them. This is true regardless of which color space blending is performed in.
Given this fact, if one wants to achieve uniform perceptual line thickness for black-on-white and white-on-black text, a more principled approach than messing with color blending would be to vary line width or the footprint of the antialiasing filter based on the luminance of the text (and possibly the background). This is the approach Apple and Adobe have taken for years with stem darkening/font dilation.
One caveat, if viewing this image on a high-dpi display there will be blurring from image upsampling by the browser. My eyes have the same result as yours: the white line appears thicker than the black one.
Here's the interesting thing about your observation: doing alpha compositing of text in a perceptual space (as opposed to a linear space) results in a thickening of black-on-white and a thinning of white-on-black. So doing gamma "wrong" actually results in better perceptual matching.
Do you have evidence that either Apple or Adobe varies the amount of stem darkening based on text luminance? I've tested Apple (macOS 10.12 most carefully) and have not seen this.
I don't think they actually take luminance into account (although I personally like the idea); I just meant that they solve the problem of "black-on-white text looks too thin" by thickening the text, rather than messing up other parts of the pipeline.
Absolutely, and I think there's a very strong case to be made for that. One of the points I'm trying to make is that you have to solve it somewhere. People who focus narrowly on "correct gamma" often miss this, especially when "incorrect gamma" is also plausibly a workable solution.
I notice that your renderer doesn't even attempt to render text on the GPU and instead just blits glyphs from an atlas texture rendered with Freetype on the CPU: https://github.com/Const-me/Vrmac/blob/master/Vrmac/Draw/Sha...
In contrast, piet-gpu (the subject of the original blog post) has high enough path rendering quality (and performance) to render glyphs purely on the GPU. This makes it clear you didn't even perform a cursory investigation of the project before making a comment to dump on it and promote your own library.