
Show HN: GPU text rendering with vector textures - wjd
http://wdobbie.com/post/gpu-text-rendering-with-vector-textures/
======
pcwalton
This is really neat. Though in practice if you are using this kind of thing
I'm pretty sure you will want to combine it with atlasing (i.e. construct an
atlas of glyphs via render to texture). That's because it's a waste of time to
rerasterize glyphs in the FS every frame (which appears to be 200+ lines)
instead of caching the results and reducing the per-frame work to a 5-line FS
that just blits. In most apps pans are more common than zooms and the same
glyphs are used over and over, so this usually ends up being a win. Using
atlases also reduces the number of raster operations because the same glyphs
tend to appear on the page repeatedly, and by using atlases you rasterize each
glyph only once. Even more importantly, though, atlasing gives you the ability
to reuse a blitting fragment shader to render other content (images, figures,
etc.) as well as glyphs from the atlas in order to reduce state changes.

~~~
derefr
There's also the higher level—the thing web browser rendering engines do,
where you generate a texture for each "tile" (single-layer viewport rectangle)
of likely-to-be-static text, and carry those around until you have to zoom or
the text changes.

I'm not sure if you need the medium-level (font atlasing) if you have both
lower-level (bezier-curve atlasing), and higher-level (pre-composited
"tiles.")

This is one of the reasons I really like WebGL, actually: for GUI elements,
you can usually skip rendering them _within_ the game altogether, instead
using plain HTML+CSS controls positioned _above_ the <canvas>, where they'll
be treated as separate "tiles" that aren't dirtied by whatever your game is
doing. Imagine: relatively-static fully-RGBA-translucent stuff sitting on top
of your animating viewport, composited into a final image each frame, "for
free."

~~~
pcwalton
> There's also the higher level—the thing web browser rendering engines do,
> where you generate a texture for each "tile" (single-layer viewport
> rectangle) of likely-to-be-static text, and carry those around until you
> have to zoom or the text changes.

I think this is actually a harmful approach in most cases--all it does is
reduce the number of vertices and slightly reduces overdraw in exchange for a
lot more memory consumption and a lot more draw calls. Overdraw can be reduced
in other ways, such as sorting front to back and taking advantage of early Z
for opaque content. Web pages and PDFs are so totally not bound on either ROPs
or VS, and are very sensitive to draw call count, that caching tiles to
minimize overdraw is almost never worth it.

But this is somewhat off topic.

> I'm not sure if you need the medium-level (font atlasing) if you have both
> lower-level (bezier-curve atlasing), and higher-level (pre-composited
> "tiles.")

You definitely do. You're sacrificing a lot of performance for the rendering
of each tile if you don't do that.

~~~
derefr
> Overdraw

I don't think we're imagining the same things here. Think less '3D landscape
with text floating in it for some reason' (e.g. SecondLife), and more '3D
landscape with _labels_ applied to it' (e.g. Google Earth). Nothing from the
"VR" layer obscures the "labels" layer; the labels float above everything, as
if part of the GUI, but following the same scale as the things they label.

> PDFs

I'm actually unsure why a (non-interactive) PDF page, displayed at 100% zoom,
can't be baked into a _single_ extremely-high-resolution tile—applied as a
texture to a single GL rectangle with plain-old lanzcos downsampling. That'd
work for the overwhelming majority of the frames.

You'd only need to re-render the tile when you zoom in enough that it'd look
bad; effectively, you could think of the PDF page as coming from a mipmapped
texture, but where you only have the current size and will just-in-time
rerender when the zoom-factor changes. And then, when the zoom factor makes
the PDF not fit on the screen, the "tile" would be the screen size—so you'd
re-render sibling tiles when the user moves.

Or, in other words: PDFs can be rendered exactly the way the image tiles on a
mapping website are rendered, can't they? Doesn't that approach win over re-
rendering _all_ the text from a font-atlas every frame? And if not, _why do
browser rendering engines use tiles_? (This isn't a rhetorical question to
make my point; I'm not a graphics dev and I honestly don't see why.)

~~~
pcwalton
> I'm actually unsure why a (non-interactive) PDF page, displayed at 100%
> zoom, can't be baked into a single extremely-high-resolution tile—applied as
> a texture to a single GL rectangle with plain-old lanzcos downsampling.
> That'd work for the overwhelming majority of the frames.

Sure, but that uses a lot of memory, and initial pageload is going to be
really slow. Especially when you consider you need to generate mipmaps if it's
really high resolution, or else blow out your GPU's L1/L2 cache in FS
execution.

> Doesn't that approach win over re-rendering all the text from a font-atlas
> every frame?

Not significantly, in my experience. Assume, in the simple case, that you're
just rendering a bunch of text with no overlap. In that case, the only thing
that tiles buy you over rerendering all the text from an atlas is decreased
vertex count. You're still touching the same number of pixels in the FS and
ROP units either way: you just do so with more vertices in the texture atlas
case and fewer vertices in the tiling case. Now consider that creating and
maintaining the tiles has costs over maintaining the glyph atlas (which you
have to do either way): you have the memory of the tiles, the overhead of
creating and switching FBOs, lots of little textures to keep around (which in
naive implementations results in tons of state changes), and extra draw calls
to render the tiles after rendering the content.

> why do browser rendering engines use tiles?

Mostly because it was an easy way to fit GPU-accelerated panning and zooming
into the existing, originally CPU-based, rendering architectures that browser
engines used (and still largely use). Tiling was popularized by Mobile Safari
on the iPhone in 2007 as an easy way to avoid repainting the entire page on
the CPU every frame when the user performed touch gestures. I don't think it's
necessarily a globally optimal decision.

------
mattdesl
Great work! Is there any WebGL source available?

The demo runs very poorly on my 15" MBP (late 2013, Intel Iris Pro). I'm not
sure if it's just the sheer number of glyphs being rendered at once, or
whether it's something specific that this GPU is having trouble with.

Some related links for those interested in WebGL text rendering:

1 - [https://github.com/Jam3/three-bmfont-text](https://github.com/Jam3/three-
bmfont-text)

2 - [http://mattdesl.svbtle.com/material-design-on-the-
gpu](http://mattdesl.svbtle.com/material-design-on-the-gpu)

3 - [https://github.com/mattdesl/text-
modules](https://github.com/mattdesl/text-modules)

~~~
microcolonel
It's probably Apple's custom intel HD graphics driver. I'm getting 60fps on an
intel HD Graphics 4400 on linux.

~~~
spoiler
Whole PC freezes (except the cursor) with GeForce GT 610 in Chrome (Linux) on
the demo. Can't even switch to another tty to kill the process. The PC is
average otherwise (AMD FX 8320 with 8GB of RAM).

------
jvuletich
I did a simpler, potentially faster approach that also does subpixel AA:
[https://dl.dropboxusercontent.com/u/13285702/M3.png](https://dl.dropboxusercontent.com/u/13285702/M3.png)
[https://dl.dropboxusercontent.com/u/13285702/M3-TTF8.png](https://dl.dropboxusercontent.com/u/13285702/M3-TTF8.png)
[https://dl.dropboxusercontent.com/u/13285702/morphic3-jenson...](https://dl.dropboxusercontent.com/u/13285702/morphic3-jenson.png)
[https://dl.dropboxusercontent.com/u/13285702/Morphic3-TimesN...](https://dl.dropboxusercontent.com/u/13285702/Morphic3-TimesNewRomanSample.png)
[http://www.jvuletich.org/Morphic3/Morphic3-201006.html](http://www.jvuletich.org/Morphic3/Morphic3-201006.html)
[http://www.defensivepublications.org/publications/prefilteri...](http://www.defensivepublications.org/publications/prefiltering-
antialiasing-for-general-vector-graphics)

------
sho_hn
Another one:
[https://github.com/behdad/glyphy](https://github.com/behdad/glyphy) (SDF, but
full vector outlines, not texture sampling)

~~~
microcolonel
I'm actually quite surprised this author didn't mention Glyphy.

------
jacobolus
This paper from 2014 is the best recent work I’ve seen about rendering vector
graphics on the GPU:

[http://w3.impa.br/~diego/projects/GanEtAl14/](http://w3.impa.br/~diego/projects/GanEtAl14/)

------
orangeduck
I really like this idea, but bezier curves are quite computationally difficult
to use. There are many representations of these kind of continuous shapes
which are much easier for this kind of processing.

For example I would try converting the bezier representation into an implicit
HRBF representation (here is a good explaination in 3D [http://rodolphe-
vaillant.fr/?e=12](http://rodolphe-vaillant.fr/?e=12)). This representation
should be much easier to process on the GPU - checking how much the point is
inside/outside the glyph should only be one matrix multiplication which would
make the computation in the actual shader really really fast.

------
watmough
Exactly what I needed, just as I was about to start screwing with distance
fields or the like. THANK-YOU!

Edit: I may have spoken slightly too soon, not since the technique isn't
awesome, but because this isn't yet usable without an atlas generation tool
that I don't know how to build.

As far as performance goes, this is rendering a 124-page PDF (I know, just the
glyphs...) utterly without lag using the GPU, on a 16 Gbyte quad i7 box that
can barely render a couple of pages under OS X Preview without significant
pauses. Impressive and Incredible.

~~~
davej
It's _very_ laggy and slow on a quad i7 16GB Macbook Pro. Presumably this is
because of the retina display.

~~~
watmough
Here's my experience on my local devices:

    
    
      * Quad i7 16 GBytes GTX 970 OSX FFox - perfectly smooth
      * iPhone 6s+ - very smooth
      * Nexus 6p - slightly jumpy / laggy but still impressive
      * Lumia 640 Windows Phone 10 - dFdx undeclared identifier (120,33)
    

There may be issues with webgl on the MBP if it's using Intel graphics, but
I'd be pretty disappointed to get worse performance than an iPhone ...

For the curious, the Lumia error appears to stem from this:
[https://groups.google.com/forum/embed/#!topic/angleproject/-...](https://groups.google.com/forum/embed/#!topic/angleproject/-CcKYvupr80)

~~~
davej
Yeah my rMBP uses an embedded Intel graphics card. The performance is
horrible, ~2 fps, tried FF and Chrome.

~~~
microcolonel
Performance is pretty decent (around 48fps for a full page with lots of
glyphs, 1080p) on my Intel HD 4400. This is on linux though. I hear the OS X
intel drivers have some shader compiler pitfalls.

------
rayiner
See also: [http://www.msr-waypoint.net/en-
us/um/people/cloop/LoopBlinn0...](http://www.msr-waypoint.net/en-
us/um/people/cloop/LoopBlinn05.pdf) (work done on the subject at Microsoft
Research, with more of the math described).

~~~
watmough
Just had a quick read though that paper, and it doesn't really look like the
same thing.

This technique treats a pixel as a unit circle, coloring it according to its
length of intersection, along multiple axes, with a glyph simply described by
3 point beziers, oriented to keep 'inside' on the right.

A point fully inside a glyph has a trivial intersection, and is fully shaded.

Points on the edge of a glyph intersect a bezier, with shading of the pixel
being proportionate to the amount of intersection, giving anti-aliasing.

The product I work on in my day job has similar code to handle rendering
filled polygons, with similar constraints on widdershins/anti-widdershins to
turn off filling for 'holes'. The 'magic' of this technique seems to be
applying that technique in a shader, on a per-pixel basis.

There's no triangulation that I can see (compare with the elaborate schemes
Loop-Blinn paper above), just a decomposition of glyphs to these ordered
collections of 3 point bezier curves, and some fiddling to get it all saved in
a texture.

~~~
rayiner
Yes, it's a different technique, one that creates a triangle mesh from the
curves first. Sorry for the lack of clarity in my original post.

------
a_e_k
Nice work. The subdivision into cells and encoding into GPU textures is
strongly reminiscent of the 2008 paper "Random-Access Rendering of General
Vector Graphics" by Nehab and Hoppe [1].

Rotating the line samples is a rather interesting idea. Seems like that would
converge to the equivalent of a convolution with a radial tent filter.

[1]
[http://w3.impa.br/~diego/publications/NehHop08.pdf](http://w3.impa.br/~diego/publications/NehHop08.pdf)

------
voltagex_
This looks great, but I don't understand much of the article. As a half-decent
dev, what do I need to do to write "Hello World" on the screen?

------
DonHopkins
Signed Distance Field fonts are wonderful! I've been using a Unity extension
called TextMesh Pro, which works very efficiently, and renders beautiful text
that you can configure and decorate in many ways.

Beyond the obvious benefits of SDF fonts, it also has excellent and extensive
layout, formatting and other useful features, like automatically scaling the
text to fit in a given area.

The source code for the shaders and formatter and ugui integration is
included, but not the atlas generator (although you might ask the developer if
you need it -- he's engaged and helpful, and provides good support and quick
fixes for Unity updates). It includes desktop shaders with many fancy features
(outlines, drop shadows, beveling, glow, bump mapping, a surface shader that
reacts to lighting and throws shadows into the world), and simplified
optimized shaders for mobile.

I've used it (and read the source code to see how it works), and can testify
that it works great on Unity's desktop, iOS, Android and WebGL backends. The
code itself and the way it offloads so much work to the GPU is very beautiful
and elegant.

TextMesh Pro - Unite 14 Demo:
[https://www.youtube.com/watch?v=q3ROZmdu65o](https://www.youtube.com/watch?v=q3ROZmdu65o)

Here's a benchmark that compares 5000 crisp TextMesh Pro objects rendering at
70 FPS, versus 5000 fuzzy Unity text objects rendering at 42 FPS:
[https://www.youtube.com/watch?v=rdc8UkxuSZc](https://www.youtube.com/watch?v=rdc8UkxuSZc)

Here's the same benchmark compiled with Unity's WebGL backend, with the same
number of 2500 static and 2500 dynamic text objects. Nowhere near the 70 FPS
of native code of course, but not bad for WebGL and so many objects -- zoom in
with the mouse wheel to see the text up close:
[http://donhopkins.com/home/TextMeshProBenchmark/](http://donhopkins.com/home/TextMeshProBenchmark/)

TextMesh Pro on the Unity Asset Store:
[https://www.assetstore.unity3d.com/en/#!/content/17662](https://www.assetstore.unity3d.com/en/#!/content/17662)

Digital Native Studios home page:
[http://digitalnativestudios.com/](http://digitalnativestudios.com/)

~~~
thorp5555
They are good but they do have their weaknesses: Rounded Corners, Not good
with thin lines, time required to generate the SDF

------
pshc
Very clever technique! It seems more true to letterforms than SDF. Seems like
it might need more texture bandwidth/dependent reads than SDF, but less
texture memory as a whole?

I've been looking at different text rendering methods for a VR engine. I
wonder if this technique works in 3D space? If so I'll give it a shot,
benchmark it against SDF :)

~~~
DonHopkins
I'm using TextMesh Pro for a VR/AR application in Unity called Pantomime, and
it works great for text in the 3d world space, as well as 2d gui dialogs. You
can get up really close to the text, and it looks perfect. I posted some links
to demos on the TextMesh Pro forum here:

[http://digitalnativestudios.com/forum/index.php?topic=470.ms...](http://digitalnativestudios.com/forum/index.php?topic=470.msg3493#msg3493)

Some discussion about text in VR:

[https://www.reddit.com/r/oculusdev/comments/34d8nm/reading_i...](https://www.reddit.com/r/oculusdev/comments/34d8nm/reading_in_the_dk2/)

More about Pantomime:

[http://pantomimecorp.com](http://pantomimecorp.com)

------
e98cuenc
It doesn't work here, Nexus 5. That's what I see when I zoom in:

[https://goo.gl/photos/3EXv2t4YZRxmyNPe8](https://goo.gl/photos/3EXv2t4YZRxmyNPe8)

------
thorp5555
The use of the grid reminds me of Qin 2006's "Real-time texture-mapped vector
glyphs":

[http://dl.acm.org/citation.cfm?id=1111433](http://dl.acm.org/citation.cfm?id=1111433)

[http://docdro.id/ynU32mg](http://docdro.id/ynU32mg) (Paper)

[http://docdro.id/t6XiE97](http://docdro.id/t6XiE97) (Slides)

------
kevingadd
This is a cool technique. The WebGL demonstration is pretty compelling!

It's a little odd to me that they describe distance fields as having an
unavoidable problem with sharp corners, though. The Valve paper on signed
distance fields for text demonstrates a simple workaround that fixes sharp
corners...

~~~
thorp5555
While it seems simple in the paper, it is way more complicated to implement in
practice as it comes with all sorts of edge cases.

[https://lambdacube3d.wordpress.com/2014/11/12/playing-
around...](https://lambdacube3d.wordpress.com/2014/11/12/playing-around-with-
font-rendering/)

------
jepler
The method seems to assume the geometric complexity of a glyph is relatively
small. I wonder if it will work with CJK glyphs. Possibly the number of tiles
per glyph would need to be greater; I'm not sure how badly a couple of
doublings would impact performance.

------
Apanatshka
_Source: MS Paint._

haha, that's clever ^^ EDIT: seriously though, those are some Paint skills, I
didn't even notice until I saw the "source" text below it.

------
rasz_pl
no subpixel rendering or hinting. Im confused - is there a problem with cpu
rendered fonts that needs fixing? I didnt notice anything wrong last time I
was reading schematics and datasheets on 4K monitor.

~~~
piotrkaminski
Don't know why you're getting downvoted. More performance is certainly always
good, but without subpixel anti-aliasing the text will tend to look pretty
crappy absent a ridiculously high resolution display.

~~~
pcwalton
Ridiculously high resolution displays are rapidly becoming the norm.

~~~
hammerandtongs
I assume you are thinking through it's implications for WebVR+servo?

I can see questioning the utility for current 2d applications but this is
really critical speed when the large amount of translation and scaling of many
pages in a 3d environment is the expectation.

The css3d "portal" solution for webgl doesn't seem like it can get us very far
at all...

I'd use this today in my webvr/gl project if it was source complete.

------
monk_e_boy
Why don't GPUs have some hardware in them to make rendering text simpler? It
seems like a pretty common function that every game has. Have you ever seen a
game without any text in it?

~~~
ris
Fixed-function hardware on GPUs is really on the way out.

------
_ZeD_
strange, I only see white rectangles on the demo (using ff 43.0.3)

------
dvt
Although cool, this is not a particularly novel idea.

I remember reading a 2007 SIGGRAPH paper about something similar Valve did in
HL2 (or maybe it was Orange Box). Either way, it was a similar method of
rendering vector textures via distance fields. Here it is:
[http://www.valvesoftware.com/publications/2007/SIGGRAPH2007_...](http://www.valvesoftware.com/publications/2007/SIGGRAPH2007_AlphaTestedMagnification.pdf)

~~~
mattzito
This isn't my field of expertise, so maybe I'm missing something, but the
article cites the paper you describe in literally the second paragraph.

