
Inside font-rs, a font renderer written in Rust - beefsack
https://medium.com/@raphlinus/inside-the-fastest-font-renderer-in-the-world-75ae5270c445#.uttzi87qp
======
nathancahill
Neat. I can't help but wonder if this will fall victim to the "last 10% is the
hardest" rule? Will going from tech demo to production ready remove the
performance gain?

It sounds like you're claiming that FreeType is slower because the
parsing/accumulation implementations are slower. It's far from my area of
expertise, but wouldn't 20 y/o open source software as prevalent as FreeType
have optimized those code paths?

Edit: Author is a heavyweight in font rendering circles. Excuse my ignorance.
Just wary of "10x faster with 90% of functionality!" benchmarks.

~~~
zellyn
Raph really, really knows his stuff. His interview on the New Rustacean
podcast starts with a summary of his background: Gimp, Ghostview, android font
rendering.

~~~
zellyn
Your 10x speed, 90% functionality caution is definitely warranted. I've lost
count of how many times I've seen "we run Python/Ruby/Perl/whatever" 3x
faster, just haven't implemented exceptions and monkey-patching yet. :-)

------
kibwen
The author is in the thread on /r/rust if anyone has any questions:
[https://www.reddit.com/r/rust/comments/4vqpxx/inside_the_fas...](https://www.reddit.com/r/rust/comments/4vqpxx/inside_the_fastest_font_renderer_in_the_world/)

(Please ignore today's deliberately garish background image, we just turned
MIR on and we're celebrating. :P )

~~~
ufo
That warning about the web design shows that you know the HN crowd very well.
Complaints about fonts, annoying Javascript and other things that make the
site hard to read always get voted to the top :)

------
wmil
This is actually an important use for Rust. Many font renders were initially
written for speed on single user systems running trusted programs.

Vector rendering creates a lot of edge cases that C tends to ignore.

The initial X-Box soft mod hack was done by loading a font with negative
values in key fields. Microsoft had brought over the Windows font rendering
code, and that wasn't written with hostile fonts in mind.

~~~
moosingin3space
My belief is that Rust today, in production, is best used in networking code,
such as protocol parsers, but the interest in font rendering in Rust will help
to push the priority of stabilizing SIMD up, which should help quite a bit
with the applications where Rust lags in performance.

~~~
pcwalton
I haven't observed that the lack of stable SIMD makes Rust only suitable for
networking code. You can easily use SIMD by writing in assembly (inline
assembly, even), with unstable intrinsics, or even with autovectorization.

In Servo, for example, we have large speedups over existing C++ codebases that
have nothing to do with networking.

~~~
dikaiosune
Inline assembly is still unstable though, yes?

Also autovectorization doesn't support many uses and can be quite brittle
AFAIK.

I think Rust is absolutely amazing, but stabilized SIMD support is right up at
the top of my wishlist.

~~~
steveklabnik
Inline asm is still unstable, yes.

------
publicfig
There's also rusttype [1] that was written as an alternative to FreeType in
Rust, I know it's being used at this point in Redox [2], which is an
interesting project of a Unix-like operating system being written entirely in
Rust to the best of my knowledge. I'm curious to see how the two compare at
this time.

[1]
[https://github.com/dylanede/rusttype](https://github.com/dylanede/rusttype)

[2] [https://www.redox-os.org/](https://www.redox-os.org/)

~~~
raphlinus
I did some measurements of rusttype and found it to be even slower than
FreeType. That said, this is all open source and so I have confidence that the
improvements will flow all the way, either through rusttype using the font-rs
rasterizer, or font-rs becoming robust enough for downstream like Redox and
Piston to adopt.

~~~
moosingin3space
Integrating parts of this into RustType would be awesome!

------
wscott
Very interesting post.

Note that the New Rustacean podcast did an interview with the author of this
post, Raph Levien. That was very interesting and he did touch on this program.
[http://www.newrustacean.com/show_notes/interview/_2/index.ht...](http://www.newrustacean.com/show_notes/interview/_2/index.html)

I believe it should be possible to have rust compile to a library that could
be called from a normal C program.

~~~
untothebreach
Indeed, C <-> Rust interop is one of rusts design goals. The Rust Book goes
over this a bit: [https://doc.rust-lang.org/book/ffi.html#calling-rust-code-
fr...](https://doc.rust-lang.org/book/ffi.html#calling-rust-code-from-c)

------
joosters
Are fonts always rendered 'completely' these days? I thought that the fonts
would be rendered once to a cache of bitmaps/textures, and then those bitmaps
can be copied to the screen/buffer pretty much instantaneously.

After all, there's no point re-doing all the calculations to draw all the
curves in a letter 'g' when the output is going to look just the same as the
last time you drew it...

~~~
microcolonel
There is a point. Subpixel positioning is important and it's hard to know how
much memory will be taken up by the full set of characters. Because you can
not know the advances(and hence the exact positions) before you render the
characters, you can not render them ahead of time without quantizing the
position and storing multiple bitmaps.

This quickly becomes a non-optimization.

~~~
p0nce
What I do is rounding the glyph position to 1/4 of pixels, in my tests it has
not much visual impact. That still mean in the worst case the glyph cache
could contains 16x the same glyph, and that's before hinting enters the
equation.

------
sievebrain
Surely the biggest win here is not so much the faster rendering, which is
nice, but rather that font rendering is so often a source of remote code
execution exploits ...

~~~
pcwalton
I think Rust is at its strongest when it achieves both improved performance
and better security. Not everyone cares about performance, and not everyone
cares about security, but most people care about one or the other.

------
Drdrdrq
I'm curious: is font rendering speed something that regular user would notice
in his day to day use? Maybe in battery life? Or is this important just for
some specialized (designers'?) use?

I do hope this project gains traction, though the advantage I see is in
replacing another piece of legacy code with safe(r) one (Rust).

~~~
rspeer
Right now you can basically DoS gnome-terminal on Ubuntu by sending it a
screenful of Thai text (or any of many similar scripts where most adjacent
letters have ligatures to each other, but Thai is by far the most common). It
slows down to rendering like 40 characters per second.

I work with multilingual corpora and I live in fear of accidentally scrolling
into the Thai parts. I would report this bug but I don't know whose bug it is.

So maybe I'm not a regular user, but font rendering speed is something I
notice.

~~~
anthk
gnome-terminal is slow. Try st, or rxvt.

~~~
rspeer
You must mean something else besides rxvt, which doesn't support Unicode.

~~~
welterde
Probably meant rxvt-unicode and not plain rxvt, which doesn't appear to be
under development anymore.

~~~
rspeer
As a follow-up: I tried urxvt and st, despite the unfriendliness of the fact
that you configure urxvt by looking up .Xresources incantations and you
configure st by actually editing the code and recompiling it.

It seems what we have here is a tradeoff: the font rendering in your terminal
can be good, or it can be fast, but perhaps not both given the current
options. urxvt and st are doing some kind of very low-level font rendering
that doesn't match the fonts I see most of the time on Ubuntu. Whatever
antialiasing algorithm they use, if you let them antialias, is a smudgy mess,
and not something I would want to look at all day.

The result only makes me appreciate more the idea that good, fast, text
rendering is something that a new library could help us achieve.

~~~
anthk
Edit ~/.fonts.conf

[https://wiki.archlinux.org/index.php/Font_configuration#Hint...](https://wiki.archlinux.org/index.php/Font_configuration#Hinting)

I have the hinting style set to slight.

------
Mathnerd314
For a while now, I've been wondering if font rendering could be improved by
using a Lanczos filter instead of a box filter for anti-aliasing.
(Conceptually, when you compute the pixel coverage you are convoluting the
image with a box filter and then sampling at each pixel's center) The
performance impact might be too high, but I haven't seen anyone try the
experiment.

~~~
vardump
> if font rendering could be improved by using a Lanczos filter instead of a
> box filter

You don't want an ideal filter for rendering fonts -- you want sharp edges
instead. A lot of work [1][2] has been done to achieve this.

In font rendering, alignment to pixel boundaries is intentional, it's called
hinting. So you could say aliasing is used for an advantage.

[1]:
[https://en.wikipedia.org/wiki/Font_hinting](https://en.wikipedia.org/wiki/Font_hinting)

[2]:
[https://en.wikipedia.org/wiki/Subpixel_rendering](https://en.wikipedia.org/wiki/Subpixel_rendering)

~~~
Ericson2314
I think on today's high density mobile phone screens, these techniques are no
longer needed.

~~~
TheRealPomax
if those were the only screens in the world, I would agree wholeheartedly, but
there are millions, if not more, non-mobile screens in use today. This problem
is unlike to go away with high-dpi technology in the next 20 maybe even 50
years.

~~~
Ericson2314
Well, a different render can be written for those screens.

I assume 4K will come do predominate, but we'll see.

------
Const-me
Impressive!

> dense representations have a huge advantage when data-parallelism is an
> option

Sparse or dense isn’t binary choice, it’s possible to combine the two to have
best of both.

When I was working on similar problem in 3D space and with much larger
dataset, I represented my voxels as sparse collection of small dense blocks.
The blocks are small enough to fit in a single cache line, small enough to
save a lot of RAM space + bandwidth because many are empty, but inside they
are dense and large enough to benefit from SIMD parallelism.

However, for 2D images that only take a few hundred kb RAM dense buffer is
probably better because fits in L1 or at least L2 cache.

------
amelius
I always wonder why a distinction is made between font rendering and rendering
arbitrary SVG shapes. In other words, couldn't this renderer be much more
useful when generalized?

~~~
rtpg
I would imagine that arbitrary SVG rendering is harder to get right?

With fonts you can probably cache a lot of the rendering (even considering
things like ligatures), and considering how much text your machine is showing,
the performance tricks could be extra useful.

~~~
valarauca1
SVG standard includes JavaScript.

So SVG renderers are very non-trivial to implement.

~~~
posterboy
hilarious

~~~
kbenson
In a "laugh because the only other option is to break down and cry" sort of
way, yes.

------
nathell
Extremely impressive and exciting, kudos to the author!

It would be interesting to compare the memory footprint and code size of font-
rs vs freetype; comparing just speed doesn't give the full picture.

------
c-smile
Those manual use/optimization of SIMD are done in C anyway.

What exactly is so Rust'y in the code? If to compare with C/C++? What is the
main benefit of using Rust? Or is it just a test like "you can do that in Rust
too"?

~~~
vardump
> Those manual use/optimization of SIMD are done in C anyway.

C seems to be used _only_ for SSE intrinsics support. ~16 LoC of "C".

This is _all_ C in the whole project, if you _really_ think you can call it as
such:

[https://github.com/google/font-
rs/blob/master/src/accumulate...](https://github.com/google/font-
rs/blob/master/src/accumulate.c)

    
    
      void accumulate_sse(const float *in, uint8_t *out, uint32_t n) {
        __m128 offset = _mm_setzero_ps();
        __m128i mask = _mm_set1_epi32(0x0c080400);
        __m128 sign_mask = _mm_set1_ps(-0.f);
        for (int i = 0; i < n; i += 4) {
          __m128 x = _mm_load_ps(&in[i]);
          x = _mm_add_ps(x, _mm_castsi128_ps(_mm_slli_si128(_mm_castps_si128(x), 4)));
          x = _mm_add_ps(x, _mm_shuffle_ps(_mm_setzero_ps(), x, 0x40));
          x = _mm_add_ps(x, offset);
          __m128 y = _mm_andnot_ps(sign_mask, x);  // fabs(x)
          y = _mm_min_ps(y, _mm_set1_ps(1.0f));
          y = _mm_mul_ps(y, _mm_set1_ps(255.0f));
          __m128i z = _mm_cvtps_epi32(y);
          z = _mm_shuffle_epi8(z, mask);
          _mm_store_ss((float *)&out[i], (__m128)z);
          offset = _mm_shuffle_ps(x, x, _MM_SHUFFLE(3, 3, 3, 3));
        }
      }
    

Also consider top level: [https://github.com/google/font-
rs/tree/master/src](https://github.com/google/font-rs/tree/master/src)

> What is the main benefit of using Rust?

Safety and performance.

From the article:

"With the SIMD speedup, font-rs is approximately 7.6x faster than FreeType in
larger sizes (keep in mind that 42 pixels/em is the default for xxhdpi Android
devices)."

~~~
dman
Any reasons why the sse code wouldnt be written in rust itself?

~~~
phpnode
No SIMD intrinsics yet

~~~
kibwen
Or rather, they're there, but only on nightly builds of Rust, though there's
talk of stabilizing them with only minor changes.

