
Parallelizing PNG: choosing Rust for mtpng - fanf2
https://brionv.com/log/2018/09/09/parallelizing-png-part-5-choosing-rust-for-mtpng/
======
xiphias2
,,I ended up not using the fancy iterator systems (yet?) but the ThreadPool is
perfect.''

Please give Rayon iterators another try instead of using thread pool + message
passing, it's one of the best features of Rust.

~~~
amelius
> Please give Rayon iterators another try (...) it's one of the best features
> of Rust

Isn't it possible to implement these iterators in any language? Why are they a
feature of specifically Rust?

~~~
carlosdp
Sure, but when you use a parallel iterator like Rayon in Rust, memory and
data-race safety is guaranteed at compile time. And using Rayon is as simple
as changing “.iter()” to “.par_iter()” in your code thanks to the trait
system.

Not that you lose that safety doing the thread pools and such manually, it’s
just unnecessary.

~~~
jandrese
Is there any reason to use .iter() over .par_iter()? Does the latter have
higher constant costs that only amortize on lists larger than some value? Or
could you just replace .iter() with .par_iter() under the hood and reap the
benefits for free?

I guess if you are iterating over something declared unsafe? But the compiler
should be able to recognize that and fall back to the linear version
automatically.

~~~
pornel
Rayon has relatively small overhead, but that doesn't mean you can
find'n'replace iter() with par_iter():

\- not every collection/combination of iterators supports parallel work. For
some it doesn't make sense, for some it's just not implemented.

\- while Rust prevents data races, it can't prevent deadlocks, so you need to
be mindful of locks.

\- .iter() is allowed to mutate state outside of the iterator and use non-
thread-safe objects, but .par_iter() isn't (without explicit synchronization
primitives). So par_iter() will force you to use more expensive atomic and
mutex-locked objects.

For data-bound problems most of the time it just works great. I've ran into
problems with some networking libraries having global state, which prevented
use of par_iter() (but no crashes - caught at compile time!)

------
Klover
I really enjoy posts like these. It’s pleasant to read and gives you some
insight into what people do to get their idea materialised.

------
jhasse
> But it adds the C++ standard library as a dependency, which might or might
> not fly for GNOME.

You can also statically link C++'s standard library, just as you do now with
Rust's.

~~~
simcop2387
This might not always be an option. GCC's libstdc++ is actually under the
GPLv3[1]. There's an exception for code that's done via some Eligible
Compilation Process, but I'm not sure if this also counts static linking of
the libstdc++ into it. That said I'd be a little surprised if statically
linking changed things here, I don't see any language about it like there is
in the LGPL since it also considers replacement of the LGPL code without
needing to touch the original inputs.

[1]
[https://gcc.gnu.org/onlinedocs/libstdc++/manual/license.html](https://gcc.gnu.org/onlinedocs/libstdc++/manual/license.html)

~~~
jhasse
It isn't under plain GPLv3, but a license which includes the following
exception:

> When you use GCC to compile a program, GCC may combine portions of certain
> GCC header files and runtime libraries with the compiled program. The
> purpose of this Exception is to allow compilation of non-GPL (including
> proprietary) programs to use, in this way, the header files and runtime
> libraries covered by this Exception.

See
[https://gcc.gnu.org/onlinedocs/libstdc++/manual/license.html](https://gcc.gnu.org/onlinedocs/libstdc++/manual/license.html)

~~~
uluyol
Does that include libstdc++ or just libgcc? In any case, I guess you could
always use libc++ instead.

~~~
tomjakubowski
As the page says, that license and exception apply to libstdc++. I don't know
if they apply to libgcc.

------
squiguy7
> Rust’s type system is powerful, with a generics system that’s in some ways
> more limited than C++ templates but much easier to grok.

I'm definitely not experienced enough in C++ to know the details around this
but I would love to hear about it from someone who is. I know some of the
limitations of Rust's current type system but there is a decent amount of work
underway to implement things like constant generics.

~~~
steveklabnik
There's two core differences.

The first is what happens when you get something wrong. Let's say you write a
templated function. This is Rust syntax, mapping it to C++ is left as an
exercise for the reader:

    
    
      fn foo<T>(x: T) {
          x.bar();
      }
    

Rust checks the types _before_ expansion, not after. So you get this error:

    
    
      error[E0599]: no method named `bar` found for type `T` in the current scope
       --> src/lib.rs:2:9
        |
      2 |       x.bar();
        |         ^^^
    

In C++, this stuff is checked _after_ expansion, so if you only pass things
that have bar to foo, you're all good! It will compile. But when you pass
something that doesn't, you'll get an error then.

This is a restriction, but one that leads to better error messages, and
stronger checks. You'd need to write

    
    
      fn foo<T: Bar>(x: T)
    

where Bar is a trait that provides a bar method.

The second difference is what is allowed in generics: Rust only lets you use
type parameters. We have accepted an RFC to allow constant expressions (the
most straightforward of which is 'integers'), but it hasn't been implemented
yet. C++ lets you do this today
[https://stackoverflow.com/questions/499106/what-does-
templat...](https://stackoverflow.com/questions/499106/what-does-template-
unsigned-int-n-mean) and
[https://en.cppreference.com/w/cpp/language/template_paramete...](https://en.cppreference.com/w/cpp/language/template_parameters#Non-
type_template_parameter) (they also have "template template parameters" aka
higher kinded types
[https://en.cppreference.com/w/cpp/language/template_paramete...](https://en.cppreference.com/w/cpp/language/template_parameters#Template_template_arguments))

~~~
petters
It is worth adding that the C++ community has wanted to add something called
"concepts" since a very long time. Rust traits sounds similar. So C++ will
likely move in the direction of Rust here.

~~~
steveklabnik
Yes. They're similar in ways, but also very different. I know that they have
been added to the C++20 draft, but haven't gotten a chance to really dig in
yet.

------
comesee
This is good work. Widespread parallel png codecs are a practical need that
we've had for a while, everyone benefits.

Not interested by his implementation strategy but would only recommend this
get merged into libpng to make the largest impact.

------
kevin_b_er
Compression time is heavily mentioned, but how about the output file size?

Over the years there've been wildly different PNG file sizes And a good
encoder can secure something like 50% file size against a naïve
implementation.

~~~
steveklabnik
[https://www.reddit.com/r/rust/comments/9evdyt/parallelizing_...](https://www.reddit.com/r/rust/comments/9evdyt/parallelizing_png_part_5_choosing_rust_for_mtpng/e5s0ydg/)

------
hyperman1
To be honest, I hadnt expected a speed problem here.

However,

* gzipping 7680×2160x3 bytes from /dev/urandom on my thinkpad x220t takes 1.6 seconds (not mutch difference between -1 and -9)

* gzipping /dev/zero takes 0.4 seconds

So it turns out this might be a problem after all!

~~~
Twirrim
I'm not sure what you expected?

Random data doesn't compress well, and requires a lot more effort by the
compressor.

Images aren't anywhere near that random.

~~~
hyperman1
I expected the throughput of the gzip process to be a lot higher, to be
honest. Bot the random data and the stream of zeros took a noticeable time to
compress.

In a video game or real-time mpeg stream decoding job, the video card has to
generate this amount of data 30-60 times per seconds. So 0.4-1.6 seconds to
compress the result seems long in comparison.

I tested the 2 extreme situations I could think of: ultimate random and
ultimate order. Images will fall somewhere between these 2 limits, so on my
PC, the PNG compression has a lower bound of 0.4-1.6 seconds to write an image
with the current algorithm.

------
faitswulff
I must admit I upvoted for the puns.

