
How I Wrote a Modern C++ Library in Rust - hsivonen
https://hsivonen.fi/modern-cpp-in-rust/
======
pedrocr
Is there some kind of target or plan for converting C/C++ to Rust in Firefox?
Or is it just happening as there are people interested in working on parts of
the stack?

These progress numbers are very interesting:

[https://twitter.com/eroc/status/1061049330574884864](https://twitter.com/eroc/status/1061049330574884864)

and I was wondering how directed the effort was.

~~~
hsivonen
Not a specific target of having to replace C++ for the sake of replacing C++.

The way Rust code gets added includes:

* A new feature needs an identifiable library, so the new library can be written in Rust to begin with. (Example: U2F token USB integration.)

* Old code needs a rewrite anyway, so the rewrite can be in Rust. (Example: Character encoding converters.)

* Servo has proven a component, so it makes sense to bring it over. (Examples: Stylo and WebRender)

* History of vulnerabilities in code that was replaced. (Example: MP4 metadata parser)

~~~
Already__Taken
Is anyone aware of secondary effects this has had? e.g. removed C++ code that
has later found to have bugs, or newly re-written crates now more useful to
the wider community than the same code locked up in C++.

~~~
earenndil
> removed C++ code that has later found to have bugs

I doubt this would ever be discovered; who would analyze code that was
formerly a part of Firefox?

~~~
nickpsecurity
There's actually a lot of people who want actual, experimental data to back a
language's claims about safety. A subset of them use C and C++. I occasionally
argue with them about safety benefits of other languages. They demand more
proof than the design, esp field data. I do keep stuff like this as
experimental evidence that will add up for such empiricists over time.
Although, I prefer controlled experiments where you teach amateurs C, modern
C++, and Rust over a specific time followed by testing (esp fuzzing) of their
code to test the safety claims. Run it in a dozen different places to see if
results are consistent.

There's also folks that just study these things to identify patterns in
problems created, prevented, or detected (at what effectiveness) in various
languages and techniques in software development. Along similar vein, each bug
report also provides (in theory) a test case for automated tools that detect
bugs. It's very important to have a huge, diverse pile of code to test those
tools with. That's because each one's algorithms might have blind spots
missing bugs. The more code and bugs we have, the better we can assess those
algorithms' accuracy. And then build better algorithms. :)

~~~
umanwizard
Don't get me wrong, I love Rust, but I think any programming beginner starting
with Rust as a first language is pretty likely to fail.

Ownership is a hugely important part of designing programs, and it's something
people need to come to terms with eventually, but a language where you can't
do even hello world without understanding ownership adds a lot of mental
overhead to the learning process when someone is still not even comfortable
with for loops and function calls.

~~~
valarauca1
`cargo new $project_name` literally generates a hello world program so it is
misleading to say you need to understand ownership to write `hello world`.

That being said ownership is rather hard, but liberal usage of `.clone()` can
get you pretty far.

------
the_mitsuhiko
I have started it more than once but gave up on the sheer complexity involved
as many times but I really would love to have attributes to auto generate a C
ABI from a Rust API alongside headers and then high level python and C++
bindings to it. With the recent improvements to wasm-bindgen I looked into if
stuff can be repurposed there but it seems very custom crafted sadly.

If someone is interested in that as well, maybe there are some ways to join
forces.

~~~
Sean1708
Are you aware of cbindgen[0]? It's not exactly what you want, but it's
certainly far closer than wasm-bindgen is.

[0]: [https://github.com/eqrion/cbindgen](https://github.com/eqrion/cbindgen)

~~~
detaro
He's published some tooling using it:
[https://blog.sentry.io/2017/11/14/evolving-our-rust-with-
mil...](https://blog.sentry.io/2017/11/14/evolving-our-rust-with-milksnake)

------
cryptonector
C ABI to the rescue!

That may be the one thing left of C in 50 years' time.

~~~
MaxBarraclough
A dead lingua franca, Like Latin. Could happen, but I suspect embedded
programmers will stick with C forever.

~~~
kccqzy
This is one reason I don't really like embedded programming. There seems to be
generally a fear of newer systems languages like Rust and a fear of newer
features in C++. An embedded programmer told me he doesn't use the C++ STL
because he doesn't want functions allocating stuff on his back. Doesn't seem
very convincing as you can always redefine the global operator new.

~~~
rcxdude
There is a fair amount of interest in Rust from embedded programmers, though
at the moment you can only really use it for ARM cores (which are probably
among the most common embedded cores but there are a huge number of
alternatives). There is a dedicated working group within the rust community
for embedded which is pushing towards having a good user experience for those
wanting to use Rust in such an environment.

But even within embedded rust you won't find much appetite for dynamic memory
allocation, and a fair amount of the work has been stabilizing the mechanisms
by which you can build rust code which does not use its standard library
('no_std'). This has nothing to do with the fear of the new and lots to do
with predictability of code. Using a single heap for all allocations is not at
all ideal in an embedded context: your allocations become much harder to
predict (both in terms of failure and in terms of time taken), errors become
harder to recover from, and it becomes harder to reason about the amount of
memory your system will use (especially in edge cases). For embedded work you
will generally try very hard to allocate everything statically, and use some
kind of pool allocation if you cannot (and often you combine the allocation
and whatever datastructure you are placing the objects in).

~~~
josephg
> Using a single heap for all allocations is not at all ideal in an embedded
> context

The same is true in game development. Using a per-frame arena allocator for
short lived objects can make a big difference to performance. And Rust’s
lifetime rules can be used to make this sort of thing completely safe.

It’s a shame that rust std completely relies on the hidden global allocator
instead of accepting an allocator as a parameter. It means that while you can
write your arena allocator, you can’t use it with any of the built in
Box/Vec/HashSet/etc types.

I actually really like Zig’s answer here of just passing in an allocator as an
argument in all data structures. I’d love to see that in rust!

~~~
steveklabnik
We didn’t have an allocator API, or we would have liked to.

Global allocators have finally stabilized. Working on it!

------
SloopJon
If I'm reading this correctly, the "modern C++" part of the library is in fact
supplied by a C++ wrapper around a C-style API.

~~~
hsivonen
Correct. The Rust API is recreated in C++ using the corresponding C++
facilities with a C API in between.

~~~
lightedman
"with a C API in between."

I smell vulnerabilities miles away with that sort of implementation. Hope the
Rust programmers remember basic garbage collection in C, since C itself
doesn't have it automated.

~~~
kibwen
Just because it's using a C-style API doesn't mean there needs to be any C
code involved at all (I don't know if there is in this case, but in general
it's not necessary). You have one side say "hey, pretend this code I've got
here is C", and the other side says "hey, let me call this function that I
think is C". Every language has a way of calling C functions, so you don't
even necessarily need a C compiler.

~~~
hsivonen
> I don't know if there is in this case

In this case, there indeed is no .c compilation unit between the C++ and Rust
code that see each other via C linkage.

------
bluGill
I wish Rust, C++, Go, D... could get together and agree on a common ABI/module
format. It doesn't need to be complete (which is to say I'm fine with having
to use , just enough that 90% of my program can be written/rewritten in
whatever language makes sense and used in the other without having to drop to
C.

For starters it needs to have classes (probably using PIMPL like things
internally by default). It needs to have some sort of error handling. It needs
to support some basic data like std::vector (but they can start from scratch).

Edit: my fingers typed API not ABI first...

~~~
simias
That could actually somewhat work for C++ and Rust (although getting things
like generics/templates across would be a huge challenge and you'd lose Rust's
safety guarantees at the unsafe C++ interface. Also good luck dealing with
exceptions). For Go and D you have a bigger problem: garbage collection.
Having two different runtimes play well with each other is far from trivial
and could hurt performance.

The main reason the C ABI is de-facto the lingua franca for language bindings
is because it's almost runtime-less. You basically only need to agree on
things like stack layout and where the parameters/return values go. It's a
super low bar.

Dealing with C++ objects, overloaded functions, namespacing, exceptions etc...
Now that's a whole different can of worms. How would you automagically map
something like std::cout and its operator<< in Rust or Go for instance?

~~~
masklinn
> Dealing with C++ objects, overloaded functions, namespacing, exceptions
> etc... Now that's a whole different can of worms. How would you
> automagically map something like std::cout and its operator<< in Rust or Go
> for instance?

You'd define std::cout and operator<< in terms of a sane underlying
reader/writer ABI?

~~~
simias
Right, but my point is that then you're generating a complex FFI (which
requires high level info about how "operator<<" on std::cout implements
streaming and not bitshifting as usual), not a simple binding. Doing that
automatically and "standardly" would be quite complex. Those aren't primitive
types, they're abstraction built on top of the language's typesystem, g++ is
not aware of what a "stream" is in C++, nor is rustc aware of what the Read
and Write traits mean.

------
rokob
The epilog alone was worth the price of admission. Having one compiler tell
the other what to do via code generation is a great way around the lack of ABI
compatibility.

------
kyberias
Sounds like a huge effort just to get Rust in. I predict huge problems for the
Firefox program if they continue this mixing. It's a huge added complexity.

~~~
kbd
> ... just to get Rust in

You make it sound like they're just cargo culting a shiny new language when in
fact they _invented_ the language in the first place to solve their complexity
problems with C++.

