
A Guide to Porting C/C++ to Rust - ingve
https://locka99.gitbooks.io/a-guide-to-porting-c-to-rust/content/
======
bluejekyll
I just started porting an application using corrode on a C library at work.
This was more of a POC, to show the team. Based on that, I think this book is
missing a section on incremental conversion.

You don't want to convert entire applications at once, you want to do it a
source file or module at a time. This means incorporating the Rust toolchain
into the build of the C/C++ project Makefile, or etc., system.

It is not difficult, and will increase confidence in the final product. This
can even be used to test incremental conversions in production, rather than
having any doubt about the entire conversion.

This is an awesome start for a great manual on the area. I'm very inspired by
the experience I had in doing this, and hope that we do more.

~~~
tom_mellior
> I just started porting an application using corrode on a C library at work.

Can you tell us a bit about your experience with Corrode? The Readme itself
states that it's not expected to work on "real" code, and there are open
issues like this:
[https://github.com/jameysharp/corrode/issues/52](https://github.com/jameysharp/corrode/issues/52)
titled "Arrays are broken!" in which Corrode's developer wrote half a year
ago: "The current hack gets arrays translated correctly in function arguments,
but gets them wrong everywhere else."

In your experience, does it work better than this makes it seem?

(To be somewhat fair, the issue does have detailed discussion, and partial
fixes seem to have been merged, but it's still open.)

~~~
bluejekyll
I didn't run into the Arrays issue. This issue I filed, has a list of all the
things I hit:

[https://github.com/jameysharp/corrode/issues/109](https://github.com/jameysharp/corrode/issues/109)

Most of the issues I ran into were not exactly with the corrode tool, but with
compiler options, i.e. passing the DARWIN_C_LEVEL flag for macOS (my main dev
machine). Actually, the library I hacked on was only built for Linux, I spent
a lot of my time converting the C to multi-platform, not because it was
necessary, but because I didn't feel like running a VM.

What I found is that corrode produces _mostly_ compilable code. I had played
around a bit with FFI from C to Rust before, so the task of integrating a Rust
library into a C codebase was already familiar to me. I would love to see a
guide for doing that in corrode though, like an example project, for others.
If I have time I might do that. One really cool thing, is that the Rust
compiler detected some undefined behavior in our C code, which I filed a bug
for to our team (it was an uninitialized stack variable, passed as a pointer
to a function; in practice it would never actually rely on the undefined
behavior since all the fields of the struct were set in that function, but
it's still poor form).

There are some things which you will need to massage, but I found in general
that it wasn't much. I had a problem with null function pointer creation that
I need to go back and look at.

I think my biggest issue, is that every conversion produces _unsafe_ code in
Rust. It's left up to the developer to figure out how to convert it to safe
code. I think this is ok, but _unsafe_ is still dark arts for me in Rust
(meaning that I don't completely grok the best practices right now). I need to
go back and reread the Rustonomicon: [https://doc.rust-
lang.org/nightly/nomicon/](https://doc.rust-lang.org/nightly/nomicon/) , as
there may be more details there than the last time I checked it out. I would
love it if corrode could try and produce more safe blocks of code, but I also
understand why it's this way.

Anyway, I think it's an awesome tool, works 95% of the time, and you can
probably easily come up with some standard techniques for going back to fix up
the commonly incorrect cases.

I'll reiterate this: integrate the Rust code into your C build right at the
beginning (I used Cargo + staticlib), I even made this optional in my setup,
so that you can flip between the two implementations easily for testing. I set
up a make recipe which converts the C src to Rust with corrode (to capture all
the CFLAGS needed), and that will only run as long as the .rs file doesn't
exist. I found this to be very powerful for verification. I'd highly recommend
corrode at this point, but don't be misled to believe that you won't need to
make changes to the produced code.

Edit: I should mention that to get all this setup, with all the C library
fixes in macOS included and the Makefile integration, it took me 6 hours to
get setup, with about 10-15 minutes spent actually making the produced .rs
file compile.

------
Animats
The guide really isn't written yet. After the introductory material, it's
mostly a list of 'todo' items. [1]

The "corrode" translator apparently just compiles C to unmaintainable, unsafe
Rust. Raw pointers in C become unsafe raw pointers in Rust. That's probably
not too useful. See [2]. Now you have heavy technical debt. Also, losing all
the comments during translation is cruel to later maintenance programmers.

Good inter-language translation for programming languages is very hard.
Usually, you lose the idioms of both the source language and the target
language. There are few, if any, good systems for this.

To do this well, you need verification-like analysis of the source program.
You need to know if a pointer can ever be null. You need to know how big a
passed array is. You need to know if something has single ownership. If an
analyzer could look at a C program and answer "yes", "no", or "don't know" to
those questions, with a reasonably low percentage of "don't knows", then you
could do a translation to proper Rust.

You could brute-force some of this by running the Rust compiler from the
translator. Start out with everything being a single-owner reference, and
compile. Anything that generates a borrow checker error gets turned into a
reference counted item on the next try.

[1] [https://locka99.gitbooks.io/a-guide-to-porting-c-to-
rust/con...](https://locka99.gitbooks.io/a-guide-to-porting-c-to-
rust/content/porting-code.html)

[2]
[https://news.ycombinator.com/item?id=12056230](https://news.ycombinator.com/item?id=12056230)

~~~
Manishearth
> The "corrode" translator apparently just compiles C to unmaintainable,
> unsafe Rust.

The point is for it to be a first step in conversion, with large swathes of
code being easy to make safe.

The issue with idioms is a real one.

But still, it gives you a good path for incremental conversion which IMO is
pretty valuable.

~~~
Animats
It's too bad there aren't more Corrode examples on line. But from what's
available, it just passes C raw pointers through function calls as unsafe Rust
raw pointers. The interfaces generated are no better than they were in C. You
can't rewrite one side or the other and get safety, because the interface
itself is unsafe. That's no good.

This isn't a syntax problem. It's a semantic problem.

~~~
bluejekyll
The interface is FFI, it has to be unsafe, because you can't guarantee
anything about the C code.

The rust code though, can be converted to pure Rust. A strategy I use and most
other FFI code I've seen, is to have a shim layer, which is unsafe, that calls
through to safe code. And vise versa for calling out to C. The function
definitions don't need to be unsafe, just the direct calls to C and
conversions to C types.

------
ericpts
I find that most of the arguments given against C++ (such as memory leaks or
dangling pointers) can be avoided by using Modern C++ features, like never
allocating memory with new but using std::make_unique or avoiding null
pointers with not_null from GSL or std::optional.

~~~
bluejekyll
Is there a compiler option in C++ to only allow this safe variant of the
language to be used? I find this argument to be a false comparison to the
features of Rust, especially on dataraces, etc. I've never seen a 100% clean,
modern C++ codebase. If you know of one, could you point us to it?

Also, this is just me, given experience writing code in C++ and Rust, I find
Rust to be a more ergonomic language (YMMV).

~~~
ericpts
You can try to use some extra tools that clang provides, such as the static
analyzer ( [https://clang-analyzer.llvm.org/](https://clang-
analyzer.llvm.org/) ) or tidy ( [http://clang.llvm.org/extra/clang-
tidy/](http://clang.llvm.org/extra/clang-tidy/) ) with checks like the
cppcoreguidelines.

~~~
pjmlp
And what do you do when clang isn't an option, due to customer, target
hardware, OS or even language extensions?

This is what I mean in another thread about outsourcing safety.

~~~
otabdeveloper
> And what do you do when clang isn't an option, due to customer, target
> hardware, OS or even language extensions?

Then you cannot use Rust and must settle for lack of safety. (A profoundly
silly question -- if modern C++ is not an option for whatever reason, then
Rust is doubly so.)

~~~
pjmlp
Only when people mix languages with implementations.

Just because the only existing Rust compiler uses LLVM, it doesn't mean
another implementations will not surface.

So hypothetical the OS not targeted by clang, can still have a Rust compiler
on the OS SDK, offering all safety guarantees from Rust.

For example, there are OSes and hardware architectures not supported by clang
that have Ada and even Java AOT compilers.

~~~
lossolo
You made a bad argument, hide your ego in your pocket and admit it instead of
writing such a nonsense. Both clang and Rust compiler are based on LLVM and
they both can target only those platforms that LLVM target. If you can't use
clang today then you can't use Rust also. We are not talking about
"hypothetical" situations or alternative worlds because in those situations i
can say whatever I want about clang also...

~~~
pjmlp
Nonsense only to those that don't understand about compiler design.

If you want an option that works today without nonsense as you put it, then I
would pick Ada over C++ for secure critical projects in such scenarios.

Companies like LDRA do exist, because just using clang isn't enough.

[http://www.ldra.com/en/software-quality-test-
tools/group/by-...](http://www.ldra.com/en/software-quality-test-
tools/group/by-coding-standard)

Also a reason why HIC exists
[http://www.codingstandard.com/](http://www.codingstandard.com/)

~~~
lossolo
You didn't even reply to what I wrote? We are talking about Rust and Clang
compilers and platforms they target and this was the context. It started when
you tried to belittle Clang because it shows warnings and then you tried to
make argument about hypothetical Rust compiler that can target OS that Clang
can't, again trying to show Rust > C++. Even if you can't show superiority of
Rust over C++ you are inventing hypothetical compilers that you think will
work as your argument. If anyone will read your comments then he can see you
are obviously biased and your ego just magnifies that effect. Think about it,
objectively, get some air, there is world besides HN also.

------
jeffdavis
I'd really like to see some guides for using rust _incrementally_ in a large C
or C++ codebase.

* What are some good ways to deal with a lot of C macros and C/C++ header files in general.

* What is a good build strategy for integrating with a big, say, cmake build system?

* How to choose APIs, other than always using the lowest common denominator (C) for everything? Put another way, how do you choose which parts to write in rust first so that you don't end up with rust code that looks like it's translated from C?

* How to work with other memory allocation strategies, like a region allocator (like postgres memory contexts)?

------
catnaroek
> A language like Java _would have the reliability angle covered_ but is hard
> to make performant.

ConcurrentModificationException.

~~~
the8472
That is a runtime check similar to RefCell::borrow() panic if you try to read
from it while something else is mutating it.

~~~
catnaroek
With two important differences:

(0) RefCell can't be used in a multithreaded context. You have to use an
actual Mutex.

(1) ConcurrentModificationException isn't reliable. It's thrown on a best-
effort basis.

In any case, in a high-level language, just memory-safety is a very cheap
guarantee. You also want to protect the internal invariants of your data
structures - especially the concurrent ones!

------
shmerl
This looks interesting, though I can't find much info about inheritance and
code reuse there, which should be a major topic I suppose? Who is the author
of this book?

~~~
Manishearth
The book is incomplete.

Porting an inheritance model to Rust is ... interesting. You can emulate it
with some macros, but usually you should redesign the idioms used based on the
exact use.

~~~
shmerl
Yes, I'm not talking about porting things 1:1. I'm talking about explaining
OOP approach that Rust suggests, and may be examples of how this methodology
can be applied to solve issues like code reuse, that C++ at times solves with
inheritance and etc.

------
ar15saveslives
The main problem is what to do with tons of opensource/thirdparty C++
libraries.

~~~
quickben
Nothing. Unless you think rust will become the next c++, and the world will
move to it en masse, in which case people still won't rewrite most of the old
libraries to it.

At best people can hope that somebody comes up with a neat import system of
c++ namespaces, a rock solid debugger and a fluid ide.

Rust will be like Go. Both pushed by corporations, and probably follow the
fate of cobol, if history is to be asked.

~~~
moomin
Since COBOL is the longest actively used language I wouldn't say that was so
bad.

~~~
quickben
If anything it was good for the overall good of our ecosystem. People coded
it, used it, built with it. We all are in a better place knowledge and
experience wise. New languages generally are good.

------
geezerjay
The book appears to be promising, but it is still far from being complete and
doesn't touch the main points of interest, such as inheritance and problems
arising from any form of template specialization.

Perhaps the author could get the book up to a presentable point before
advertising it.

~~~
jhoechtl
There was a rant earlier this week in the same vain: Instead of presenting
something pre-mature check back later once its done.

I think it is still wothwhile to present something you have instead of waiting
once its done. Also the understanding "once its done" varies from the use
case. This in your opinion unfinished book might be of use to others.

It is also good to have something written and share it with others to get
feedback (as you did) at an early stage.

------
static_noise
TODO: write the rest of the comment.

~~~
Kenji
I bet most people just read the first page of this guide and thought "yeah, I
should upvote that, I'm gonna read it later" and don't realize that the
article is full of gaping holes and TODOs which really devalues the entire
work. I started reading it seriously and as the TODOs came in masses, I was
disappointed and put it away.

