
Implications of Rewriting a Browser Component in Rust - zwliew
https://hacks.mozilla.org/2019/02/rewriting-a-browser-component-in-rust/
======
atoav
This is in tune with my own experience using Rust in production: it can stop
you from doing certain classes of mistakes, but it won't stop you from doing
stupid things.

But the idea that I don't have to think about certain classes of problems
allows me to give these stupid things more focus, which is surprisingly
refreshing.

The predictable nature of Rust was so refreshing for me that I ended up using
it even for smaller reusable scripts where I would happily have used Python
before but soon got annoyed with obvious errors that would only show up once
you run a program.

If you e.g. have a `print foo` in some obscure branch that rarely happens,
that print will ruin your day if you use Python 3. If python would be a little
like Rust you would get on save (or at least on compile) a hint or error, that
the print should look like this: `print(foo)` for Python 3. You can be
incredibly careful and rust will still catch things now and then, that would
have gone unnoticed into production unless you have immense test coverage.

I _like_ Rust for the experience I had with it. It definitly changed how I
approach certain problems in a very good and productive way, even when I don't
use it.

~~~
asdkhadsj
Agree completely. For a bit of my own story, a year+ ago I had the option to
write a project in Rust and evaluated it vs Go. Long story short, I tried
rust, and it was a massive headache and I failed. We used Go (as I had been
for ~4 years).

Fast forward to ~2 months ago, a work project dictated tight control over
memory which, while possible in Go, had me looking at alternatives. I decided
to give Rust another try. This time it wasn't just an evaluation, it was
needed to work so I bought and Rust book and spent some after hours time
learning/etc.

This time, Rust has been an absolute joy. I have no understanding why last
time was so painful, and this time it's been so amazing. Maybe it was the
book[1]? Maybe it was just a 2nd round of learning based on my previous
experience? Regardless, it's been great.

There's just so many mental overheads like what is concurrent safe, what is
non-null, etc that are just great to not think about anymore. On top of that,
the formatter and LSP are just great. It highlights in my text editor
(Kakoune) what variable caused an error, where it gets moved incorrectly, etc.
So much just works, it's great.

My _only_ complaint these days is:

1\. I find it odd that some things like slice reads can still panic by
default. Yes, I can use `foo.get(1)` to avoid panics, but still - it's a bit
odd to me. 2\. I'm anxiously awaiting async/await. It's quite difficult to be
patient.

[1]: Programming Rust: Fast, Safe Systems Development

~~~
epage
> 1\. I find it odd that some things like slice reads can still panic by
> default. Yes, I can use `foo.get(1)` to avoid panics, but still - it's a bit
> odd to me.

I wonder if this is similar to C++'s `[]` vs `at`. `at` does implicit bounds
checking but, as an optimization, if you are already doing an explicit bounds
check, you can elide the implicit check via `[]`.

~~~
masklinn
Except Rust's [] behaves like C++'s at (checks and aborts). C++'s [] is called
`get_unchecked`.

~~~
int_19h
C++ at() doesn't check and abort, it checks and throws a specific documented
exception that you can catch. So it is, in fact, closer to get(), just using a
very inefficient way of reporting.

~~~
masklinn
> C++ at() doesn't check and abort, it checks and throws a specific documented
> exception that you can catch. So it is, in fact, closer to get(), just using
> a very inefficient way of reporting.

You can catch a panic, and you can compile C++ with -fno-exception. at() is
not closer to get() than to [].

~~~
pjmlp
If you do that, you will be invoking nasal daemons, as _at()_ is required by
ISO C++ to throw.

~~~
jcelerier
thankfully, ISO C++ is a language that no one actually programs in, everyone
use `MSVC C++`, `g++ 8.2.1 -fwhatever`, etc

~~~
pjmlp
On real world where code portability actually matters, many do program against
ISO C++, and have to deal with workarounds for lack of compliance.

Not doing so means ending up with situations like the Linux kernel, Windows or
console games, which might be ok, when code portability doesn't matter to
start with.

------
rkangel
It's nice to see a balanced, real world, case study including 'these things
are fixed by Rust', 'these are problems that don't occur in idiomatic Rust',
and 'these are problems that Rust can't help you with'.

I'm a big fan of Rust, but the one sided 'Rust makes all the problems go away'
articles don't provide any value.

~~~
hedora
Overall, it did a decent job of being balanced, but I don’t buy the memory
overflow example at all.

For one thing, idiomatic C++ bounds checks by default. You need to use at().
If you don’t like typing at(), you can implement an array type that always
bounds checks fairly easily. On that note, the vulnerable c++ code should be
using accessors, not indexing to access the oddly packed and laid out array.
Even the fixed version wouldn’t pass a code review from me. You could write
equivalently bad code in any language that supports array types, and get
similarly broken results.

For another thing, there’s no evidence that you couldn’t achieve the same
improved data structures in C++ using its type system (which is turing
complete...)

The “thread safe by default” property sounds interesting; I’d be interested in
reading more about that.

~~~
masklinn
> For one thing, idiomatic C++ bounds checks by default. You need to use at().

Sounds like it doesn't check by default then. It checks if you remember to
check using the more verbose bounds-checking method. Not unlike the issues
with subscripting std::map.

~~~
TheAsprngHacker
I have some experience in C++, and I am familiar enough with the standard
library to remember that operator[] doesn't check bounds while the at member
function does. I would assume that the Firefox C++ programmers know this as
well. However, maybe I'm wrong or have too much faith?

Or, maybe, the programmer was aware that operator[] didn't perform bounds
checking, but opted to use it for some reason? A good way to dissuade people
from making unidiomatic choices is to make them more verbose. IMO calling the
at function isn't particularly verbose, but if the member function that didn't
check bounds were called something like "at_unchecked," perhaps people would
be less inclined to use it.

Also, from the snippet in the blog post, note that you can't tell whether the
Firefox code used std::vector, C-style arrays, or some non-STL container type.
Projects may use their own container types, but your criticism only applies if
the programmers were using the C++ standard library.

~~~
masklinn
> I have some experience in C++, and I am familiar enough with the standard
> library to remember that operator[] doesn't check bounds while the at member
> function does.

Everybody knows you're supposed to check pointers for being null, and yet time
and time again developers fail.

As long as you rely on human nature and provide one API which is simple,
convenient, obvious and dangerous and one which is complex, inconvenient, non-
obvious and safe, you will just drive users towards the former.

> I would assume that the Firefox C++ programmers know this as well. However,
> maybe I'm wrong or have too much faith?

Just because they know when quizzed doesn't mean they'll always remember when
actually doing. Even less so when subscripting is safe in pretty much every
other language which provides array subscripting, and ::at… only exists in
C++?

> IMO calling the at function isn't particularly verbose

No, but it's still more verbose and less intuitive than [], especially given
the above (that tons of languages use [], and very few have an at method)

> A good way to dissuade people from making unidiomatic choices is to make
> them more verbose.

Indeed.

~~~
svnpenn
this is so true

and has been my experience with unwrap

they dont want people to use it but the alternative is so verbose and clunky

~~~
FreeFull
The alternative tends to be to propagate the error upwards using the `?`
operator, up to some point where it makes sense to handle errors

~~~
barrkel
Aka exceptions.

Yes, chaps, that Result<T,E> type is all but isomorphic with checked
exceptions, Java-style.

~~~
masklinn
The "but" is where all the difference lies though, Result (or Either or
whatever you want to call it) is the reification of the sum of a return value
and an error, and as such manipulable without having to add dedicated tooling…
(which java didn't have either, and still does not).

Amongst other issues it's possible to pipe one through a generic wrapper
without that wrapper having to care about it.

e.g. let's say you have an input collection, you map() over it, and the map
callback can fail.

In Rust or Haskell you… just do that. And the caller deals with a collection
of results however it wants.

In Swift, you need map to be specifically annotated in `rethrow` so it can be
transparent to failure (aka can't fail if its callback can't, but can if its
callback can).

In Java, you're shit out of luck and jolly well fucked, your generic map can't
be generic over generic exceptions, so either it callback can't fail or you
need to wrap said callback to convert the checked exception into an unchecked
one, and possibly back again outside the map.

So… yeah, they're "all but isomorphic" because they're both implementations of
the concept of statically checked fallibility. It's just that java's checked
exceptions[0] are a bad implementation of the concept.

Put an other way, a 2018 fiesta or yaris are "all but isomorphic with" a 1960
corvair or a pinto, but you couldn't pay me to take a road trip in a corvair
or a pinto.

[0] java's because someone might come up with better ones, though the well's
been pretty tainted at this point

~~~
barrkel
Lack of parameterisation of exception signature is not the reason why checked
exceptions are a bad idea, though. Failure is a function of implementation
details; runtime errors are fundamentally an abstraction violation. There's no
escape.

~~~
masklinn
> Lack of parameterisation of exception signature is not the reason why
> checked exceptions are a bad idea

It absolutely is one of the reasons why they are a bad idea, and why reified
results are so much more useable and useful.

> Failure is a function of implementation details

In the original intention, that's what unchecked exceptions were for, with
checked exceptions for the reporting of errors rather than failure. That's why
java didn't have _only_ checked exceptions.

~~~
barrkel
Unchecked exceptions were for programmer errors, errors that could be avoided
with a sufficiently careful programmer: null pointer check, a divisor not zero
check, a list bounds check, etc.

Checked exceptions were for unavoidable errors. Errors that are fundamental to
the operation being attempted. They usually occur because the operation
interacts with the world outside the program, which means the program is
subject to violations of expectations. Network errors. File not found when you
try to open it - you cannot test for file existence first without a race.

The reality is that the latter kind of errors are better off as overloaded
call signatures: one call variant when you care about the error and want to
catch it, another variant when you don't care about specifics of the error and
want the whole stack to unwind when the expectation is violated.

Neither of these approaches require checked exception signatures throughout
the stack (or Result types for that matter - you can assume I also mean those,
due to the isomorphism).

There's a reason .net modeled number parsing with Parse() and TryParse()
instead of throwing NumberFormatException like parseInt(). It's because
sometimes you care - input came from user and you need to handle it - and
sometimes you don't - input came from configuration file and the stack needs
to be torn down if you can't parse it.

Picture a stack that looks like this:

    
    
        0: <operation that may throw exception of the checked variety>
        1: <code that may be interested in handling error>
        2: <code that doesn't know about implementation details>
           .... could be 10, 100, 200+ methods in this stack dump
        N-1: <code that doesn't know about implementation details>
        N: <request handler or event loop that catches all exceptions>
    

The great problem with checked exceptions is the methods for stack frames 2 to
N-1. Either exceptions are handled at stack entry 1, or at stack entry N. The
only job of all the code between is to pass exceptions unmolested back to N.

Those calls may be dynamically bound (whether via vtables or function
pointers, objects or closures) and / or dynamically linked (so unavailable to
a type system at compile time). In large production programs, control flow
will be dynamically determined. It's a fact of life; if it's not for testing
purposes, it'll be for deployment flexibility.

I think there's value in the IO monad; in marking functions as the kind of
functions that may interact with the outside world. And checked exceptions can
work this way. But not unless the error type has a polymorphic storage
location, and unwinding the stack is syntactically weightless. I don't ever
want to have to change the signature on a dozens of methods just because
there's a new implementation detail deep inside a dynamically linked
abstraction.

And exception / error wrapping isn't the answer either - it's almost always a
bad idea.

------
mrath
I primarily use Java for my job. Security and memory related features of Rust
are not an advantage compared to Java. But I like rust because it feels modern
and produces efficient standalone binaries. Most of my hobby projects are in
Rust now. But I would not rewrite any of my work projects in Rust even though
they require ultimate performance. That would be a maintenance burden.

It is great to see people rewriting in Rust where it makes sense.

~~~
ianlevesque
I use Java constantly in my job and recently tried rewriting a math & memory
heavy component in rust to see what performance gains there might be.
Surprisingly (to me) the naive rust version was ~15% slower than Java. There’s
probably room for more rust optimization but it was interesting that
“efficient standalone binaries” doesn’t automatically mean faster too when
competing with HotSpot.

~~~
alex_duf
When comparing speed on the JVM with speed with native languages, the only
positive side you get from native binaries is cold start nowadays.

For a webapp this doesn't matter, but as we're moving towards more cloud
functions it start to make a lot of sense.

That needs to be ponderated by the fact any real life application will have to
access the network at bootstrap to load configuration and therefore your
bottleneck will most likely be I/O.

~~~
chrisweekly
> "ponderated"

Huh! I have a fairly strong vocabulary and thought this might have been a
made-up word -- but apparently it means "to weigh down or give substance to",
which aligns with your intended point here. TIL

~~~
alex_duf
Sorry that's directly brought from French.

English is my second language. I did a quick google search just to make sure
it was correct but it's hard to gauge if a word that happen to be in the
dictionary is also widely understood.

------
ilovecaching
Rust is often sold feature by feature; the borrow check offers proof like
safety over fuzzing, cargo provides real versioned package management over
makefiles or git commits...

I choose Rust because taken as a whole, Rust changed the way I approached
laying out my memory and how I composed my code. I think this more than
anything leads to less issues than the equivalent C++. The article points out
that a Rust vs C++ solution to any given problem are going to be completely
different.

My only desire for Rust is to see compile times speed up and the C++ interop
to improve.

~~~
mrath
Yes compilation times are a big pain point. I heard that there is work being
done in this area.

------
jupp0r
One of the major hurdles in rewriting parts of C++ projects in Rust is that
the interop surface between both languages is C. The necessary interface layer
has created more bugs and work than the conversion saved. I'd really like to
see more high-level interoperability between the two languages in the future,
although C++ is a pretty fast-moving target at this point, with all the
changes in C++20.

~~~
jcranmer
Honestly, I wish a few different major languages would get together and start
developing system ABIs that move beyond C as the interchange language.

~~~
wilsonthewhale
But what would you put in such an ABI beyond what's in the C ABI? Beyond basic
data types, struct defintions, and function definitions, languages begin to
wildly diverge almost immediately.

~~~
jcranmer
Multiple return values. Ownership annotations. SIMD type support. String
charset information.

More challenging would be to include stuff like allocation management (how to
free a pointer when it's done), GC integration, function boxing,
iterator/generator support. Vtables and cross-language inheritance is
interesting but difficult.

One point of clarity: this would be an FFI ABI, not necessarily expecting that
most datatypes would be laid out according to the ABI.

~~~
ChrisSD
> Vtables and cross-language inheritance is interesting but difficult.

Windows' COM "solved" that problem. For example see C++/WinRT[0] for a modern
C++ interface to COM. Or the C# .NET equivalents.

[0]
[https://github.com/Microsoft/cppwinrt](https://github.com/Microsoft/cppwinrt)

~~~
int_19h
Take a look at this.

[https://github.com/Microsoft/xlang](https://github.com/Microsoft/xlang)

------
rini17
Does Rust allow for taint analysis too like Perl has for long time? If not I'd
say it's missed opportunity.

(It marks all untrusted input as tainted and programmer must explicitly parse
the data or mark them untainted to pass them further.)

~~~
palotasb
In Rust, or any statically typed language such as C++ or Java, the idiomatic
way to handle untrusted input is to treat it as a "bag of bytes" before you
access it. Then either parse it into a strongly typed object or bail out of
parsing. The strongly typed object is safe to use. Bailing out (throwing an
exception or returning an error type) does not allow the program to continue
assuming that the (malformed) input was correct.

~~~
chopin
More to the point, you should put untrusted input into a different type from
trusted input. As much as I admire the design of the servlet API I think the
biggest mistake is that everything is transmitted as Strings. The input
characters should have had a different type than the output characters.

------
guscost
Recently some colleagues started using the type annotations in the latest
python3. Really excited for this feature! It’s going to make a lot of our
production systems safer to work with.

And of course Rust is a great technology, etc.

------
herogreen
Can you build in some kind of "unsafe release" mode, so that every array bound
check that were asked in the code are skipped ? If not, would it be an
interesting feature ?

~~~
steveklabnik
No. Such a thing could only remove some kinds of checks; for example, if you
see that code sample later in the thread with a manual check, it wouldn't know
that's what you're doing.

In general, we don't want to make it easy to turn checks off. They get removed
if the compiler can prove they're not needed; if they're there, they're almost
always for good reason.

------
tonetheman
Meh. The whole thing seems weird to me.

We totally tried to write this twice then we switched to language x and
everything is great. Feels like something a language zealot would say. I would
scoff if someone at my company rewrote a core section in a different language.
It is their language so maybe they just told them to do it that way ha.

------
gubbrora
> could have been caught by a run time bounds check

And here I thought rust was all about zero cost abstractions.

~~~
blub
It's not, that's C++.

Rust requires a lot of runtime checks, but that's the price one has to pay for
memory safety.

~~~
sciurus
"Abstraction without overhead" is explicitly a goal of Rust.

[https://blog.rust-lang.org/2015/05/11/traits.html](https://blog.rust-
lang.org/2015/05/11/traits.html)

~~~
blub
From your link:

"C++ implementations obey the zero-overhead principle: What you don't use, you
don't pay for".

In Rust you pay for e.g. bounds checking or integer overflow handling or
optionals. But I don't understand why everyone's getting so defensive about
this, since it's the only way (static verification and proofs aside) to get
the desired safety characteristics...

~~~
carlmr
In Rust release builds the integer overflow of checks are disabled. They're
not zero cost and there's no magical way to do it, so it's only done in debug
builds.

You can disable them in debug builds as well with wrapping integers. So this
is again a practice of making the safer option the default, but won't incur
runtime costs if it bothers you.

If there is a way to do it at compile time that will usually be used in rust.

------
rujuladanh
The article is arguing that Rust somehow has better capabilities than C++ to
fight memory-related bugs, but the example vulnerability given is not
something Rust can solve nor is more powerful than C++ in its “bug catching”
capabilities regarding this kind of bug.

Concretely, the article claims that in Rust the vulnerability doesn’t become a
bigger problem because it simply crashes at run-time due to built-in bounds
checking. True, but that is alao the case as well with C++ if you were using
the equivalent Vec type with mandatory bounds checking - which many projects
do (and, critically, enforce).

Personally, I like what Rust brought to the compiler/language world. However,
some people is definitely overstating the case. Most non-trivial memory-safety
errors and vulnerabilities are related to runtime problems like the example
shown. In these, no language can help in the general case - we are not solving
the Halting Problem. Therefore, saying Rust is immune to memory-related
problems is not true. It is true, however, that those bugs will not trigger
anything worse than a crash if there is no unsafe blocks. The same way that
many other common languages out there do (Java, C# and many others).

The same way, I have seen people (and even the linked blog) to claim Rust is
free of race conditions or thread-safety issues (even if it introduced great
ideas to write correct code).

Giving a false sense of security is the worst thing we can do.

~~~
scoutt
A system crash is a bug. Period. In many cases it could lead to Denial of
Service. An insulin pump can stop working.

I remember when C# came out almost 20 years ago. People said "I can forget
about managing memory so I can focus on the logic". Programs kept crashing,
memory problems were still there.

The article goes with "...remove the burden of memory safety from our
shoulders, allowing us to focus on logical correctness and soundness
instead...". More or less the same, and admitting that said problems won't go
away.

But here we are, it's 2019 and we're still using C/C++ as if nothing happened.

~~~
empath75
A crash is a bug but not a security problem.

~~~
scoutt
Such affirmation requires that a crash will never produce a security problem.
But for example...

"Families are LOCKED OUT of or INSIDE their homes as Yale 'smart' security app
crashes leaving dozens stranded"

[https://www.dailymail.co.uk/news/article-6268379/People-
lock...](https://www.dailymail.co.uk/news/article-6268379/People-locked-
houses-Yale-smart-security-crashes.html)

"Households up and down the UK were unable to lock or unlock their doors"

An unlocked door it's a security problem too...

~~~
empath75
That's still a bug. I feel like you're being intentionally obtuse about what's
considered to be a security problem in code. Nobody ever suggested that rust
code can never crash or have bugs. It's just about memory safety, which
obviously has nothing to do with door locks.

~~~
scoutt
Sorry if I gave that impression. It was not my intention.

> A crash is a bug but not a security problem.

I think that all bugs, the ones that produces crashes and security bugs should
be all treated equally. A bug is a bug, whenever it has security implications
or not.

To me, the article gives the impression that a system crash is not a security
problem, because a Rust program will _" terminate in a controlled fashion,
preventing any illegal access"_. But one for example, can fingerprint a system
by forcing it to crash.

And of course, nobody expects that Rust will prevent bugs from happening, but
at the same time I don't get why the fixation of setting a difference between
_security bugs_ and _bugs_.

"security problems are just bugs" \- Linus Torvalds.
([http://lkml.iu.edu/hypermail/linux/kernel/1711.2/01701.html](http://lkml.iu.edu/hypermail/linux/kernel/1711.2/01701.html))

edit: Linus reference.

~~~
empath75
Security problems are bugs but not all bugs are security problems.

