
Building on Rock, Not Sand - clouddrover
http://robert.ocallahan.org/2017/10/building-on-rock-not-sand.html
======
dsign
We run high-load web services in something-other-than-c with managed memory
and checked array access. It is native, 8/9 as fast as C, and it works. We
will never have the budget to pay for the additional 1/9th that C would afford
us, in terms of security concerns.

Yet, everyday, I have to field questions about why our low-level, network-
facing system code is not written in C... old prejudices die hard.

~~~
rixed
How do you mesure performance against C ? Did you implement a rival unsafe C
version to compete with, or is this just a guess ? If so, what makes you
believe C would only yield a 10% improvement?

I'm asking because my own gut estimate of the cost of automatic memory
management is significantly higher.

~~~
Quarrelsome
isn't this exactly the sort of line of questioning that the OP is complaining
about?

In my experience of managed software 99.9% of performance problems are due to
stupids and 0.1% are due to marshalling and GC costs. An average unmanaged
developer seems to assume the opposite.

~~~
blub
Some managed languages can have pretty good performance, but at significant
memory usage cost. Performance is not a single dimension.

It's a very legitimate question to ask how it was measured.

Obviously they're happy with their performance and we can't comment on that,
but the other claims regarding speed vs. C were more general.

~~~
nine_k
Look at the cost equation. Whatever additional resources they need to spend is
a payment for improved safety / security.

It's like saying that a reinforced door with a lock is more expensive and
heavy than a regular cardboard-and-planks one, and is slower to open. It is!
But is has _other_ advantages, not attainable otherwise.

~~~
blub
By managed languages I understand memory managed, aka GC languages.

It's not correct to say that they use additional resources to provide memory
safety or security.

Python for instance uses additional resources because it simply doesn't have a
performant implementation and Java uses additional memory to gain execution
speed.

Swift and Rust are memory safe, but don't need a GC, so those memory safety
advantages are in fact attainable otherwise.

------
jeremysalwen
If you really want to build on "rock", check out Bedrock:
[http://plv.csail.mit.edu/bedrock/](http://plv.csail.mit.edu/bedrock/) . Don't
just prove your program is memory safe, prove it satisfies arbitrary
properties! (A bit tongue in cheek, but the tools are really cool). Check out
[http://adam.chlipala.net/cpdt/cpdt.pdf](http://adam.chlipala.net/cpdt/cpdt.pdf)
for a nice introduction.

~~~
bpicolo
Very formal sorts of languages are super interesting. They might tend towards
striking the wrong balance, though.

Rust is super interesting to me because it prevents the most common
vulnerabilities w/regards to memory safety that show up time and again, but
it's accessible enough that it's got a big community and a ton of high quality
libraries. I think that's a pretty awesomely potent combination.

~~~
bjz_
For the short term and mid term yes. However I do hope that in the long term
the ideas from ATS, Liquid Haskell, Idris, F*, etc. will be iterated on in
order to make them more accessible and usable. In the mean time we'll probably
start to see Rust graft on more and more correctness tools like dependent
types and SMT solvers, and it will begin to get a little ungainly, like
dependently typed Haskell is. But there's only so many ideas you can try out
in a new language, and Rust has made the right decisions on that front in the
interests of getting into production this decade. :)

------
sanxiyn
Note that you _can_ write safe C code. seL4 is written in C! (And verified in
Isabelle.) If that is too hard core, PolarSSL is a normal looking C code,
which still has been proved to lack any buffer overflow. [https://trust-in-
soft.com/polarssl-verification-kit/](https://trust-in-soft.com/polarssl-
verification-kit/) has details.

On the other hand, it probably is less effort to rewrite PolarSSL in Rust than
doing that proof.

~~~
jabot
I think the seL4 kernel rather proves that you can write safe haskell, verify
it, and then transpile that to C.

That's different from directly writing C...

~~~
agumonkey
in the end the idea is that if you are semantic aware, then everything is
safe; it's just operations from exp -> env -> store to env -> store etc etc

~~~
nine_k
Indeed! You can write safe programs for e.g. the original Turing Machine.

But I _bet_ you'll end up writing tools to make creating a safe program of any
non-trivial size feasible. A large class of these tools is usually called
"programming languages".

------
xelxebar
I'm always confused by the "safe language" evangelism. Reducing cognitive load
is a Good Thing TM; I get that. But aren't we just trusting the Rust compiler
and JVM to not have subtle bugs that introduce memory management errors into
our programs?

Centralizing the memory management code to some well tested core---like the
Rust compiler or some C lib---sounds to be the crux of wiring memory safe
code, not the language.

~~~
pcwalton
> But aren't we just trusting the Rust compiler and JVM to not have subtle
> bugs that introduce memory management errors into our programs?

Empirically, programs written in memory-safe languages produce multiple orders
of magnitude fewer memory-related bugs than C and C++ codebases do.

> Centralizing the memory management code to some well tested core---like the
> Rust compiler or some C lib

That isn't practical in C (or in C++). "Memory management" here means
"pointers", and you can't program in C or C++ without pointers.

~~~
wahern
How many buffer overflows were involved in the Equifax leak?

The ultimate solution to the problem is formal verification. Rust solves this
for buffer overflows, but nothing else. It's a one trick pony. Someone posted
this interesting nugget on HN the other day:

    
    
      https://www.youtube.com/watch?v=zt0OQb1DBko
    

Aside from some of the awkward ergonomics, its mechanism for formal
specification looks brilliant. And you get to keep C's bag-of-bytes object
manipulation and pointer arithmetic (when you want it) without having to
resort to unsafe{}.

For projects where it's worth the effort to carefully declare precise typing
semantics (because safety, performance, whatever), I want a wholistic
solution. Otherwise my time and money is better spent throwing Javascript or
Python at the problem, which solve the buffer overflow problem just as well.

~~~
pcwalton
> Rust solves this for buffer overflows, but nothing else. It's a one trick
> pony.

Completely false. Rust's design prevents all memory safety problems (that's
what "memory safe" means). Buffer overflows aren't even the most pernicious
kinds of memory safety problems anymore. Use after free is worse, and Rust
spends most of its complexity budget on preventing that.

~~~
wahern
I spoke poorly. My point was that memory safety is but _one_ issue of a far
larger problem. And Rust indeed solves it, or at least Rust provides an
ergonomic environment for writing solutions that are memory safe. Which is
different than saying it's easy to do this in an absolute sense. But certainly
writing a program in Rust without using unsafe{} at all is still easier and
more ergonomic than using other systems (e.g. annotation based frameworks
which weren't baked into the language from inception) for equivalently safe
behavior.

But if you watch the video, the presenter makes a great point: Rust's borrow
checker is an amazing piece of technology, but it's inaccessible to the
programmer. It's an implementation detail used to provide proofs for a narrow
constraint--memory safety. Imagine if Rust provided syntax and semantics which
not only allowed you to effectively implement the borrow checker yourself
using a more general declaration system, but implement any other kind of
formal specification needed to prove the higher-level semantics of your code.

In other words, imagine if a more general formal specification system were as
first-class as the ownership- and mutability-oriented syntax are in Rust; a
language that unifies the annotation model of solutions like Ada SPARK and
Frama-C, but which is properly integrated into the language.

And that's what ATS is exploring. As the presenter says, ATS might be ugly as
a systems language (because misallocated complexity--some easy things are too
complex, some complex things are too easy--increases cognitive load and
reduces efficiency), but its mechanism for formal specification is brilliant.
Improve ATS, or apply it's novel approach to a language designed as a daily
driver, and you'd finally have a realistic answer to the plague of buggy
infrastructure software.

I realize this sounds like I'm making perfect the enemy of the good here. But
I stand by my point: major failures like Equifax rarely involve memory safety,
per se. Arithmetic issues are far more common, and even those are on the long
tail of a much larger issue; namely, an inability to [efficiently] provide
verifiable specifications for higher-level semantics. We hyper focus on buffer
overflows, arithmetic overflows, etc, because we understand them and we know
(at least in principal) how to fix them. But those are psychological blinders
that cause us to miscalculate relative risks. We tend to overestimate the cost
of problems we can fix relative to the cost of problems we're unsure about how
to fix.

~~~
roca
Full verification is extremely expensive, not just to create the proofs, but
also to maintain them as the software evolves. Telling C programmers to go
directly to full verification is, indeed, making the perfect the enemy of the
good.

And you are missing an important point here. Once you have a reasonably sound
and rich type system, you can leverage the type system in library API design
to eliminate classes of higher-level bugs. For example, Rust crypto libraries
can leverage Rust's affine type system to ensure you don't use a nonce more
than once. The Apache Struts vulnerability was about failing to distinguish
trusted vs untrusted input; that distinction can be expressed and checked in
type systems.

------
blub
What if there is just not enough interest in improving C++'s safety? Or in
improving safety at all? That's the scary thought.

I also have to point out that these pesky C buffer overflows are trivial to
avoid in C++: just use std::vector::at.

------
Nokinside
If someone thinks that compiler for their favorite programming language
provides safety they have no idea what safe code is.

C/C++ is used to write safe code for medical and aerospace applications every
day. The compiler for the languages like C, C++, Ada, Rust or whatever, is not
enough.

You can get better static and dynamic code analysis and test coverage analysis
tools for C/C++/Ada than you can for Rust.

~~~
jlg23
> C/C++ is used to write safe code for medical and aerospace applications
> every day.

How comes we still catch lots of errors in reviews there? How comes that the
best paying gigs for c/c++ coders are all code review? Best practices and an
excellent toolchain don't help if they are not used. A compiler/language that
enforces those is a giant leap forward.

> You can get better static and dynamic code analysis and test coverage
> analysis tools for C/C++/Ada than you can for Rust.

Of course, but comparing the toolchain of a relatively new language with those
of languages into which - literally - billions of dollar were put does only
make a temporary point. And with lessons learned from those billions
incorporated into the design of the new language, closing the gap will be
much, much less expensive and time consuming than the initial development for
the languages you mentioned.

~~~
blub
What makes you think that a team which doesn't follow practices or uses their
excellent toolchain will use Rust properly (without unsafe at any step) or
even at all?

Not sure what you mean about code review. Security reviews? I guess that's
because C and C++ are easy to misuse and most programmers, teams and companies
aren't that good at writing correct or safe code.

But we already knew that and the solution is not as easy as switching to a
different programming language.

~~~
pcwalton
> What makes you think that a team which doesn't follow practices or uses
> their excellent toolchain will use Rust properly (without unsafe at any
> step) or even at all?

Rust tends to push you away from using unsafe all the time. Unsafe is a pain
to use, because you don't have all the nice pointer operators you do in C and
C++, so programmers naturally default toward working in the safe language.
Even if you use unsafe more than you should, Rust tends toward much safer code
than C and C++ in the aggregate. (This has been observed empirically.)

> I guess that's because C and C++ are easy to misuse and most programmers,
> teams and companies aren't that good at writing correct or safe code.

If you replace "most" with "virtually every" (i.e. everyone who isn't writing
avionics/defense/aerospace/etc. code), I agree.

> But we already knew that and the solution is not as easy as switching to a
> different programming language.

Programs written in C and C++ empirically have far more memory safety related
problems than programs written in memory safe languages do.

------
pmontra
Nothing comes for free. I expect to have to trade something for security. It
seems we have to trade speed and size. Ok, it shouldn't matter to us: this
should not be developers' call. Instead it should go into a costs benefit
matrix with plenty of other technical and not technical stuff. Developers will
advise on their part of the matrix, marketers on their own, etc. Managers will
make the decision and be held responsible for security breaches (or plauded
for the lack of) as much as for every other feature of the product.

~~~
sanxiyn
> It seems we have to trade speed and size.

Why do you think so?

~~~
pmontra
I might be wrong but it seems that the new safer languages don't run as fast
as C and create larger binaries. But that's OK if safety is valued more than
speed and size.

~~~
sanxiyn
You might be wrong. Rust is as fast as C and creates binaries as small as C.

------
kosma
Talk is cheap; show me the code. There's a lot of talk about rewriting some
crucial pieces in Rust but no actual work to follow it.

~~~
briansmith
[https://github.com/bluejekyll/trust-dns](https://github.com/bluejekyll/trust-
dns)

~~~
jasode
Fyi... that Rust project doesn't actually cover the functionality of dnsmasq.

Dnsmasq also includes a _DHCP server_ and the ability to read a blacklist to
act as an _ad blocker_. In contrast, the "trust-dns" project is more of a
replacement for the "bind" program instead of "dnsmasq".

If your intention was to only show that "non trivial Rust code exists", that's
fine. However, some others might get the wrong impression that it's a Rust
version of dnsmasq.

------
barrkel
Getting rid of C is like gun control. There are irrational, emotional
attachments.

~~~
_greim_
Foot-gun control. Ha!

------
ryansama
It would be nice to have more healthy competition in this space. Any
suggestions?

Would it not be great to have a language as powerful as Rust but with the ease
of Go?

Anyhow, must be like CAP. You can't have everything in a language.

~~~
chrismorgan
When Mozilla first sponsored Rust, it was with the goal of being the fastest
memory-safe language around; at the time, it was thought that that required
garbage collection. Only later (2011 or so), with the application of some
comparatively recent research, did it progressively become apparent that it
was in fact possible to have a practical memory-safe language without garbage
collection; Rust progressively lost its garbage collected types (the @ sigil),
and steadily settled down to its current model, which requires strong
ownership to make it memory-safe and references + lifetimes to make it useful
(otherwise you have a straightforward linear type system, which while
functional is not very useful for fast code—you need references for that!).

The hard part of Rust is strongly tied to ownership and lifetimes. You can’t
get rid of them and keep memory safety without introducing garbage collection
on at least _almost_ everything. And thus you’re roughly at Go.

~~~
bpicolo
Happen to have a link to the research that relates to this? I've been trying
to up my CS-paper-reading game.

~~~
chrismorgan
A quick search based on certain keywords I remembered yields
[https://www.reddit.com/r/rust/comments/2d94tu/is_there_any_a...](https://www.reddit.com/r/rust/comments/2d94tu/is_there_any_academic_background_for_rusts_static/)
which may help. Regions is the key thing. But right now I’m going to bed.

~~~
bpicolo
Thanks :)

------
hguhghuff
Novice question: isn't rust fatter?

~~~
bjz_
Great question. There's and FAQ on that: [https://www.rust-lang.org/en-
US/faq.html#why-do-rust-program...](https://www.rust-lang.org/en-
US/faq.html#why-do-rust-programs-have-larger-binary-sizes-than-C-programs) \-
tl;dr is that it is by default but you can slim it down to the same size as a
C executable if you need to.

------
petiepooo
Another rust proponent lecturing the world on how they should rewrite
everything in rust. _yawn_

~~~
chrismorgan
Robert is an expert C/C++ programmer, but understands the problems with those
languages, which has led to articles like
[http://robert.ocallahan.org/2017/07/confession-of-cc-
program...](http://robert.ocallahan.org/2017/07/confession-of-cc-
programmer.html).

Robert is now a Rust proponent _because it works_.

From today’s article, I think this is the money quote:

> My sincere hope is that people will at least stop choosing C for new
> projects. At this point, doing so is professional negligence.

~~~
blub
I don't doubt they're a competent programmer, but I do have some doubts that
they're an expert in C++. Two reasons why:

1) They made incorrect claims about C++ in relation to Rust before:
[http://robert.ocallahan.org/2017/02/what-rust-can-do-that-
ot...](http://robert.ocallahan.org/2017/02/what-rust-can-do-that-other-
languages.html?m=1) and [http://robert.ocallahan.org/2017/04/rust-
optimizations-that-...](http://robert.ocallahan.org/2017/04/rust-
optimizations-that-c-cant-do.html?m=1)

2) The Mozilla C++ code base is old and very raw pointer/reference heavy and
it doesn't seem to be written with safety in mind.

Their wikis also don't have any particularly good security guidelines. Maybe
the good stuff is kept private, who knows.

Saying that someone was a distinguished engineer at Mozilla is not saying much
about their abilities of writing modern or safe C++.

~~~
roca
Hi!

Yeah I made a mistake once. It happens.

If having a PhD in computer science (programming languages), being reasonably
smart, and using the language for 20 years (up to and including most C++14
stuff) doesn't make you an expert in that language, then your language is far
too difficult.

In fact, C++ _is_ far too difficult and there are very few genuine experts in
it. For example, who can explain why using push_back on a
vector<map<T,unique_ptr>> is not conformant to the Standard, without looking
it up? (I'll save you some time:
[https://bugs.chromium.org/p/chromium/issues/detail?id=683729...](https://bugs.chromium.org/p/chromium/issues/detail?id=683729#c25))

There's also a definitional bait-and-switch going on here. C++ proponents use
"C++" to mean "the language that lots of projects have been using for 20 years
and lots of programmers know" when espousing its popularity. But when
necessary, the meaning changes to some "'modern', 'safe' subset of C++" ...
that few programmers know well and few projects stick to rigorously. The exact
definition of that subset changes depending on the situation, too.

~~~
blub
Hi. In the linked "confessions" blog you've taken the path of making some
claims about a topic and supported them through who you are, instead of facts.
People who disagree with your claims will question if who you are is relevant.

C++ is difficult, and there are few experts. My thesis is that one doesn't
need to be an expert to write safe C++ code, but they do need access to
quality libraries focused on safety, and good practices focused on safety.
Banning some unsafe C functions, saying "use smart pointers" or making a list
of UB is useful, but not enough.

C++ can be written much more safely than it normally is, but it seems that's
not happening. I'm not sure why, it could be that the performance loss of
additional runtime verification is not acceptable, that the adequate learning
resources are not available or that it's not an important topic for the C++
community.

P.S: I'll gladly have the kind of error you linked to. It's at compile time, I
will try to figure it out and worst case rewrite my code. UB is the problem.

------
anon-123
C++ is not C. They are two totally different languages. Boundaries and
overflows are not an issue in C++ like they are in C. I have no idea why
people can't understand this.

~~~
sanxiyn
Because it is not true. C++ is not memory safe. Especially, C++ is not very
good at handling use-after-free.

~~~
blub
It's not memory safe, but with some good practices and library support it's in
a different league of safety compared to C. Just the availability of smart
pointers, vector array and string puts it waaay ahead of plain C.

The bugs in dnsmasq were buffer overflows. In this case, the good practices
would be always use std::array and std::vector and index with "at".

Tackling UAF is more complex. It involves using smart pointers exclusively
with a runtime-check on dereferencing. This will result in some performance
loss.

P.S: I'm not saying it trivial to secure C++ code, nor that every project out
there is using these techniques. It's a worthwile task to try to make existing
C++ projects as safe as possible, and that's an effort parallel to Rust which
should have very beneficial results.

------
Annatar
_Contrary to the quote, given a finite "amount of budget", dnsmasq could have
been Rewritten In Rust and these problems avoided._

Ah, I see the Rust evangelism strike force is back. n-gate.com is right, as
usual.

