We run high-load web services in something-other-than-c with managed memory and checked array access. It is native, 8/9 as fast as C, and it works. We will never have the budget to pay for the additional 1/9th that C would afford us, in terms of security concerns.
Yet, everyday, I have to field questions about why our low-level, network-facing system code is not written in C... old prejudices die hard.
> why our low-level, network-facing system code is not written in C
Really I think these days, in fact since the Morris Worm, people should be asking the opposite question: why is network-facing code written in C given that almost any error can lead to an exploitable security compromise?
I do not buy the performance argument at all any more. So few systems are really performance-constrained by network processing (usually it's the network itself or the disk), you have to write horizontally scalable code anyway if you want to make use of multiple cores, so almost all cases where it's a bottleneck are more cheaply dealt with by scaling rather than tinkering with the software.
There are people doing HFT in Java. The managed languages are quite capable of great performance with a little effort and imagination.
Watching Heartbleed and Cloudbleed happen really drove it home for me. Given that routine programming errors can cause private data to be transmitted unintentionally, I have to question the sanity of writing networking software handling private data in C.
What counters do you offer though? Or maybe reverse the question - why do they think you should use C? That last 1/9th performance? I'm pretty sure that last factor also isn't due to the language and there's still room for performance improvements.
How do you mesure performance against C ? Did you implement a rival unsafe C version to compete with, or is this just a guess ? If so, what makes you believe C would only yield a 10% improvement?
I'm asking because my own gut estimate of the cost of automatic memory management is significantly higher.
10% is fairly typical overhead for mark-and-sweep GC for an imperative program. It requires artificially high allocations rates – that well-written imperative code shouldn't have – to go much higher than that.
The cost of array bound checks is generally negligible with a modern compiler/CPU and the speed/safety tradeoff is not even worth haggling over for any code that's exposed to the internet.
isn't this exactly the sort of line of questioning that the OP is complaining about?
In my experience of managed software 99.9% of performance problems are due to stupids and 0.1% are due to marshalling and GC costs. An average unmanaged developer seems to assume the opposite.
Then you may not be working in the same problem domain as the person who asked the question.
I mean, my experience is pretty much the same as yours, to the point where if someone blames the language for poor performance I'm immediately skeptical of the claim. But other cases do exist, is especially for problems that are CPU intensive.
Look at the cost equation. Whatever additional resources they need to spend is a payment for improved safety / security.
It's like saying that a reinforced door with a lock is more expensive and heavy than a regular cardboard-and-planks one, and is slower to open. It is! But is has other advantages, not attainable otherwise.
By managed languages I understand memory managed, aka GC languages.
It's not correct to say that they use additional resources to provide memory safety or security.
Python for instance uses additional resources because it simply doesn't have a performant implementation and Java uses additional memory to gain execution speed.
Swift and Rust are memory safe, but don't need a GC, so those memory safety advantages are in fact attainable otherwise.
The performance loss vs C is in that ballpark; arguably it's sometimes less, because of increased flexibility in refactoring. A bigger issue is memory overhead, that's often 100% larger, and it's harder to avoid. But memory is cheap.
Seems like it would be hard to get Haskell to run 8/9 as fast as C. Haskell is much harder to reason about performance than Go or C, and it's pretty easy to make a seemingly innocuous change that slows your program dramatically.
If you really want to build on "rock", check out Bedrock: http://plv.csail.mit.edu/bedrock/ . Don't just prove your program is memory safe, prove it satisfies arbitrary properties! (A bit tongue in cheek, but the tools are really cool). Check out http://adam.chlipala.net/cpdt/cpdt.pdf for a nice introduction.
Very formal sorts of languages are super interesting. They might tend towards striking the wrong balance, though.
Rust is super interesting to me because it prevents the most common vulnerabilities w/regards to memory safety that show up time and again, but it's accessible enough that it's got a big community and a ton of high quality libraries. I think that's a pretty awesomely potent combination.
For the short term and mid term yes. However I do hope that in the long term the ideas from ATS, Liquid Haskell, Idris, F*, etc. will be iterated on in order to make them more accessible and usable. In the mean time we'll probably start to see Rust graft on more and more correctness tools like dependent types and SMT solvers, and it will begin to get a little ungainly, like dependently typed Haskell is. But there's only so many ideas you can try out in a new language, and Rust has made the right decisions on that front in the interests of getting into production this decade. :)
Note that you can write safe C code. seL4 is written in C! (And verified in Isabelle.) If that is too hard core, PolarSSL is a normal looking C code, which still has been proved to lack any buffer overflow. https://trust-in-soft.com/polarssl-verification-kit/ has details.
On the other hand, it probably is less effort to rewrite PolarSSL in Rust than doing that proof.
> On the other hand, it probably is less effort to rewrite PolarSSL in Rust than doing that proof.
It's probably even less effort to convert to SaferCPlusPlus[1] (essentially a memory-safe subset of C++). There's even an tool[2] (under construction, but functional) to do a lot of the conversion for you.
Does anyone actually want to program in "C++ with instance methods banned"?
Why not just use a language designed for memory safety (which is most of them)? It's a whole lot easier, plus you get an ecosystem of actually safe code.
> Does anyone actually want to program in "C++ with instance methods banned"?
Well, passing a "safe this" pointer as the first parameter to a (static) member function isn't that hard to get used to, is it?
But I don't disagree with your gist. If memory-safety is your only concern and you're not trying to salvage an existing codebase, then Rust might be a better choice.
But if you're trying to add memory safety to, say, an existing C implementation of SSL, the SaferCPlusPlus route would probably be less effort. Even when non-static member functions are banned.
> (which is most of them)
Personally, I consider RAII (deterministic destructors) an essential feature for safe, efficient programming at scale. That leaves only two choices, C++ and Rust, right?
And if there's a memory-safe C++ option available, there are some arguments for choosing it over Rust.
Indeed! You can write safe programs for e.g. the original Turing Machine.
But I bet you'll end up writing tools to make creating a safe program of any non-trivial size feasible. A large class of these tools is usually called "programming languages".
There are several safe dialects of C. The issue is that almost no one is using them and they're not about to start to either.
A lot of infrastructure software has not been built with security in mind, but the game has changed, and this old C software is getting eviscerated.
If one looks at the fixes for these buffer overflows in dnsmasq, no other conclusion can be drawn except that holes are plugged in a leaky sieve. This could work for software that's done, eventually. But if something's under maintenance, using plain C is security suicide.
I would really like to see a comparison of the efforts of a) writing a program in a safe language like rust and b) writing the language in C and verifying it with a toolset like the one used by the polarssl guys.
Given that the number of C projects that use similar toolsets is... low (?), I'd guess that said comparison would favor rust, but I'm prepared to be surprised :-)
Easy, it was out of scope. Quoting the report, "This report states the immunity of the PolarSSL software component to widespread CWEs, provided that PolarSSL is deployed in a context where... The server is configured as to never ask for client-side certificates". Quoting the news, "Servers that don't ask for client certificates are not impacted".
TrustInSoft could prove that part of code too. There's nothing special about it. It's just that they didn't bother.
It's an uninitialized read, and PolarSSL is actually also proven to lack uninitialized reads. It's just that asn1_get_sequence_of was not included in verification (as it is unreachable in verified configuration).
Which shows you the limit of verification toolsets vs. a verifying compiler. To verify that the code is 100% free of uninitialized reads or uses-after-free the static analyzer would have to scan through all possible permutations.
Sure, you can get very close to 100% in practice, but even if you're fine with this level of guarantee, the amount of discipline and cognitive overhead required for doing that in C makes Rust's learning seem like a piece of cake.
I'm always confused by the "safe language" evangelism. Reducing cognitive load is a Good Thing TM; I get that. But aren't we just trusting the Rust compiler and JVM to not have subtle bugs that introduce memory management errors into our programs?
Centralizing the memory management code to some well tested core---like the Rust compiler or some C lib---sounds to be the crux of wiring memory safe code, not the language.
Trouble is that in C there is no way to encapsulate the unsafe code. Rust has tools that let you do that - it's not perfect, but depending on the level you are working at you can even build operating systems like Redox with the majority of the code in safe Rust.
The language has not been formally specified which is a concern, but this is an active area that is also being worked on in collaboration with universities. This should make it easier to write unsafe code in combination with theorem provers like Lean or Coq for even greater confidence, and ensure that any soundess holes in the type system are found.
Concerning Rust specifically—Rust’s memory safety comes from its type system; specifically, its strong ownership model, which leads to its lifetime model. Without these, you simply can’t have a fast, memory-safe language: either you surrender speed, by boxing everything, or you surrender safety.
My point is that it’s not possible to just isolate the hazardous chunks into a well-tested core; it needs to be pervasive in the language’s memory model and type system in a way that C and C++ just don’t have.
> Without these, you simply can’t have a fast,
> memory-safe language: either you surrender
> speed, by boxing everything, or you surrender safety.
Not only can you have speed and safety without Rust's rigid enforcement of exclusive ownership, you can do even better--even faster and with even stronger security guarantees:
How many buffer overflows were involved in the Equifax leak?
The ultimate solution to the problem is formal verification. Rust solves this for buffer overflows, but nothing else. It's a one trick pony. Someone posted this interesting nugget on HN the other day:
https://www.youtube.com/watch?v=zt0OQb1DBko
Aside from some of the awkward ergonomics, its mechanism for formal specification looks brilliant. And you get to keep C's bag-of-bytes object manipulation and pointer arithmetic (when you want it) without having to resort to unsafe{}.
For projects where it's worth the effort to carefully declare precise typing semantics (because safety, performance, whatever), I want a wholistic solution. Otherwise my time and money is better spent throwing Javascript or Python at the problem, which solve the buffer overflow problem just as well.
> Rust solves this for buffer overflows, but nothing else. It's a one trick pony.
Completely false. Rust's design prevents all memory safety problems (that's what "memory safe" means). Buffer overflows aren't even the most pernicious kinds of memory safety problems anymore. Use after free is worse, and Rust spends most of its complexity budget on preventing that.
I spoke poorly. My point was that memory safety is but _one_ issue of a far larger problem. And Rust indeed solves it, or at least Rust provides an ergonomic environment for writing solutions that are memory safe. Which is different than saying it's easy to do this in an absolute sense. But certainly writing a program in Rust without using unsafe{} at all is still easier and more ergonomic than using other systems (e.g. annotation based frameworks which weren't baked into the language from inception) for equivalently safe behavior.
But if you watch the video, the presenter makes a great point: Rust's borrow checker is an amazing piece of technology, but it's inaccessible to the programmer. It's an implementation detail used to provide proofs for a narrow constraint--memory safety. Imagine if Rust provided syntax and semantics which not only allowed you to effectively implement the borrow checker yourself using a more general declaration system, but implement any other kind of formal specification needed to prove the higher-level semantics of your code.
In other words, imagine if a more general formal specification system were as first-class as the ownership- and mutability-oriented syntax are in Rust; a language that unifies the annotation model of solutions like Ada SPARK and Frama-C, but which is properly integrated into the language.
And that's what ATS is exploring. As the presenter says, ATS might be ugly as a systems language (because misallocated complexity--some easy things are too complex, some complex things are too easy--increases cognitive load and reduces efficiency), but its mechanism for formal specification is brilliant. Improve ATS, or apply it's novel approach to a language designed as a daily driver, and you'd finally have a realistic answer to the plague of buggy infrastructure software.
I realize this sounds like I'm making perfect the enemy of the good here. But I stand by my point: major failures like Equifax rarely involve memory safety, per se. Arithmetic issues are far more common, and even those are on the long tail of a much larger issue; namely, an inability to [efficiently] provide verifiable specifications for higher-level semantics. We hyper focus on buffer overflows, arithmetic overflows, etc, because we understand them and we know (at least in principal) how to fix them. But those are psychological blinders that cause us to miscalculate relative risks. We tend to overestimate the cost of problems we can fix relative to the cost of problems we're unsure about how to fix.
Full verification is extremely expensive, not just to create the proofs, but also to maintain them as the software evolves. Telling C programmers to go directly to full verification is, indeed, making the perfect the enemy of the good.
And you are missing an important point here. Once you have a reasonably sound and rich type system, you can leverage the type system in library API design to eliminate classes of higher-level bugs. For example, Rust crypto libraries can leverage Rust's affine type system to ensure you don't use a nonce more than once. The Apache Struts vulnerability was about failing to distinguish trusted vs untrusted input; that distinction can be expressed and checked in type systems.
> Empirically, programs written in memory-safe languages produce multiple orders of magnitude fewer memory-related bugs than C and C++ codebases do.
This kind of emperically-driven motivation is great! Would you mind pointing me to some sources? My google-fu didn't immediately yield anything helpful.
You are right. On the other hand, in practice, implementing it in language (really, compiler) seems more effective than in library, as in reducing number of bugs. Why it is so is an interesting question, but is not very relevant to empirical basis of safe language evangelism.
If someone thinks that compiler for their favorite programming language provides safety they have no idea what safe code is.
C/C++ is used to write safe code for medical and aerospace applications every day. The compiler for the languages like C, C++, Ada, Rust or whatever, is not enough.
You can get better static and dynamic code analysis and test coverage analysis tools for C/C++/Ada than you can for Rust.
The term 'safe' varies a lot based on context. In this context it is being used to mean 'memory safe' - i.e. that the compiler can eliminate a class of behaviour that are the root of a number of recent security issues.
"Safe" in the context of medical and aerospace means something very different, but is much closer to the meaning of "Secure" in this context. No compiler is ever going to prevent you writing insecure code - there can always be a logic problem, bad choice of crypto algorithm etc..
> C/C++ is used to write safe code for medical and aerospace applications every day.
How comes we still catch lots of errors in reviews there? How comes that the best paying gigs for c/c++ coders are all code review? Best practices and an excellent toolchain don't help if they are not used. A compiler/language that enforces those is a giant leap forward.
> You can get better static and dynamic code analysis and test coverage analysis tools for C/C++/Ada than you can for Rust.
Of course, but comparing the toolchain of a relatively new language with those of languages into which - literally - billions of dollar were put does only make a temporary point. And with lessons learned from those billions incorporated into the design of the new language, closing the gap will be much, much less expensive and time consuming than the initial development for the languages you mentioned.
What makes you think that a team which doesn't follow practices or uses their excellent toolchain will use Rust properly (without unsafe at any step) or even at all?
Not sure what you mean about code review. Security reviews? I guess that's because C and C++ are easy to misuse and most programmers, teams and companies aren't that good at writing correct or safe code.
But we already knew that and the solution is not as easy as switching to a different programming language.
> What makes you think that a team which doesn't follow practices or uses their excellent toolchain will use Rust properly (without unsafe at any step) or even at all?
Rust tends to push you away from using unsafe all the time. Unsafe is a pain to use, because you don't have all the nice pointer operators you do in C and C++, so programmers naturally default toward working in the safe language. Even if you use unsafe more than you should, Rust tends toward much safer code than C and C++ in the aggregate. (This has been observed empirically.)
> I guess that's because C and C++ are easy to misuse and most programmers, teams and companies aren't that good at writing correct or safe code.
If you replace "most" with "virtually every" (i.e. everyone who isn't writing avionics/defense/aerospace/etc. code), I agree.
> But we already knew that and the solution is not as easy as switching to a different programming language.
Programs written in C and C++ empirically have far more memory safety related problems than programs written in memory safe languages do.
That safety-critical C/C++ code tends to be brutally constrained. Typically dynamic memory allocation is not allowed. It's also very expensive to write and verify. If that was the only way people were allowed to use C/C++, most programmers would migrate to Rust or whatever en-masse. I'm for it!
You write that the claim is (A) "there exists a compiler X that always produces safe code", but then argue against (B) "C++ always produces unsafe code". To argue against claim A, you have to show that some "compiler X" that always produces safe code cannot exist.
> what is "safe" code without a guarantee of memory safety?
Memory safety is the absolute minimum for safety critical code.
It seems to be surprise for most people that you can write memory safe code in C and check for that statically and that includes static stack and heap exhaustion checks.
It's actually appeal to authority. "Populum" appeals to the general populace, not specific industries. The people who smoke crack are not highly regarded.
Nothing comes for free. I expect to have to trade something for security. It seems we have to trade speed and size. Ok, it shouldn't matter to us: this should not be developers' call. Instead it should go into a costs benefit matrix with plenty of other technical and not technical stuff. Developers will advise on their part of the matrix, marketers on their own, etc. Managers will make the decision and be held responsible for security breaches (or plauded for the lack of) as much as for every other feature of the product.
In principle things can come for free. Sometimes something new is strictly better. I am fairly confident that GCC outperforms early C compilers in every relevant metric (in theory GCC is ofc vastly more expensive because of decades of labor invested in it, but I don't pay that investment).
Right now Rust is slower than C (though not always slower than C++), produces bigger binaries, and compiles slower. But more safety required a more expressive language, and a more expressive language contains more information for code generation and optimizer. It's certainly conceivable that in a few years the Rust compiler might generate faster code than C compilers.
Unfortunately in the real world people are shortsighted and focus on immediate concerns like development velocity and performance at the expense of long-term concerns like latent vulnerabilities. Managers and developers who make these decisions today aren't going to still accountable when a critical bug is discovered in deployed code ten years later.
I might be wrong but it seems that the new safer languages don't run as fast as C and create larger binaries. But that's OK if safety is valued more than speed and size.
Servo, Firefox, Redox-OS, libsvg, Linker-d, GNOME, Maidsafe, pieces of Fuchsia... it's a huge effort, but it's beginning to pick up steam. I'm certainly trying to play my part as well (still private, but will be OSS).
Fyi... that Rust project doesn't actually cover the functionality of dnsmasq.
Dnsmasq also includes a DHCP server and the ability to read a blacklist to act as an ad blocker. In contrast, the "trust-dns" project is more of a replacement for the "bind" program instead of "dnsmasq".
If your intention was to only show that "non trivial Rust code exists", that's fine. However, some others might get the wrong impression that it's a Rust version of dnsmasq.
When Mozilla first sponsored Rust, it was with the goal of being the fastest memory-safe language around; at the time, it was thought that that required garbage collection. Only later (2011 or so), with the application of some comparatively recent research, did it progressively become apparent that it was in fact possible to have a practical memory-safe language without garbage collection; Rust progressively lost its garbage collected types (the @ sigil), and steadily settled down to its current model, which requires strong ownership to make it memory-safe and references + lifetimes to make it useful (otherwise you have a straightforward linear type system, which while functional is not very useful for fast code—you need references for that!).
The hard part of Rust is strongly tied to ownership and lifetimes. You can’t get rid of them and keep memory safety without introducing garbage collection on at least almost everything. And thus you’re roughly at Go.
If having a PhD in computer science (programming languages), being reasonably smart, and using the language for 20 years (up to and including most C++14 stuff) doesn't make you an expert in that language, then your language is far too difficult.
In fact, C++ is far too difficult and there are very few genuine experts in it. For example, who can explain why using push_back on a vector<map<T,unique_ptr>> is not conformant to the Standard, without looking it up? (I'll save you some time: https://bugs.chromium.org/p/chromium/issues/detail?id=683729...)
There's also a definitional bait-and-switch going on here. C++ proponents use "C++" to mean "the language that lots of projects have been using for 20 years and lots of programmers know" when espousing its popularity. But when necessary, the meaning changes to some "'modern', 'safe' subset of C++" ... that few programmers know well and few projects stick to rigorously. The exact definition of that subset changes depending on the situation, too.
Hi. In the linked "confessions" blog you've taken the path of making some claims about a topic and supported them through who you are, instead of facts.
People who disagree with your claims will question if who you are is relevant.
C++ is difficult, and there are few experts. My thesis is that one doesn't need to be an expert to write safe C++ code, but they do need access to quality libraries focused on safety, and good practices focused on safety.
Banning some unsafe C functions, saying "use smart pointers" or making a list of UB is useful, but not enough.
C++ can be written much more safely than it normally is, but it seems that's not happening. I'm not sure why, it could be that the performance loss of additional runtime verification is not acceptable, that the adequate learning resources are not available or that it's not an important topic for the C++ community.
P.S: I'll gladly have the kind of error you linked to. It's at compile time, I will try to figure it out and worst case rewrite my code. UB is the problem.
The first link has no relevance to the discussion, and the second I had indeed read before posting. Need I point out that C++ IS NOT C. Being a good C++ programmer does not make you a good C programmer, and vis versa. I would argue, in fact, that many of the capabilities that a knowledgable, current C++ coder depends on would make them more liable to encounter issues coding in C, std::auto_ptr being chief among them.
Someone who has something which is indeed better and knows it will believe everything should be rewritten thus; someone who has something which is middling or worse but believes it to be better will think the same thing.
The interesting question is not whether or not the author is a Rust proponent; it's whether or not Rust is an improvement on C. As neither a C nor a Rust programmer, it appears to me that the answer is unequivocally 'yes' — and this despite the fact that Rust is roughly as intelligible as Mandarin to me.
C++ is not C. They are two totally different languages. Boundaries and overflows are not an issue in C++ like they are in C. I have no idea why people can't understand this.
It's not memory safe, but with some good practices and library support it's in a different league of safety compared to C. Just the availability of smart pointers, vector array and string puts it waaay ahead of plain C.
The bugs in dnsmasq were buffer overflows. In this case, the good practices would be always use std::array and std::vector and index with "at".
Tackling UAF is more complex. It involves using smart pointers exclusively with a runtime-check on dereferencing. This will result in some performance loss.
P.S: I'm not saying it trivial to secure C++ code, nor that every project out there is using these techniques. It's a worthwile task to try to make existing C++ projects as safe as possible, and that's an effort parallel to Rust which should have very beneficial results.
Yet, everyday, I have to field questions about why our low-level, network-facing system code is not written in C... old prejudices die hard.