Hacker News new | comments | show | ask | jobs | submit login

I understand why people want to rewrite C libraries and such in Rust.

Software like glibc is battle-tested---it is _widely_ used on hundreds of thousands of systems around the world, and has been used (though not at today's massive scale) for decades. I understand that glibc is under active development and there is a lot of new code, but let's keep this in perspective:

Writing a new implementation of a system is a huge opportunity to introduce bugs. There is focus on these specific problems, but in the broader scheme of things, glibc is remarkably stable, performant, and feature-rich. A new implementation will have bugs, and those bugs might be less likely to be caught simply because the system will not be as widely used for quite a long time. Even formally proven systems don't address flaws in the actual program specification. (See "The Limits of Correctness" by Brian Cantwell Smith for a good discussion). Also relevant (which I'm reminded of in part because of his recent death): Peter Naur's Programming as Theory Building.

So even _new_ code to glibc has the benefit of a huge community of both developers are users to eyeball it and test it out in production on a huge number of systems.

So rewriting glibc may solve certain problems, but it's bound to create a whole lot more, considering the narrow range of issues that are being focused on. New Rust code will have undiscovered issues too, even if they're not memory or stack related. I feel that this effort might be better spent fixing and finding problems with glibc---and continuing the development of tools to find those problems, to benefit _all_ of our old C libraries and programs---than rewriting for the sake of rewriting.




> New Rust code will have undiscovered issues too, even if they're not memory or stack related.

Memory issues have a tendency to be the most severe types of issues from a security perspective. (Not always, but with a high frequency.) Other issues that may occur with a rewrite may be security problems as well, but the security track records of network-facing software written in old-timey C speak for themselves: memory safety issues make up a huge fraction of their vulnerabilities.

> I feel that this effort might be better spent fixing and finding problems with glibc---and continuing the development of tools to find those problems, to benefit _all_ of our old C libraries and programs---than rewriting for the sake of rewriting.

So we've been trying that for the past 35 years since C was released in 1978, and we've completely failed to move beyond making the same basic memory management errors again and again and again. I think that if we couldn't do it in 35 years, we aren't likely to be able to by just trying a little harder this one time.


One problem with rewriting libc in a safer language is that interfaces will still be memory unsafe. No matter how many times you rewrite 'gets' it will still be broken, str* will still be problematic, the heap can be still trivially corrupted by the application, etc.

As long as the public interfaces are still unsafe, it doesn't seem to me that there is significant latitude to qualitatively improve the security of a libc implementation.


> As long as the public interfaces are still unsafe, it doesn't seem to me that there is significant latitude to qualitatively improve the security of a libc implementation.

Of course there is. The glibc bug we're talking about would not have happened if libresolv were memory safe.


I was too negative in my last message.

To be more precise: I claim that, for a library for an unsafe interface, if the ratio of library interface area over the library implementation volume is high enough, it is not worth re-implementing in a safe language. I also claim that a libc implementation is generally well above this threshold.

I agree that this might not be the case for selected subsets of the library like libresolv.


We can resolve this empirically, by looking at the list of glibc CVEs: https://www.cvedetails.com/vulnerability-list/vendor_id-72/p...

The vast majority of the memory safety issues here are not at the public interface boundaries.


> Software like glibc is battle-tested

This seems a bit counter-intuitive given that serious problems are still being found years or even decades later. Perhaps battle-tested is insufficient.

And while new code is likely to have bugs, assuming it's already based on the original C code it would be less than if it was re-written from scratch based on a spec. New code can take advantage of the battle-tested nature of the original code even if it's in a different language.


Yes, there's this whole attitude of "black-boxiness" to these types of arguments that I find troubling; that it is simply the nature of the machine to be unpredictable, and that therefore the safest and best code is the code with the most mileage on it, and every time you poke it you compromise the integrity of the system as whole.

(It's a close cousin to the "don't rewrite it" argument we've all heard via Spolsky et. al., even though they've ended up rewriting their shit several times over).

There is a fundamental shift forward in the types of guarantees you can make with the premises of a language like Rust vs. that of C that I have simply never heard one of these demagogues address.


> This seems a bit counter-intuitive given that serious problems are still being found years or even decades later. Perhaps battle-tested is insufficient.

Bugs will always exist. But that statement will apply equally to any software, regardless of language.

> New code can take advantage of the battle-tested nature of the original code even if it's in a different language.

That's why I referenced the Peter Naur paper on Programming as Theory Building---in practice, that may very well not be the case.


> Bugs will always exist. But that statement will apply equally to any software, regardless of language.

I'd be curious to see some research suggesting that the prevalence and severity of those bugs is identical regardless of development tool/language/runtime, but I'd be surprised if that's demonstrable.

The question at hand is not "will safer languages eliminate all bugs?" It's "will safer languages reduce the prevalence and severity of important classes of bugs?" I'd wager it's probably yes, but even if you disagree, I don't think that it's reasonable to suggest that because there will always be bugs we should never improve.


> I'd be curious to see some research suggesting that the prevalence and severity of those bugs is identical regardless of development tool/language/runtime, but I'd be surprised if that's demonstrable.

I intended to convey that software written in any language will have bugs, not that all languages will produce the same types of bugs.


Yes, but all that gets you is that no language is a panacea; bugs always exist. It does not address the question of whether or not there will be a propensity for more bugs (or more severe bugs) when comparing two languages.


> It does not address the question of whether or not there will be a propensity for more bugs (or more severe bugs) when comparing two languages.

My argument is based on the act of rewriting it---regardless of language. Many languages provide excellent guarantees, but that does not protect against bugs in the implementation itself (logic).


Yes; and the rewrite can take into account the logic used in the old code (especially in the security-critical areas) as well as all the vulnerabilities that have happened before. You're not starting from a complete blank slate; you can pick up the lessons learned.

Despite being "battle tested", all of these C programs continue to have both memory and logic errors. I think a rewrite would have the same rate of new logic issues after the initial code review and testing. "bugs will always exist" -- sure, so if we have something that eliminates a class of bugs, why not use it? The other classes of bugs will be there (and probably in the same force) whether you rewrite or not.

A lot of these bugs get introduced due to cruft in old code as well. So there are a bunch of tradeoffs here.


> Many languages provide excellent guarantees, but that does not protect against bugs in the implementation itself (logic).

This is the black and white security fallacy. Memory safety problems are, statistically, a huge quantity of security bugs. By eliminating them you drastically reduce the number of bugs.


Battle tested in the past != fit for future wars. The nature of systems has evolved. Attackers are more sophisticated, have better tools and probably understand C code better than most who still write it. So our tools need to evolve too. Admittedly rewriting full libs into another language sounds scary, but hey - since when did fear of the future stop it from coming?


I'm unconvinced by the argument that C is unfit for future wars.


It's simple:

1. Memory safety issues are the cause of a very large number of security vulnerabilities (often most of them for projects written in C or C++, depending on the software).

2. Memory safety-related issues have a relatively high probability of being turned into remote code execution, which is one of the most if not the most severe outcomes.

3. C and C++ projects have been empirically observed to have orders of magnitude more memory safety problems than projects written in other languages do.

4. The additional classes of security vulnerabilities that managed languages tend to foster do not have the combined prevalence and severity of memory safety problems.

So, we would be better served in security by moving to memory-safe languages.


> C and C++ projects have been empirically observed to have orders of magnitude more memory safety problems than projects written in other languages do.

Note that this includes projects in languages that are themselves written in C or C++, which shows that there's some value in confining the unsafe code to a small and well-tested core library (in this case, the language runtime). Honestly, it seems like 50% of the value just comes from not using C strings, since pretty much every other language has its own string library that does not use null-termination.


Memory safe programs require

1) Runtime overhead for some form of GC (D, Lisp, etc)

2) Rephrasing a program to satisfy a memory constraint checker (Rust)

3) Disciplined memory usage (i.e. Nasa C coding guidelines)

We don't have enough experience with 2 to indicate whether it will create new classes of bugs. We also don't understand the knock-on effect of managing memory differently - will functionally identical programs require more or fewer resources, more or fewer programmers hours, etc.

Rust may very well be the future, but we don't know for sure yet.

One thing we do know: options 1 and 3 have been available for years, but not widely utilized. What lessons can we learn from this fact to apply to Rust?


> We don't have enough experience with 2 to indicate whether it will create new classes of bugs.

What classes of security bugs could possibly arise from Rust's ownership discipline?


Logic bugs. Failure to correctly adapt imperative algorithms while still satisfying the constraint checkers.

Not all security bugs are related to memory. Many are related to improperly written algorithms (most crypto attacks), or improperly designed requirements (TLSv1).

Even Heartbleed was primarily due to a logic bug (trusting tainted data) instead of an outright memory ownership bug.

Does Rust automatically zero out newly allocated memory? Honest question, I don't know the answer.


> Logic bugs. Failure to correctly adapt imperative algorithms while still satisfying the constraint checkers.

Oh, also: If you're implying that Rust's ownership discipline can create security bugs where there were none before, I consider that a real stretch. I'd need to see an actual bug, or at least a bug concept, that Rust's borrowing/ownership rules create before accepting this.


> Not all security bugs are related to memory. Many are related to improperly written algorithms (most crypto attacks), or improperly designed requirements (TLSv1).

Nobody is saying that Rust eliminates all security bugs. Just a huge number of the most common ones.

> Does Rust automatically zero out newly allocated memory? Honest question, I don't know the answer.

Yes.


> Not all security bugs are related to memory.

This is a problem that will be there equally in all languages

Perhaps less so in languages with a better type system, but that doesn't affect Rust since there aren't any _systems_ languages with a better type system.


You most definitely know a lot more about code than me. So I'm not challenging you at all. But my contention is that robustness is a function of resilience. C has a class of errors which are hard to spot for foot-soldiers, and sometimes even generals which can leave deadly chinks unspotted for long. If Rust attempts to do away with those specific type of errors altogether, what's wrong with that? Rewriting code. And if that seems like a challenge worth attempting for some people, I can't fault it. And in the process maybe we will discover more bugs, or maybe something better. Its evolution, no?


It's not a bad thing that Rust addresses these issues. That's good, and essential for newer languages---it wouldn't make sense to not try to solve these problems in a language that intends to be lower-level (like Rust), relatively speaking.

The paper on the Limits of Correctness that I mentioned above does a good job at arguing my point. Even if you rewrote glibc in a language like Coq and formally proved its correctness, that doesn't mean that it's "correct" in the sense that its logic mimics glibc---there could be logic errors.

So you might gain confidence (or guarantees) in rewriting glibc in Rust, but in rewriting it you potentially introduce a host of new issues.


  > Software like glibc is battle-tested
"The best time to plant a tree is 20 years ago; the second best time is now."




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: