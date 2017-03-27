>C is not the primary reason for our past vulnerabilities
>There. The simple fact is that most of our past vulnerabilities happened because of logical mistakes in the code. Logical mistakes that aren’t really language bound and they would not be fixed simply by changing language.
So I looked at https://curl.haxx.se/docs/security.html
#61 -> uninitialized random : libcurl's (new) internal function that returns a good 32bit random value was implemented poorly and overwrote the pointer instead of writing the value into the buffer the pointer pointed to.
#60 -> printf floating point buffer overflow
#57 -> cookie injection for other servers : The issue pertains to the function that loads cookies into memory, which reads the specified file into a fixed-size buffer in a line-by-line manner using the fgets() function. If an invocation of fgets() cannot read the whole line into the destination buffer due to it being too small, it truncates the output
This one is arguably not really a failure of C itself, but I'd argue that Rust encourages a more robust error handling through its Options and Results when C tends to abuse "-1" and NULL return types that need careful checking and can't usually be enforced by the compiler.
#55 -> OOB write via unchecked multiplication
Rust has checked multiplication enabled by default in debug builds, and regardless of that the OOB wouldn't be possible.
#54 -> Double free in curl_maprintf
#53 -> Double free in krb5 code
#52 -> glob parser write/read out of bound
And I'll stop here, so far 7 out of 11 vulnerabilities would probably have been avoided with a safer language. Looks like the vast majority of these issues wouldn't have been possible in safe Rust.
He addressed all of those points in the second short paragraph. None of those are C vulnerabilities, they were mistakes made on the part of the developers, not the language. Avoidance of problems in a safer language doesn't mean when things happen, it's the language's fault.
The point of type system features e.g. Option types instead of nulls, or linear/affine types avoiding use-after-free is to make programmer mistakes turn into compiler errors. Nothing more. There is no point in talking about whether or not something is the "languages fault" or not. We know C is like juggling knives. It's a tool. It has drawbacks and benefits. Being widespread and fast are the benefits. Not turning many forms of programmer errors into compiler errors is the drawback. That means the programmer can't make mistakes because they will be shipped. But programmers invariably make mistakes.
The grandparent argued that for the sample of issues he looked at, a lot would in fact be avoided by the type system of e.g. Rust - contradicting the argument in the blog post (could be because of the small sample though).
I think it's perhaps less important to focus on the number of issues of each kind, and instead look at the severity of them. If the kinds of issues avoided by better type systems are typically trivial issues, but the kind of issues coming from logic errors are severe security issues - then perhaps the case for stronger type systems isn't so strong after all. But I doubt that's the case.
They are absolutely the fault of the language, given other languages would have made these bugs impossible.
Errare humanum est is hardly a new concept, blaming humans for not being computers is inane, and in fact qualifies for in errare perseverare diabolicum as out of misplaced pride you persevere in the core original error of using C.
#52 -> glob parser write/read out of bound
Actually that's exactly what it means.
That's the only way people mean "It's because of C". That a safer language would have prevented those classes of errors"
That's regardless whether a more careful programmer would also have prevented them in C too.
Would you ever declare a bug to be the language's fault, other than compiler bugs?
My definition of fault is "what should be changed to prevent such error". And you are not going to change developers.
On the one hand, Curl is a great piece of software with a better security record than most, the engineering choices it's made thus far have served it just fine, and its developers quite reasonably view rewriting it as risky and unnecessary.
On the other hand, the state of internet security is really terrible, and the only way it'll ever get fixed is if we somehow get to the point where writing networking code in a non-memory-safe language is considered professional malpractice. Because it should be; reliably not introducing memory corruption bugs without a compiler checking your work is a higher standard than programmers can realistically be held to, and in networking code such bugs often have immediate and dramatic security consequences. We need to somehow create a culture where serious programmers don't try to do this, the same way serious programmers don't write in BASIC or use tarball backups as version control. That so much existing high-profile networking software is written in C makes this a lot harder, because everyone thinks "well all those projects do it so it must be okay".
End users, even those compiling from source, will still only need a C compiler. Only developers need to install the safer language (even Curl developers must install valgrind to run the full tests).
Where can you use generated code?
- For non-C language bindings (this could apply to the Curl project, but libcurl is a bit unusual in that it doesn't include other bindings, they are supplied by third parties).
- To describe the API and generate header files, function prototypes, and wrappers.
- To enforce type checking on API parameters (eg. all the CURL_EASY_... options could be described in the generator and then that can be turned into some kind of type checking code).
- Any other time you want a single source of truth in your codebase.
We use a generator (written in OCaml, generating mostly C) successfully in two projects: https://github.com/libguestfs/libguestfs/tree/master/generat... https://github.com/libguestfs/hivex/tree/master/generator
[0]http://www.fftw.org/
Programmatically generating C code not without problems. How can you prove that the C you're generating is free from problems solved by the safer language? Cloudbleed came from computer generated C code: https://blog.cloudflare.com/incident-report-on-memory-leak-c....
See quote from the author of Ragel in the comments:
There is no mistake in ragel generated code. What happened was that you turned on EOF actions without appropriate testing. The original author most certainly never intended for that. He/She would have known it would require extensive testing. Legacy code needs to be tested heavily after changes. It should have been left alone.
PLEASE PLEASE PLEASE take some time to ensure the media doesn't print things like this. It's going to destroy me. You guys have most certainly benefitted from my hard work over the years. Please don't kill my reputation!
The mystical process of "programmatically generating code" in also known as compilation. The case you are describing is a compiler bug. The compiler wasn't able to generate target code (in this case C code) with semantics and/or guarantees of the source language.
how is that different from just writing it in another language? End users who need to compile will be able to regardless of the generated C code, but the end users who need to do a _little_ modification will be given ugly generated C code! Seems stictly worse to me...
Didn't know that curl was stuck back on C89, that's really optimizing for portability.
If anyone is confused by the "curl sits in the boat" section header, that's basically a Swedish idiom being translated straight to English. That rarely works, of course, and I'm sure Daniel knows this. :)
The closest English analog would be "curl doesn't rock the boat", I think the two expressions are equivalent (if you sit, you don't rock the boat).
I know there's a sentiment here on HN against C (as evidenced by bitter comments whenever a new project dares to choose C) but I wish there'd be a more constructive approach, acknowledging the issue isn't so much new software but the large collection of existing (mostly F/OSS) software not going to be rewritten in eg. Rust or some (lets face it) esoteric/niche FP language. Even for new projects, the choice of programming language isn't clear at all if you value integration and maintainability aspects.
Everything else outside UNIX was using Assembly, Algol, PL/I, Modula or Pascal dialect.
C owes its success to UNIX's adoption by the market, as operating system available almost free of charge to universities, with source code available.
I particularly like the mention of portability. No other language comes even remotely close to the portability of C. What other language runs on Linux, NT, BSD, Minix, Mach, VAX, Solaris, plan9, Hurd, eight dozen other platforms, freestanding kernels, and nearly every architecture ever made?
That's wrong. A lot of the C mistakes are indeed "logical mistakes in the code", but most of them would be indeed fixed by changing to a language that prevents those mistakes in the first place.
I don't know that you'd gain much in the real world though. Starting such a project now? Rust, for sure, for me anyway. But is it worth rewriting Curl? I agree with the author there, it most probably isn't.
Probably.
Safer code, so half of the vulnerabilities wouldn't have existed.
Rust's ecosystem (package manager and libraries).
Of course you lose portability and you probably appeal to fewer developers, at least for now. So there is a trade-off.
I wish Rust compiled to C. It would be my dream language. The only reason I can't choose rust half the time is because it doesn't support targets I need to support.
> Every once in a while someone suggests to me that curl and libcurl would do better if rewritten in a “safe language”.
And so he minimally addresses that and some other reasons for sticking with (and originally choosing) C89.
