Hacker News new | past | comments | ask | show | jobs | submit login
Why CVE-2022-3602 was not detected by fuzz testing (allsoftwaresucks.blogspot.com)
219 points by pjmlp on Nov 21, 2022 | hide | past | favorite | 168 comments

As powerful as fuzzing is, this is a good reminder why it’s not a substitute for formal verification in high-integrity or critical systems.

.. or more pragmatically, safer languages where errors aren't exploitable to get remote code execution.

(I guess that semantics can also be seen as a formally verified property)

Given log4shell happened in one of the more aggressively sandboxed languages with mainstream adoption, the outlook isn't great.

Actually, Elasticsearch was completely unaffected by log4shell precisely because they implemented JVM sandboxing which completely mitigated it: https://discuss.elastic.co/t/apache-log4j2-remote-code-execu...

Java Security Manager is deprecated in Java 17 and is likely to be removed in the future releases.

So better approach is to use container security or something like that.

The deprecation JEP has some discussion of why it was deprecated: https://openjdk.org/jeps/411

"The threat of accidental vulnerabilities in local code is almost impossible to address with the Security Manager. Many of the claims that the Security Manager is widely used to secure local code do not stand up to scrutiny; it is used far less in production than many people assume. There are many reasons for its lack of use: [...]"

Would be interesting to know if there were other cases besides ElasticSearch that were protected from log4j by JSM.

.NET also dropped CAS during the Core rewrite, with similar reasoning.

Which is a pity, but unfortunely capability based systems still seem to have a problem for the common developer to properly configure them.

Sandboxing is mostly irrelevant to the log4j error. You'd have to tell the sandbox to turn off reflection, which isn't really feasible in Java. And that's because Java is so poorly designed that big libraries are all designed to use reflection to present an API they consider usable.

Compare that to a language designed well enough that reflection isn't necessary for good APIs, for instance.

> big libraries are all designed to use reflection to present an API they consider usable.

whistles in python

Python has first-class type objects. That's not the same thing as writing:

  pickle._getattribute(__import__(package), path)
everywhere, which is basically how Java reflection works half the time. In Python, you'd have something like copyreg.dispatch_table, and have plugin modules that register themselves in the table at load-time – limiting your attack surface to the modules you expect to be attack surface, rather than every single package accessible to the JVM.

Dunno if I agree that libraries need reflection. Some do, but primarily in the dependency injection and testing space.

That's not really where you'd expect RCE-problems.

Yeah, I should say where developers don't think they need to use reflection.

Like, the log4j thing came from (among other design errors) choosing to use reflection to look up filters for processing data during logging. Why would log4j's developers possibly think reflection is an appropriate tool for making filters available? Because it's the easy option in Java. Because it's the easy option, people are already comfortable with it in other libraries. Because it's easy and comfortable, it's what gets done.

Some languages make reflection much more difficult (or nearly impossible) and other APIs much easier. It's far more difficult to make that class of error in languages like that.

That belongs to the 30% of exploits that we are left with, after removing the 70% others from C.

I think you are correct, but I do not think the average severity of these exploits is necessarily the same.

US agency for cyber security thinks otherwise.

>> US agency for cyber security thinks otherwise

NSA's Software Memory Safety recommendation:

https://media.defense.gov/2022/Nov/10/2003112742/-1/-1/1/CSI... (pdf)

Code executing in the JVM isn't sandboxed. Sandboxing could have indeed mitigated log4shell. Log4shell was a design where a too powerful embedded DSL was exposed to untrusted data in a daft way - the log("format here...", arg1, arg2) call would interpret DSL expressions in the args carrying logged data. One can even imagine it passing formal verification depending on the specification.

But more broadly the thing is that eliminating these low level language footguns would allow people to focus on the logic and design errors.

Upvote for log4shell. That's pretty funny.

Yes, a safer language is not enough, but it is a huge leap forward, so I'll take it.

I’d classify runtime reflection as an unsafe language feature, to be honest.

Safer languages cannot protect from bad design. Many libraries have implicit behaviour which is not always visible. It's a hard tradeoff to make. You want safety, but in the same time enough customisation and features. I worked recently with an http client library which was forbidding to send special characters in headers. I understand that this is a safety feature, but I really wanted to send weird characters (building a fuzzing tool).

OK but we can mitigate these types of exploits (buffer overflow etc.) using memory-safe languages.

Bad design is a universal orthogonal problem.

That's true. It will definitely mitigate different category of exploits out of the box, but you still need people to acknowledge and be intentional about their decisions.

Seatbelts can not save you from bad driving, but they certainly help mitigate the effects.

Safer languages can protect from some kinds of bad design. Some kinds of incoherent design become simply impossible to express. (For example, using a structured language with function calls instead of goto means a function always returns to the same place where it was called from, so a whole class of possible designs - most of which were just mistakes, but a few of which were efficient implementations - becomes impossible)

One job of language designers and compiler writers is to recover the efficient implementations without allowing the bad designs.

See how eg optimized tail calls essentially give you back all the power of goto without the old downsides.

There's no replacement for intelligence. Turning the world into authoritarian dystopia in search of that replacement seems to be the popular thing to do, unfortunately.

I would argue the issue was not checking the coverage of new code vs what was being tested.

The OP is pointing out that what "the issue" is depends on whether you want high confidence that your code has few bugs, or you want certainty that your code contains no bugs.

> you want certainty that your code contains no bugs

Well, everyone would want that, but it's not possible. Formal verification comes nowhere close to promising that, especially not on a large project. I'm pretty sure OpenSSL is larger than any formally verified software to date (perhaps CompCert is larger?).

"Beware of bugs in the above code; I have only proved it correct, not tried it."

Donald Knuth

I really find OpenSSL function call interfaces infuriating more so if you think that this is a security library.

I think interfaces in Botan, to give an example, are way easier to use.

It looks to me like a minefield the OpenSSL API.

tl;dr: because ossl_a2ulabel had no unit tests until a few days ago, the fuzzer could not have reached it through any combination of other tests.

That fuzzing is tricky was not the problem here. The problem is the culture that allowed ossl_a2ulabel to exist without unit tests. And before some weird nerd jumps in to say that openssl is so old we can't apply modern standards of project health, please note that the vulnerable function was committed from scratch in August 2020. Without unit tests.

Seems OpenSSL as an organization is just irreparably broken if they haven't still learned the lesson

I think at this point we've established that it's C which is just irreparably broken.

Blaming the OpenSSL developers for writing bad C is just a "no true scotsman" at this point, since there is no large, popular C codebase in existence that I'm aware of that avoids running into vulnerabilities like this; vulnerabilities that just about every other language (mainly excluding C++) would have prevented from becoming an RCE, and likely prevented from even being a DoS. Memory safe languages obviously can't prevent all vulnerabilities, since the developer can still intentionally or unintentionally write code that simply does the wrong thing, but memory safe languages can prevent a lot of dumb vulnerabilities, including this one.

No feasible amount of funding would have prevented this, since it continues to happen to much better funded projects also written in C.

On the other hand, I guess we could blame the OpenSSL developers for writing C at all, being unwilling to start writing new code in a memory safe language of some kind, and ideally rewriting particularly risks code paths like parsers as well. We've learned this lesson the hard way a thousand times. C isn't going away any time soon (unfortunately), but that doesn't mean we have to continue writing new vulnerabilities like this one, which was written in the last two years.

> Blaming the OpenSSL developers for writing bad C is just a "no true scotsman" at this point, since there is no large, popular C codebase in existence that I'm aware of that avoids running into vulnerabilities like this; vulnerabilities that just about every other language (mainly excluding C++) would have prevented from becoming an RCE

No, this whole thing is about the lack of testing. Adding a parser without matching tests is just absurd regardless of the language it's implemented with. If only for basic correctness check, you want a test.

Not all vulnerabilities or bugs are memory-related, vulnerabilities are bound to surface in any language with that kind of organizational culture.

Keep in mind that Ubuntu compiled OpenSSL using a gcc flag that turns this one byte overflow into a crash instead of a memory leak/corrpution because it has a way to do that already. It's very risky, and a very long term project to rewrite something with this level of history into a completely new language.

> It's very risky, and a very long term project to rewrite something with this level of history into a completely new language.

I didn't suggest a complete rewrite of the project. However, they could choose to only write new code in something else, and they could rewrite certain critical paths too. The bulk of the code would continue to be a liability written in C.

I agree that it would be nearly impossible to rewrite OpenSSL as-is. It would take huge amounts of funding and time. In general, people with that much funding are probably better off starting from scratch and focusing on only the most commonly used functionality, as well as designing the public interface to be more ergonomic / harder to misuse.

qmail in 32-bit mode (which was the default when it was written) had no issues. One issue was found in 64-bit mode without memory limits (against the explicit recommendation).

Most "safe" languages like Python are written in C and are full of segfaults, mostly due to the high churn rate and the attitude of the developers.

I haven't tried Rust yet, so I won't comment on that.

OpenSSL still receives minimum funding. Until Heartbleed they had nearly no funding, and now it is two full time people.


I'm of the opinion that in cases like this, it'd be better for the organization to close, and allow the gap to be filled naturaly.

If the current OpenSSL maintainers closed the project, given its importance, there would be a rush to follow up maintenance. Chances are, it'd be better funded; even in worst case, it'll hardly be assigned less than two devs.

This a case of the general dynamic where a barely-sufficient-but-arguably-insufficient solution prevents actors from finding and executing a proper one.

You don't even have to close. You can just refuse to merge code which does not include 100% test coverage. If someone wants the feature badly enough, they will figure out a way to fill the gap. Alternatively, someone can always fork the code and release "OpenSSL-but-with-lots-of-untested-code" variant.

For a project like OpenSSL, it's not just having "enough" developers (whatever that is) it's having qualified developers. Writing good crypto code requires deep expertise. There aren't a lot of people with such expertise whose time is not already fully committed.

The OpenSSL team is actually very good and they do very good work, and they even have funding. The problem is that legacy never goes away, and OpenSSL is a huge pile of legacy code, and it will take a long long time to a) fix all the issues (e.g., code coverage), b) migrate OpenSSL to not-C or the industry to not-OpenSSL.

This vulnerable parser of attacker-controlled remote input was written from scratch in C in 2020, without a fuzz harness even though OpenSSL is critical infrastructure and is already hooked up to oss-fuzz.

It is simply difficult to reconcile these facts with the idea that it is a very good team doing very good work.

> because ossl_a2ulabel had no unit tests until a few days ago

it's not realistic to enforce unit test coverage % with a project at the scale of OpenSSL, right?

> it's not realistic to enforce unit test coverage % with a project at the scale of OpenSSL, right?

Why not?

You can enforce that all new files should be covered (at the very least line-covered). It requires some setup effort (collecting code coverage and either sending it to a tool which perform the correlation or correlating yourself), but once that's done... it does its thing.

Then you can work on increasing coverage for existing files, and ratcheting requirements.

No one is asking for that. But this is code that did one thing: punycode decoding, with millions of well documented test vectors. The code had zero dependencies on anything OpenSSL related. It is a very simple "text in, text out" problem, the most trivial thing to write unit tests for. At the same time, it's code that parses externally provided buffers and has to deal with things like unicode in C - there should be a massive red flashing warning light in every developers head here.

It's realistic to reject a CR for a new parsing function without proof that it works, which usually comes in the form of a unit test

Nit: a unit test never proves that it works. At best it can prove that it doesn't work.

Otherwise, I agree.

Nit: "never"? Even if you unit test all cases?

How do you know you covered all cases. You can verify you cover all cases the code handles easily enough. However does the code actually handle all the cases that could exist? A formal proof can bring to your attention a case that you didn't handle at all.

Formal proofs also have limits. Donald Knuth once famously wrote "beware of bugs in the above code, I proved it correct but never ran it". Which is why I think we should write tests for code as well as formally prove it. (On the later I've never figured out how to prove my code - writing C++ I'm not sure if it is possible but I'd like to)

There are limited cases where this is possible. Like a pure function that takes a single 32bit number as input. Just go over all the possible inputs.

You can't as a verify all the outputs though, only that it doesn't crash, which isn't useful for detecting edge cases where the results are wrong but don't crash. And that is a single pure function, if that pure function with a wrong output is then fed into/used by some other function (a non-pure data storage in particular - a large part of what computers are used for is storage of some form, not pure functions) you can read off the end of the buffer or other bad things.

Even running through all 4 billion some cases a single 32 bit number can result in your test taking a significant amount of time - enough that you wouldn't want to run it very often. One value of real world tests is often that they can detect that you broke what you thought was a completely unrelated area of code.

Seriously? It is pretty easy to imagine function which has limited and low combinations count

Enums, bools, etc.

>can result in your test taking a significant amount of time - enough that you wouldn't want to run it very often.

It is irrelevant in theoretical discussions like this

Yes, never. You are assuming that the implementation of all unit test cases are themselves correct (that they would fail if there was any error in the case they cover). In fact unit tests are often wrong. In that context a unit test can't even prove code incorrect, unless we know that the unit test is correct.

IMO to prove that code is correct requires a proof; a unit test can only provide evidence suggestive of correctness.

Mistakes in proofs are just as probable as mistakes in exhaustive tests.

An exhaustive test is just one type of a machine verified proof.

> An exhaustive test is just one type of a machine verified proof.

Not entirely sure I agree with this. A proof by construction is a very different beast to empirical unit tests that only cover a subset of inputs. The equivalent would be units tests that cover every single possible input.

> The equivalent would be units tests that cover every single possible input.

That's what "exhaustive" means.


It's worth noting that coverage can be a deceptive metric sometimes.

You can have coverage on code that divides - it won't tell you if you ever divide by zero.

You can have coverage on code that follows a pointer - it won't tell you if you ever pass a bad pointer.

yeah but there isn't even a try here

Not saying it isn't worth an attempt, just that the real meaning shouldn't be lost.

It is trivial to enforce that new functions have new unit tests and fuzz tests. You are the reviewer of https://github.com/openssl/openssl/pull/9654 and you just say "Please add unit tests and fuzz tests for foo and bar" and you don't approve it.

I don't know what the deal is with their testing culture but in year 27 of the project they demonstrably haven't learned this lesson. It's nice that they added integration tests (testing given encoded certs) but as the article points out that was insufficient.

IMO, one of the biggest benefits of "modern" systems languages like Rust, D, Zig is how much easier they make writing and running tests compared to C and C++. Yes, you can write tests for those languages, but it's nowhere near as trivial. And that makes a difference.

I was writing unit tests for C in 1996, naturally we still haven't coined the term, so we just called them automatic tests.

It was part of our data structures and algorithms project, failure to execute the automatic tests meant no admission to the final exam.

We had three sets of tests, those provided initially at the begin of the term, those that we were expected to write ourselves, and a surprise set on the integration week at the end of the semester.

I am not buying it.

Writing unit tests for c/c++ is trivial. There are perfectly fine test frameworks, used by developers every day, integrated in any major IDE or runnable as one-liner from the command line.

This is absolutely a cultural problem.

In modern languages you either have built in testing or it’s a single package install away. It is nowhere near this simple with C or C++

Last week in a code review I got a "please add unit tests for this code" comment. The person who wrote that comment wasn't aware that this was a refactoring where the functionality was well tested.

There is no substitute for reviewers who really understand the code in question. The problem is they are the ones writing the code and so are biased and not able to give a good review.

It should be realistic, line coverage isn't really that hard. The hard thing is that high line coverage alone is usually not enough for numerical stuff...

I don't get it. Why doesn't everyone just use the battle-hardened, fully-compliant Rust implementation of OpenSSL?

Do you have a link? Do you mean rustls?

Would it be reasonable to have fuzz testing around reasonably sized units in addition to e2e?

Yes, but see also "property testing" like QuickCheck and Hypothesis, the line is blurry.

I'm not familiar with C enough to know the answer, but I'm trying to think how anything goes from untrusted input -> trusted input safely. To sanitize the data, you're putting the input into memory to perform logic on it, isn't that itself then an attack vector? I would think that any language would need to do this.

Is anyone able to explain this to me?

There are a lot of different issues that can come up, but in practice ~80% of those (my made up number) are out-of-bounds issues. So for example, say you're parsing a JSON string literal. What happens if the close-quote is missing from the end of the string? You might have a loop that iterates forward looking for the close-quote until it reaches the end of the input. What that code should do is then return an error like "unclosed string". If you write that check, your code will be fine in any language. What if you forget that check? In most languages you'll get an exception like "tried to read element X+1 in an array of length X". That's not a great error message, but it's invalid JSON anyway, so maybe we don't care super much. However in C, array accesses aren't bounds-checked, so your loop plows forward into random memory, and you get a CVE roughly like this one.

In short, the issue is that you forgot a check, and your code effectively "trusted" that the input would close all its strings. If you never make mistakes like that, you can validate input in C just like in any other language. But the consequences of making that mistake in C are really nasty.

The error you're describing is more likely to happen with an array of ints (or really any other type without a sentinel value).

Strings specifically are often enclosed enclosed in a `while(c != '\0')` loop (assume c is the character being examined) or something to that effect, which means you'll exit at the end of the string (non-string arrays don't have this).

The CVE in question seems to be the exact opposite of this. It's that someone didn't check the bounds on a write instead of a read.

`while(c != '\0')` is the same as `while(c != '"')`. An attacker controlled string may very will be missing the 0 byte, which has been an extremely common attack vector (though is probably not a realistic attack vector for JSON parsers, to be fair).

> An attacker controlled string may very will be missing the 0 byte

Entirely possible, especially if the attacker is local. But when we're dealing with something coming in over the network, I think even the old arpa headers get you a null byte at the end, regardless of if one was sent.

Unless we aren't dealing with tcp/ip, in which case I'm way out of my depth.

Just because something is in memory doesn’t mean that it is realistically executable. That’s why you can download a virus to look at the code without it installing itself.

You aren’t wrong that even downloading untrusted data is less secure than not downloading it. But to actually exploit a machine that is actively sanitizing unsafe data, you need either (A) an attack vector for executing code at an arbitrary location in memory, or (B) a known OOB bug in the code that you can exploit to read your malicious data, by ensuring your data is right after the data affected by the OOB bug.

>To sanitize the data, you're putting the input into memory to perform logic on it

Sure, but memory isn't normally executed.

One of the more common problems was not checking length. Many C functions assume sanitized data and so they don't check. You have functions to get that data that don't check length - thus if someone supplies more data than you have more room for (gets is most famous, but there are others) the rest of the data will just keep going off the end - and it turns out in many cases you and predict where that off the end is, and then craft that data to be something the computer will run.

One common variation: C assumes that many strings end with a null character. There are a number of ways to get a string to not end with that null, and if the user can force that those functions will read/write past the end of data which is sometimes something you can exploit.

So long as your C code carefully checks the length of everything you are fine. One common variation of this is checking length but miss counting by one character. It is very hard to get this right every single time, and mess it up just once and you are open to something unknown in the future.

(Note, there are also memory issues with malloc that I didn't cover, but that is something else C makes hard to get right).

don't forget the hilariously dangerous strcpy that they "fixed" with strncpy, which would happily create unterminated strings, so has was fixed again with strlcpy. At least std::string doesn't have these problems (it has its own issues because the anemic API surface means you keep needing C APIs that require null termination)

Slapping strlcpy on everything, as some codebases/companies have taken to doing, is a poor fix. The proper fix is not quite shipping yet, but you can build your own out of memccpy if you'd like. (Of course, at the risk of doing it wrong…)

Counterpoint: This was actually demonstrated to be detectable with very basic fuzzing, if done at a lower level.

So, are there now people looking for projects that fuzz incorrectly to scope out prospective targets?

Adding more and better fuzzing instead of trying to fix the issue (potentially malicious user input inside a C library) seems like the wrong way to address the problem. Buffer overruns just shouldn’t be a concern of the developer or test suite but of the compiler or language runtime.

There are two problems. The CVE, and the fact that the current fuzzing harness does did not find it. The CVE is getting fixed, but obviously the fuzzer needs work too because it exists to find these kinds of issues before they get used in the wild.

It's being handled how it should be. This happened, let's handle it, and how can we work to better address future problems.

Trusting the fuzzer and not examining its coverage seem to be main problem here.

I fail to see what is problematic about giving the control over the entire flow of the program to the developer. Quite the contrary, I am more concerned about the paradigm shift towards higher level system programming languages that hide more and more control from the developer while putting more burden on the perfectness of optimizer.

Absent a high performing systems language that still offers some safety guarantees, the right call should be to use whatever the second best is. It could be a higher level language with runtime overhead, sandboxing, formal verification etc. In some cases constraints won’t allow this, and obviously replacing even parts of infrastructure code is never easy. Nor should the perfect be the enemy of the good - adding better testing doesn’t sound like a bad idea even for a piece of code being sunset. What I’m objecting against is the (apparent, or my perceived!) idea that “if only the fuzzing was good enough, this code would be acceptable for use forever.

The average developer is typically more error prone than their compilers is.

Open-loop fuzz testing catches only the most shallow of bugs. It is like genetic optimization with no fitness function.

Why are people still using parsers for untrusted input in C? That is the real flaw here, not how the fuzzing was done.

But modern fuzzers aren't open-loop, they are coverage directed, adjusting their inputs to increase coverage. As the article points out, this works best if leaf functions are fuzzed; difficult to reach corners still might not be found.

The problem here was that the coverage of the fuzz testing was not being examined.

Using parsers for untrusted input in C is a legacy of when this was written. Requiring the parsing portion (or any version of OpenSSL) to be rewritten in Rust or whatever new language is a massive change given the length of time the OpenSSL project has been around.

Parser generators are one of the oldest ideas in computer science. YACC was written in the 70s; and had to be ported to C because it's original implementation was written in B.

The idea of not writing parsers directly was well established by the time OpenSSL started in the late 90s.

Industry prefers hand written parsers for a reasons.

Parser generators feel like an academic dream

Why is this? I come from academia and I have yet to encounter a good argument for not using parser combinators, in new applications. Can you please point to some reason?

Compilers have moved away from parser generators in favor of hand-written recursive descent parsers in many cases, and the main reason has been to produce high-quality error messages. In the special case of C++, another problem has been that the language is not LR(n) for any n; unbounded lookahead can be required in some cases.

Given that so many of these bugs are parsing bugs, maybe we should have a new emphasis on compiler compilers which generate fast, provably correct code? (Fast, because most of the bugs and exploits are accompanied by some form of optimization.)

Fast and provably correct are more or less solved problems (at least for a large class of languages).

The main drawback is that it is difficult to get good error messages when a parse fails.

I think you may run into the halting problem here.

You don't have to solve the halting problem every time it presents itself. Otherwise, things like Valgrind and fuzzing wouldn't be valid at all. You just have to improve your odds.

EDIT: An important note to newbs: The Halting Problem is correct. However, a problem which maps to the halting problem can still be solved often enough in practice to make it worthwhile. In fact, entire industries have been born of heuristic solutions to such problems.

Note that I was responding to post mentioning 'provably correct code' which is very different opportunistic efforts to improve code.

Valgrind and fuzzing are useful tools, but there is no general method. Being semi-decidable, with enumerable inputs. This means that fuzzing can be useful both with random walks and constrained random walks. But it doesn't come close to generating 'provably correct code'.

The 'Post correspondence problem' is maybe an easier way to see how this applies at the compiler level.

But tools that help are opportunistic, but that doesn't change the undecidability of generalizations.

While crossing domains, but because it is commonly used, considering the VC dimensionality of the problem also helps, or pure math problems like the Collatz conjecture that generalize to the same halting problem.

Generating 'provably correct code' in the general case is simply not possible without major advances in math. There is room to improve with many paths to do so, just not going down this particular path.

Note that I was responding to post mentioning 'provably correct code' which is very different opportunistic efforts to improve code.

Note that parsing a context free grammar maps to a stack machine. The Halting Problem uses a Turing Machine. I don't think it applies! (But if you do still want to show off your knowledge, please elucidate. I've forgotten about half of the automata theory I covered as an undergrad and in grad school.) For parsing tasks corresponding to computational models which are less complex, like regular expressions, the problem is well under the threshold of "semi-decidable, with enumerable inputs" in your words.

Generating 'provably correct code' in the general case is simply not possible without major advances in math.

Note that I was responding specifically to the problem domain of parsing and compiler compilers, and that most parsing problems involve models of computation which are akin to a stack machine or less powerful.

People generally throwing out The Halting Problem as an objection without carefully considering the particulars/context is one of my chief pet peeves.

The Halting Problem is typically used because it was one of the first proven as undecidable, same reason in P vs. NP questions often use SAT.

C++ templates, Haskel templates, Lisp macros, etc... are all examples of metaprogramming facilities that would be considered TC, and thus subject to Rice's Theorem (To avoid HP).

But last I saw the code that was generated was the problem, not the primitives. As the languages are TC, a parser not being so doesn't remove the issue with this CVE.

But last I saw the code that was generated was the problem, not the primitives.

Good! So, if what we want the compiler compiler to do is, say, just to produce an output isomorphic to the input, then this also reduces the complexity of what we're asking to do. I think this falls well inside what we could automate with some kind of guarantee of correctness.

A good mental habit for programmers is to constantly ask, "Can the stated problem be reduced in scope, such that we satisfy the goal?"

It's not that big of a change, in the grand scheme of things. But it's also not the only thing you can do. The memory safe subset of C++ is also an option.

Shipping a CVE in critical infrastructure because of a trivial memory safety bug is borderline negligence in 2022. This is why people get upset over new code being written in C. The cost of writing new portions of the software with memory safety in mind dwarfs the cost of writing in C because it's more convenient for the build tooling.

The bigger question is why hasn't OpenSSL bitten the bullet and adopted some memory safety guarantees in their tooling, given the knowledge of the sources of these bugs and prevalent literature and tools in avoiding them!

> The memory safe subset of C++ is also an option.

This does not actually exist, as far as I'm aware. There are certain things people propose doing in C++ that eliminate a small number of issues, but I haven't seen anyone clearly define and propose a subset of C++ that is reasonably described as memory safe. Even if such a subset existed, you would still need some way to statically enforce that people only use that.

Even just writing the parsers in Lua should be a safer choice than writing it in C, but I think now is as good of a time as any to start writing critical code paths in Rust. If the Linux kernel is beginning to allow Rust for kernel modules, then it is high time that OpenSSL looked more seriously at Rust too

As others have pointed out, parser generators could be a useful intermediate option for some of this.

You can write memory safe C++ easier than memory safe C. The problem of statically verifying it the C++ is actually memory safe because of the numerous ways to write unsafe code in C++ is a different goalpost.

My point is that there isn't a compelling reason to write new code in C for something where safety is critical.

It is called C++ Core Guidelines.

Some interesting holes in that spec:

> What to do with leaks out of temporaries? : p = (s1 + s2).c_str();

> pointer/iterator invalidation leading to dangling pointers

I feel like those are relatively common memory safety pitfalls, not obscure corner cases. The Core Guidelines have been in the works for about 7 years, I think? It's not clear if/when these will ever be addressed if they haven't been addressed by now.


There are other things mentioned in the list that look suspicious, but they're less clear. So, unless I'm misreading this, then I stand behind my original assertion that there is no safe subset of C++, but the Core Guidelines are certainly better than nothing... assuming they're actually used/enforced in real world applications. Other people are welcome to have their own opinions.

No, it's not. There is no set of guidelines that exists today that creates a memory safe subset of C++. If you claim to have one, and it have a way of verifying that a program satisfies those guidelines, I'll find you a program that has unsafe behavior.

That is your opinion, many folks at ISO C++ see it differently.

Hence why the ongoing efforts to improve static analysis tooling in regards to mechanical enforcement of C++ Core Guidelines across all major C++ compilers, IDEs and commercial static analysers.

It is perfect? Don't let perfect be the enemy of good.

In any case, anyone that cares about secure code shouldn't be touching any language that is copy-paste compatible with C, unless they can't avoid it.

I think you fundamentally misunderstand what memory safety means. "Kinda sorta better" is not memory safety. It's when you can pick out a safe subset of the language and guarantee, barring errors in the language implementation itself, that certain kinds of errors are not possible. C++ cannot do this today and implementations I have seen for C have required semi-invasive changes to the language itself.

I think you fundamentally misunderstand how static analysers and check-in hooks can be used to force everyone to play by the rules, assuming management plays ball with SecDevOps best practices.

As for the rest, I was quite clear where I stand on my last sentence.

I don't get it. You responded to a post that talks about a memory safe subset of C++ by mentioning the Core Guidelines. You are aware of what memory safety is and how this doesn't actually solve that problem (I know you are, you've been around here long enough). Then why bring it up? The answer to "is there a memory safe subset of C++" is "no". The answer to "should you write security critical software in C++" is "generally no because there is no memory safe subset of C++". The Core Guidelines is a good way to not run into the trivial pitfalls but it's not an answer. Why are you presenting it like one?

Rewriting the parser portion of anything is not 'massive'. Boring as anything, but mot difficult and not much time consuming.

You can easily write a robust parser in C. Just don't write a clump of code that interleaves pointer manipulation for scanning the input, writing the output and doing the parsing per se.

* Have a stream-like abstraction for getting or peeking at the next symbol (and pushing back, if necessary). Make it impervious to abuse; under no circumstances will it access memory beyond the end of a string or whatever.

* Have some safe primitives for producing whatever output the parser produces.

* Work only with the primitives, and check all the cases of their return values.

Because there isn't a good way of distributing pre-compiled cross-platform C libraries. So if you want to use a parsing library written in Rust, for example, you'd need to add Rust to your toolchain, which is a pain.

One solution to this problem would be to write an LLVM backend that outputs C. Maybe such a thing already exists.

I'm confused and very far from an expert here. What is wrong with parsers, and what is the alternative?

A specific class of parsers

>parsers for untrusted input in C

Pretty much all input is untrusted unless it originated (exclusively!) from something with more permissions that is trustworthy.

The kernel is written in C.

So that pretty much means all parsers written in C and every other language should consider all input untrustworthy, no?

Linux is probably the most carefully constructed C codebase in existence and still falls in to C pitfalls semi regularly. Every other project has no hope of safely using C. It's looking more and more like Linux should be carefully rewritten in Rust. It's a monstrous task but I can see it happening over the next decade.

I agree with the spirit of your comment.

> Linux is probably the most carefully constructed C codebase in existence and still falls in to C pitfalls semi regularly.

My guess is that it would actually be OpenBSD, but I'm not sure either way.

That didn't answer anything. If you want to do anything with your input, you have to run it through a parser. Doesn't matter if it's untrusted or not. Your only options are ignoring the input, echoing it somewhere, or parsing it.

They're saying don't write such a parser in C. Use something else (memory safe language, parser generator, whatever).

And then do what with it? Throw it away?

If it hands it to a C program, that C program needs to parse (in some form!) those values!

How is a C program expected to ever do anything if it can’t safely handle input?

Untrusted input -> memory-safe parser -> trusted input -> C program.

Probably not that important for `ls`, probably worth it for OpenSSL.

The challenge of course is the links to the ‘memory safe parser’, or how it gets from the untrusted input to it mediated by C, correct?

Right, but you do have the option of writing that parser in a language other than C. And given how often severe security issues are caused by such parsers written in C, one probably ought to choose a different language, or at least use C functions and string types that store a length rather than relying on null termination.

>>>>> Why are people still using parsers for untrusted input in C?

No matter what the parser itself is written in, if you're writing in C you'll be using the parser in C.

If you have the input in a buffer of known length in C, hand it off to a (dynamic or static) library written in a safe language, and get back trusted parsed output, then there's much less attack surface in your C code.

The issue in many of these cases is there appears to be no canonical safe way to know the length of the input in C, and people apparently screw up keeping track of the lengths of the buffers all the time.

This is why you reduce the amount of C code that has to keep track of it to as little as possible.

1. Well don’t write in C then if your program is security critical or going to be exposed over a network. Sure, there are some targets that require C, but that’s not the case for the vast majority of platforms running OpenSSL.

2. That’s still less of a problem as the C will then be handling trusted data validated by the safe langauge.

If you make argument 2) could you explain how writing a parser is more security critical than any other code that has a (direct or indirect) interaction with the network? At least recursive descent parsers are close to trivial. I usually start by writing a "next_byte" function and then "next_token". You'll have to look very hard to find any pointer code there. It's close to impossible to get this wrong and I don't see how the fact that it's a parser would make it any more dangerous.

Well if you're dealing with a struct then the compiler will provide type safety if say you try to access a field that doesn't exist. You don't get the same safeguards when dealing with raw bytes. Admittedly in C you can also run into these hazards with arrays and strings, which I why I suggest using non-standard array and string types which actually store the length if you insist on using C.

When a C program is factored well there needn't be all that much access by pointer + index. I'm not saying it can't be frequent in certain kinds of code, but for many things it's easy to just put a simple abstraction (API consisting of a few functions) that you have to get right once, then can reuse dozens of times.

Plain pointer access in high-level code (say when parsing a particular syntactic element by hand in a recursive descent parser) is a violation of the principle of separation of concerns IMO.

In any case I still don't see what's special about parsers. Most vulnerabilities I suspect to be in the higher levels, like validating parsed numbers and references, for a trivial example. In general, those are checks that are likely to be implemented much closer at the core of the application.

> Most vulnerabilities I suspect to be in the higher levels, like validating parsed numbers and references, for a trivial example. In general, those are checks that are likely to be implemented much closer at the core of the application.

What I see (especially in libraries like OpenSSL) is the core logic often receives a lot of scrutiny and testing, and thus it is silly mistakes with offsets and bounds checks that make up the majority of bugs.

It’s also worth considering the severity of different kinds of bug. A bug in high level logic might allow an attacker to do something they shouldn’t be able to do, but it doesn’t give them code execution.

The worst bit is, an attacker can often gain code execution through a part of the code that otherwise wouldn’t be security critical (where a logic mistake would be low impact). So writing code in a language that allows for these vulnerabilities greatly increases your attack surface.

> It's close to impossible to get this wrong and I don't see how the fact that it's a parser would make it any more dangerous.

I can answer that one. The parser is more dangerous because a parser, essentially by definition, takes untrusted input.

Nothing the parser does is any more dangerous than the rest of the code; it's all about the parser's position in the data flow.

I agree that the original statement encourages that interpretation, but I think it admits the interpretation that the parser itself is in C and I think that is what was intended.

Even if application constraints mean you can't write a parser in another language that's linkable to C, why couldn't you use a parser generator that outputs C?

What alternatives are there to parsers? Genuine question from the ignorant.

There are no alternatives to parsers. There are alternatives to "in C".

Loads of bugs aren't detected by fuzz testing, as this technique exhibits stochastic behaviour, where you'll most likely find bugs overall, but have varying chances (including none at all) of uncovering specific bugs.

Which is great news for those of us who approach such research by gaining a deep understanding of the code and the systems it exists in, and figuring out vulnerabilities from that perspective. An overreliance on fuzzing keeps us employed.

Or seen the other way around. By applying fuzzing to find the “silly” type of bugs, you can spend your artistic efforts on finding the other bugs.

I think this is the main reason fuzzing exist. Let the boring part to the tool and focus on the most creative work

Fuzz testing has a very high chance of detecting bugs, especially these kind, but you do need to at least check that the fuzzer is reaching the relevant code!

This is reasoning backwards in a misleading way. The point is not changing the fuzzing setup to find this specific bug that we now know with hindsight was there. There are a zillion paths and you would need to be ensuring that fuzzing reaches all vulnerable code with values that trigger all vulnerable dynamic behaviours.

It's not backwards: you run the fuzzer, you look at the code coverage, and you compare that against what you expect to be tested. Then you update the fuzzing harness to allow it to find missing code paths.

It's far more doable than you are suggesting: fuzzing automatically covers most branches anyway, so you just need to manually deal with the exceptions (which are easy to locate from the code coverage).

I used fuzzing to test an implementation of Raft, and with only a little help, the fuzzer was able to execute every major code path, including dynamic cluster membership changes, network failure and delays. The Raft safety invariants are checked after each step. Does this guarantee that there are no bugs? Of course not. It did however find some very difficult to reproduce issues that would never have been caught during manual testing. And this is with a project not even particularly well suited to fuzzing! A parser is the dream scenario for a fuzzer, you just have to actually run it...

Yep, code coverage can tell you code is definitely entirely untested, but doesn't tell you that you are covering the input space to have high assurance that there aren't vulnerabilities.

Coverage might have helped here (or not), but it doesn't fix the general problem of fuzzing being stochastic and only testing some behaviours of the covered code.

I wonder if one possible solution is making things more "the Unix way" or like microservices. Then instead of depending on some super specific inputs to reach deep into some code branch, you can just send input directly to that piece and fuzz it. Even if fuzzers only catch shallow bugs, if everything is spread out enough then each part will be simple and shallow.

Fuzzers can already do this. When you set up a fuzzer you set up what functions it's going to call and how it should generate inputs to the function. So you can fuzz the X.509 parsing code and hope it hits punycode parsing paths, but you can also fuzz the punycode parsing routines directly.

This is the flip size of the fuzzing approach that is called property testing. It's legit but involves unit test style manual creation of lots of tests for various components of the system, and a lot of specs of what are the contracts between components & aligning the property testing to those.

isn't that what code coverage does?

Fuzz tests can take a seed corpus of test vectors. If the test framework tries them first, it can guarantee that it will find those bugs in any test run. For anything beyond that, it depends on chance.

> I think we should give the developers the benefit of doubt and assume they were acting in good faith and try to see what could be improved.

I feel like there is this trend of assuming any harsh criticism is bad faith. Asking why industry standard $SECURITY_CONTROL didn't work immediately after an issue happened that should have been caught by $SECURITY_CONTROL is hardly a bad faith question.

Questions themselves are not good-faith or bad-faith. People asking questions are doing so in either good-faith or bad-faith.

Someone pushing hard on legitimate criticisms with the intent of attacking a project or members thereof is acting in bad-faith, while someone ignorant with a totally bogus criticism could be acting in good-faith. Many bad-faith actors hide behind a veneer of legitimacy by disguising or shifting the gaze away from their motivations.

Umm, i disagree.

Bad/good faith is about whether you are being misleading or dishonest in asking the question.

You can intend to go attack a project in good faith as long as you are not being misleading in your intentions.

For example, a movie critic who pans a film is not acting in bad faith since they aren't being misleading in their intentions.

A movie critic who pans a film they think sucked is acting in good faith; a movie critic who pans a film specifically with the intent of attacking the film (whether as clickbait or because they don't like someone involved with the film or whatever) is acting in bad faith.

We might actually be in agreement with each other because a critic who leads with "The director slept with my wife, so I'm only going to say all the bad things about the film and you should probably ignore this review" would have significantly blunted their attack by leading with it, and are arguably not acting in bad-faith.

OpenSSL is known to be broken.

All this bickering over language misses the real problem.

The actual solution is that open source, widely used code is a target for hackers.

By using one library used everywhere for everything, you’re painting a target on your own back.

The real solution is we need the software ecosystem to have more competition and decentralization.

Use alternative crypto libraries.

If you want a drop in replacement, use LibreSSL which was forked and cleaned up by the OpenBSD guys due to HeartBleed.

But the long term solution, is more competition by using smaller, more specialized libraries, or even writing your own.

> The actual solution is that open source, widely used code is a target for hackers.

The long term solution is likely using languages which are A. Memory safe, and B. make formal verification viable. Being widely used and open source isn't an issue if there are no exploitable bugs in the code.

I see a lot of people pushing memory safe languages, far more people than actually write systems code.

What is your primary language?

Many of those people have experience writing systems programming code before UNIX and C got widespread outside Bell Labs.

Mac OS was originally written in Object Pascal + Assembly, just to cite one example from several.


We've banned this account for breaking the site guidelines. Please don't create accounts to do that with. If you've decided you want to use HN as intended, that's fine, but in that case please review https://news.ycombinator.com/newsguidelines.html and be sure you're sticking to the rules.


Please don't break the site guidelines just because another comment is bad. That only makes things worse.


My primary language is JavaScript, but I use Rust when I need to write more low-level (or higher performance) code. I certainly haven't been writing kernel code or anything like that, but I think it's fair to say that I practice what I preach when it comes to using memory safe languages.

There are a limited number of people willing to spend a limited amount of time fuzzing, reviewing, and scrutinizing crypto libraries. The more libraries exist, the more their efforts are divided, and the total scrutiny each library receives decreases. How would this help the problem?

I find it quite brave to trust the OpenBSD guys by default. Historically, they have way too many forked huge projects (Apache, gcc, patched clang, ...) to understand them in depth.

OpenSMPTD had its fair share of exploits. sudo had its fair share of exploits.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact