"My qbe C frontend is actually written in myrddin which is a lot like rust/ocaml. To be honest, C is not well suited to places where there is adversarial input such as servers, but when it comes to logical correctness of code I do not see amazing benefits from languages like ocaml."
Did you mean two? Or did you come to new conclusions after that experience? Or did I misunderstand that entirely? ;)
There ARE benefits to correctness from better languages, just not as amazing as people think.
I'm just saying a vulnerability in C is just going to be an out of bounds exception in another language. both cases the code is not correct, one is just safer than the other when it comes to exploits.
Keep making more languages, we can't know for sure if we don't try.
I can see where you're going but I don't think it's correct. There are clearly cases where an error will mess you up whether it goes as far as in C or not. Other times, the language or tooling prevents the error before runtime to ensure correctness. Are you aware of the benefits of Design-by-Contract (Ada/Eiffel), static proving (SPARK), or dependent types (ATS), though? The correctness criteria you can encode in them can straight-up prevent errors at interface or algorithmic expression levels. Three of these have been used for low-level code with two often in real-time and one for an 8-bitter. Depending on automation or interactive use involved, the errors caught at compile-time can increase pretty far past what a basic, low-level, type system can do.
So, we already know we can knock out extra classes of errors with such languages. It was proven in theory and in the field. Using or improving them is just good engineering. We can also make more languages in trial-and-error discovery process to see if we find more benefits. Exceeding C's benefits, though, is already empirically proven to be worthwhile whether it's a Myrddin, a SPARK, or an ATS.
C code will continue to dereference pointers after object deallocation, access arrays out of bounds, and not complain about integer overflow. Even if input is not an RCE, you will silently get an incorrect result of computation. A safer language helps in cases other than security.
And there are newer languages that have this problem. I was shocked to see that I could produce a SIGSEGV in Nim within the first 15 lines of code that I wrote.
Why were you shocked? Is it because you're used to languages that prevent null values by default or do you have concerns about the safety of Nim because of this error?
Whatever the case, as far as I am aware, sooner or later Nim will prevent nil values by default.
A segfault on null is fine if the language defines it to be OK (i.e. dereferencing null is reliably a crash), but most environments that allow interacting with null (C, C++, LLVM IR) have dereferencing it be undefined behaviour, i.e. the compiler may optimise in ways that mean code that naively results in a null dereference/segfault do not do that and instead result in random memory corruption.
Note that Nim uses C as an intermediate language for compilation, so unchecked dereferences of possibly-null pointers are not memory safe.
Modifying Nim to perform a check before each dereference would be trivial. The reason this isn't done yet is because in practice unchecked dereferences do not cause random memory corruption, I have been using Nim for a while now and have never seen an instance where this was a problem.
Sure it is. It means your language and tooling allowed an unauthorized access that resulted in a segfault. Many don't. There's even system-level languages like Clay and Rust that can catch those things. Also, tools like Softbound+CETS neutralize them in C code. The category is called "temporal, memory safety."
Definitely a memory safety issue. Definitely worth preventing if possible. If not preventable, definitely worth handling better than mere segfaults.
No, it's the same as a .unwrap() in Rust, or malloc exiting when it fails. It means the language doesn't force you to handle all cases, not that it's "unsafe."
The point about it being like .unwrap is that the bug is a completeness bug.
My theory is that people have converted "memory unsafety can cause segfaults" into "segfaults are memory unsafe." Segfaults are actually the desired outcome, and initializing a pointer with null, instead of leaving it uninitialized is how that outcome is achieved.
The desired outcome is a program that either doesn't have the error or crashes safely with a report of exactly how it happened. Not segfaults or undefined behavior in general. The latter are simply one of the consequences of C's design and programming style than anything inevitable or ideal.
It's tradition that failures of a language's safety system to handle memory are "memory, safety issues." However, that they can be used for code injections in some scenarios is even more reason to think of them that way and make our languages prevent them where possible. Quick example:
Yeah, in C you could have a problem if there's a large object, but in that language you already have undefined behavior. Nim is not C, and bounds checking on arrays prevents that. (I am only skimming the docs, so tell me if that is incorrect.)
Segfaults can often be turned into an actual exploit; unwinding the stack cannot. It would be better to explicitly abort than to purposefully cause a segfault.
You are committing the exact generalization of segfaults=bad that I mentioned in the parent comment. Memory safety violations which cause segfaults can often be turned into an actual exploit. That does not mean the same thing as, "If A is a segfault, A can be turned into an exploit."
And it's not better to "explicitly" abort, because you get zero benefit from that. And because testing for null would be slow. (This is why JVM implementations often let it segfault.)
I didn't say that all segfaults are always security problems, just that the often are.
I think we'll have to agree to disagree here, though. Segfaults are memory safety problems, and unwrap is significantly different because it's not a memory safety problem.
You can "agree to disagree" about a definition, but we'll also have to agree to disagree on what properties the definition should have.
I'd say that language L is memory-safe, then a language M which is like L, except that some programs terminate earlier, should also be memory-safe. We'll have to "agree to disagree" about that.
I'd say a compiler's choice of behavior-maintaining implementation technique should not determine whether a language is memory-safe. We'll have to "agree to disagree" about that.
I'd say a definition of memory safety should correspond to some mutually relevant notions of being safe (with consequences around security and ease of debuggability), and that it shouldn't include some kinds of unrelated notions about termination safety while omitting other kinds of termination safety. We'll have to "agree to disagree" about that.
> I didn't say that all segfaults are always security problems, just that the often are.
My mistake, I assumed you were trying to make a point that had some bearing on the discussion.