Hacker News new | past | comments | ask | show | jobs | submit login
Memory Safety in Rust: A Case Study with C (willcrichton.net)
65 points by wcrichton on Feb 3, 2018 | hide | past | web | favorite | 89 comments

A lot of point-missing here about the example C program. Maybe everyone else in the world is a perfect C coder who never makes a mistake, but I've found every single one of those mistakes in my own code at one point or another (though of course, not at quite the same density). The point is that they're easy to make, and though there is tooling, it's extremely cumbersome and imperfect.

Worse, over time as a codebase gets more complicated and older - and only if you're one of the few lucky enough to start with C code that isn't an unholy, inconsistent no-warnings disaster-mess slop bucket of horror - some of those warnings have a nasty habit of getting disabled so some middle manager can expedite some unreasonable commitment out a door to tick a box.

Call me a masochist but I enjoy writing C, particularly with valgrind, *san, afl, et al at my back. But I felt the point was well made by the author and I find it hard not to feel like a world where these problems simply didn't exist at all might be a bit nicer

> The point is that they're easy to make, and though there is tooling, it's extremely cumbersome and imperfect.

What's hard about adding -fsanitize=address or -Weverything to your Makefile? Or running a program under Valgrind, or using the clang static analyzer or coverity? Using AFL can be a little cumbersome but still not that hard.

Depending on what you're doing, AddressSanitizer might not be available for you to use, as there are plenty of ASAN-incompatible constructs that may be used by libraries your application depends on.

(it's worth looking at other sanitizers anyway, it seems not everyone is aware of their existence after all; if you can't put C away they're really great tools for retaining your sanity :))

You can't use -Weverything on a real project.

It's full of noise.

You can enable many others though, but -Weverything is virtually useless.

In contrast to Rust's language design, -fsanitize and -Weverything and static source code analyzer can't give guarantees over memory safety. Considering that your main activity in your Github profile is concentrated on JavaScript, I wouldn't be so cheeky.

Feel free to post your self-written applications (no fuzzers or similar with uninteresting attack surfaces) written in C for open review.

Not exactly the best example. If I take this code (test.c) and try to compile it:

  > gcc test.c
I get an error:

  test.c:21:3: warning: function returns address of local 
  variable [-Wreturn-local-addr]
     return &vec;
After I fix that error, the program segfaults when running. Compile with asan:

  > gcc test.c -Fsanitize=address -lasan -g
Then we can start debugging these problems:

  ==3261== ERROR: AddressSanitizer: attempting double-free on 0x60040000dfd0:
      #2 0x4008b5 in vec_free ./test.c:46
Not trying to say that this is the best workflow for debugging C, but the tooling does exist for these kinds of programming errors.

As another poster pointed out, if I run this via valgrind (after fixing the first error):

  > valgrind ./a.out

  ==4871== Invalid write of size 4
  ==4871==    at 0x40073A: vec_push (test.c:40)

  ==4871== Invalid write of size 4
  ==4871==    at 0x40073A: vec_push (test.c:40)

  ==4871== Invalid read of size 4
  ==4871==    at 0x4007CD: main (test.c:55)

  ==4871== Invalid read of size 8
  ==4871==    at 0x400769: vec_free (test.c:46)

  ==4871== Invalid free() / delete / delete[] / realloc()
  ==4871==    at 0x4C2BDEC: free (...)
  ==4871==    by 0x400770: vec_free (test.c:46)

None of which will help detect the integer overflow unless you're quite serious about your testing.

Very valid point. The code for that:

  int new_capacity = vec->capacity * 2;
  assert(new_capacity > vec->capacity);

Not good enough. The compiler is entirely within its rights to optimize that out, since signed integer overflow is UB in C.

Overflow is a runtime behavior, not a compile time behavior. How a type overflows is dependent on the CPU, not software.

You are either trolling or dangerously wrong. A modern optimizing compiler will simply remove your assert() from the generated assembly. If you are trolling, stop it.

But if you are a C programmer and speaking honestly, you need to have a better understanding of undefined behavior in C before you write any more code. Checking for signed overflow in C is extremely difficult to do correctly. This is a good starting point: https://blog.regehr.org/archives/213.

Nothing in the snippet is undefined. The assert simply enforces the size always increases.

Not sure I understand why you think the assert() gets removed. Is this documented somewhere?

Hint: vec->capacity could be 0

AFAIK you are wrong, and the other poster is right.

Signed overflow (in int * 2) is an UB. If it happens, compiler is free to do anything. Eg. assume your assert can never happen and remove it.

Let me explain: Because signed overflow is an UB, compiler assumes that you would never write a code that overflows like that (because you write C, the assupmtion is that you are omnipotent), and knowing that, remove the assert that could possibly never ever happen, since it's a dead code anyway.

I've spend more than 6 years writting embedded software in C. God I love Rust.

How does a compiler know that signed overflow will happen in the above snippet? Couldn't all integer math result in overflow? If so, how does the compiler ever know what to do?

I do not think its plausible that the compiler is emitting code which checks for an overflow and then in a deterministic fashion skips all dependent logic there after. Im sorry, but I think this line of argument has been taken to such an extreme it has detached from reality.

The compiler will see that you calculated capacity * 2. Since overflow is UB, the compiler may under that any code path that does this calculation does not cause signed overflow and can deduce that capacity * 2 does not overflow. Therefore, for example, the condition that capacity * 2 > capacity is equivalent to capacity > 0, and there goes your assertion. A modern compiler will likely also deduce that capacity > 0 and remove the assertion entirely. You can read up on "value range propagation" if you're interested.

But all this is beside the point. In C, you may not invoke undefined behavior and then expect anything at all about the result. With malicious input, UB can and will result in RCE.

I can see the point you are trying to make, but in this context, it doesn't work. Lets start at line 20:

  vec.capacity = 0;
Next, line 26:

  int new_capacity = vec->capacity * 2;
So we get new_capacity = 0 * 2, so new_capacity is 0. The assert:

  assert(new_capacity > vec->capacity);
Which translates to 0 > 0, it triggers a fail and we are done. If the compiler were to 'optimize' that out, then the compiler just introduced a very serious and fatal bug in our program, and we didn't even overflow or UB.

So that assert works as intended, it forces new_capacity to grow until it hits INT_MAX or UB, at which point it will assert.

Oh. In case you wanted to assert the capacity = 0 isn't happening, it's OK. I was talking about overflow protection. I'm sorry if I've missed something you said that was implying that.

If you wanted to check overflow condition, this assert doesn't help. If the capacity was almost MAX_INT, and you multiply it be 2, then compiler has freedom to do whatever, and not catch it because it must never be happening in the first place.

> I can see the point you are trying to make, but in this context, it doesn't work.

No, you don't see the point. Writing code in C that invokes undefined behavior is wrong, full stop. You seem to think that it's only wrong if you, the programmer, can imagine how it would go wrong.

> If the compiler were to 'optimize' that out, then the compiler just introduced a very serious and fatal bug in our program, and we didn't even overflow or UB.

Sure, in this case, the compiler can't optimize the assertion out. But it certainly can change the condition in the assertion so that it won't fire on overflow.

> But it certainly can change the condition in the assertion so that it won't fire on overflow.

Once again, why would a compiler do that? To satisfy the Rust community's desire to spread FUD throughout the C community? (sorry to be so blunt, but im really scratching my head here)

A more logical thought process here is that signed operations aren't uniform across architectures and are sometimes emulated, leading to UB for certain edge cases, like overflow. Its more practical to believe that a future C standard will fix this loophole rather than think compilers will exploit this and punish their own community.

Very not impressed with this thread.

why would a compiler do that?

The theory is that constraint propagation enables optimizations that aren't possible otherwise. In particular, it lets the compiler avoid emitting code for conditions it judges are impossible to occur. This usually doesn't help much with C, but can be useful with C++, and many of the optimization decisions are driven by C++ performance.

To satisfy the Rust community's desire to spread FUD throughout the C community?

I'm a big fan of C, and consider it my primary language. I have barely used Rust and am not part of the Rust community.

A more logical thought process here is that signed operations aren't uniform across architectures and are sometimes emulated, leading to UB for certain edge cases, like overflow.

Personally, I would love if this was the vision for C that compiler writers held. But it is not, and has not been for at least a decade.

Its more practical to believe that a future C standard will fix this loophole rather than think compilers will exploit this and punish their own community.

Unfortunately there is no chance that this will happen. That ship has sailed. Compilers are taking ever greater advantage of undefined behavior for purposes of optimization. Overflowing a signed integer is undefined behavior, therefor real-world compilers can and do strip out checks that they determine would require undefined behavior to be occur.

I realize this is hard to believe, but try this article as an intro: https://blogs.msdn.microsoft.com/oldnewthing/20140627-00/?p=...

Or even better, just look at the assembly that is generated here:

  #include <assert.h>
  int add_one(int num) {
    int increased = num + 1;
    assert(increased > num);
    return increased;
Both GCC and Clang completely remove the assert from the generated code: https://godbolt.org/g/5k43pV. To confirm, try adding "unsigned" to the "int" declarations and notice that it then keeps the check.

I don't think either removes the check yet in the exact multiply by two case that you offer, but it is not safe to depend on this behavior.

Very not impressed with this thread.

I can see why you feel that way, but I think if you research this further you'll come to appreciate the stern warnings that people are giving you in this thread. C does not work the you want it to. Your choices are to adapt and avoid undefined behavior, or switch to a better language. I'm sticking with C because I think it's the best choice for what I do, but you need to be aware of the dangers if you are going to use it.

> Your choices are to adapt and avoid undefined behavior, or switch to a better language.

Sadly, a lot of people choose the third choice: write code in what they imagine C to be, fiddle with it until it compiles without warnings and, if they're trying extra hard, passes limited testing with Valgrind. Then they ship it.

The C standard explicitly lists signed integer overflow as an undefined behavior. Any undefined behavior in your program essentially makes it semantically meaningless. There is no guarantee that the compiler will even emit the assembly instructions that would lead to signed overflow behavior. It could just as well emit a program that deletes your files, and still be standards-compliant.

asserts are for diagnostic-enabled code, don't use them for security checks.

This is one area where Rust also differs from C; assert!s are left on in release mode; you use debug_assert! if you want something only in development.

I agree, it's a hella questionable language choice, and not limited to C.

I found out about it through https://github.com/rohe/pysaml2/issues/451

I would argue the opposite, assert() statements are the best way to write defensive and secure C. There might have been a time when people commonly compiled out assert() statements from binaries, but that is only OK if the software was designed for that. Otherwise, that would be like me saying I am going to compile out all strlen() statements from a given binary and then expect it to behave the same.

Secure code should be correct and robust. To assure correctness assert()'s should be used, to assure robustness you should check return values and buffer sizes.

This is more like "a comparison of terrible C to middling Rust." Rust is generally a superior language, but this sort of comparing the worst kind of C to middling (or, worse, well-written) Rust doesn't seem useful to me.

I suppose it could be worse: these sorts of examples could always use obfuscated C as their source for comparison.

Yeah, sure. Personally, as much as I dislike the way the Rust community evangelizes the way Rust works (like the Haskell community writing another monad tutorial), there are some pretty common categories of errors demonstrated in the C code:

- Return pointer to stack object (more common than you might think!)

- Incorrect size passed to malloc()

- Pointer to array outliving invalidation of array

Some of the remaining C code appears to be garbage intended to distract you from the errors people actually make, but there is some good stuff in there and I pull out similar examples when I'm trying to convince people not to learn C or C++ just to improve the quality of their greenfield projects.

> - Pointer to array outliving invalidation of array

Another problem that can be largely avoided with a clear architecture.

1. The vast majority of buffers should be allocated at program startup, stored in a global variable, and freed at the end.

2. In almost all other cases, pointer ownership should just not be moved across functions (mostly "initializer functions" which return allocated memory). I think the idea that a function takes a data pointer without any idea how it was created, and then stores the pointer somewhere in its own structures, comes from garbage-collected languages and OOP culture which has an (I think) irrational aversion to globally architected dataflow. The messy object graphs that are created in this culture are just not manageable without a GC or another mechanically enforced discipline.

I'm not a fan of neither C++, nor Rust, nor garbage collected languages, for complex tasks. One can get away with less expertly architected programs in these languages. But, IMHO, they don't support, or do even impede, clear architecture, which ultimately leads to complexity. And this does mean hard to maintain and buggy software - it's just more memory-safe.

Now the only problem is in communicating and enforcing that architecture, hiring people who understand how important this architecture is or training them, adding these architectural checks to your code review process, having discussions about which cases are exceptions, and figuring out what to do with legacy code.

Legacy code is reduced to a fraction with good architecture, though. Well architected programs are a lot shorter and a lot more straightforward. And I don't really think you can make clear rules or exceptions - so no general discussions needed. Just don't admit data flow designs that aren't completely thought through (and really simple). Yes, it takes experienced developers.

You can point out that the ancient Egyptians built the pyramids just fine, but I'm still going to argue in favor of mechanization.

In my experience, good architecture does not materialize often enough to make this a viable strategy except for certain specialized teams. Consider writing code in C when you don't completely understand the problem and haven't devised a solution yet, or when you know it's going to get assigned to a junior developer and you don't have spare time for extra oversight. Or consider that you might ship poorly architected code because it's functional and shipping it will earn you money, or you might inherit large tracts of legacy code that are too expensive to refactor.

We all love ourselves some good architecture in our programs, but I'd rather have the penalty for bad architecture be "fails to compile" or "assertion error at runtime" rather than "our product has a security vulnerability". Just because the problem can be solved by throwing better, more experienced developers at it doesn't mean that we should. Let's throw the experienced developers at the more exciting problems and let our tools figure out memory safety for us... at least, most of the time.

> Legacy code is reduced to a fraction with good architecture, though.

I honestly don't understand this claim at all. I've only worked at one company where I didn't have to work with legacy code. The amount of legacy code I've been personally responsible for has varied from ~20 kloc (fairly manageable) to ~1 Mloc. I don't know what kind of good architecture could reduce that 1 Mloc. I ended up just throwing the address sanitizer at it, working on the test suite, and prioritizing refactors based on expected cost and payoff.

That was meant as "the amount of legacy produced is much smaller".

Are you seriously calling global mutable state "expert architecture"?

The thousands of C-based security vulnerabilities indicate that many experienced developers haven't figured out a "clear architecture" that prevents such issues.

> Are you seriously calling global mutable state "expert architecture"?

Yes. Putting global state in a class and instantiating it once is pretty pointless. It's a lot of boilerplate and in the end just hides the intention. To the point that programmers confuse themselves about their own assumptions. Leads only to more bugs.

Furthermore, global state (as well as global constant data) enjoys the best support for memory management in existence: the linker / process loader. (well-tested, no maintenance overhead)

> Return pointer to stack object (more common than you might think!)

Maybe when you're still learning how to program.

A combination of valgrind/sanitizers will catch all of these mistakes. The same class of mistakes can also be made in "memory-safe" languages, just replace pointer with index and memory with array.

> Maybe when you're still learning how to program.

I used to think so, too. From John Carmack (https://twitter.com/ID_AA_Carmack/status/587077680652230656)

> Found two pointer-to-out-of-scope-stack bugs today. I like tight native code, but C/C++ still makes me worry a lot.

And then there's this dubious claim,

> A combination of valgrind/sanitizers will catch all of these mistakes.

Nope! 1) You have to execute the right code paths before Valgrind or any of the sanitizers will catch your use of a pointer to stack. In relatively simple cases, you might not catch the error even if you have 100% code coverage. 2) Not all platforms have Valgrind or sanitizers working on them, in fact, most don't.

> The same class of mistakes can also be made in "memory-safe" languages, just replace pointer with index and memory with array.

Sure, you could write an x86 interpreter in Java, and I'm sure that somebody has already done this. But these errors have much more severe consequences in C.

Pointers to stack objects escaping the function (not necessarily returning) is actually surprisingly easy to do. Sure, the example "Foo f; return &f;" is the sort of situation that isn't going to happen if you're at all experienced, but it's not hard to build cases. I recently had to debug a crash which turned out to be a stack object unexpectedly escaping. Effectively, the flow is this:

    void foo(Foo *v) {
      Operand blah;
      /* Turns out that there's a use-list on some operands, and this adds &blah to that list in that case. */
      copy_operand(&blah, &v->operands[1]);
> A combination of valgrind/sanitizers will catch all of these mistakes.

They will catch only those mistakes that occur when you run them in tests. Plenty of memory safety CVEs still show up in programs that do aggressive fuzzing under valgrind/sanitizer testing. Or, as one aphorism has it, "testing cannot prove the absence of bugs, only their presence."

Valgrind will only catch such indexing errors when you read/write out of bounds when running Valgrind.

There is a huge difference between out-of-bounds indexing leading to undefined behavior or an exception/panic.

Valgrind won't catch certain indexing errors even if you exercise them. Valgrind only checks that the address is valid, not that the address was derived from a pointer to the object you are accessing.

    int main() {
        int x[16];
        int y[16];
        int z[16];
        x[0] = 0;
        y[0] = 0;
        z[0] = 0;
        y[18] = 3; // Valgrind thinks this is OK.
        return 0;

I think you may have missed the part of the conversation where we were talking about Valgrind, specifically, and not talking about the address sanitizer.

But clang complains, it's not about just one tool.

test.c:8:9: warning: array index 18 is past the end of the array (which contains 16 elements) [-Warray-bounds] y[18] = 3; // Valgrind thinks this is OK. ^ ~~

The snippet is a demonstration of the limitations of Valgrind, specifically. It would be trivial to change the code so it has the same behavior but doesn't trigger the Clang warning.

Don't use string_view then. Seriously, if people actually programmed C and C++ the way people claim they do, it wouldn't be any more efficient than Java most of the time.

That C code is bad and doesn't compile on macOS using clang without adding #include <assert.h> . Just looking at it you can see the bugs. But the main problem I'm having with this article is comparing a contrived C program with no tooling to Rust. As the author notes returning a stack address is caught by clang. Also simply compiling with address sanitizer or running under Valgrind would have caught the rest of the bugs.

I'm not saying Rust isn't safer than C or not a sweet and useful language it is those things. What I am saying is comparing Rust to C without all the tooling that Modern C developers use is kind of disingenuous.

I don't even use any tooling except occasionally a debugger (gdb on Linux / Visual studio on Windows). I have spent more than 500 hours of the last 8 months in C (no complicated data structures), and I'm pretty sure I've spent less than 2 hours total tracing memory bugs. (not saying there aren't more hidden).

Use as many tools as your platform supports. I've had "stable" C code run just fine for two years but come crashing to a halt immediately under address sanitizer. So if code is working just fine now it could still be buggy and not work on other platforms or when built by other compilers.

Yep - might do. Unfortunately, to be honest, the OpenGL implementation I use printed quite a bit of a mess when run under valgrind, so I haven't tried again. (Could improve the architecture to bench separately). Update: tried and valgrind doesn't report problems in my own code under normal operation, except for me being sloppy at shutdown :-)

Check out the docs for suppressing errors in valgrind[1]. Also try the various compiler sanitizers, as well as the clang static analyzer and if your project is open source, coverity.

[1]: http://valgrind.org/docs/manual/manual-core.html#manual-core...

Having learned Rust and used it for nearly 2 years, I am now happy using C. I think most issues in C are easily catchable. Here me out:

1. If using Int for indexing or any sort of len or count, make sure it's positive when needed, and within bounds of what's allocated. As in if you plan on allocating huge data, plan it out and use the right data type.

2. If you alloc, then free when done. If you free, set to Null; and before you free, check for Null.

3. If you realloc, in particular, check that it actually worked and prepare for basic error handling.

Rust requires all of these steps by default.

Finally, just test some of your code. Rust makes this easy, and encourages it.

I still really like C.

Although I completely disagree with you, I really appreciate that you posted.

> before you free, check for Null.

You can free(NULL), no problem. It will not do anything.

It's usually a problem with pointers to structs that have pointer members. In a typical destruction sequence, you usually free the members before the struct itself.

Would anyone realistically write this C code though? The second point (the initial total amount being 0) is something that anyone paying even a cursory glance at the source code could pick up on, as were many of the others (who the hell gets the address of stack memory and returns it? This is C 101).

If this was found-in-the-wild C then I wouldn't be bothered with it, but this is completely contrived.

No experienced C programmer will write this code, but years of buffer overflow/double free CVEs show that even experienced programmers make such errors occasionally. And one error can be enough for a system compromise.

Of course, this is not only an argument in favor of in Rust, but any memory-safe language. Rust just happens to address some of the same problem domains that C and C++ have traditionally been dominant in.

Why is double free so commonly an issue? Isn't setting to Null after free and checking for Null before hand a common practice?

It's usually not calling free() on the same variable within a single function, although that can sometimes happen. The most common case, I expect, is when two separate data structures both have pointers to the same object and believe they own it, then later end up calling free on the same address at different times.

It isn't that common. On HN, however, you'd think there isn't a piece of software written in C that isn't responsible for mayhem and death. The sort of FUD around the language, driven helpfully by the Rust Evangelism Strikeforce, isn't helpful.

> who the hell gets the address of stack memory and returns it? This is C 101

John Carmack, apparently: https://twitter.com/ID_AA_Carmack/status/587077680652230656

That's... unsettling.

> who the hell gets the address of stack memory and returns it?

Most of beginners out there, who then cannot figure out the reason for their crashes, especially when it's not explicitly returned as the value of the function ;)

Neither of these two languages are exceptionally friendly towards beginners. Python, Ruby, Java/C#, Go are IMO much better for them.

But eventually if someone turns to C no amount of Python experience will prepare them for this category of memory problems, so they still qualify as a “beginner” for this discussion.

> no amount of Python experience will prepare them for this category of memory problems

General programming experience helps a lot. Someone coming to C from a higher level language likely knows to read compiler output and warnings, use debugger, read diagnostic output. With modern C toolset, that experience will help solve 99% of such errors.

OTOH, someone coming to Rust encounters set of problems no amount of programming experience help to solve. IMO for beginners, Rust’s learning curve is dangerously close to that of pure functional languages.

There's an extra bug in both the C and the rust code. The code assumes that, if length == capacity and capacity >= 1, then length < capacity *2. This is false for finite-precision integers. In C, this will manifest as an out of bound write when the array gets too big. In Rust, it'll panic when array bounds checking or integer overflow checking catches it.

A related issue is that overflow on signed integers in C is undefined. Vec.length, Vec.capacity, and new_capacity should all be changed to size_t to avoid a compiler optimizing out an overflow check.

Edited to be explicit about programming language.

(Note that they are both well-defined in Rust)

Another problem with the C code, which is perhaps even more subtle, is the usage of `int` for the length, capacity, and loop index. On 64-bit platforms these variables will overflow for very large arrays, which will likely break the entire implementation. The call to `malloc()` should give a signed-to-unsigned conversion warning which will hint about this, but unfortunately many people ignore integer warnings. Incidentally, this could also be caught by compiling with `-ftrapv` or `-fsanitize=undefined`, but this problem is only triggered in a rather extreme corner case that is unlikely to be exposed during testing. The correct integer type to use is `size_t`, which is guaranteed to be large enough to hold the size of any object in memory.

Personally I'm going the other direction. I use int for anything unless there's a serious need to make an exception. The amount of complexity introduced by using a zoo of integer sizes and signedness is unmanageable to me.

There is no point in measuring array sizes in size_t. I don't make bigger allocations >2G. (At some point this assumption will probably break and I will have to re-think my approach. Shouldn't we all move to 64-bit integers by default already?)

> There is no point to measure array sizes in size_t. I don't make bigger allocations >2G.

I think almost every Rust program I've ever published regularly encounters files greater than 2GB (even 4GB), and if I had used C and its `int` type everywhere, I'd be in for a very very bad time.

This of course doesn't mean I am allocating >2G on the heap, but I might memory map the file. Or there might be some other counter (line counter? byte counter?) where using `int` would just fail.

There are even some alternatives to my tools written in C, and either they or their dependencies use `int` instead of `size_t`, and that leads to actual bugs their end users hit in exactly the cases where files are greater than 2GB.

Getting integer sizes right is important, and it's not just in cases where you're putting >2G on the heap.

> Getting integer sizes right is important, and it's not just in cases where you're putting >2G on the heap.

I prioritize on getting their values right :-). So far I have not encountered bugs due to my pretty uniform usage of int. But if they had to deal with allocations >2G, my programs would just die.

Yup, I make some exceptions as well, for example to measure time in microseconds, or to measure the size of very large streams. And of course I try to assert that all downcasts to int are valid, and that my integer operations don't overflow, etc. (why do CPUs still not support overflow exceptions?)

I'm pretty sure we could get rid of quite some historical baggage in terms of integer types. For example, I'm currently working on a network module, and there is a type socklen_t which is to indicate the size of a socket structure. I might be missing something, but to me there is no good reason not to use simply int.

Lots of comments seem to say “these aren’t mistakes any experienced C programmer would make”.

Does that really make the protection against those mistakes less important?

> Does that really make the protection against those mistakes less important?

No it doesn't. But from a C programmers point of view it's kinda like can you show examples of bugs in C that wouldn't immediately be found in code review or by tools like ASAN, Valgrind, Coverity or even just the C compiler, that Rust can solve. Those are the examples that would interest the C community.

Bugs that sneak through code review are, almost by definition, the hardest to demonstrate. If a qualified person familiar with the code couldn't see it, it will take a lot to explain it to an average reader. Hidden bugs are generally in large and complex codebases, but such codebases are also the worst for example code in a blog post. So that's why the next best thing is to show it in obvious case, and leave it for you to extrapolate it to your complex cases.

Also the more complex the problem, the more likely it is incomparable between C and Rust. The prime example is a controversy whether Heartbleed would have happened in Rust. On one hand a direct C-to-Rust translation of the whole system with a custom memory-reusing allocator wouldn't have prevented the bug. OTOH such approach from Rust perspective is very contrived and unusable in practice, so one can argue nobody would have structured the code like that in the first place.

So rust isn't memory safe? You can still access invalid pointers?

I'm just trying to clarify...it's lifetime safe, but not access safe, or leak safe?

You have two dialects: Safe Rust and Unsafe Rust. Safe Rust is regular Rust code, the kind of code that you should write 99.9% of the time. Unsafe Rust is a superset of Safe Rust; all of the Safe Rust checks still apply. Within an unsafe block, you gain access to a few extra tools that are not available in Safe Rust: you can dereference raw pointers, you can modify global variables, you can invoke unsafe functions, and a few more.

Unsafe code allows programmers to write code the compiler can’t check. Unsafe code is expected to uphold the safety invariants, but humans can make mistakes.

Most people mean “safe Rust” when they say “Rust.”

Leaks are memory safe though. They’re hard, but not impossible, to get.

> Most people mean “safe Rust” when they say “Rust.”

I disagree. I think they mean Rust code they see in use that has no explicit "unsafe" blocks. For example, most people are going to say "let mut v = Vec::<i32>::new(); v.push(0)" is "safe Rust." It is, in the sense that all the rules of Rust safety apply to the specific code quoted as-is. It also isn't, because 'Vec' is littered with unsafe.

If you're wearing x-ray glasses to look at the implementation details, no language is safe (e.g. CPython's vectors/lists are written in C). It isn't a useful contention.

That’s what I mean by “safe Rust”, so we agree.

We don't, though. The quoted code isn't safe. It has implicit unsafe blocks in it. You can't claim there are two dialects of Rust if you inlcude the "unsafe" dialect in your definition of "safe" as long as there isn't an explicit "unsafe" block present. It's misleading

That's not a useful definition, though, as that means that all code everywhere is unsafe, since the hardware is.

Although you can access invalid pointers, there are plenty of ergonomic ways not to.

If Rust never let you write code that could "go wrong", then it would not be useful for its target domain. Sometimes you need a "trust me" block where you call some internal allocator or trusted library function. It's more about the balance of affordances: is it easy to do the right thing in Rust, whereas in C it is easy to do whatever.

The 382th episode in which a rustacean blames C while he has only practised C++.

1. The (int PTR) cast of malloc() gives you away immediately. No cast in C.

2. > missing free on resize. When the resize occurs, we reassign vec->data without freeing the old data pointer, resulting in a memory leak.

Er... C has realloc() for that. Once again, you do delete() + new() only in C++, not in C.

BTW, you forgot other errors in your "C" program.

Please don't post snarky dismissals here. Not cool, and you can make your substantive points without casting them that way.


Please don't be so arrogant. Post your github profile, and then we can see if you write code without bugs.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact