Segfaults are the ideal case for memory errors, and those are the most easily caught and fixed, so you're least likely to see them. But, often those memory errors result in silent corruption which can be exploited, and that's harder to detect, especially if it relies on very specific corner cases. `curl` has had a number of these vulnerabilities over the last several years, for example.
Something as simple as `ls` is probably so small and battle tested that it's not an issue, but if you're writing an all-new, not-battle-tested tool, why wouldn't you want stronger guarantees? The new tool is being written for the features, but it's not fun to write vulnerabilities into something that should be simple and "just work."
Languages like Go and Ruby are also mostly memory safe, so those are generally fine picks here too, but every language has trade offs. In this case, the author clearly cares about performance, which Ruby does not care about.
Rust also has a built-in testing harness, which is a lot more convenient IMHO than using whatever C testing framework you might have a predisposition towards.
Regarding segfaults, when I was interviewing devs for C++ roles, I'd ask questions about a simple function like this:
std::string foo(bool flag)
{
if (flag)
return "true";
}
Questions I'd ask:
* Is the function well formed (Yes - functions need not return a value on all paths due to C ancestory, even if the return type has a non trivial actor - not sure if this is still considered well formed, but I think it was as least to C++11)
* what happens if 'foo' is called with true? (Returns "true" as one would expect)
* what happens if 'foo' is called with false? (Undefined or implementation defined behavior, but generally nothing nice - segfault, acces violation, etc)
* if it crashes, where, when and why does it crash? (Technically since its undefined, nearly any answer here suffices, if it can be backed up. Since practically, most optimizing compilers assume UB can never happen, when you return nothing from a non-void function, the compiler will attempt to invoke the destructor of a non existent object instance (assuming non POD) and boom.)
I asked this because it was a distilled example of a real world rare crash that was extremely difficult bug to track down because the crash location is often know where near the offending function.
I remember getting into a heated argument with a coworker when I claimed it should have been a compilation error. IIRC, he claimed it to be a halting problem and that the compiler couldn't determine that all paths didnt return a value. I called BS, citing at the time (circa 2004) that the new compiler on the block for C# could reliably emit errors when not all return paths returned a value.
I also like it that in C++, it's a very rare example of a very terse example dealing with a number of topics such as undefined/implementation defined behavior, debugging, compilation settings (warning levels, etc) all in a mere 4 lines of codes. With 4 LOC, which is straight forward and simple for the candidate to mentally parse, I can gleam a lot about their understanding of the language (and it's potential pitfalls).
Sorry if this got a little long winded and ranting.
> he claimed it to be a halting problem and that the compiler couldn't determine that all paths didnt return a value.
Theoretically we can't determine whether a function will return a value or not. In practice, heuristics get 99% of the way and the last 1% you can make the programmer put in a possibly redundant return statement.
There are two different questions: "do all paths lead to returning a value?" and "will the function return a value?"
Answering the second in the general case is equivalent to solving the halting problem. But answering the first question is much simpler. Static analyzers aren't using a heuristic approach to the second question, they're solving a completely different problem.
Of course you can decide the type a function returns, to overkill this problem, just run any type inference algorithm powered by unification (e.g. hindley milner). Give undeclared variables the type NULL and you're all set since you can decide whether the function returns type NULL.
Your friend was confused because you cannot decide whether a certain line in your program will be executed. Since this is equivalent to the halting problem. Let's prove this! Assume we have a blackbox B(P,x,N) decides whether line N of program P will be executed given input x. Here, we can solve the halting problem:
```pseudocode
def Helper(sourcecode P, input x):
[0] P(x)
[1] print "What Do We Say to the God of Death?"
def Halts(sourcecode P, input x):
if B(Helper,(P,x),1):
return true
else:
return false
```
This means, given programs like your interview question, compiler cannot decide -- in general -- whether the program will crash or work. Of course, it's not too hard to find "good enough" heuristics that'll catch some cases and/or restricting your language to make it "harder" to encode such programs.
> Is the function well formed (Yes - functions need not return a value on all paths due to C ancestory, even if the return type has a non trivial actor - not sure if this is still considered well formed, but I think it was as least to C++11)
In C falling off the end of a function without returning is legal, as long as you don't use the return value. In C++ this is illegal.
That's due to no shortage of effort by the maintainers, however: one small bug can easily lead to memory safety problems.
Rust is great in this sense because you don't have to think about those issues.
That said, I'd pick rust over C for most projects simply because it's a nicer language to write IMHO. It feels like a modern high-level language with things like iterators, traits, etc, and it has a great package manager too. In comparison, I find writing C to be a huge chore, especially if you're doing much string processing
Very importantly, Rust's safety guarantees help with "fearless concurrency", and better multithreading accounts for most of the performance gain we see in these new tools.
You can have fearless concurrency in C too if you use async queues and handover data completely without sharing references to data between threads (other than via the queues) - including using thread safe functions.
Rust will not help you with other multi-threadding issues, that arise from how OS implements processes/threads/signals/syscalls etc.
So I wouldn't call it fearless. Concurrent data access is just one issue you have to tak care around.
Well, they certainly have for me. As an example, GNU awk has some rather easy to the trigger segfaults which do not seem like they are going to be fixed soon.
I think its worth mentioning that rust has the gold standard argument parsing library now:
https://clap.rs
That library is used by most(?) rust cli tools, so they all have a similar feel. In addition, there is a library built on top of clap called struct-opt that makes bringing up a new cli exceedingly easy:
https://docs.rs/structopt/0.2.15/structopt/
Seriously, checkout struct-opts page even if you don't know any rust. I think their demonstration on the front page summarizes a lot of things I like about idiomatic rust.
I haven't needed it in a long time, but I still think trollop for ruby is magical. It seems it's been renamed to optimist now: http://manageiq.github.io/optimist/
I'd say Rust is a productive language for providing competitive CLIs.
- Compiled so easy to distribute compared to Python, JS, etc
- Cargo makes it easy to pull in libraries compared to C/C++
- Smart people made some great libraries that make it trivial to get performance and/or a nice UX. e.g. Everything in ripgrep is available for reuse, down to how it walks the filesystem.
Because it's a low level system's programming language with Haskell influences (makes it good for modelling a well-defined problem domain) with the "rewrite it in Rust" meme behind it. It's meant to be a replacement for C, so it makes sense that rustaceans would want to write replacements for programs written in C.