I've noticed a lot of command line utilities are being rewritten in Rust. Is the...

gnulinux · on May 21, 2019

Possibly because

1. People predict Rust will be more popular in the future, so they're trying to learn/master the language by practicing it.

2. People think speed and memory safety are important guarantees for unix programs.

3. People are testing whether Rust deserves the fame it has

megous · on May 21, 2019

> 2. People think speed and memory safety are important guarantees for unix programs.

I don't think any of the typical unix tools ever segfaulted on me in the last 15 years I use Linux.

coder543 · on May 21, 2019

Segfaults are the ideal case for memory errors, and those are the most easily caught and fixed, so you're least likely to see them. But, often those memory errors result in silent corruption which can be exploited, and that's harder to detect, especially if it relies on very specific corner cases. `curl` has had a number of these vulnerabilities over the last several years, for example.

Something as simple as `ls` is probably so small and battle tested that it's not an issue, but if you're writing an all-new, not-battle-tested tool, why wouldn't you want stronger guarantees? The new tool is being written for the features, but it's not fun to write vulnerabilities into something that should be simple and "just work."

Languages like Go and Ruby are also mostly memory safe, so those are generally fine picks here too, but every language has trade offs. In this case, the author clearly cares about performance, which Ruby does not care about.

Rust also has a built-in testing harness, which is a lot more convenient IMHO than using whatever C testing framework you might have a predisposition towards.

hermitdev · on May 21, 2019

Regarding segfaults, when I was interviewing devs for C++ roles, I'd ask questions about a simple function like this:

  std::string foo(bool flag)
  {
    if (flag)
      return "true";
  }

Questions I'd ask: * Is the function well formed (Yes - functions need not return a value on all paths due to C ancestory, even if the return type has a non trivial actor - not sure if this is still considered well formed, but I think it was as least to C++11) * what happens if 'foo' is called with true? (Returns "true" as one would expect) * what happens if 'foo' is called with false? (Undefined or implementation defined behavior, but generally nothing nice - segfault, acces violation, etc)

* if it crashes, where, when and why does it crash? (Technically since its undefined, nearly any answer here suffices, if it can be backed up. Since practically, most optimizing compilers assume UB can never happen, when you return nothing from a non-void function, the compiler will attempt to invoke the destructor of a non existent object instance (assuming non POD) and boom.)

I asked this because it was a distilled example of a real world rare crash that was extremely difficult bug to track down because the crash location is often know where near the offending function.

I remember getting into a heated argument with a coworker when I claimed it should have been a compilation error. IIRC, he claimed it to be a halting problem and that the compiler couldn't determine that all paths didnt return a value. I called BS, citing at the time (circa 2004) that the new compiler on the block for C# could reliably emit errors when not all return paths returned a value.

I also like it that in C++, it's a very rare example of a very terse example dealing with a number of topics such as undefined/implementation defined behavior, debugging, compilation settings (warning levels, etc) all in a mere 4 lines of codes. With 4 LOC, which is straight forward and simple for the candidate to mentally parse, I can gleam a lot about their understanding of the language (and it's potential pitfalls).

Sorry if this got a little long winded and ranting.

im3w1l · on May 21, 2019

> he claimed it to be a halting problem and that the compiler couldn't determine that all paths didnt return a value.

Theoretically we can't determine whether a function will return a value or not. In practice, heuristics get 99% of the way and the last 1% you can make the programmer put in a possibly redundant return statement.

adrusi · on May 21, 2019

There are two different questions: "do all paths lead to returning a value?" and "will the function return a value?"

Answering the second in the general case is equivalent to solving the halting problem. But answering the first question is much simpler. Static analyzers aren't using a heuristic approach to the second question, they're solving a completely different problem.

mortb · on May 21, 2019

Don't get me wrong, but if whole classes of errors, can be avoided by not choosing C++, why should I choose to use it?

monsieurbanana · on May 21, 2019

Because you already know C++ and don't want or can't justify learning another language. That's a very valable reason.

gnulinux · on May 21, 2019

Of course you can decide the type a function returns, to overkill this problem, just run any type inference algorithm powered by unification (e.g. hindley milner). Give undeclared variables the type NULL and you're all set since you can decide whether the function returns type NULL.

Your friend was confused because you cannot decide whether a certain line in your program will be executed. Since this is equivalent to the halting problem. Let's prove this! Assume we have a blackbox B(P,x,N) decides whether line N of program P will be executed given input x. Here, we can solve the halting problem:

```pseudocode

def Helper(sourcecode P, input x):

    [0] P(x)

    [1] print "What Do We Say to the God of Death?"

def Halts(sourcecode P, input x):

    if B(Helper,(P,x),1):

        return true

    else:

        return false

```

This means, given programs like your interview question, compiler cannot decide -- in general -- whether the program will crash or work. Of course, it's not too hard to find "good enough" heuristics that'll catch some cases and/or restricting your language to make it "harder" to encode such programs.

saagarjha · on May 21, 2019

> Is the function well formed (Yes - functions need not return a value on all paths due to C ancestory, even if the return type has a non trivial actor - not sure if this is still considered well formed, but I think it was as least to C++11)

In C falling off the end of a function without returning is legal, as long as you don't use the return value. In C++ this is illegal.

paulfurtado · on May 21, 2019

That's due to no shortage of effort by the maintainers, however: one small bug can easily lead to memory safety problems.

Rust is great in this sense because you don't have to think about those issues.

That said, I'd pick rust over C for most projects simply because it's a nicer language to write IMHO. It feels like a modern high-level language with things like iterators, traits, etc, and it has a great package manager too. In comparison, I find writing C to be a huge chore, especially if you're doing much string processing

ajdlinux · on May 21, 2019

Very importantly, Rust's safety guarantees help with "fearless concurrency", and better multithreading accounts for most of the performance gain we see in these new tools.

megous · on May 21, 2019

You can have fearless concurrency in C too if you use async queues and handover data completely without sharing references to data between threads (other than via the queues) - including using thread safe functions.

Rust will not help you with other multi-threadding issues, that arise from how OS implements processes/threads/signals/syscalls etc.

So I wouldn't call it fearless. Concurrent data access is just one issue you have to tak care around.

feanaro · on May 21, 2019

Well, they certainly have for me. As an example, GNU awk has some rather easy to the trigger segfaults which do not seem like they are going to be fixed soon.

mruts · on May 21, 2019

Have you tried piping /dev/random to them? I bet you could get some of coreutils and a lot more BSD tools to crash by doing that.

moomin · on May 21, 2019

The time window is very important here. csh, for instance, famously crashed all the time.

ThemalSpan · on May 21, 2019

I think its worth mentioning that rust has the gold standard argument parsing library now: https://clap.rs

That library is used by most(?) rust cli tools, so they all have a similar feel. In addition, there is a library built on top of clap called struct-opt that makes bringing up a new cli exceedingly easy: https://docs.rs/structopt/0.2.15/structopt/

Seriously, checkout struct-opts page even if you don't know any rust. I think their demonstration on the front page summarizes a lot of things I like about idiomatic rust.

rakoo · on May 21, 2019

I haven't needed it in a long time, but I still think trollop for ruby is magical. It seems it's been renamed to optimist now: http://manageiq.github.io/optimist/

epage · on May 21, 2019

I'd say Rust is a productive language for providing competitive CLIs.

- Compiled so easy to distribute compared to Python, JS, etc - Cargo makes it easy to pull in libraries compared to C/C++ - Smart people made some great libraries that make it trivial to get performance and/or a nice UX. e.g. Everything in ripgrep is available for reuse, down to how it walks the filesystem.

empath75 · on May 21, 2019

I’d say it’s about 80% projects for people wanting to learn the language and 20% about safety, speed and concurrency.

c3534l · on May 21, 2019

Because it's a low level system's programming language with Haskell influences (makes it good for modelling a well-defined problem domain) with the "rewrite it in Rust" meme behind it. It's meant to be a replacement for C, so it makes sense that rustaceans would want to write replacements for programs written in C.

addicted · on May 21, 2019

In this case it appears they are getting better performance than colorls, whose features the author presumably likes.

mhh__ · on May 21, 2019

1. Hype 2. Rust is, despite reasons for which I don't use it, a good language which seems to achieve it's goals 3. Why not