
Cross-platform Rust Rewrite of the GNU Coreutils - doppp
https://github.com/uutils/coreutils
======
pixelbeat
Note GNU coreutils has a good test suite which just calls out to the various
tools from shell and perl scripts. It should be easy enough to run this
implementation through it. More effort goes into the coreutils tests than the
code.

One of the hardest parts of coreutils is keeping it working everywhere,
including handling that buggy version of function X in libc Y on distro Z.
That's handled for GNU coreutils by gnulib, which currently has nearly 10K
files, and so is a significant project in itself.

Some stats:

coreutils files, lines, commits: 1072, 239474, 27924

gnulib files, lines, commits: 9274, 302513, 17476

~~~
3JPLW
Would using a GPL test suite with an MIT implementation make the whole GPL?
You're not "linking" to it, but it'd worry me somewhat.

~~~
jnbiche
>Would using a GPL test suite with an MIT implementation make the whole GPL?

Absolutely not. MIT is a FSF-approved GPL-compatible license and you're free
to use it with whatever GPL-licensed packages you wish[0].

However, if you _package_ and _distribute_ the Rust coreutils with the GPL
test suite, then the _users_ of the package are obliged to either use GPL,
MIT, or some other FSF-approved OS license.

But it's simple enough to package the Rust coreutils _without_ the test suite,
which puts the Rust coreutils users under no GPL obligations.

GPL2 is all about the _distributing_ of software, not how you use it.

I do agree that any use of a GPL package will make some folks using this
package in commercial software nervous, given the very few (none?) actual
court cases that have decided these issues.

0\. [https://www.gnu.org/licenses/](https://www.gnu.org/licenses/)

~~~
__david__
> However, if you package and distribute the Rust coreutils with the GPL test
> suite, then the users of the package are obliged to either use GPL, MIT, or
> some other FSF-approved OS license.

I don't think this is right. The programs in question are the test suite, and
there's can certainly distribute complete programs licensed under the GPL
alongside with programs under even non-free licenses (otherwise Mac OS X
couldn't ship).

Since the test suite doesn't link against the individual utilities, it won't
affect them, license-wise.

It is well understood that the GPL doesn't cross executable boundaries (again,
otherwise Mac OS X is in trouble).

~~~
jnbiche
You're absolutely right. I was thinking of AGPL, which does cross network
boundaries.

~~~
angersock
And that's why it's to be avoided like the plague. :(

~~~
steveklabnik
Or, that's why to use it instead of the GPL.

------
DCKing
I was already secretly imagining a future where Rust will completely replace C
as the low level language for everything. Even the Linux kernel rewritten in
it eventually. It'd be a much better world with no #ifdefs, strong type safety
and memory safety depending on compilers instead of fallible programmers.

But I was expecting it would take a while before people had the ambition to
start doing these things, even when it comes to the smaller ambition of
rewriting the GNU coreutils. Rust is a great language already, but it's not a
_stable_ language yet.

I wonder if my imagination will become reality eventually. There's really good
buzz around Rust now, and it's not even 'production ready'.

~~~
userbinator
I hope that never happens, because I think such a world would be extremely
dystopian. Strong safety guarantees sound very beneficial at first ("safe"
_anything_ in general tends to evoke that response), but as shocking as it may
seem, a lot of the freedom we have today is a result of using "unsafe"
languages like C: iOS jailbreaks, Android rooting, homebrew on game consoles,
and many other examples of freeing one's devices from restrictions imposed by
the original manufacturer and giving control back to the user depend on this.
These bugs and vulnerabilities, although they can be exploited maliciously,
are in some ways like an escape hatch from the tyranny of corporate control.
If strongly safe languages become the norm, all these would disappear and our
computing experience become even more locked-down and restricted. The thought
of them being used for DRM is scary (I remember reading an academic paper that
suggeted this as one of their applications, and that's one of the thing that
provoked me into really starting to think about this.) Even open-source
software where safer languages appear most beneficial can be used against the
user.

Ultimately is a security vs. freedom tradeoff, and I think we've sacrificed
too much of the latter already.

~~~
DCKing
I'm baffled by that opinion. The fact that memory-unsafe software keeps on
being produced in C/C++ is a _good_ thing? The capabilities of criminals to
run local (or remote!) root exploits on many devices is offset by the fact
that some hackers can tinker with them?

Freedom should be achieved in different ways. Please don't make freedom and
security contradictions.

~~~
weland
Practical experience shows that once you run into even routine systems
programming elements, like writing device drivers and whatnot, you eventually
have to concede some safety, or end up with an unprogrammable behemoth of such
baroque complexity that you likely have more security holes, just of some
other kind. There has been a lot of research into this back in the 1990s; most
of the useable results produced security by architecture rather than tools.

~~~
pohl
It's almost as if what one would want is a language where you could shrink
unsafe operations to as small a footprint as possible by confining them within
a block with a special keyword, like this:

    
    
         unsafe { … }
    

Someone should invent such a thing.

 _Edit: …yeah, I was being tongue-in-cheek. This is exactly what Rust
provides._

[http://doc.rust-lang.org/rust.html#unsafety](http://doc.rust-
lang.org/rust.html#unsafety)

 _"...When a programmer has sufficient conviction that a sequence of
potentially unsafe operations is actually safe, they can encapsulate that
sequence (taken as a whole) within an unsafe block. The compiler will consider
uses of such code safe, in the surrounding context…"_

~~~
Touche
If I call a function that is running unsafe { } inside of it, do I know?
Because I really want to know. And I want my function to be marked as unsafe
as well (because it is) as well as any function calling my function, etc.

~~~
kibwen
The way that you mark a function as unsafe is to stick a keyword in front of
it:

    
    
      unsafe fn foo() { ... }
    

Any function marked as such is then allowed to call other unsafe functions:

    
    
      unsafe fn bar() { foo(); }
    

But there is a way to break the chain, which is to use an `unsafe` block
without marking your function as unsafe:

    
    
      fn qux() {
          unsafe {
              bar();
          }
      }
    

That said, it's incorrect to think that Rust is any more unsafe than any other
language because of this; most languages simply defer this behavior to their
FFI. By pulling it into the language itself, Rust is actually _safer_ than
e.g. calling C from Python, because Rust can do the low-level fiddling while
still retaining at least some of the safety checks of normal Rust code. Even
unsafe Rust is safer than C.

~~~
steveklabnik
> Even unsafe Rust is safer than C.

This is an important point. `unsafe` blocks only let you do a few extra
operations[1], not anything you want. A lot of safety checks still happen
inside of unsafe blocks.

1: [http://static.rust-lang.org/doc/master/rust.html#behavior-
co...](http://static.rust-lang.org/doc/master/rust.html#behavior-considered-
unsafe)

~~~
kibwen
Well, no, you can still theoretically do anything you want, you just need to
be very, very explicit about it. :)

~~~
dbaupp
Some things are undefined behaviour[1]... so you _really_ don't want to want
to do them (i.e. you _can_ do them inside `unsafe`, but the compiler
optimises/reasons assuming they never happen: if they occur at all, you could
have an arbitrarily broken program).

[1]: [http://doc.rust-lang.org/master/rust.html#behavior-
considere...](http://doc.rust-lang.org/master/rust.html#behavior-considered-
unsafe)

~~~
kibwen
The point that I'm trying to make here is that you cannot make any assumptions
about an unsafe block. Anything can happen, including really terrible
undefined behavior. But the fact that anything can happen is why Rust is as
powerful as C in this area.

~~~
steveklabnik
My point is that while anything _can_ happen, it's not like Rust just turns
off every single check. Yes, they can be gotten around, but it's not like the
type system suddenly goes away.

------
reirob
"uutils is licensed under the MIT License - see the LICENSE file for details"
\- funny to see GNU software reimplemented in another open source license.

~~~
reirob
I do not understand the down votes. Can somebody explain?

I did not mean to hurt any feeling. I was just surprised to see the license of
a software where "GNU" is in the title. I naturally assumed that it will be in
GNU license. So it was a surprise to discover MIT.

Now I understand. I should have written "surprising" instead of "funny".
Sometimes writing about a subject gives you the answer.

Sorry for those that I hurt.

~~~
adestefan
Because there is a large anti-GPL contingent on HN.

~~~
icefox
Is it anti-gpl or anti-gpl3? It seems from a public relations perspective gpl3
really lost a lot of mindshare in the open source world, including me.

~~~
steveklabnik
It's anti-GPL. Remember, a large target audience of HN are entrepreneurs, who
want to use open source components in their closed-source applications that
they're selling. (A)GPL prevents them from doing this, (not selling, but
keeping it closed source, which many see as being important to selling) so
they get pissed.

~~~
stephenr
A large part of HN is developers who like to release good code and let anyone
use it as they see fit.

GPL has neither of those goals and actively works against the latter.

~~~
steveklabnik
> actively works against the latter.

For your definition of 'anyone.' BSD software often does not let the end user
use it as they see fit, as they no longer have access to the source code.

~~~
bronson
How does BSD remove access to source code?

Someone else may add some modifications to my code and I may not be able to
see those modifications, true. But nothing's been removed.

------
cageface
How stable and usable is Rust's ABI? Can I easily call Rust code from C or
other languages that can easily call C functions? I ask because I think this
is one area in which C++ has really failed. C++ is a bad language for building
frameworks and libraries for consumption from other languages because
interfacing with C++ code is such a nightmare. I'd hope any language that
aspires to displace C++ has a better story here.

~~~
dbaupp
Rust's internal ABI (i.e. used for a default `fn foo() { ... }`) is not at all
stable/usable/specified.

However, you can easily define a function with a C ABI (and a non-mangled
symbol name):

    
    
      #[no_mangle]
      pub extern "C" fn foo() { ... }
    

I'm not sure if you regard this as better than C++'s situation.

------
loudmax
While they're at it, I hope they write complete man pages rather than pointers
to info pages.

~~~
malandrew
While I know the basic differences between info and man pages, can you shed
some more light on your preference here for the less well informed.

~~~
asher
Man pages are more convenient for the typical unix hacker - command-line
oriented, vi-using.

They are rendered in less, which supports vi-like searching.

Info, on the other hand, is one of those odd pre-www hypertext systems.
Advanced for its time, it is now clumsy and nonstandard.

------
gkya
> However, [other ports of coreutils to Windows] are either old, abandoned,
> hosted on CVS, written in platform-specific C, etc.

I don't get how on the earth can using CVS for version control be an
appropriate reason to consider a software project bad. Yes, CVS is old and
centralised, but, is it that big of a deal that it's usage by a project per se
projects the project old and inactive?

~~~
thegeomaster
I know from my own experience—when you want to start a project that has been
done before you, it's easy to fall into the trap of faultily rationalizing why
none of the other attempts have worked and why you're going to do it better.

However, I don't think you should ever avoid writing something you'd enjoy for
the sole reason that there are prior attempts (successful or unsuccessful). If
you aren't motivated by money, you won't lose anything if your project doesn't
get a single user, but you will always gain enlightenment and great
experience.

~~~
gkya
I understand your comment and support it -- I have, many times, thought of
implementing a POSIX userland in Go, but the task has daunted me, so that has
not been realised till now. What bothered me was the fact that they show usage
of CVS as an alert for deprecation. I concur your argument though, no-one can
judge no-one else for what they choose the create in their own time. But CVS
is perfectly usable software (and I am telling this as someone who has spent a
month trying to CVS-checkout OpenBSD source tree, only because $CVS_RSH=rsh
rather than ssh).

~~~
__david__
> What bothered me was the fact that they show usage of CVS as an alert for
> deprecation.

CVS still works just as well as it ever did, but it _is_ super crusty at this
point. If an open source project hasn't transitioned off of it, that's a sign
to me that the maintainer either doesn't know any modern DVCSes, or doesn't
care about the project enough to transition to them.

It's not a guarantee that the project is dead or outdated, but it's a smell.

Of course, I consider still being hosted on Sourceforge to _also_ be a smell.

~~~
gkya
> CVS still works just as well as it ever did, but it is super crusty at this
> point. If an open source project hasn't transitioned off of it, that's a
> sign to me that the maintainer either doesn't know any modern DVCSes, or
> doesn't care about the project enough to transition to them.

[http://www.openbsd.org](http://www.openbsd.org)

~~~
__david__
Yes, I know. The *BSDs are the exception that proves the rules. I also said it
was "a smell" and not an outright dismissal.

------
mhd
This looks like it's handy for learning purposes, but is there actually any
practical use for this? I don't think that the code size and the pace of
development really require changing the core language.

Past re-implementations have focused either on the learning experience, code
size (for embedding or Unix "purity" reasons) or the license (i.e. not GPL).

~~~
Nelson69
Most of these tools are small enough and simple enough that they sort of "get
finished." Probably not interesting but to some purists.

There is something to be said for the nice clean implementations of some of
these tools. There is a little to be said for the language itself.

Really, something bigger and better is needed. A new HTTP server, a new
sendmail or something, a new DNS server. I'd love to see Go or Rust take on
Bind and produce a safe, secure, high performance implementation. The Ada guys
never stepped up and produced anything interesting to show their tools'
superiority, there is a lot more interest and community in Rust and Go. What's
the leakiest, buggiest part of the equation right now? Go make a better one.

~~~
v21
Well, they're building a browser engine...

------
krick
Nice. However, I feel there're two things really needed to be done here: move
all that cat|du|... to some src|source|whatever directory and add 'tests'
directory with somewhat more robust tests layout than the current one. It
would be really cool to replace (seriously, just for the arts sake) coreutils
with this one, but I wouldn't dare to do it if I'm unable to easily verify
that they both _work_ and _work at least almost as fast as coreutils_ on my
platform. It's important stuff, you know, otherwise it wouldn't be called
"coreutils", really…

------
fernly
It will be interesting when they get down the to-do list to fmt, which
contains the Knuth-Plass paragraph reflow algorithm -- the only C-language
version I've been able to find, and about the only readable one, after Knuth's
"literate" one.

[1]
[http://onlinelibrary.wiley.com/doi/10.1002/spe.4380111102/ab...](http://onlinelibrary.wiley.com/doi/10.1002/spe.4380111102/abstract)

------
lukasm
Why Rust? Is it better at this than, say, Go?

~~~
octo_t
Well rust doesn't have garbage collection, its designed to replace* the
languages that coreutils are currently written in - C (I don't think any gnu
coreutils are written in C++). Go is not designed to replace these languages.

*by replace I mean occupy some of their current marketshare.

~~~
retroencabulato
Simply comparing the two based on GC is very superficial.

~~~
__david__
Not really. The day my `cat` pauses to GC is the day I leave this field...

~~~
lomnakkus
Why? Do you think you would notice?

(I realize your post may have been tongue in cheek, but it can be hard to tell
on the interwebs.)

------
elktea
It's not really cross platform - whoami for example depends on libc which
afaik isn't native on Windows.

I wrote one of the first utilities for this when it was first opened up for
collaboration, so I hope it succeeds :) I need to go back and write tests for
my util.

~~~
icebraining
There are many "libc"s; according to the docs, Rust's libc is a module that
binds to the platform-specific libc implementation: [http://static.rust-
lang.org/doc/0.10/std/libc/index.html](http://static.rust-
lang.org/doc/0.10/std/libc/index.html)

------
bronson
I love this idea! When I get a few hours I'll have to try porting a small
coreutil... Seems like a great way to learn Rust.

Any chance of Rust taking over embedded systems programming? That's still
mostly done in C and quickly devolves into horror.

~~~
steveklabnik
> Any chance of Rust taking over embedded systems programming?

I don't know about 'taking over' but there are some people who are using Rust
for this use case.

~~~
bronson
That would be super cool. Anything publicly available to look at?

Freescale's Freedom boards are so cheap, so capable, and have such miserable
tooling (mbed, DIY usb, Processor Expert, CodeWarrior, ugh)... I wonder if
Rust could make them attractive.

~~~
steveklabnik
I just know that people come into IRC, and there was some discussion about
bitfields to help support embedded stuff. Not sure there's a lot written about
it yet.

------
SeanDav
Interesting project. Why is there no grep in this, or on the to-do list?

~~~
delroth
grep is not part of the coreutils.

