
Show HN: A simple, fast and user-friendly alternative to find, written in Rust - sharkdp
https://github.com/sharkdp/fd
======
coldtea
It would be nice to have a collection of modern/faster/saner alternatives to
common unix tools and such utilities (made in rust or not) that becomes
standard -- or at least easily installable in linux distros and OS X. Not for
replacing those tools, but for supplementing them.

Thinking of stuff like fd and:

ripgrep:
[https://github.com/BurntSushi/ripgrep](https://github.com/BurntSushi/ripgrep)
(grep/ag alt)

xsv: [https://github.com/BurntSushi/xsv](https://github.com/BurntSushi/xsv)
(csv tool)

exa: [https://the.exa.website/](https://the.exa.website/) (ls)

una: [https://github.com/jwiegley/una](https://github.com/jwiegley/una)
(multi-compression utils wrapper)

tokei:
[https://github.com/Aaronepower/tokei](https://github.com/Aaronepower/tokei)
(loc stats)

And of course this:
[https://github.com/uutils/coreutils](https://github.com/uutils/coreutils)

~~~
bm1362
I appreciate all these new incarnations of old school tools but end up
sticking with the basics regardless. When I'm sshing into a box to figure out
whats going on, my toolset is mostly limited to top, find, xargs, awk, grep,
df, du etc. Even local development now, I'm mostly debugging on a docker
container running alpine or ubuntu.

Knowing the right incantations is useful in that context and keeps me from
installing better tools until they become part of the distro we deploy with.

~~~
foo101
What surprises me is that even after 25 years of the Internet, we cannot bring
in new tools into our environment (be it a remote box I ssh into or a trimmed
down docker image) on the fly.

The popular package management tools rely heavily on FHS and like to install
stuff into directories that require root permission.

Imagine if there was a tool that could download binaries of modern tools from
a certain repo and install it to our ~/bin. Imagine if we could use new
command line tools as easily as we can download a fully functional complex
applications securely into a browser tab just by typing its URL!

~~~
laumars
You can. There's nothing stopping you adding ~/bin to $PATH. Many compilers
allow you to specify the destination location either via a ./configure flag
(for example) or an environmental variable. Or there is always the option of
doing a `make build` and then manually copying the binaries into your ~/bin
directory without doing a `make install`

You can also add ~/lib (for example) to your LD_LIBRARY_PATH path if you
wanted to install custom libraries and even have your own man page path too.

All of this is already possibly on Linux and Unix however I'm not sure it's
something you actively want to encourage as allowing users to install whatever
they want on servers lowers the security to the level of the worst user on
that server. If it was really deemed necessary that users should be able to
install whatever they want then I'd sooner roll out a dedicated VM per user so
at least their damage is self-contained (barring any visible networked
infrastructure)

~~~
coldtea
> _You can. There 's nothing stopping you adding ~/bin to $PATH. Many
> compilers allow you to specify the destination location either via a
> ./configure flag (for example) or an environmental variable._

Of course you can, technically. The parent laments why it's not easier to
achieve. Already you're talking about manually building for example. Where's a
package manager that will allow for that too, not just the central repo? (not
just asking if such manager exists in some form, asking where it is in modern
popular distros).

> _All of this is already possibly on Linux and Unix however I 'm not sure
> it's something you actively want to encourage as allowing users to install
> whatever they want on servers lowers the security to the level of the worst
> user on that server._

Which also touches the parent's question. Why is it not easier AND safer? It's
not like we don't have security models that allow for such things...

~~~
hnlmorg
The GP had exampled one easier and safer way of doing this: sandbox each user
in their own Linux instance.

Anything short of that would be sacrificing security for the sake of
convenience.

------
falcolas
I'm sorry to be something of a negative Nancy - and I'm sure I'm a corner case
- but this is not really an alternative to find. It's an alternative for your
editor's fuzzy find and a better version of shell globs.

The absense of -delete, -execdir, -mtime, and the ability to chain multiple
patterns together for the non-trivial use-cases, means this is practically
useless in most places where `find` is used in day-to-day work. Not to mention
the "opinionated" choice to ignore a dynamic set of files and directories
(what files were skipped? In any directory, or just git repos? Does it back
track to find a git directory and .gitignore? Does it query out to git? Does
it respect user/global .gitignore settings? Does it skip Windows hidden files
from a unix prompt, or a dotted file when on Windows?), the options hidden
behind two separate flags.

Perhaps it's just because I'm used to using 'find', but when I reach for it,
it's because I need -delete or -execdir, or I'm writing automation that really
needs -mtime and chained patterns.

So, I would suggest that you don't call this an alternative to find; it's not.
A replacement for shell globs, sure. A replacement for poor file finding in
text editors, OK. Just... not find.

EDIT: Oh, 'fd' also already exists.
[https://github.com/knu/FDclone](https://github.com/knu/FDclone)

~~~
sharkdp
Thank you for the feedback.

> So, I would suggest that you don't call this an alternative to find; it's
> not. A replacement for shell globs, sure. A replacement for poor file
> finding in text editors, OK. Just... not find.

It's literally called "a simple [..] alternative to find" and the very first
sentence in the README says that "it does not seek to mirror all of find's
powerful functionality". I'm not sure anymore how far I have to back off until
everyone will be okay with my "advertising" :-)

~~~
tangus
But in fact it does not seek to mirror _any_ of find's powerful functionality.

Maybe advertize it as "a better ls -R | grep -i"?

------
fuzzygroup
Yes I'll agree with the people who say that this isn't a drop in replacement
(fewer features and different syntax) but I also just don't care. fd's
defaults are brilliantly simple and just plain make sense. Yes I can use find
but I always, always end up looking up an example. fd was simple enough that I
very quickly figured it out by trial and error. And it really is fast.

Thank you sharkdp -- really nicely done. Appreciated.

~~~
sharkdp
Thank you for the feedback, I'm glad you like it!

------
josteink
My main usecase for find is not only finding files, but to do actions on them
through -exec.

Unless I’m missing something, that’s not supported with this tool.

~~~
derriz
I'm curious why you (or anyone) uses -exec? It's often painfully slower than
piping through xargs and I find the syntax (the semicolon and braces and the
required quoting/escaping) uncomfortable. I haven't used -exec in 20 years.

~~~
ekns
It's not even slower (and it's safe beyond the argv limit) if you use "+"
instead of the semicolon.

    
    
      find -type f -exec sed -i s/foo/bar/ {} +
    

This only invokes the command twice for 100k files (assuming we can pass in
64k arguments to sed)

~~~
derriz
Thanks! I never knew about the + terminator.

------
wruza
Any plan to provide find compatibility for drop-in replacement? My find/locate
usage patterns are not worth learning new sophisticated tool for 5sec speedup
once a week, and I suspect that I’m not alone. This is the main problem of new
tools — not repeating the old-familiar style makes it unattractive for non-
aggressive users.

Side question: why is speed important? Which tools use fd to gain required
performance?

~~~
sharkdp
Thank you for the feedback.

No, fd does not aim to be a drop-in replacement for find. It was actually
designed as a "user-friendly" (I know, I know...) alternative to find for most
of the use-cases: "fd pattern" vs. "find -iname ' _pattern_ '", smart case,
etc. Also, in my view, the colored output is not just a way to make it look
fancy, but actually helps a lot when scanning a large output in the terminal.

The speed is not _hugely_ important to me, but I wanted it to be at least
comparably fast to find (that's why I decided to try out Rust for this
project). Otherwise, nobody would even consider using it.

I know that it will not be for everyone and I also know that it's a lot to ask
someone to learn a new tool, but I'm happy if there are some people that enjoy
using it. I certainly do ;-)

~~~
weaksauce
any plans on implementing the exec command from find in fd?

~~~
nine_k
Why it would be significantly better than using a pipe?

~~~
jimktrains2
Because you'll have to pipe into xargs and then you have issues with spaces in
file names, sometimes-exec is simpler for a single command.

~~~
RHSeeger
You can use -print0 and -0 to handle spaces.

    
    
        find . -name \*.java -print0 |xargs -0 grep something

~~~
IshKebab
Much more error prone.

------
reacweb
IMHO, there is no demand for this kind of "improved tools collection". In the
programming world, the programmer creates a sweet working environment to fit
his needs by creating many alias and functions, by customizing his desktop. He
complains that the tools collection is complex but has no opportunity to
progress because he rarely uses them directly. In the "sysop" world, the user
is always logged remotely on varying machines. There is no incentive to
install and customize a machine because he will quickly have to work on
another machine without this customization.

I think we need a saner alternative to the man pages of the classic unix tools
collection. Something less focus on the seldom used options, but more focus on
the traps to avoid and the useful tricks. The bad quality of documentation is
more annoying than the incongruous syntax of tools.

------
hzhou321
For me, the speed of find and grep is sufficiently never a concern, so I do
not find these speedy alternatives sufficient to compensate for its non-
universality. And when the speed of grep and find do become a concern, I would
think the issues are somewhere else -- like have 10s thousands of jpg files
scatter everywhere that you need to find in the first place. When issues are
somewhere else, replacing it with _better_ tool only hides and makes the
problem worse.

------
majewsky
It's strange to advertise this as a find(1) replacement when it covers maybe
1% of find(1) use cases.

    
    
      find -type d -empty -delete
      find -type l -exec readlink -f {} +
    

etc. are not covered at all.

~~~
raverbashing
1% of possible cases but 90% of actual usage I'd say

~~~
majewsky
Then you have a different usage pattern than I do. For simple name globbing, I
usually just use shell globs, e.g.

    
    
      ls src/**/*_test.go
    

It might be marginally slower than find or FD, but it usually doesn't matter
because I usually deal with a small number of files.

~~~
lkbm
Oh, [star][star]! What I usually end up doing is:

1\. ls [star]/foo

2\. ls [star]/[star]/foo

3\. ls [star]/[star]/[star]/foo

4\. "I guess it doesn't exist."

[star][star] would have been useful to know. But now I have fd, so that's
nice. :-)

Edit: Silly markdown.

------
benmarten
It is incredibly fast! Thanks a lot for this great tool!

~~~
sharkdp
Thank you for the feedback! Most of the credit for the speed goes to the
amazing Rust modules 'ignore' and 'regex', which are also used by ripgrep
([https://github.com/BurntSushi/ripgrep](https://github.com/BurntSushi/ripgrep)).

~~~
nickcw
Did you do a benchmark with `--threads 1`? I suspect most of the speedup is
coming from that as the the listing of the files is likely to be IO bound (or
at least bound by calling into the kernel to read the directory listings).

That has certainly been my experience in the past when experimenting with this
sort of thing, that more threads makes a lot of difference.

~~~
sharkdp
> Did you do a benchmark with `--threads 1`?

I did. You are right, multi-threading does not give a linear speed up, but it
makes fd about a factor of three faster on my machine (8 virtual cores). With
`--threads 1`, fd is on-par with 'find -iname', but still faster than 'find
-iregex'.

------
jnwatson
How does it compare in speed to 'ag' (i.e the silver searcher) -g option?

~~~
coldtea
rg (ripgrep -- the rust-made grep/ag style tool) is already quite faster than
ag.

------
z1mm32m4n
I've found that fzf has replaced all uses of find for me except in scripts.
fzf has the benefits of being bound to a single key binding and showing me
results as I type, rather than after the command runs.

~~~
pmoriarty
fzf is fine for searching just for filenames, but it does nothing else that
find does.

It can't even search specifically for directories, as far as I can see,
nevermind searching for files/dirs with certain permissions, ages, etc.

It's also uselesss for non-interactive uses such as scripting.

I like fzf, but it's really not anywhere close to a complete find replacement.

~~~
sa46
You're right that fzf doesn't replace find. Fzf is just an interactive,
filtering tool. You can pipe anything into it. find is a common way of
populating it with data.

I suspect the GP meant that fzf replaced the pattern 'find |xargs $binary' for
small interactive use cases. It's much nicer to do '$binary <invoke fzf>'. I
use fzf the same way with key bindings to select directories and files powered
by find.

------
tayo42
Why is this faster then find?

~~~
tyingq
_" Concerning fd's speed, the main credit goes to the regex and ignore crates
that are also used in ripgrep"_

~~~
sharkdp
Yes. For simple searches, the main reason is that 'fd' walks the directory
tree in a multithreaded fashion (thanks to the 'ignore' crate).

~~~
zestyping
Interesting. Why does multithreading make a big difference? I would have
assumed that disk seek latency was the speed-limiting factor; what am I
missing?

~~~
tyingq
I believe the comparative benchmarks are done with the os file caches already
primed.

------
Dowwie
It's exciting to see so many problem solvers use Rust to improve core linux
utilities. I've been using Exa as an 'ls' replacement for some time now. 'fd'
looks promising but has room to grow, still. Finding a file and then acting
upon it is an important feature that ought to be addressed.

------
hehno
I'm honestly wondering when will we have the whole coreutils rewritten in
Rust.

~~~
dbaupp
[https://github.com/uutils/coreutils](https://github.com/uutils/coreutils) is
a long way along.

~~~
oblio
The docs don't say: is this meant to replace Busybox or the actual GNU
coreutils? As in, all GNU features and flags (which are generally much more
comprehensive and IMO nicer than plain old POSIX features).

~~~
dbaupp
I do think it (optionally) provides a busybox-esque single-binary interface,
but I do not know whether this also translates into a busybox-esque command-
line interface.

------
adwhit
ripgrep, exa and now fd... any other CLI tools I should cargo-install?

~~~
seagreen
una ([https://github.com/jwiegley/una](https://github.com/jwiegley/una)), a
universal unarchiver which handles choosing between tar, unzip, etc. for you.

~~~
rahiel
You can't install this with cargo. Una is a Haskell program; you can get it
with `stack install una`.

~~~
seagreen
Whoops, good point. In the long run I think Una's a good candidate for being
installed at the systems level, because you really want to install unzip, etc.
along with it.

EDIT: The idea being that if you install it with your system package manager
it can have unzip etc. as dependencies so that's taken care of automatically.

------
pmoriarty
I wonder how this compares to the speed of zsh's globbing, which is what I've
mostly switched to using instead of find.

~~~
sharkdp
zsh globs are about a factor of 5 slower (for this example):

    
    
        > time fd -sIe jpg > /dev/null
        1,24s user 0,77s system 758% cpu 0,265 total
        
        > time ls ~/**/*.jpg > /dev/null
        0,53s user 0,97s system 98% cpu 1,518 total

------
xfactor973
For the benchmarks did you clear the page cache inbetween runs? It’s crazy
fast and seems a little suspect

~~~
sharkdp
Thank you for the feedback!

The benchmarks that are mentioned in the README are performed for a "warm
cache", i.e. I'm running one of the tools first to fill the caches. Then, I'm
performing multiple runs of each tool (using bench for some nice statistics)
such that both tools profit from the warmed-up cache.

I also perform other benchmarks where I clear the caches. In this case, 'fd'
is usually even faster.

The scripts for both warm and cold cache are here:
[https://gist.github.com/sharkdp/4bc3e5f5ea9df2f29c02ede50634...](https://gist.github.com/sharkdp/4bc3e5f5ea9df2f29c02ede50634b16a)

~~~
dbdr
> using bench for some nice statistics

'bench' turns out to be this tool:
[https://github.com/Gabriel439/bench](https://github.com/Gabriel439/bench)

It seems to be quite useful, and I was not aware of it, thanks! Would probably
be nice to have it packaged in distributions...

------
wheresmyusern
hey david, you can omit the 'static lifetime in the root directory string
declarations if you want!

[https://github.com/rust-lang/rfcs/pull/1623](https://github.com/rust-
lang/rfcs/pull/1623)

~~~
sharkdp
Thank you for the hint! Unfortunately, this is not in rust 1.16, which I
currently still want to support.

------
smegel
> Ignores patterns from your .gitignore, by default.

Do you need to be in the current git directory for this to happen, or all git
dirs that happen to be traversed?

And are you using an NFA based regex engine?

~~~
sharkdp
It will work for all git directories that are encountered. This behavior can
be disabled with the '-I' flag, if needed.

fd uses Rusts regex engine ([https://github.com/rust-
lang/regex](https://github.com/rust-lang/regex)) which is based on finite
automata.

~~~
JepZ
I am not sure if I find that a smart default as I frequently exclude generated
files from git repositories and when want to find something I do not know why
I would not like to find such a file if my pattern matches it.

I would prefer to let the switch enable the .gitignore logic, but as I don't
know the authors use-case, theirs might be valid too.

~~~
Sean1708
Surely the common case is that you _don 't_ want to search generated files? Or
at least I personally am almost never interested in generated files because
they're not the ones I'm going to be changing/doing stuff with.

------
Karrot_Kream
Will we ever be able to promote a tool without talking about the programming
language it is implemented in?

~~~
dysoco
Yeah when it's written in Python or C and not Go/Rust.

~~~
mmirate
Aye, that's because being written in Python, C or Go is - in my eyes - a
strike _against_ the project in an absolute sense, because it will be more
buggy and difficult to maintain due to lacking the functional-programming
abstractions and powerful static-type system of a good ML descendant such as
Rust or OCaml. (Or of Haskell, but it tends to lack the practicality of the ML
family.)

So of _course_ one does not trumpet that one's project is written in Python, C
or Go.

n.b. I said "absolute sense" because all of this this is, of course,
inapplicable when searching for libraries specifically for a language such as
Python, C, or Go.

~~~
burntsushi
I don't buy it. People say, "X (written in Foo)" because there is
fundamentally an interest in the fact that something is written in Foo (and
this may indeed depend on the nature of X). Back when Go was new and shiny, it
was the same situation.[1] :-)

[1] -
[https://news.ycombinator.com/item?id=4685594](https://news.ycombinator.com/item?id=4685594)

~~~
mmirate
> there is fundamentally an interest in the fact that something is written in
> Foo (and this may indeed depend on the nature of X).

That speaks to a deficit in human cognition, that slapping Foo's name on
something makes it interesting only based on Foo's "shininess" rather than
Foo's technical merits. Neophilia, perhaps?

Anyway. Go has been in roughly the same state of "another purposefully-boring
Java-like, but instead of generics it has a different shade of OO and a full
gamut of integer machine-types" for its entire life thusfar, which is why I've
never understood most of its hype.

Rust, on the other hand, actually has some serious motivation to its learning
curve and its developing ecosystem.

~~~
setr
I'd say there are two (positive) reasons to it, since it only really appears
for newer languages that are just gaining popularity

1\. An improved tool was made, and presumably was made easier by the language
somehow. 2\. The tool serves as example of larger product written in that
language

Both facilitate further usage of the language (by evangelism), show the
language as being up-to-snuff (production-viable), and the existence of this
codebase acts as an example for others within that language community.

It's not so much the language makes this tool better, but for that particular
community, that the language was used is important, possibly more so than the
tool itself.

~~~
mmirate
> and presumably was made easier by the language somehow.

Right, and outside a few very narrow domains that are helped by the bolted-on
concurrency, Go generally fails to deliver on those presumptions, for the same
reasons that Java wouldn't if it were invented when Go was.

------
alvil
Pet: [https://github.com/knqyf263/pet](https://github.com/knqyf263/pet) and
Peco: [https://github.com/peco/peco](https://github.com/peco/peco)

