
Ack is a grep-like tool, optimised for programmers - quasque
http://www.betterthangrep.com/
======
pie
It's also worth checking out The Silver Searcher:
[https://github.com/ggreer/the_silver_searcher](https://github.com/ggreer/the_silver_searcher)

~~~
sillysaurus3
_How is it so fast? Files are mmap()ed instead of read into a buffer._

It's hard to believe this would give a significant performance boost. Is there
evidence of this?

~~~
tobinfricke
There is a nice (and often posted) mailing list post explaining some of the
reasons GNU Grep is so fast:

[http://lists.freebsd.org/pipermail/freebsd-
current/2010-Augu...](http://lists.freebsd.org/pipermail/freebsd-
current/2010-August/019310.html)

He mentions: "So even nowadays, using --mmap can be worth a >20% speedup."

~~~
sillysaurus3
Now I'm burning with curiosity. I have to know why! My plan:

\- replicate the experiment, confirm --mmap shaves off a non-negligible amount
of time. It could be that his computer happened to be running something in the
background that was using his harddrive, for example, which would skew the
results.

\- look at the code, figure out the exact difference between what --mmap is
doing and what it does by default. Confirm that the problem isn't in grep
itself (it's probably not, but it's important to check).

\- dig into the kernel source to figure out the difference under the hood and
why it might be faster.

~~~
makmanalp
I wonder if it has to do with not having to copy data back and forth between
kernel and userspace. My mildly uneducated thought is that you could do this
with splice() or whatever, but mmap is an easy drop-in replacement.

edit: I've been reading your posts for a while and I like them, but I keep
wondering, why do you have sillysaurus1-2-3?

~~~
sillysaurus3
That's what has me so curious, because it doesn't seem like copying between
kernel/userspace should account for a 20% speed drop. Once data is in the L3
CPU cache, it should be inexpensive to move it around.

Regarding my ancestry, I'm sillysaurus3 because I've (rightfully) been in
trouble twice with the mods for getting too personal on HN. I apologized and
changed my behavior accordingly, and additionally created a new account both
times to serve as a constant reminder to be objective and emotionless. There's
rarely a reason to argue with a person rather than with an idea. Debating
ideas, not people, has a bunch of nice benefits: it's easier to learn from
your mistakes, it makes for better reading, etc. It's pretty important,
because forgetting that principle leads to exchanges like
[https://news.ycombinator.com/item?id=7700145](https://news.ycombinator.com/item?id=7700145)

Another nice benefit of creating a new account is that you lose your
downvoting privilege for a time, which made me more thoughtful about whether a
downvote is actually justified.

~~~
kbenson
Possibly the OS is doing interesting things with file access and caching and
opting out of that has benefits for this particular workload?

...

I just skimmed the bsd mailing list email on why grep is fast which was linked
up-thread, and it seems that's somewhat the case. It sounds like since they
are doing advanced search techniques on what matches or can match, they use
mmap to avoid requiring the kernel copy every byte into memory, when they know
they only need to look at specific ranges of bytes in some instances. At least
that was the case at some point in the past.

 _Finally, when I was last the maintainer of GNU grep (15+ years ago...), GNU
grep also tried very hard to set things up so that the _kernel_ could ALSO
avoid handling every byte of the input, by using mmap() instead of read() for
file input. At the time, using read() caused most Unix versions to do extra
copying._

P.S. Nice attitude, it earned an upvote from me. Which is probably one reason
why your third account has more karma than my first.

~~~
makmanalp
Right, I think the point of boyer-moore is that it allows to eliminate / skip
large chunks of the text during the search.

So the assumption is that those pages don't even ever get swapped in, but I
think that'd only be the case when the pattern size is at least as large as
the page size (usually 4KB!), which is not the case in the example in the
mailing list. So the mystery continues!

------
djeikyb
I use ag[2], which is pretty much the same as ack, but even faster. The other
day I was using it to find all instances in all projects of a list of
problematic method names[1], in case anyone wants to see a real world use
case.

[1]: [http://unix.stackexchange.com/questions/108471/no-output-
usi...](http://unix.stackexchange.com/questions/108471/no-output-using-
parallel-in-tandem-with-ag-or-ack/108472#108472)

[2]:
[https://github.com/ggreer/the_silver_searcher](https://github.com/ggreer/the_silver_searcher)

~~~
masklinn
The only annoyance with ag is it does not have ack's quick filters e.g. ack
--py versus ag -G '\\.py$' (and ack's type flags can include multiple file
extensions).

~~~
wahnfrieden
It does have them now, update to the latest release :)

------
revscat
For Java programmers who use Silver Searcher or ack, this lets you search all
jars in a directory tree for a given string. Requires GNU Parallel:

    
    
        function ffjar() { 
          jars=(./**/*.jar)
          print "Searching ${#jars[*]} jars for '${*}'..."
          parallel --no-notice --tag unzip -l ::: ${jars} | ag ${*} | awk '{print $1, ":", $5}'
        }
    

Because it uses parallel it spreads the workload across CPUs. I use this
frequently when I have to update/rewrite/create build scripts, and I know a
class exists but not which jar file it lives in.

~~~
garblegarble
That's pretty neat! I use the following (slower, but short enough that it
sticks in my memory so I can do it at a colleague's machine):

    
    
      find . -name "*.jar" | xargs -tn1 unzip -l | grep SomeClass

~~~
burntsushi
`xargs` also has a `-P` flag which will instruct it to spread work over
multiple processes. Given that you already have `-n1`, adding a `-P 0` will
have it automatically spread out over all your CPUs.

~~~
ole_tange
Be aware that xargs does not deal nicely with race conditions: (Parallel grep)
[http://www.gnu.org/software/parallel/man.html#differences_be...](http://www.gnu.org/software/parallel/man.html#differences_between_xargs_and_gnu_parallel)

------
gegtik
It should be noted that "faster" is due to the fact that it limits itself to
searching the subset of files that have code extensions.

I've tested grep against ack and ag for large text files and grep won handily,
especially the latest version of grep.

Also note you can use GNU Parallel to run multiple greps

~~~
djeikyb
Yeah, I think it's important not to throw grep away. Ack is really for when
you don't know or can't be bothered to explicitly mention the (several)
specific files to search.

~~~
petdance
> I think it's important not to throw grep away

Exactly. There's no reason that you can't have grep AND ack AND ag in your
toolbox to choose from.

~~~
pekk
Sure, there's no reason, but it still sucks to have 3 tools for essentially
the same task due to defects in each.

~~~
gegtik
Can you elaborate on these defects

------
616c
For some reason, it took me one or two minutes of rereading to realize it was
ack, not awk. I think this website was going to be some ironic trash-talking
about grep. Then I saw "written in Perl" and I got so confused my head almost
exploded.

Anyway, neat tool. Will check it out soon.

~~~
prakashk
> I think this website was going to be some ironic trash-talking about grep.

Andy Lester, the primary author of ack, is one of the nicest guys I know of.
You wouldn't see any trash-talking on that site. He even changed the name of
the site from "better than grep" to "beyond grep" [1].

In fact, he gives props to similar tools like ag and others [2].

[1]
[https://news.ycombinator.com/item?id=5578304](https://news.ycombinator.com/item?id=5578304)
[2] [http://beyondgrep.com/more-tools/](http://beyondgrep.com/more-tools/)

------
jaredmcateer
I'm assuming the title got rewritten? It is probably important to note:

"ack versions 2.00 to 2.11_02 are susceptible to a code execution exploit.
Please upgrade to 2.12 or higher ASAP."

~~~
tyilo
The 2.12 version was posted December 3 2013, so no.

------
cake
I couldn't live without this tool today, very useful to quickly find where
_that_ method is used for example.

~~~
Kiro
Is it better than just searching the project in your IDE?

~~~
base698
I seem to find things faster than my coworkers. The ability to quickly filter
out non relevant files and do nested searches of the searches is the strong
point. Unix as an IDE and all.

~~~
sillysaurus3
Examples, please? If you have the time.

~~~
base698
These are just a few examples I do pretty frequently:

Nested Search:

    
    
      ag functionName | ag moreSpecificContextLikeArgs
    

Find variable changed yesterday:

    
    
      git log -p --since yesterday | ag varName
    

Find controllers changed yesterday:

    
    
      git log --oneline --showfile | ag controllers
    

What files did I work on last week:

    
    
      git log --name-only --oneline --author me --since 1.weeks
    

How many JS commits did I do last month?

    
    
      git log --since 1.months --author me  --name-only | ag -i '\.js$' | wc -l
    

How many JS commits did I do on each file last month?

    
    
       git log --since 1.months --author me  --name-only | ag -i '\.js$' | awk '{arr[$1]++} END {for(i in arr) print arr[i]," - ",i}' | sort -r -n
    

Change a "classname" from MyClass to BetterName:

    
    
      ag MyClass # verify it only finds what you think it will
      ag MyClass | awk -F':' '{print $1}' | sort | uniq | while read line 
      do
        sed -i' ' 's/MyClass/BetterName/g' $line
      done

~~~
a_e_k
Nested searches are handy. They can also be used to find cases of one thing
within a few lines of proximity to another:

    
    
        ack -C5 firstThing | less
        /secondThing
    

One bonus ack trick that I like also like is bulk loading into Emacs for
further manipulation (e.g., multi-occur and then occur-edit-mode):

    
    
        emacsclient -n `ack -l functionName`

------
senthilnayagam
[https://github.com/monochromegane/the_platinum_searcher](https://github.com/monochromegane/the_platinum_searcher)
I use pt for stuff which is not in git, written in go and it is fast

~~~
Flenser
Pt also has first party builds for Windows and Mac

------
pretz
It's worth noting that if you're always searching in a git repo, git-grep is
pretty comparable to ack and you already have it installed.

~~~
badman_ting
Totally, I use it all the time now. Here's a pretty great writeup on git grep
that was on HN recently: [http://travisjeffery.com/b/2012/02/search-a-git-
repo-like-a-...](http://travisjeffery.com/b/2012/02/search-a-git-repo-like-a-
ninja/)

------
antisocial
My ex-colleague introduced this to me and I thank him every time I use ack. It
is really so much better than grep. I have set up a bunch of aliases to search
by file type and it makes me so productive.

------
foz
I used Ack a lot when I was coding Perl (it's been a while). After I switched
to Ruby, I used rak [1], which seemed easier to use most of the time, and
nearly identical.

However, when you just want to find stuff fast, it's annoying to have to deal
with Perl/CPAN or RVM/Rubygems, especially when the dependencies are not
installed on your server/workstation.

That's why I've switched to silver searcher (ag) [2], as it can be installed
with any OS package manager (brew, apt, yum).

[1] [http://rak.rubyforge.org](http://rak.rubyforge.org) [2]
[https://github.com/ggreer/the_silver_searcher](https://github.com/ggreer/the_silver_searcher)

~~~
ediblenergy
ag is not available as a package on debian stable. Ack is though; as ack-grep.
So if you don't want to mess with CPAN that's fine. The non-CPAN instructions
are right on the website.

------
gejjaxxita
The problem with such tools is often their lack of ubiquity. I don't want to
start using ack, forget a lot of my grep knowledge, only to ssh into a server
and need grep.

The benefit of grep's ubiquity outweighs any small advantage ack has in
usability.

~~~
davorb
ack is a tiny perl script that you can simply wget and add to your path. I
hear what you are saying, and I think that it applies to a lot of utilities,
but not ack. Imho ack is so much better than grep that it is worth the hassle
of having to install it every now and then.

------
andrelaszlo
Old thread here:
[https://news.ycombinator.com/item?id=6075083](https://news.ycombinator.com/item?id=6075083)

------
eliben
Similar tool in pure Python:
[https://github.com/eliben/pss](https://github.com/eliben/pss)

~~~
abind
Thank you for this! `pip install pss` is one of the first things I do on a new
computer :)

------
petdance
There's a list of other tools for search source code besides ack at
[http://beyondgrep.com/more-tools/](http://beyondgrep.com/more-tools/),
including other grepalikes and indexing tools like ctags and cscope.

I suggest that you need not limit yourself to only one tool for your code
searching. Toolboxes FTW.

------
sitaramc
Grep is just as good, and with the recent order of magnitude speed improvement
on non-C locales that they made -- see
[https://lwn.net/Articles/586899/](https://lwn.net/Articles/586899/) \-- which
may not have made its way into distros yet, it's easily the best option.

I have a simple wrapper over egrep (see
[https://github.com/sitaramc/ew](https://github.com/sitaramc/ew) ) that adds
those little extras (ignoring binary files, ignoring VCS directories...).

I'm sure it's improved since the days I tried it, but I tend to be permanently
prejudiced against tools where the author can't/won't document the file
selection logic and says "there's really no English that explains how it
works" when someone asks.

------
TorKlingberg
Ack is great, but watch out if you have any source files with unusual file
name extensions. Ack will only search file types it knows about. Also if you
have your whole source tree in your editor or IDE, then you may as well search
there instead.

~~~
npongratz
Addressed in ack's FAQ [0], and in its own section of the manual [1].

The manual explains: "This is done with command line options that are best put
into an .ackrc file - then you do not have to define your types over and over
again." Then comprehensively describes options for both command line and
.ackrc.

[0]
[http://beyondgrep.com/documentation/ack-2.12-man.html#faq](http://beyondgrep.com/documentation/ack-2.12-man.html#faq)

[1]
[http://beyondgrep.com/documentation/ack-2.12-man.html#defini...](http://beyondgrep.com/documentation/ack-2.12-man.html#defining_your_own_types)

~~~
TorKlingberg
Yes, I should have added "by default".

------
raverbashing
Yeah, I think grep sucks as well, that's why I created my own little ack thing

(since it's in a sorry state I won't post it here, and it will attract the
rage of people for not being compatible with grep/in python/no docs/etc)

------
bch
How does this compare to cscope[0] ?

edit: or ctags[1] ?

\---

[0] [http://en.wikipedia.org/wiki/Cscope](http://en.wikipedia.org/wiki/Cscope)

[1] [http://en.wikipedia.org/wiki/Ctags](http://en.wikipedia.org/wiki/Ctags)

~~~
astine
cscope and ctags are language syntax dependent searching tools for c-like
programming languages. They let you search specifically for all instances of a
function names 'foo' for example. Ack is instead just a normal pattern matcher
like grep except that it has some cleverness by which it knows not to search
certain file types and directories. It will return all lines which match a
string rather than just variable names or functions.

~~~
bch
cscope lets you search for arbitrary text strings and egrep patterns as well.

    
    
      "The fuzzy parser supports C, but is flexible enough to be useful for C++ and Java, and for use as a generalized 'grep database' (use it to browse large text documents!)"[0]
    

Exuberant Ctags supports 41 languages[1], incl. javascript, Tcl, Ruby, TeX,
awk, etc.

I was wondering if there are obvious ack killer features or areas where it's
remarkably superior.

[0] [http://cscope.sourceforge.net/](http://cscope.sourceforge.net/)

[1]
[http://ctags.sourceforge.net/languages.html](http://ctags.sourceforge.net/languages.html)

~~~
petdance
They solve different problems. Ctags and cscope index a corpus of source code,
usually tied into another tool, like Ctrl-] in vim. ack searches the files
every time.

------
VeejayRampay
I recently found out about git-grep. It's good and quite fast.

------
herokusaki
Is there a tool that is to find what ack is to grep?

~~~
e12e
It depends what you mean... as others have mentioned[1], neither ack or ag are
particularly fast compared to grep, they just give you a lot of specialized
context (search the right files). As such, what would be to find as ack is to
grep? A find that automatically filters out files that are not source code
files?

[1] Things might have changed since the last time I personally tried this, at
the time grep was significantly faster, especially for fixed string searches
-- but then again, I never tried to coerce up a command line that gave the
same kind of output that ack/ag does (which could probably be hammerd out with
help of awk). So don't take my comment to suggest that these tools aren't
valuable, just maybe not for the reason some people (notably not the authors
of said tools) claim.

~~~
herokusaki
> find that automatically filters out files that are not source code

Not just that but an extensible set of file type filters that are simple to
invoke is what I had in mind. E.g., the tool would let you perform searches
like

    
    
      find++ --Python  projects/archive/200?
    

or

    
    
      find++ --video trailer
    

where in the latter case the hypothetical find++ would refer to my config to
get a list of video file extensions and then print a list of all files in the
current directory and its subdirectories with the word "trailer" in their
name. For better effect it would ship with useful filters like "\--video" by
default.

~~~
e12e
Right. It's not entirely straight forward to link up the mime database (via
eg: file) and generating filters for use by find. Basing filters off of
filenames isn't a very good idea -- and actually a little regressive in my
opinion -- after all project/bin/foo (executable) might be a python or perl or
whatever script -- not just a binary file.

But first getting all files via find, then testing with file, and finally
matching against mime-type doesn't sound like something that's going to be as
fast as possible...

I tried to see if maybe gvfs (gio - gnome io) could help, but couldn't really
find anything directly applicable (although there is a set of gvfs command
line tools, like gvfs-ls, gvfs-info, gvfs-mime).

~~~
petdance
> after all project/bin/foo (executable) might be a python or perl or whatever
> script -- not just a binary file.

That's one of the big features of ack that the find/grep combo can't replicate
is checking the shebang of the file to detect type. In ack's case, Perl and
shell programs are detected both by extension:

    
    
      --type-add=perl:ext:pl,pm,pod,t,psgi
      --type-add=shell:ext:sh,bash,csh,tcsh,ksh,zsh,fish
    

And by checking the shebang:

    
    
      --type-add=perl:firstlinematch:/^#!.*\bperl/
      --type-add=shell:firstlinematch:/^#!.*\b(?:ba|t?c|k|z|fi)?sh\b/
    

Run `ack --dump` to see a list of all the definitions.

~~~
e12e
I'd prefer checking the magic numbers in general (or resource forks) -- and
list based on mime-types -- rather than just shebang/extension. I'm sure
there's frameworks ready for doing this -- both gnome and kde (among others)
have been working on this for a while. You need it do be able to display
(correct) file icons, for example. And once one goes down that route, it might
be beneficial to leverage one of the frameworks for file-search (from locate
db to something based on xapian or what-not) -- rather than find-style
traversal.

------
0x0
I used this for a while, but got bitten by the fact that, by default, it does
not search all file types. :-/

------
omegote
Been using ack-grep for years. In systems where it's not installed I just use
grep -Hnir

------
olalonde
Is there a code repository somewhere? Can't seem to find one...

~~~
theOnliest
[https://github.com/petdance/ack2](https://github.com/petdance/ack2)

I'll send Andy a PR about putting the Github link on the site somewhere.

~~~
petdance
[https://github.com/petdance/beyondgrep](https://github.com/petdance/beyondgrep)
is the repo for the site.

------
davexunit
I'll stick with grep, thanks. M-x rgrep in Emacs works great.

~~~
michaelsbradley
'M-x ag', is just a package install away:

[https://github.com/Wilfred/ag.el](https://github.com/Wilfred/ag.el)

------
NicoJuicy
This has found it's way to HN a long long time ago:

[https://news.ycombinator.com/item?id=975511](https://news.ycombinator.com/item?id=975511)

------
username42
Why did they choose the same name as minix compiler ack
([http://tack.sourceforge.net/](http://tack.sourceforge.net/)) ?

------
anotherevan
And just to warp things even further, use ack to find the files you're after,
then run through grep:

    
    
        ack -f --css | parallel grep search_term

------
fdsary
I thought we were done with this already?
[https://github.com/ggreer/the_silver_searcher](https://github.com/ggreer/the_silver_searcher)

------
Antwan
Seen [http://beyondgrep.com/security/](http://beyondgrep.com/security/)

Didn't read further.

