You can do a lot of things with `grep -E`, fwiw - there's not much here to really sell ack.
Things that do sell ack, for me:
ack css_class --sass # search .sass and .scss
ack some_method --no-flash # ignore .as and .mxml
# ignore compiled css in every Rails project on
# my system (as long as I `ack` from the root)
--ignore-dir=public/stylesheets/compiled
And the fact that it prints out like this:
path/to/file.ext
123: some text matching
234: more text matching
path/to/other/file.ext
480: a match
instead of like this (with `-n`):
path/to/file.ext:123: a match
path/to/other/file.ext:567: another match
path/to/that/file/you/didnt/know/you_had.ext:32: yet another match
makes it massively more useful for human-viewing of the results than the normal behavior of grep. And it reverts to grep-like output when you pipe it into something, so you can go from exploration to composition with no effort.
Entirely agreed. Other users' points about the tone of the article ring true to me as well, and they are only hurting people's impression of the tool, which is unfortunate; I use both ack and grep (my regex comfort level is not extremely high, so grep -v is still a common fallback).
For the curious: It's "ack-grep" in Ubuntu's package manager (and presumably Debian, though I can't say for sure); I stick it on every machine/server I set up just to have it handy. Queries to the effect of ack-grep --python ClassName yield fast, readable, extremely useful output, as you mention. That's why I use it in addition to grep.
Because with the double grep method, a match will be made whether "silver" is before or after "needle"; while with the single ack command shown above, a match will only be made when "silver" comes before "needle".
That assumes that "silver" will appear before "needle", which may not always be the case. `grep needle file | grep silver` gets you lines containing both "silver" and "needle" but not necessarily in that order.
While ack is a great tool, I don't think the author pointed out its strengths in this article. From my perspective, the strengths are using it recursively and its ability to 'recognise' files containing source code (and yes, I know that grep has a recursive option - it's more innate, though, in ack).
Agreed. He says he can't remember the syntax for such and such in grep, but the regexen he follows up with seem complicated enough to me.
Now, there's stuff that never sticks in my brain (tests in shell, sigh). But generally there's less syntax and therefore less to remember in a chain of greps. Composition of simple piece is easier to understand than one equivalent and therefore more complex piece. Heck, the power of the shell is predicated on this idea.
Perhaps the best part about ack is that it's simple to restrict your search of files to a given pattern with a command line flag rather than using shell globbing. You could wrap invocations of grep with a shell function or another script, but that's still not great.
But that too seems like a demonstration of something. The more "simple" methods with obscure names that populate the Unix toolbox, the more confusing it gets. I've gone from find to locate recently, for example, but their functionalities kind of overlap and so when I do find, I'm rusty with it.
A little feedback on voice. When someone says "You should stop using them. Now.", I expect the article to be about some system killing security problem, not an argument for the elegance of one tool over another. And if the argument is going to be about elegance, it better be absolutely compelling. The benefit of piped grep expressions is that you don't have to know anything beyond the principles of Unix to intuit their usage. For many uses (most uses for me), no thought is required -- grep fades into the background and becomes part of the programmatic brain stem. Commanding the reader to no longer use it is as effective as telling them to stop breathing.
>The primary virtue of these commands is that they use the Perl regular expression engine.
You mean the engine that lets you write pathological regular expressions[1] and accidentally ReDoS[2] yourself? To be fair, it's fine if you understand how the engine works well enough to avoid these cases. But how many people can actually say this?
I was looking into breaking a Perl IRC bot the other day and couldn't get any of the examples to work (that is, take more than a split second to execute). Does perl now detect these pathological cases and work around them or was I just not trying the examples correctly?
I believe I read somewhere that recent versions of Perl detect certain obviously pathological cases and abort them, but I don't know if they fail silently or display an error. Whether a particular needle is pathological depends on the haystack, too, so it could just be that there was a mismatch between the two in your case.
Well, with a simple haystack like the one used in the example, there really would be no reason not to grep for "silver needle" in the first place. So it's really not the best or most realistic example of the usefulness of the double grep method.
When I use double grep in real life, I often tend to do so on a relatively large haystack, where I don't necessarily know what the second search term will be. In that situation, I'll usually do the first grep, look through its output, and add on the second grep once I see something in the first grep's output that I want to narrow the results down to.
Of course, instead of adding on a second grep, I could modify the original regex (and sometimes I do); but if the original regex is complicated, then modifying it is error prone. And, anyway, using a shell abbreviation, it's very easy to type " G " and have that expand to " | grep " to simply add on another grep, without touching the first regex.
A second, quite common use case for a double grep is when I want the second search term to match whether it's before or after the first term. There's probably some convoluted way to get the same effect using a single regex, but it probably won't be nearly as easy or intuitive as a double grep.
That last one could be tricky if there's other types of needles with names like "ead needle" or "mead needle". But using the haystack he gives us BRE can do the job, easily.
Perl regex may be easy to use but they are inferior from a performance perspective. As someone else said, they're slower than BRE or ERE. Moreover, even if speed is not an issue, you pay a price in the amount of memory you will need compared with line-based utilities like, e.g., sed and awk.
Find the needles
sed '/needle/!d;/needle/q' haystack
Find the silver needles
sed '/silver needle/!d;/silver needle/q' haystack
Find all needles except lead ones
sed '/lead needle/d;/needle/!d/needle/q' haystack
My preference is to use (f)lex if I want a fast "parser" (scanner). Its regex is more than adequate.
That'd be my other concern. ack is perl, as far as I can tell. I have no idea how perl performs at these tasks. But grep is written in C, and there're fun examples of how exactly it gets to be so fast (http://ridiculousfish.com/blog/posts/old-age-and-treachery.h... comes to mind).
Is anyone else turned off by the tone of the blog post and the attitude of the poster? There's gotta be a better way to showcase the usefulness of a tool.
I'm more turned off by hard-to-read regular expressions, especially ones that look like they may break depending on what terminal emulator I'm using, how quotes are escaped, etc. The "you've been doing it wrong for years" tone I could do without, but see people using it with good enough intentions so frequently that I'm no longer bothered by it.
Also, `ack` is not installed by default, which is reason enough to not get too used to it. Some people will say "optimize for being on your own machine, since you are 99% of the time", but I'm not. Installing additional utilities on multiple production servers is annoying enough, and can actually become problematic in a PCI-compliant environment as mine is. I'm also frequently helping out other members of my team, and having a magic one-liner that often results in "-bash: ack: command not found" is not terribly useful to me. YMMV.
The first one gets you five lines around all of the scopes without lambdas. The second one gets you five lines around all the scopes (including those with lambdas) and then omits all of the lines with lambdas, some of which will have caused a match in the first grep, and some of which may be part of the context of matches in the first grep. You will get both contexts with no match in the middle and matches with incomplete contexts.
Is it worth learning grep or ack or a similar tool?
When I need to do these sort of tasks, I do them in a scripting language with some combination of split() and regex instead of using command line tools. But, I'm just doing that because it's what I know.
Would I end up saving a significant amount of time if I learned to use grep instead?
In my opinion, yes, it is. grep is fast and versatile.
Shell tools in general are relatively simple or at least specialized, and they're built to be composed in novel/useful ways. The interface between all of these is text, aka data, aka what is arguably the simplest interface.
I'd say start with ack for its ease of use, but knowing grep is important. I prefer ack but most of the boxes i work on don't have ack installed (and won't).
Wasn't mentioned in the article but you can install ack (on OSX) through the homebrew package manager with: `brew install ack`. Check out the docs at http://betterthangrep.com/
grep has the great strength of near-ubiquitous installation. If you find yourself on a Unix-like system, it will have grep. Perhaps if you only ever really work on a very small set of systems, then this doesn't matter, but for those of us who frequently need to work on different systems and don't always have the ability to install new packages, grep does a fine job. In fact, it has a lot in common with vi here: you can (nearly) always count on having it available.
Things that do sell ack, for me:
And the fact that it prints out like this: instead of like this (with `-n`): makes it massively more useful for human-viewing of the results than the normal behavior of grep. And it reverts to grep-like output when you pipe it into something, so you can go from exploration to composition with no effort.