Why GNU grep is fast (2010)

ColinWright · on Nov 28, 2013

In case you're interested to see what older HN contributors had to say about this, here are some of the previous submissions, all of which have significant discussion:

https://news.ycombinator.com/item?id=1626305, 1193 days ago, 115 comments

https://news.ycombinator.com/item?id=2393587, 972 days ago, 68 comments

https://news.ycombinator.com/item?id=2860759, 842 days ago, 43 comments

( ionelm also pointed out a previous submission: https://news.ycombinator.com/item?id=6814153 )

Also worth mentioning is "The Treacherous Optimization" ( http://ridiculousfish.com/blog/posts/old-age-and-treachery.h... ), although previous submissions of that provoked no discussion at all.

https://news.ycombinator.com/item?id=1624402

https://news.ycombinator.com/item?id=5257874

ADDED IN EDIT: More rigorous searching has turned up substantial discussion of the Treacherous Optimization: https://news.ycombinator.com/item?id=1627367

antirez · on Nov 28, 2013

It's great that you posted this, but it would be greater if HN was capable of doing it automatically since it is mostly trivial, but provides a lot of value.

steveklabnik · on Nov 28, 2013

In the past, there were semi-automated accounts that did this, but they made more people upset than happy.

I would also prefer to see this happen on every story.

gitaarik · on Nov 28, 2013

hey hey hey, if you want hacker news to stay fast you should't make it do too much!

ColinWright · on Nov 28, 2013

The systems I wrote were independent of HN and didn't slow it down. I got significant grief for them and pulled them.

Standard "wisdom" for start-ups includes "If you're not embarrassed by your first product then you didn't launch early enough," and "Try the minimal viable product before investing too much time." I both launched early, and made sure the "product" was absolutely minimal. It got slated by the people it was trying to help, so I didn't bother refining it.

The experience was educational and instructive.

(minor edit to remove some inappropriate snark - apologies)

AlisdairO · on Nov 28, 2013

I wouldn't be surprised if that was because people misinterpreted it as complaining about reposting, which is a contentious topic on HN.

kirubakaran · on Nov 28, 2013

The complainers were probably just a vocal minority.

redthrowaway · on Nov 28, 2013

Stay? When was HN ever fast? It's a tech demo for a language that's been more-or-less abandoned, that accidentally grew into a popular website.

stiff · on Nov 28, 2013

See also:

http://swtch.com/~rsc/regexp/regexp1.html

http://swtch.com/~rsc/regexp/regexp2.html

http://swtch.com/~rsc/regexp/regexp3.html

http://en.wikipedia.org/wiki/Thompson%27s_construction_algor...

I implemented a sort of micro-grep in the past using the Thompson's algorithm, it was a great exercise in practical applications of heavily-theoretical CS stuff, with recursive-descent parsing of the regular expression, then using Thompsons algorithm to create a NFA out of the parse tree, then a simulation of an NFA with a DFA using two stacks. I used the Dragon Book for reference on this, all pretty awesome stuff.

michaelsbradley · on Nov 28, 2013

The author of The Silver Searcher[1] has some blog posts about his efforts at optimizing his own grep-like tool.

There's nothing wrong with good ole grep, but on boxen I spend any significant time on, I always install ag. The integration with Emacs provided by ag.el[2] is awesome too!

[1] https://github.com/ggreer/the_silver_searcher#how-is-it-so-f...

[2] https://github.com/Wilfred/ag.el

yelnatz · on Nov 28, 2013

I'm actually more familiar with the KMP algorithm than BM. So I looked it up to see what the difference was:

The classic Boyer-Moore algorithm suffers from the phenomenon that it tends not to work so efficiently on small alphabets like DNA.

The skip distance tends to stop growing with the pattern length because substrings re-occur frequently.

By remembering more of what has already been matched, one can get larger skips through the text.

One can even arrange 'perfect memory' and thus look at each character at most once, whereas the Boyer-Moore algorithm, while linear, may inspect a character from the text multiple times.

1: http://stackoverflow.com/questions/12656160/what-are-the-mai...

ahelwer · on Nov 28, 2013

You would inspect characters multiple times only in a very naive implementation of Boyer-Moore. The Galil Rule[0] fixes this and is also required for proving linear worst-case runtime.

[0] https://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_sea...

lubomir · on Nov 28, 2013

This page [1] nicely demonstrates differences between KMP and BM.

[1]: http://www.cs.utexas.edu/users/moore/best-ideas/string-searc...

mtdewcmu · on Nov 28, 2013

I have a book called Flexible Pattern Matching in Strings[1], which covers all the algorithms that might go into a modern, state-of-the-art grep, with practical advice on performance (N.B.: Boyer-Moore is no longer state-of-the-art). The publication year is 2002.

[1] http://www.dcc.uchile.cl/~gnavarro/FPMbook/

tetha · on Nov 28, 2013

Couldn't you speed up a small-alphabet search by considering pairs or triplets of characters so you have N^2 or N^3 glyphs instead of N characters (so, going from 4 characters to 16 cgaracters to 64 characters)?

mtdewcmu · on Nov 30, 2013

I'm not sure exactly how your proposal might work, but, generally, the problem you would run into is that the group of characters can be shifted. So you'd end up having to compare every possible shift.

stcredzero · on Nov 28, 2013

That's also a Forth way of looking at optimization.

Also reminds me of Kent Beck's quip when he was asked to optimize Chrysler's C3 system. He asked for validated sets of input and output. The programmers on site said the the system wasn't producing correct results yet. His response: In that case, I can make this real fast!

gruseom · on Nov 28, 2013

I recall nearly the same story about Jerry Weinberg:

Engineers: Oh your system can only do a thousand cards a minute? Ours can do ten thousand.

Weinberg: Yeah but your system doesn't work! If mine doesn't have to work I can do a million cards a minute.

I thought I got it from one of Weinberg's books.

on Nov 28, 2013

[deleted]

nostrademons · on Nov 28, 2013

I think Kent was being sarcastic - his point is that it's rather pointless to optimize until the system is producing correct results. He's also famous for the "Make it work, make it right, make it fast" quote.

dalke · on Nov 28, 2013

Are you sure he's the source of the quote?

A quick scholar.google.com search found a match in Byte magazine from 1983:

> Furthermore, many C environments contain measurement tools that enable the programmer to identify these critical sections easily. But the strategy is definitely: first make it work, then make it right, and, finally, make it fast.

The Google URL is http://books.google.com/books?ei=dtOXUoCVHoegkAef9YGQBg&id=A... but that's not enough to tell who wrote it. The full text is at http://archive.org/stream/byte-magazine-1983-08/1983_08_BYTE... . Unfortunately, it's OCR'ed, with advertisements and articles intermixed.

I believe it was written by James Joyce.

In any case, it looks like Beck was a ~22 year old undergraduate at the University of Oregon when that was published.

dalke · on Nov 29, 2013

Looks like I'm wrong about the attribution. It's in the article "The C Language and Models for Systems Programming" by Johnson and Kernighan.

dingaling · on Nov 28, 2013

'Compile, conform, perform'

martindale · on Nov 28, 2013

Where is this from?

BrainInAJar · on Nov 28, 2013

Conversely, Why GNU grep is slow (and how to make it not-slow): http://dtrace.org/blogs/brendan/2011/12/08/2000x-performance...

ionelm · on Nov 28, 2013

Older submission here: https://news.ycombinator.com/item?id=1626305

pestaa · on Nov 28, 2013

By this logic you can write really fast programs in Haskell, because it avoids computing unused values (and therefore complete calltrees). (Unfortunately, the management of such calltrees -- thunks -- often has higher cost than outright computing them in the first place.)

I try hard, but I'm not smart enough to write programs that do nothing. :)

mseebach · on Nov 28, 2013

You can do that in any programming language by, well, not computing unused values.

chii · on Nov 28, 2013

That's like saying anyone can make money by just making money. It's a null statement!

mseebach · on Nov 28, 2013

In the context of the comment I replied to, no, it's not.

Haskell gives you some constructs that makes this easier but there exists no program whose Haskell implementations will run faster because of this, only programs that are easier to make fast.

nostrademons · on Nov 28, 2013

The big advantage lazy evaluation has is composability. In an eager language, you have to explicitly code operations in a lazy fashion, which means that if you use a library, it will probably be eager. With Haskell you can be pretty sure that the runtime will evaluate only as much of the program as it needs to.

That said, the downside is that it then becomes very difficult to reason about the performance characteristics of your library, because its performance depends on how it's used. Simon Peyton-Jones is on record as saying that laziness is probably the wrong default for a language - the big benefit for Haskell was that it "kept them honest" wrt purity, but in a production system you probably want strictness.

alcari · on Nov 28, 2013

The term you're looking for is tautology, not 'null statement'.

evincarofautumn · on Nov 29, 2013

You can. A lazy algorithm will do no more reductions than an eager one, and it will usually do fewer. You just have to specify strictness. The compiler is pretty conservative about adding strictness because it has to preserve behaviour. But if you know something will be computed, e.g., a loop accumulator, then you can just tell the compiler so.

bane · on Nov 28, 2013

The inverse is this...what is modern software doing that makes them so slow?

Jare · on Nov 28, 2013

Figuratively speaking, a lot of modern software is busy peeling layers of various onions, going to places to check if it needs to go there, and writing and organizing detailed reports of what it's doing, every step along the way. In other words, bureaucracy and logistics.

weland · on Nov 28, 2013

> GNU grep also tried very hard to set things up so that the kernel could ALSO avoid handling every byte of the input, by using mmap() instead of read() for file input. At the time, using read() caused most Unix versions to do extra copying.

Good luck pulling this off in Chrome.

jamesaguilar · on Nov 29, 2013

With nacl, you could. But I think you're talking about js, and yeah, you are right.

gjulianm · on Nov 28, 2013

Development time. Why invest time in such optimizations instead of other features when the app is fast enough? Time is not infinite and we need to prioritize what takes our development time.

ceautery · on Nov 29, 2013

I'd say time spent optimizing grep (GNU or otherwise) is always well spent. Saving a clock cycle or two per execution in my work's enterprise would translate into needing to lease less hardware, based just on how much grep is used. I imagine the same is true of all large Unix-y enterprises.

tetha · on Nov 28, 2013

Portability and maintainability, if you want to boil it down to the core. The speed of gnu grep probably isn't portable since it relies on the kernel to do certain things in a cooperative manner, and it certainly won't be easy to maintain while staying fast.

I see the same in some of our software. One of our senior is rolling out a number of very, very fast collections based on compare-and-swap-operations, but you always end up thinking an hour or two about five lines of code. Most problem domains don't need this kind of performance, so most software teams don't pay the maintenance price and I think they are correct about that judgement call.

mtdewcmu · on Nov 28, 2013

Part of the performance is algorithmic. Mixing Boyer-Moore and regexes isn't elementary. However, the knowledge is available for anyone keen to apply it.

noonespecial · on Nov 28, 2013

The fastest programs are the ones that don't prepare themselves to do something and then don't do that thing. Another way to say just do one thing well.

badman_ting · on Nov 28, 2013

The fastest programs return 0 immediately and don't do shit.

noonespecial · on Nov 28, 2013

Returning 0 is technically doing something. It might not be something that you find useful but it is something.

Danieru · on Nov 28, 2013

On unix you'll find two programs which do this exact thing: `false` and `true`. False returns 1, and true returns 0.

You might not believe that these are real programs but they are. You can find the binaries using `whereis`. For example on my linux install true is /bin/true.

The command line is pretty awesome.

staz · on Nov 29, 2013

And here are the actual code source for them :

http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=blob_p...

/bin/true source is a bit longer than expected but /bin/false implementation is really interesting : http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=blob_p...

disgruntledphd2 · on Nov 29, 2013

That is both hilarious and brilliant. The bin/false source code made me laugh out loud.

mtdewcmu · on Nov 30, 2013

true.c could be useful as a kind of helloworld -- a minimal example of the conventions followed by gnu command line programs.

rm445 · on Nov 30, 2013

GNU Hello exists for this purpose.

badman_ting · on Nov 28, 2013

> Returning 0 is technically doing something.

Yeah, but try telling your boss that :)

brongondwana · on Nov 29, 2013

Yep, boss will turn right around and "do" paying your salary

christianmann · on Nov 28, 2013

I'd imagine the fastest programs don't always return 0, since that's an extra MOV instruction.

tsm · on Nov 28, 2013

    $ touch foo
    $ chmod +x foo
    $ ./foo
    $ echo $?
    0

idupree · on Nov 28, 2013

For me, the trivial C program appears to run faster than the empty file:

    $ touch empty
    $ chmod +x empty
    $ time ./empty

    real    0m0.002s
    user    0m0.000s
    sys     0m0.000s

    $ echo "int main(){return 0;}" > trivial.c
    $ gcc trivial.c -o trivial
    $ time ./trivial

    real    0m0.001s
    user    0m0.000s
    sys     0m0.000s

Timing results are consistent over several repetitions (provided everything's in cache from disk). Linux x86_64. `mov` takes ten thousand to a million times less than a millisecond ( https://gist.github.com/jboner/2841832 ), so I can't find out this way whether removing 'return 0' changes anything.

(If I use my default zsh shell to execute ./empty, it gives me

    zsh: exec format error: ./empty
    ./empty  0.00s user 0.00s system 0% cpu 0.008 total

So I used bash for this.)

derekp7 · on Nov 29, 2013

So here's what happens with empty. If you run it from the shell, first the shell will fork, then try to exec empty. But the exec fails, since empty doesn't begin with a magic value. Therefore the exec call returns an error. Now the shell picks up this error, and then tries to run the program again, this time as an argument to an invocation of the shell (i.e., it does an exec of /bin/bash, passing it "empty" as a parameter). This is why empty ends up taking longer to run.

This is the normal pattern, just in case you forget to put "#!/bin/bash" at the top of the script, so that the script can be run anyway. This is also a source of confusion for some sysadmins, when a script works from the command line but not from something like a cron script.

bnegreve · on Nov 28, 2013

Before the system can execute any one of these programs, it has fork the bash process, call exec to load the new binary in the process memory, load the standard library and map it to the adress space, open stdin / stdout, run the program, close stdin / stdout, wait for the process termination ...

Comparing to all this, a move instruction in userland won't really make a difference.

Edit: List on system calls required to execute trivial.c (on my Linux):

execve, brk, access, mmap, access, open, open, open, open, open, stat, open, stat, open, stat, open, stat, open, fstat, mmap, close, access, open, read, fstat, mmap, mprotect, mmap, mmap, close, mmap, mmap, mmap, arch_prctl, mprotect, mprotect, munmap, exit_group

lelandbatey · on Nov 28, 2013

I get the same results:

    $ touch empty

    $ chmod +x empty 

    $ time ./empty 

        real    0m0.005s
        user    0m0.001s
        sys     0m0.001s

    $ echo "int main(){return 0;}" > trivial.c

    $ gcc trivial.c -o trivial

    $ time ./trivial

        real    0m0.002s
        user    0m0.001s
        sys     0m0.002s

GuiA · on Nov 28, 2013

Isn't that just a default in the bash implementation?

tetha · on Nov 28, 2013

Technically the fastest non-portable program returns whatever was in eax upon call if I remember right.

Timmmmbob · on Nov 28, 2013

Well not really because the "do one thing well" leads to shell scripting where you join together dozens of separate tiny programs. Shell scripting is not exactly fast, nor is it robust.

jontro · on Nov 28, 2013

Please add [2010] to the title

ohwp · on Nov 28, 2013

Why? Because it doesn't apply anymore?

unwind · on Nov 28, 2013

Because it can help people realize they might have read this before (I had), which in turn can affect how one prioritizes which articles to read.

Also because "it's how it's done around here": the site is about news so it makes sense to at least tag possible re-posts to flag them as "not quite news yet still interesting and recommended reading".

m_mueller · on Nov 28, 2013

Yes, because avoiding lots of LOC by relying on tons of external libraries is the new 'doing practically nothing'. /s

alan_cx · on Nov 28, 2013

The article references an article published in 1991. Does that need stating too?

garethadams · on Nov 28, 2013

If this story was just a blogspam saying "Hey look I found this article from 1991 which is still relevant" then yes, because the story would be worth nothing by itself.

But this story is actually an email (from 2010) written by the guy WHO ORIGINALLY WROTE GREP, making it a pretty damn interesting source in its own right. In that sense, the 1991 article is very much an aside.

In conclusion, no.

nzp · on Nov 28, 2013

> by the guy WHO ORIGINALLY WROTE GREP

Just a pedantic remark: He (Mike Haertel) originally wrote GNU grep. Grep, on the other hand, was originally written by Ken Thompson.

garethadams · on Nov 28, 2013

Good point, well made :)

_delirium · on Nov 28, 2013

Does this still work in Unicode? It seems like it'd cause the Boyer-Moore lookup tables to blow up in size.

FreakLegion · on Nov 28, 2013

You can use Unicode code points for the shift table(s -- the Horspool variant has only one table) but make them sparse, that way the tables only contain characters that exist in the pattern. Hash tables make for easy lookups, but with a bit blob of a few hundred kB you can also use a bitmask.

Of course, you can also just reduce the Unicode pattern to bytes, so your alphabet is never larger than 256. This will run slower, but not as much as you'd think: Boyer-Moore does benefit from larger alphabets, but only to the extent that the alphabet is actually used.

_delirium · on Nov 28, 2013

> Of course, you can also just reduce the Unicode pattern to bytes

Ah, right. I was under the impression this was unsafe, since you could end up with spurious byte matches that are not on character boundaries. But it seems the keyword is "self synchronizing", and UTF-8 (but not UTF-16) is safe to do byte-oriented searching on.

lxm · on Nov 29, 2013

Link to a no-registration-required PDF version of "Fast String Searching" mentioned in that post

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.13....

4ngle · on Nov 28, 2013

Original title is better. I'm gonna call the cops.

Myrmornis · on Nov 28, 2013

In case it helps anyone else, to install GNU grep on OSX using homebrew I did

  brew tap homebrew/dupes
  brew install --default-names homebrew/dupes/grep

following http://www.heystephenwood.com/2013/09/install-gnu-grep-on-ma...

agnokapathetic · on Nov 29, 2013

Be careful with:

  --default-names do not prepend 'g' to the binary

On OS X (a BSD derivative with BSD-grep), --default-names will put GNU grep in your path. A lot of things in OS X depend on BSD sed/awk/grep. Omitting --default-names links then GNU variants to ggrep/gsed/gawk.

chj · on Nov 29, 2013

Quote of the day:

  The key to making programs fast is to make them do practically nothing. 
         -- Mike Haertel

ivanhoe · on Nov 28, 2013

I know all this, but it still feels like a magic when you run it against a fat log file and it's done in a couple of seconds, while awk and tail were previously struggling with it for like 15 minutes...

mamcx · on Nov 28, 2013

That is similar to how I explain to some students how have a fast database system: The key to a fast query is one that do little, and the way to make it do little is have the data exactly as how is needed.

pmiller2 · on Nov 28, 2013

This applies very well to optimization, too. There are only really two ways to optimize code:

1) make it do less

2) make it do more at a time.

The first corresponds to using more efficient algorithms and data structures. The second is parallelism.

tfb · on Nov 28, 2013

I've found that taking a considerable amount of time (days even) to determine the very best solution to a problem always results in less effort and less time spent coding overall. It's also no coincidence that the code becomes incredibly simple, more readable, more efficient, more flexible, easier to build on top of, and fewer issues down the road. Plus, in the end, depending on the magnitude of the project, less time is spent on overall development.

It is never a good idea to rush any large coding project. By rush, I mean just sit down and start churning out code just so you can have something tangible within a day or two. And by large, I mean anything requiring at least a few thousand LOC. Anything less can be okay to rush though; i.e., small projects where maintainability and scalability aren't as important; e.g., some MVP to test some market.

pacala · on Nov 28, 2013

I wholly agree that it's worth investing in finding the best solution. However, in my experience I found that having concrete code to work with is invaluable. The magic bullet for me is to slap something concrete together, then aggressively refactor / cut the crap out of it. In some sense, building software is like sculpture. You've got this amorphous block of half baked ideas, and you need to turn it somehow into a svelte shape.

* We end up finding more crap to remove then we'd thought at the start.

* We can cut down misguided arguments from wannabe architects of "what if we'll need x" by saying "we'll add this back when someone asks for x". It's much easier to point out that x is currently pointless with concrete code than at the demiurge phase where everything is possible and timelines are ignored.

* We can chop at the problem as a team, without having a long sequential and solitary step of "designing the best system". Amdahl's law applies to dev teams as well.

pmiller2 · on Nov 28, 2013

I call that "sketching in code." I've gotten in some trouble in interviews for "starting to write code right away," so it's a habit I'm having to fight at the moment, but I find having something in code (that I'm fully willing to throw away) really helps.

It's like Fred Brooks said:

    ...plan to throw one away; you will, anyhow.

tfb · on Nov 28, 2013

I agree. It helps to have something tangible and to refactor later. I think the main issue that some people (myself included of course) have with that is refactoring too late. You never want to build something big on top of sketchy code. I've done this a few times myself to try to meet a deadline, and it's really bit me in the ass. We're talking weeks maybe months down the drain.

"Sketching in code" is a great analogy. Similar to how you'd trim off the rough edges in a sketched drawing, I've found myself cutting a lot of cruft when refactoring.

quizotic · on Nov 28, 2013

> Make it do less.

Yes, but less in a particular way. It's important to skip processing stuff that can't matter in the end. In this example, skipping over bytes that can't matter because the end doesn't match. In DBMS, it's important to skip over records/columns/values that can't contribute to the answer.

The surprising realization in optimization is that procrastination is a virtue. Lazy beats eager.

Someone · on Nov 28, 2013

There is a third one, at low level: make it do something else.

For example, programmer requests a multiplication, compiler generates a shift; programmer requests a division, compiler generates a multiplication; programmer specifies a switch, compiler generates a jump table.

pmiller2 · on Nov 28, 2013

I'd argue that's just a variation of doing less, in the same way that e.g. vectorization is a variation of parallelism.

mtdewcmu · on Nov 28, 2013

The same concept applies to transportation. How to get around faster: take shorter routes, and take all the stuff you're going to need on the first trip.

masklinn · on Nov 28, 2013

> The second is parallelism.

Or vectorization, or avoiding stalling, or filling all execution units, or...

amit_m · on Nov 29, 2013

I wonder how much these tricks matter nowadays. Even if we ignore the time to read from disk, for simple algorithms that scan data linearly, it is often the case that the CPU is spending most of its time waiting for the data to be read from RAM.

Also, in modern x86 processors, an entire cache line (64 bytes) is read whenever a single byte is needed. So for strings smaller than 64, even algorithms that don't check every byte will read every byte from disk to RAM and from RAM to the processor cache.

colanderman · on Nov 29, 2013

Even if we ignore the time to read from disk, for simple algorithms that scan data linearly, it is often the case that the CPU is spending most of its time waiting for the data to be read from RAM.

Nope. Even DDR-3 can perform sequential transfers at speeds well exceeding one byte per CPU clock cycle.

And yes, you can ignore the time to read from disk. Think a SAN over a 10-gig link. That gets you close to byte-per-cycle territory as well. It takes on the order of a dozen cycles (more or less depending on architecture and algorithm) to perform one "step" of a search. So yes, these algorithms very much matter.

Also, in modern x86 processors, an entire cache line (64 bytes) is read whenever a single byte is needed. So for strings smaller than 64, even algorithms that don't check every byte will read every byte from disk to RAM and from RAM to the processor cache.

Yep. But this is not necessarily the bottleneck; see my above comments about the CPU.

amit_m · on Nov 30, 2013

The CPU can also execute instructions exceeding one per clock cycle. For example instructions such as MOV, CMP and even conditional jumps run at a typical throughput of 3 instructions per clock cycle on the intel Core line of CPUs. (source: http://www.agner.org/optimize/instruction_tables.pdf)

This means that the simplest handcrafted loop that loads a byte, compares its value to the first byte of the searched string and then does a conditional jump should (if unrolled) take about 1 clock cycle per byte examined!

This means that if our DDR3 memory can read 6GB of data per second and our CPU core is clocked at 3Gh, this completely naive algorithm will run at half of the theoretical maximum speed.

Using XMM instructions (that work on 16 bytes at a time), should probably get us to the limit of the RAM speed.

Regarding SAN-over-10-gig link. I don't know about you, but the computer I'm typing this on has an SSD that can read only a few hundred megabytes per second.

colanderman · on Dec 1, 2013

SANs typically have dozens of drives configured as a RAID array to achieve 10 Gbps performance.

myg204 · on Nov 28, 2013

Corollary (from my fortunes file):

Deleted code is debugged code. - Jeff Sickel

jsickel · on Nov 28, 2013

The phrase should be attributed to Ray Ontko, I first heard it while working on database projects for him in the early nineties. -- Jeff Sickel

MikeTaylor · on Nov 28, 2013

Impressive, but if the goal is for BSD to have a faster grep then clearly the solution is to use GNU grep instead of rolling their own redundant and inferior clone.

jtmcmc · on Nov 28, 2013

presumably GNU grep is GPL and incompatible with their BSD license

snogglethorpe · on Nov 28, 2013

The GNU GPL is perfectly compatible with their BSD license... :]

It's just that the end result is then restricted by the GPL, and they don't want that.

jamesaguilar · on Nov 29, 2013

That is what "not compatible" means.

snogglethorpe · on Nov 29, 2013

You can release some software that contains both GNU GPL and BSD-licensed code, and it will be legal to distribute as long as you follow the rules of both licenses; as the rules of these licenses do not conflict, there's no legal problem.

This is what most people mean when they say two FOSS licenses are "compatible."

belorn · on Nov 28, 2013

Is grep part of the BSD base system?

smackay · on Nov 28, 2013

In other words the correct choice of algorithm and data structure can dramatically simplify a problem and the amount of code and time needed to solve it.

It also means that having tools where you can quickly apply different techniques, ahem, composable functions, that you can search for more efficient solutions with a lot less effort. That doesn't solve the smartness problem but it makes it a lot more tractable.

bouk · on Nov 28, 2013

This is not just because of algorithm and data structure though, it's also very well implemented

phillmv · on Nov 28, 2013

>It also means that having tools where you can quickly apply different techniques, ahem, composable functions

The key word there is composable, and not functions.

dredmorbius · on Nov 28, 2013

The points of discarding data you don't need, not looking at stuff you don't need to look at, and minimizing the processing of the stuff you do look at are also highly valid.

I've seen 90% performance boosts in code where access to low-level data formats really wasn't possible. Though at times, using appropriate data structures / methods also helped markedly.

zamalek · on Nov 28, 2013

I don't know who to attribute the quote to, but there is one the goes something along the lines of:

"The fastest method to execute is an empty method."

levosmetalo · on Nov 28, 2013

> "The fastest method to execute is an empty method."

The fastest method to execute is an empty method that was never called.

The fastest method to execute is an empty method that was never called and never written.

The fastest method to execute is an empty method that was never called and never written and never planned.

mhurron · on Nov 28, 2013

This is why I say I'm at my most efficient when I just don't do any work at all.

jmilloy · on Nov 28, 2013

Funny, but I think you finally went too far! Speed is one thing, and not executing a function sure can happen fast, but efficiency is productivity per unit time, and if there is no productivity, it can't be "most efficient". (Yes, I'm a pedant sometimes.)

levosmetalo · on Nov 28, 2013

Wally? Is that you?

wpietri · on Nov 28, 2013

One of the things I like best about Lean-derived software development methods is that they maximize the amount of work not done.

TheAceOfHearts · on Nov 28, 2013

In case anyone is interested in checking it out, you can download the source here: ftp://mirrors.kernel.org/gnu/grep/

lelf · on Nov 28, 2013

http://savannah.gnu.org/git/?group=grep

eru · on Nov 28, 2013

You will need some gnu common libraries. I tried building it from scratch recently, as a first step for implementing an extension idea, but didn't succeed.

(If anyone is interested, I want to add more operators, like intersection or difference of regular languages.)

pmiller2 · on Nov 28, 2013

The shell, combined with grep already implements these intersection and difference operators. For intersection, all you have to do is pipe into a second grep [1]; for difference, you pipe into a second grep with -v [2].

[1] Intersection: grep regex1 file1 | grep regex2

[2] Difference: grep regex1 file1 | grep -v regex2

eru · on Nov 29, 2013

Yes, you can do that. But you can only do that at the top level of your expression (ie outside the expression), not intra regular expression.

a3n · on Nov 28, 2013

The article mentions the difference in times from forcing mmap.

  $ grep --mmap "fleagle" *
  grep: the --mmap option has been a no-op since 2010
  ...

Coincidentally, the article is from 2010.

dicroce · on Nov 28, 2013

The art of solving the problem in question and ONLY the problem in question.

dbbolton · on Nov 28, 2013

In some cases, Perl can actually be faster than grep and sed at basic text processing-- IIRC particularly with larger files-- but I'm not sure about the mechanics behind it.

enthdegree · on Nov 28, 2013

Original submission title was: "The key to making programs fast is to make them do practically nothing."

leeoniya · on Nov 28, 2013

if you don't need backtracking, re2 [1] should be faster.

[1] http://code.google.com/p/re2/

jrgnsd · on Nov 28, 2013

I've always loved grep. Now I know why.

fibo · on Nov 28, 2013

Thanks to grep, it almost payed my rent

niuzeta · on Nov 28, 2013

Leaving a comment here and upvoting simply to have an access for the resources. I will really need to update my way of bookmarking pages...

fsniper · on Nov 28, 2013

Practically doing nothing is not doing things efficient and intelligently.It is a joke but it may be misleading.

zamalek · on Nov 28, 2013

> It is a joke but it may be misleading

It's something along the lines of a mantra, not a joke.

UNIXgod · on Nov 30, 2013

This is a classic post.

jhhn · on Nov 28, 2013

what is grep?

Zash · on Nov 28, 2013

A text search utility in Unix. Name comes from a command in the text editor ed and stands for "global regular expression print".

jhhn · on Nov 28, 2013

Thank-you, guys, but I was kidding. I've been using Linux and other Unixes since 1998. I just got impressed in how this post became busy with everything, but Linux programming.

jschulenklopper · on Nov 28, 2013

http://en.wikipedia.org/wiki/Grep