

Grab – simple but very fast grep - pmoriarty
https://github.com/stealth/grab

======
ggreer
I'm the author of ag[1]. File searching stuff interests me, so I took a look
at grab. Sadly, grab only compiles on Linux right now. Some of the mmap flags
and cpu_set_t aren't defined on OS X. I was curious about the performance
claims, so I benchmarked grab and ag on my code directory. Times are medians
of five runs. I used a Lenovo X140e running Ubuntu 14.10. It has a 160GB Intel
SSD, 8GB of RAM, and an AMD A4-5000 (4 x 1.5Ghz Jaguar).

    
    
        ggreer@boron:~/code% du -sh .
        8.3G	.
    
        ggreer@boron:~/code% time ag cpu_set_t
        ag cpu_set_t  4.45s user 5.25s system 295% cpu 3.285 total
    
        ggreer@boron:~/code% time grab -R cpu_set_t .
        grab -R cpu_set_t .  13.31s user 21.67s system 35% cpu 1:38.28 total
    
    

30x faster, but these benchmarks aren't a fair fight. Ag ignores binary and
hidden files by default. If I tell ag to do an unrestricted search, it's still
2x faster (43 seconds vs 98 seconds). Even with cold caches (echo 3 | sudo tee
/proc/sys/vm/drop_caches between each run), ag beats grab handily:

    
    
        ag -u cpu_set_t  19.62s user 32.56s system 90% cpu 57.433 total
    
        grab -R cpu_set_t .  15.48s user 37.89s system 37% cpu 2:22.67 total
    

I haven't profiled grab yet, but there's definitely some low-hanging fruit.
For example, it looks like grab could get a big speedup by detecting literal
patterns and using strstr() instead of a whole PCRE engine. Also,
FileGrep::find is calling pthread_mutex_lock/unlock even if there are no
matches to print. Adding a condition around that makes grab 1.5x faster.

I'm glad I took a look at grab. Despite its shortcomings, I learned from it.
I'll definitely try out a few tricks grab uses that ag doesn't, such as thread
affinity.

One more thing: The author of grab is right about counting newlines. It does
hurt performance. Still, I enable it by default in ag. I think the tradeoff is
worthwhile.

1\.
[https://github.com/ggreer/the_silver_searcher](https://github.com/ggreer/the_silver_searcher)

Edit: The mutex locking change was so straightforward that I submitted a PR:
[https://github.com/stealth/grab/pull/2](https://github.com/stealth/grab/pull/2)

~~~
stefantalpalaru
> Sadly, grab only compiles on Linux right now.

the_silver_searcher is a great piece of software and I use it every day, but
why would your main development platform be anything but Linux in this day and
age? OS X has inferior performance, no official package manager and it's being
produced by people hostile to open source developers. Why in the name of Jah
would you subject yourself to that?

~~~
ggreer
I have a MacBook Air. I also have a ThinkPad dual-booting Ubuntu and Windows
8.1. When you see them side-by-side, you'll understand why the mac goes in my
travel bag[1]. Others might not mind the difference in weight or size, but I
do.

Also, even though the ThinkPad X140e is certified by Ubuntu to be
compatible[2], it took me 4 months of messing around before I could change the
screen brightness. That issue ruined battery life and made my laptop unusable
at night. I still can't get bluetooth to work. Others have been luckier than
me, but hardware support on Linux can still be spotty. With a mac, I don't
have to worry about that.

But that's just my experience. Others (such as yourself) love using Linux on
their laptops. It's entirely possible that people simply have different
preferences or workflows. Instead of feigning shock or getting upset, just use
what you like.

1\.
[http://abughrai.be/pics/DSC_8737.JPG](http://abughrai.be/pics/DSC_8737.JPG)

2\.
[http://www.ubuntu.com/certification/hardware/201309-14195/](http://www.ubuntu.com/certification/hardware/201309-14195/)

~~~
stefantalpalaru
>Instead of feigning shock or getting upset

It's not shock, it's disappointment. You put up with this shit[1] because it's
shiny?

[1]: [http://openbenchmarking.org/prospect/1304096-FO-
RARINGOSX81/...](http://openbenchmarking.org/prospect/1304096-FO-
RARINGOSX81/259ce31561ef83ab43b09765a1351dd472da82ce)

~~~
ggreer
Please stop.

------
imslavko
It might not be a replacement for ag today but it is 500 lines of C++ code w/o
dependencies. Sounds like a great resource to learn from for people like me :P

------
Scaevolus
Are there any benchmarks of this vs ag
([https://github.com/ggreer/the_silver_searcher](https://github.com/ggreer/the_silver_searcher))?

Some of the optimizations are similar (mmap and pcre_study), while others are
opposed (ag uses pthreads, grab claims disk I/O is the bottleneck and threads
slow things down).

~~~
JadeNB
> Are there any benchmarks of this vs ag
> ([https://github.com/ggreer/the_silver_searcher](https://github.com/ggreer/the_silver_searcher))?

Ask, and ggreer
([https://news.ycombinator.com/item?id=8781374](https://news.ycombinator.com/item?id=8781374))
shall give (within 10 minutes). :-)

------
seanp2k2
Surprised to find that no one has yet mentioned ack, which is another
developer-focused grep-like thing:
[http://beyondgrep.com](http://beyondgrep.com)

~~~
chingjun
Because we now have
[ag]([https://github.com/ggreer/the_silver_searcher](https://github.com/ggreer/the_silver_searcher)),
which is similar to ack, but much faster

------
_ZeD_
what's wrong with GNU grep???

~~~
zobzu
Not all that much to be honest. it has much more features than the clones and
is slower in some conditions.

I suspect improving grep would have been a better idea than making tools that
aren't as versatile as grep, but...

