Hacker News new | past | comments | ask | show | jobs | submit login

Git grep feels pretty much instant to me. Cant really understand the need for something faster unless its for searching through data files...



git grep can be fast, depending on what you're searching for and how much you need to search. If you commonly search literals, then git grep's literal optimizations probably make it good enough for you. But increasing the pattern complexity just a bit can result in large performance cliffs. For example: (times reported after running each command a few times to account for I/O cache)

    $ git clone --depth 1 https://github.com/BurntSushi/linux
    $ cd linux
    
    $ time LC_ALL=en_US.UTF-8 git grep -E '\w+_RESUME' | wc -l
    1998
    
    real    20.616
    user    2:02.49
    sys     0.363
    maxmem  64 MB
    
    $ time rg '\w+_RESUME' | wc -l
    1998
    
    real    0.127
    user    0.673
    sys     0.617
    maxmem  26 MB
Both of these invocations are doing roughly equivalent work, including respecting gitignores. Both of them are using a Unicode aware `\w` character class. OK, so you might say you don't care about Unicode. That's fine, ripgrep is still faster by an order of magnitude:

    $ time LC_ALL=C git grep -E '\w+_RESUME' | wc -l
    1998
    
    real    4.546
    user    27.741
    sys     0.420
    maxmem  63 MB
With that said, `git grep` can now be made to use PCRE2, which gives it a significant speed boost on this workload:

    $ time LC_ALL=en_US.UTF-8 git grep -P '(*UCP)\w+_RESUME' | wc -l
    1998
    
    real    0.894
    user    5.821
    sys     0.493
    maxmem  63 MB
    
    $ time LC_ALL=C git grep -P '\w+_RESUME' | wc -l
    1998
    
    real    0.517
    user    2.962
    sys     0.596
    maxmem  59 MB
ripgrep can do the same as of this release, but faster:

    $ time rg -P '\w+_RESUME' | wc -l
    1998
    
    real    0.511
    user    4.795
    sys     0.544
    maxmem  24 MB
    
    $ time rg -P --no-pcre2-unicode '\w+_RESUME' | wc -l
    1998
    real    0.422
    user    4.119
    sys     0.479
    maxmem  24 MB
Do these performance differences matter to you? Maybe not. But you said you couldn't understand; hopefully the numbers above add some clarity. :-) On top of that, ripgrep works just as well outside of git repos, on huge log files, binary data or even in shell pipelines.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: