Hacker News new | past | comments | ask | show | jobs | submit login

Ripgrep is awesome. Thank you for making it!

In addition to being really fast, it "just works". By that I mean it automatically excludes the files I want to be excluded, like `.gitignore`d files and binary files. I know I can configure ack-grep (ag) and other tools to do that, but not needing to configure it is nice.

BTW, if anyone hasn't read https://blog.burntsushi.net/ripgrep/, highly recommended. It's about how ripgrep is so fast. (Edit: Just saw another comment mentioned this, too. Goes to show that making a single high-quality blog post has a big impact.)




It is a very good blog post. Recommended!

Now, I will do my obligatory "burntsushi needs to clag the rest of the Teddy code" post which I must do on every discussion of ripgrep. :-)

The "subtitles_alternate_casei" examples would be a good benchmark - these 48 strings should not overwhelm Teddy as they could be sensibly merged into Teddy's 8 "buckets" (in fact, they could be merged into 5 buckets) using the simple greedy merging strategy in the original Teddy implementation.

This would probably be a quite good project for someone who wants to contribute to ripgrep and could likely get some nice performance wins...


I agree, this could be a nice project! If anyone wants to work on it, this is the place to start: https://github.com/rust-lang/regex/blob/master/src/literal/t...

I'll get to it myself eventually if someone else doesn't, but it will likely be a while.


I've been thinking about a Teddy successor (which I suppose needs to be called "Taft").

Teddy is very much 'of its time' (SSSE3) and there are a lot of new approaches that seem interesting (AVX512 of Skylake generation, VBMI, Sunny Cove's even bigger slate of instructions, ARM NEON, SVE).

I also have better ideas about followup confirm than I used to. There are also some prospect to pick a 'fragment' out of the whole string within Teddy or equivalent at a position not strictly at its suffix - this can even be done with ordering preserved if you are careful not to make fragment choices that allow o-o-o matches (only possible if strings overlap).

I might do a bit of work on this, but I'm a bit jaded on string matching and regex matching after 13 years.


On the other hand, by now you probably have more fixed function hardware in your brain for string/regex matching than any other human alive.


What an unnerving concept! I think there are many more people with better algorithmic understanding of the problems. I am more of a 'bang on the problem with a stick until it kinda works' type meathead with a few cheesy SIMD tricks up my sleeve.

I am hoping to move on, but I admit I do have a "few more ideas" in that area - possibly even slightly less 'meatheaded' than previous outings. Maybe (although everything looks better on paper).


ack and ag (the_silver_searcher) are two distinct tools. The latter automatically excludes .gitignored and binary files, just like ripgrep. In fact, it seems ag inspired ripgrep's behavior; in the blog post[0] introducing Ripgrep, BurntSushi specifically highlights that rg aims for the usability of ag:

> I will introduce a new command line search tool, ripgrep, that combines the usability of The Silver Searcher [ag] …

[0]: https://blog.burntsushi.net/ripgrep/


>ack and ag (the_silver_searcher) are two distinct tools. The latter automatically excludes .gitignored and binary files, just like ripgrep.

Only I've never been able to make ag (or pt, another similar too) respect .gitignored and other such settings as good as rg does out of the box. Plus it's slower.


Yes, I believe BurntSushi specifically highlights better .gitignore support in the (long) blog post I linked.


both of these are slower than rg. ag has issues with big files, it's unreliable.


I've not used ag but just a small shout out for ack: it's a single Perl file (easy to install, v1.x will work with even antique Perl), it works on weird hardware/OS (no compilation required), it contains its own man-page (ack -man) and is fast enough for typical sysadmin tasks and moderate size codebases.

(Ack also supports the full PCRE syntax but that's less of an issue now rg has -P).

If your work takes you to minority platforms ack is certainly worth keeping in your toolbox.


What file sizes did you have issues with?

I'm using ag inside my source directories and cannot remember it not doing the job.


Thanks, did not know that!


> addition to being really fast, it "just works". By that I mean it automatically excludes the files I want to be excluded, like `.gitignore`d files

I basically never want this, so for me, ripgrep's one flaw is that it never "just works".


The nature of defaults is that they will never please everyone, because opinions on what is useful have been encoded into the defaults. If you want to completely disable ripgrep's smart filtering, than just do this:

    alias rg="rg -uuu"


You can configure ripgrep to default to whatever flags you want by creating a configuration file and defining `RIPGREP_CONFIG_PATH`: https://github.com/BurntSushi/ripgrep/blob/11.0.0/GUIDE.md#c.... Try putting `-uuu` in the file as burntsushi suggested.


Also see: A series of posts on regex parsers by Russ Cox.

https://swtch.com/~rsc/regexp/


I agree, ripgrep is amazing.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: