For queries that are already fast, I expect them to remain similarly fast. Or hope that they do. :-)
But there are many queries that are not so fast today. For example:
$ time rg-master -Nc 'strengths' OpenSubtitles2018.raw.sample.en
242
real 0.255
user 0.204
sys 0.050
maxmem 903 MB
faults 0
$ time rg -Nc 'strengths' OpenSubtitles2018.raw.sample.en
242
real 0.092
user 0.068
sys 0.024
maxmem 904 MB
faults 0
Where the second command has the new substring algorithm. Nothing earth shattering to be honest. ripgrep already did a good job of dealing with common cases.
The revised algorithm is based on the same principles (heuristics derived from an a priori frequency distribution), but shouldn't be as easy to get into cases where it runs very slowly. Here's another more poignant example (where the needle is two ASCII space characters):
$ time rg-master -Nc ' ' OpenSubtitles2018.raw.sample.en
real 0.887
user 0.859
sys 0.027
maxmem 903 MB
faults 0
$ time rg -Nc ' ' OpenSubtitles2018.raw.sample.en
real 0.084
user 0.043
sys 0.040
maxmem 903 MB
faults 0
Since rg (which is awesome) understands which programming languages map to which file suffixes, could it make sense to do per-language frequency analysis? {}[] are common in C-style languages, but much less so in lisp, python, etc.
It could. But there's a lot of code complexity there because ripgrep is made from de-coupled components. Implementing your idea would require making that frequency distribution available as an overridable element all the way down into where substring search is actually implemented (a transitive dependency of ripgrep).
On top of that, it's unlikely to make much of a difference. Not only would the frequency distributions between languages look pretty similar, but it's still just a guess at the end of the day. There is such a thing as overfitting too. :-)