
Regex was taking 5 days to run. So I built a tool that did it in 15 minutes - mgdo
https://dev.to/vi3k6i5/regex-was-taking-5-days-to-run-so-i-built-a-tool-that-did-it-in-15-minutes-c98
======
colanderman
This is a really poorly-researched article.

Had the author taken time to research regular expressions, he would have
discovered the difference between DFA-based and backtracking-based
implementations. [1] He would have also discovered that the former (as
implemented by, say, OCaml) not only gives performance independent of the
number of words searched, but in fact creates and follows the exact same table
as his bespoke algorithm, while covering a wider range of patterns.

(Yes, you can apply the DFA technique even when you need to query submatches.
See Laurikari's thesis [2], which forms the basis of the OCaml regex
implementation.)

[1]
[https://en.wikipedia.org/wiki/Regular_expression#Implementat...](https://en.wikipedia.org/wiki/Regular_expression#Implementations_and_running_times)

[2] [https://laurikari.net/ville/regex-
submatch.pdf](https://laurikari.net/ville/regex-submatch.pdf)

------
glangdale
We are somewhat in amateur hour here. It is well understood that literal
matching is easier at scale than regex matching, and some of us have built
regex matchers that take advantage of this whether they are given a pure
literal case or where regexes have literal factors:
[https://github.com/intel/hyperscan](https://github.com/intel/hyperscan) being
a project I've worked on for a little while.

