Executive summary (from me, a hasty reader and a sloppy thinker): It matches multiple characters at a time by taking advantage the fact that, typically, more than one character its in a machine word. This quickly becomes hairy, leading to over 50% of the remaining text written in the LaTeX math environment.
Less jokingly, in order to understand what's going on, I think you need to have read+understood Myers' regex algorithm first, which speeds up the straightforward Thompson NFA O(nm) matching by using a table scheme, which they then extend to support performing transitions for multiple characters at once.
Finally ... I'm not sure how much practical use this improvement really is, given:
> "Today, regular expressions have also become a shining example of how ignoring good theory leads to bad programs. The regular expression implementations used by today's popular tools are significantly slower than the ones used in many of those thirty-year-old Unix tools."
This is about as awesome hackery piece of theoretical computer science as you can get, the first asymptotic improvement to regular expression matching in at least 17 years!
to quote from the paper's abstract, the improvement is We show how to solve this ubiquitous task in linear space
and O(nm(log log n)/(log n)3/2 +n+m) time where m is the length of the
expression and n the length of the string. This is the first improvement for
the dominant O(nm/ log n) term in Myers’ O(nm/ log n + (n + m) log n)
bound [JACM 1992]. We also get improved bounds for external memory.
http://www.philipbille.org/files/c10.pdf?attredirects=0
Executive summary (from me, a hasty reader and a sloppy thinker): It matches multiple characters at a time by taking advantage the fact that, typically, more than one character its in a machine word. This quickly becomes hairy, leading to over 50% of the remaining text written in the LaTeX math environment.
Less jokingly, in order to understand what's going on, I think you need to have read+understood Myers' regex algorithm first, which speeds up the straightforward Thompson NFA O(nm) matching by using a table scheme, which they then extend to support performing transitions for multiple characters at once.
Finally ... I'm not sure how much practical use this improvement really is, given:
> "Today, regular expressions have also become a shining example of how ignoring good theory leads to bad programs. The regular expression implementations used by today's popular tools are significantly slower than the ones used in many of those thirty-year-old Unix tools."
From http://swtch.com/~rsc/regexp/regexp1.html