

Elegant exact string match using BWT - bane
http://blog.avadis-ngs.com/2012/04/elegant-exact-string-match-using-bwt-2/

======
bazzargh
Heh. A decade ago, I read this article on duplicate code detection in PMD:

[http://www.onjava.com/pub/a/onjava/2003/03/12/pmd_cpd.html?p...](http://www.onjava.com/pub/a/onjava/2003/03/12/pmd_cpd.html?page=last&x-showcontent=off&x-order=date&x-maxdepth=0)

... and sent Tom a _much_ faster solution based on the BWT. (I tokenized the
codebase, applied the BWT to the tokens, and the duplicates can just be read
off the list of permutations). It's a neat trick, doesn't take much code, and
I still use that from time to time on languages that I don't have tools for;
it was used for a while in PMD's CPD, but the current version is based on
Rabin-Karp instead.

