Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The problem with suffix arrays---even with a blazing fast SACA---is that they are slow. It will take a long time to generate an index for even a moderately sized code repository.

Typically, if you want an index, you build an inverted index, which maps terms (e.g., n-grams or tokens in your favorite PL) to a postings list. The postings list contains all of the documents in which that term occurs.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: