
Livegrep: Live searching the Linux kernel source code - alpb
http://livegrep.com/search
======
roadnottaken
<http://livegrep.com/search?q=fuck>

~~~
rollypolly
That was my first search too. I think it'd be interesting to see a list of the
most searched terms.

~~~
nelhage
Hey there. I built this. Top 10 queries, all lengths:

f,s,fu,fuck,fuc,a,c,p,d,t

Top 10, >=4 characters:

fuck,shit,peni,penis,linu,craz,cunt,linus,test,hell

I weep for humanity :)

~~~
kanzure
Bug report: when you have multiple instances of a word in a single line, your
service returns the line twice, and in both cases it only highlights the first
found term in the line, instead of the second on the second pass or instead of
consolidating the two results.

------
koenigdavidmj
Any chance of the backend source for this being released? It's 2012; I
shouldn't be stuck with `grep -r` and OpenGrok on a good day.

~~~
dmit
Take a look at <https://code.google.com/p/codesearch/>.

~~~
4ad
Yes, Russ Cox's codesearch is great, I use it to search in around ~50 millions
lines of code. A query is so fast I never bothered to measure it, and
reindexing the whole thing only takes a minute or two on a 5 year old laptop.
It's so fast I reindex everything at every shell login.

------
a3_nm
Is there an equivalent CLI tool? (Preprocess a tree of files, possibly keep it
in RAM as a daemon, and answer queries in realtime.)

~~~
4ad
Russ Cox's codesearch: <https://code.google.com/p/codesearch/>

Russ Cox wrote the old Google Code Search, codesearch is an implementation in
Go based on the same ideas.

------
alpb
Are they really searching the static source in real time with the threads?
This is what I understand from the description. Why don't they use something
like Lucene, I believe the searched content is pretty much static, so why
wouldn't just index? Wouldn't that be much faster?

~~~
nelhage
Lucene, and most other indexing solutions (to my knowledge) index based on
_words_ , not arbitrary substrings.

So they can efficiently answer questions like "Give me all documents
containing the word 'Linus'", but not necessarily "All documents containing
the string '#def'".

That said, I am indexing -- I have my own custom backend that stores an in-
memory index that lets me do arbitrary substring search (and more complicated
queries, such as most character classes) much faster than a full search.

That said, the backend will fall back on a full regex search if necessary.

~~~
durin42
Do you use anything like a trigram index (see rsc's wonderful posts about how
Google Code Search worked, and <https://code.google.com/p/codesearch/> for a
Go implementation) to speed up the regex codepath?

~~~
nelhage
I'm using a different data structure -- a suffix array -- but the concept is
pretty similar. I started work on this before Russ released his codesearch
implementation, but I did read his blog posts while I was working on this.

------
tumblen
I like this search: <http://livegrep.com/search?q=is+a+hack>

------
denzil_correa
<http://livegrep.com/search?q=bastard>

------
icodeforlove
fun stuff -
[http://livegrep.com/search?q=%5Cb(fuck(ing%7Ced)%3F%7Cshit%7...](http://livegrep.com/search?q=%5Cb\(fuck\(ing%7Ced\)%3F%7Cshit%7Cbloody\)%5Cb)

------
prezjordan
How is it so quick? I attempted writing a similar thing in node to grep the
Rails source code, but it was far from real-time.

