
Show HN: An experimental code search engine - sp1982
https://codegrep.com
======
ackfoo
The thing about search is that it is only useful if it lets you find obscure
results. Anyone can find common results; they clog every search engine and
suffocate functionality.

For example, if I enter “UDP broadcast” into your search, I find all the usual
java and cpp results that I can find anywhere. Ho hum. If I wanted those, I
could go use Google or probably just trip over them in my living room.

But I want results in Swift because it is relatively new and obscure. There
are only 4 Swift projects on Github that match the term “UDP broadcast” out of
319 total results. I have to do an advanced search on the native Github engine
to find them.

I think you might want to start by making your search useful for finding the
weird obscure things that people really have to hunt for. Then expand it to
everything else while finding a way of not drowning the oddball stuff with the
common clay (of the New West).

This is where Google, for example, has gone wrong lately. I almost can’t use
it for anything meaningful because any meaningful search (example ‘MacOS gps”)
brings back “5 Amazing GPS apps for Mac that you can’t live without!!” and
many other links with no actual content.

If you want to make a useful search, you have to return the stuff that is hard
to find instead of the easy and useless stuff.

~~~
brobdingnagians
This would be great, especially if it made it easy to find specific examples
of how to do certain things. When I am trying to figure out how to do
something I usually find several examples that are decently close of the core
idea, then look at the documentation. It helps understand the documentation
better, so if I could search for a list of examples, it would be invaluable.

Somewhat related, I put in "std::string" under cpp and it didn't return
anything, but "string" returned plenty of instances that included
"std::string". Being able to use namespaces would help in finding obscure
functions with a name that is common, but has a specific module name to make
it unique.

Any and all improvements in context awareness would be nice too, but probably
more difficult than they would be worth for any short term implementation.

------
nikaspran
It would be great to have a few pre-selected example queries immediately
clickable, to see some examples. That would help showcase why this is cool and
what you can do with it. And maybe why it's better than other search tools
(i.e. built-in github one).

~~~
sp1982
Good point, will add.

------
jimijazz
This is fun. Why have you restricted the search to just Identifiers, Variables
and Functions? Does this mean that text found in a comment block would not be
a match?

I'm already devising how you could incorporate a bag of words algorithm plus
an embedding to segment/find similar items.

~~~
sp1982
>> Why have you restricted the search to just Identifiers, Variables and
Functions? Does this mean that text found in a comment block would not be a
match?

Yes, by default, it matches any but if you specify a filter, you can restrict
on available language features.

------
lgessler
If this looks cool you might also be into Google Code Search
[https://github.com/google/codesearch](https://github.com/google/codesearch)

------
gilleain
Feature request/idea: Would be nice to be able to 'pipe' the results of one
search into another.

Imagine a series of search boxes, with the results of the first fed into the
second, and so on. Like:

> codegrep --class "Logger" | codegrep --field "error"

------
aryamaan
This looks good. I wanted to do something similar for Golang when I started
learning it.

Like for a package say `sync` I want to see the most common methods first and
their documentation.

I find the current godocs lacks in giving welcoming vibes and treat every
aspect of a package equally where some method types are more important/useful
than others.

------
ssijak
First search revealed some funny test code :)
[https://github.com/ornicar/lila/blob/master/modules/shutup/s...](https://github.com/ornicar/lila/blob/master/modules/shutup/src/test/AnalyserTest.scala)

~~~
contravariant
Kind of curious what the story behind the chess-bot stuff is.

------
boyter
Very nice. I run searchcode.com which I have been neglecting recently. That
said I do like to see other code search engines and play around with them.
Seems you have disabled the default regex search offered by elastic? I didn’t
look into the code much but a search of /.*/ yielded no results?

~~~
sp1982
Yes, currently it is simple term search, regex searches aren't very stable
with respect to performance. Something to think about it for next iteration.

------
pard68
It would be useful if it grouped repositories or had a method of collapsing
results. My search returned the same repo for each result. Not sure if it was
a function of there only being one repository using my term (doubt it, I used
"GCD") or what.

------
oliverx0
This has a lot of potential. I would love to be able to type quicksort and
automatically see a good implementation of it. What is shown right now is not
good enough, but the idea is there. Great job! I also tried ray tracing, and
timsort, but got no results.

------
kureikain
Amazing. Search is super fast.

How do you think about Elm? I want to do more work with it too, play with it
on some toy projects and love it so far.

------
sqrt17
It seems to highlight matches both within strings and as normal identifiers
(e.g. to_csv finds matches within strings)

------
ddorian43
What's the backend like ?

~~~
sp1982
Play framework + Elasticsearch

------
edwinyzh
Do you have a plan to add Pascal/Delphi language support?

~~~
sp1982
Adding new languages is fairly easy as long I can write a little helper
program to tokenize. I will take a look.

~~~
azeirah
Take a look at tree-sitter, it's a parser written for the Atom editor, it
supports 20+ languages, and is _super fast_ , as it was written to parse in
under 1/60th of a second (parse on keypress, should be done before the screen
refreshes)

[https://github.com/tree-sitter](https://github.com/tree-sitter)

It has bindings for node, ruby, rust and haskell

------
catchmeifyoucan
What is experimental?

~~~
sp1982
Coverage, only top few thousand open-source repos from github are being
indexed as of now. Similarly, only few languages + features are extracted.
Probably I should have said alpha :)

------
mandeepj
How this is different than Github's search?

~~~
sp1982
A bit more structured, at least when I last hacked on it. The key idea is to
be able to search via language features, for example by function
implementation (vs usage). Lets say you saw a function name in kernel crash,
like
[https://www.codegrep.com/search?query=ext4_extent_block_csum...](https://www.codegrep.com/search?query=ext4_extent_block_csum_verify&language=c&identifier=function)

~~~
mandeepj
Thanks for the insight. You are correct - github is not supporting this type
of search.

------
whydoineedthis
Sweet. Now I can find committed AWS super fast.

------
quickthrower2
Useful. Please keep going!

~~~
kreetx
Not sure why somone downvoted, here is it back! :)

The search is fast and ui is simple, and works on mobile -- great work,
sp1982!

------
piecu
No C#, useless for me.

~~~
sp1982
Sorry, will try to add in near future.

