The thing about search is that it is only useful if it lets you find obscure results. Anyone can find common results; they clog every search engine and suffocate functionality.
For example, if I enter “UDP broadcast” into your search, I find all the usual java and cpp results that I can find anywhere. Ho hum. If I wanted those, I could go use Google or probably just trip over them in my living room.
But I want results in Swift because it is relatively new and obscure. There are only 4 Swift projects on Github that match the term “UDP broadcast” out of 319 total results. I have to do an advanced search on the native Github engine to find them.
I think you might want to start by making your search useful for finding the weird obscure things that people really have to hunt for. Then expand it to everything else while finding a way of not drowning the oddball stuff with the common clay (of the New West).
This is where Google, for example, has gone wrong lately. I almost can’t use it for anything meaningful because any meaningful search (example ‘MacOS gps”) brings back “5 Amazing GPS apps for Mac that you can’t live without!!” and many other links with no actual content.
If you want to make a useful search, you have to return the stuff that is hard to find instead of the easy and useless stuff.
This would be great, especially if it made it easy to find specific examples of how to do certain things. When I am trying to figure out how to do something I usually find several examples that are decently close of the core idea, then look at the documentation. It helps understand the documentation better, so if I could search for a list of examples, it would be invaluable.
Somewhat related, I put in "std::string" under cpp and it didn't return anything, but "string" returned plenty of instances that included "std::string". Being able to use namespaces would help in finding obscure functions with a name that is common, but has a specific module name to make it unique.
Any and all improvements in context awareness would be nice too, but probably more difficult than they would be worth for any short term implementation.
It would be great to have a few pre-selected example queries immediately clickable, to see some examples. That would help showcase why this is cool and what you can do with it. And maybe why it's better than other search tools (i.e. built-in github one).
This is fun. Why have you restricted the search to just Identifiers, Variables and Functions? Does this mean that text found in a comment block would not be a match?
I'm already devising how you could incorporate a bag of words algorithm plus an embedding to segment/find similar items.
>> Why have you restricted the search to just Identifiers, Variables and Functions? Does this mean that text found in a comment block would not be a match?
Yes, by default, it matches any but if you specify a filter, you can restrict on available language features.
This looks good. I wanted to do something similar for Golang when I started learning it.
Like for a package say `sync` I want to see the most common methods first and their documentation.
I find the current godocs lacks in giving welcoming vibes and treat every aspect of a package equally where some method types are more important/useful than others.
Very nice. I run searchcode.com which I have been neglecting recently. That said I do like to see other code search engines and play around with them. Seems you have disabled the default regex search offered by elastic? I didn’t look into the code much but a search of /.*/ yielded no results?
Yes, currently it is simple term search, regex searches aren't very stable with respect to performance. Something to think about it for next iteration.
It would be useful if it grouped repositories or had a method of collapsing results. My search returned the same repo for each result. Not sure if it was a function of there only being one repository using my term (doubt it, I used "GCD") or what.
This has a lot of potential. I would love to be able to type quicksort and automatically see a good implementation of it. What is shown right now is not good enough, but the idea is there. Great job! I also tried ray tracing, and timsort, but got no results.
Take a look at tree-sitter, it's a parser written for the Atom editor, it supports 20+ languages, and is super fast, as it was written to parse in under 1/60th of a second (parse on keypress, should be done before the screen refreshes)
Coverage, only top few thousand open-source repos from github are being indexed as of now. Similarly, only few languages + features are extracted. Probably I should have said alpha :)
A bit more structured, at least when I last hacked on it. The key idea is to be able to search via language features, for example by function implementation (vs usage). Lets say you saw a function name in kernel crash, like https://www.codegrep.com/search?query=ext4_extent_block_csum...
For example, if I enter “UDP broadcast” into your search, I find all the usual java and cpp results that I can find anywhere. Ho hum. If I wanted those, I could go use Google or probably just trip over them in my living room.
But I want results in Swift because it is relatively new and obscure. There are only 4 Swift projects on Github that match the term “UDP broadcast” out of 319 total results. I have to do an advanced search on the native Github engine to find them.
I think you might want to start by making your search useful for finding the weird obscure things that people really have to hunt for. Then expand it to everything else while finding a way of not drowning the oddball stuff with the common clay (of the New West).
This is where Google, for example, has gone wrong lately. I almost can’t use it for anything meaningful because any meaningful search (example ‘MacOS gps”) brings back “5 Amazing GPS apps for Mac that you can’t live without!!” and many other links with no actual content.
If you want to make a useful search, you have to return the stuff that is hard to find instead of the easy and useless stuff.