Hacker News new | past | comments | ask | show | jobs | submit login

Isn't code search more table stakes nowadays? Both github and gitlab have it.

There isn't anything said about end to end, integration or functional testing here. I'm in a world where everyone hacks their own system together onto the same runtime, leading to some wonky outcomes and lots of operational support. Would be interesting if there was a 'google' way to do it.




Last I checked, GitHub's code search was so bad that it's useless. There's definitely a use for good code search.


Search on GitHub is very bad IMO.

The recent semantic references stuff they've added is helping, but that only seems to be available in certain language/setups and doesn't work cross repo. Google's xref system allowed you to browse essentially the whole Google codebase - it was amazing. Third party code was indexes too, I remember my team used code search to track down a bug in NGINX once.

Githubs normal search feature is bad. I can't even quote stuff for exact matches. I usually end up using bigquery for GitHub wide searches [1] or just pull down the repo and grep locally.

[1]: https://www.google.com/amp/s/cloudblog.withgoogle.com/produc...


I use Sourcegraph to search GitHub code most of the time because GitHub's search is awful. Since it has most popular repos indexed already, and it'll clone new ones that you point it at, it's quite handy.


It’s not useless but it leaves much to be desired.

I was recently able to use it to find all repos in my org using git-lfs by searching for .gitattributes with certain properties.

And I was able to search all projects for a particular secret string.


(Disclaimer: Googler)

Internal code search is miles ahead of GitHub/Gitlab search and is super fast and reliable. I have not used source graph (which everyone seems to talk about) but in my past experience in other companies nothing comes close.


I don't know what Google's internal code search is like but if you want to see Chromium's code search it's here

https://cs.chromium.org

also android's

https://cs.android.com/

So you can compare


Nice. It looks like a subset of internal tool. Even my favorite shortcut is working (l r - reference to current line in current commit)

https://source.chromium.org/chromium/chromium/src/+/master:i...


> Both github and gitlab have it

If you want to search code sitting on your local hard disk try this tool (built on top of Lucene):

https://github.com/Rajeev-K/eureka

I made it when I was frustrated by existing tools such as sourcegraph and opengrok.


Google used to offer code searching for public repos but then, expectably, they abandoned it. Now it's only used for google-owned/run projects. But when it was around it was really amazing. Github's code search is quite lame in comparison. I often just run ripgrep locally instead.


regarding google-associated projects: that's true enough, but note that you can e.g. navigate a version of LLVM at

https://cs.android.com/android/platform/superproject/+/maste...


Google’s internal code search is far ahead of GitHub, Gitlab, and even sourcegraph. But you can use it, because they open-sourced it. I don’t know why more people don’t use it.

https://kythe.io/


Author of the post here. I would say that technically Kythe is the open-source version of part of Google's internal code search (specifically, the component that provides precise code navigation), but it doesn't include the search index or the UI. So it's not on its own an end-user product.

That having been said, I love Kythe, and we've actually considered using it as a semantic backend for Sourcegraph (and still might in the future). For the time, we're using indexers that emit LSIF (https://lsif.dev). This allows us to build on top of the substantial body of work provided by the many open-source language servers (https://microsoft.github.io/language-server-protocol). But Kythe has a far richer schema that can capture all sorts of useful relationships in code. It's awesome and I wish more people were building indexers for it.


Maybe sourcegraph just doesn't exploit the real power of language server indexes, but the C++ language server seems pretty impoverished compared to Kythe. If I ask sourcegraph to find all references to absl::string_view::string_view(const char* str) it instead finds the substring `string_view` in any context, which is quite a useless result. Kythe gives me the actual call sites of that function signature and not the other forms, and Kythe knows the difference between absl::string_view and std::string_view.

Is it just a case of the visible implementation being a bit behind the ultimate capability of the system?


(kythe googler here)

Like beliu said, the Kythe schema is far richer; it has fully abstract semantic layer in the graph, and is a superset of what can be represented with LSIF. It's not tied to specified text regions -- there are representations of symbols/functions/classes/variables/types that do have pointers to/from text regions.

Note that because of the richness and abstractness, it's theoretically feasible to drive much more than code navigation from the Kythe graph.

And yes, the open source is just part. The large scale pieces are basically (1) do instrumented build (2) run through Kythe indexers (3) post-process output for serving.

The Kythe OSS project offers solutions for (2) for C++/Java/Go/Typescript/protobuf (and early Rust support). We do have plans to open source support for at least some other languages at some point in the future. (Hedging as best I can here.) Note that the best candidates for Kythe indexing are those languages that admit solid static analysis.

(1) is inextricably tied to the build system. Bazel support should be nearly turnkey; other systems require more (maybe significantly more) work.

There's not-full-scale support for (3) available. (Clearly we use something far more sophisticated internally.) While we'd like to see this fleshed, expansion of that will depend on non-trivial community contributions.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: