
Kythe: A pluggable ecosystem for building tools that work with code - aseidl
http://www.kythe.io/
======
beliu
Kythe seems like an awesome project, and kudos to Google for releasing this in
the open.

For those interested in code analysis and dev tools, another library you might
want to check out is srclib (I'm one of the authors). srclib is an open-source
polyglot code analysis library designed for editors and code explorers. Its
mission, supporting a common language-independent schema to support building
better language-aware tools, is closely aligned with Kythe's. There's
documentation and a succinct description of the problem we're trying to solve
at [https://srclib.org](https://srclib.org).

srclib currently supports Go, Java, Python, JavaScript, Ruby, Haskell, and
soon PHP. There's a simple command line API that editor plugins can call, and
currently there are srclib plugins for Emacs, Sublime, and Atom. srclib also
powers [https://sourcegraph.com](https://sourcegraph.com).

I'm looking forward to seeing where Kythe goes and hopefully integrating Kythe
and srclib. I think this is a huge step forward toward better tools for
programmers. Just ask anyone who works/used to work at Google about the
quality of their internal dev tools vs. the outside world. Thanks to the Kythe
team for sharing this with the world!

~~~
Meai
My usecase is that I have c# and java code that call each other but I need
some kind of checker that validates "yes, the c# method signature is still
compatible with the java method signature". It sounds like Kythe is the right
tool for that, what is your take on that? Does srclib support something like
this too? It's not as easy as it sounds because the c# or java method
signature may have attributes which alter the compatability in custom ways, so
I cant just compare "yes all 3 params are compatible and the name is correct".
I'd have to plug in some custom logic that takes into account what the
attribute does. What do you think?

~~~
beliu
It's hard for me to say without more details about your specific problem.
srclib does provide a Data field where information like method signatures is
typically emitted, so you can compare the signature for the C# method with
that of the Java one, but I'm not sure if that includes the other attributes
you need to know for your problem.

If there's a lot of custom logic, then it might be better to write an ad hoc
tool that checks the AST of the Java against the AST of the C#. srclib and I
think also Kythe are designed for building tools that want to be language-
agnostic, rather than digging into specific language behavior.

------
mwcampbell
This sounds like the Grok project that Steve Yegge described in his post about
software conservatives and liberals
([https://plus.google.com/110981030061712822816/posts/KaSKeg4v...](https://plus.google.com/110981030061712822816/posts/KaSKeg4vQtz)).
Anybody know if it's the same project?

I think an interesting possible application of this tool would be source-to-
source compilation between languages. For example, once Objective-C support is
added, could Kythe be the basis for something like j2objc?

~~~
creachadair
Kythe is indeed based on the Grok project that Steve described in
[https://www.youtube.com/watch?v=KTJs-0EInW8](https://www.youtube.com/watch?v=KTJs-0EInW8)

------
mattj
What does this do? I've browsed through the site for a few minutes, and still
have no idea what kind of tools you could build with this that you couldn't
build before.

Is this for cross-language doc generation? Refactoring tools? Something else?

Are there any concrete examples of a tool built on top of this that would
otherwise be impossible / very difficult?

~~~
creachadair
You can build the same tools as before -- the purpose of Kythe isn't to
fundamentally change the kinds of tools you can make, it's intended to make it
easier to glue those tools together.

Google uses this approach internally to generate cross-references for a huge,
heterogeneous multi-language codebase. Linking across generated code,
connecting documentation to its references, and exposing all those features in
editors, code browsers, code-review tools, and so forth, are all a lot easier
when that information has a common representation.

And of course, those problems exist even in much smaller codebases. Kythe
isn't really a "product", but rather an interlanguage for tools that
manipulate source code.

~~~
vezzy-fnord
_Kythe isn 't really a "product", but rather an interlanguage for tools that
manipulate source code._

That's a concise explanation. Thanks.

Of course, the bottleneck is always in achieving widespread integration with
existing tooling. Your overview lists requirements for compiler and build
system instrumentation alike, as well as tools that then consume and filter
the graph data. It'll be interesting to see if Kythe gains the needed
mindshare for this.

~~~
creachadair
It will indeed be interesting to see. :)

You're right that the work needed to connect (say) a compiler or static
analyzer to (say) an editor is usually substantial.

Right now, projects typically duplicate this work over and over again, for
each combination of language and editor. We've found that for a lot of common
cases you can re-use the work you did to instrument a given compiler and/or
editor for others to mix and match, if they can agree on a format for the
data.

Obviously this doesn't work for every such problem, but in our experience it's
surprisingly effective for most of the day-to-day tasks engineers need to
solve, such as figuring out what will break if I commit this change to the
repo.

------
padator
A similar effort by Facebook open sourced 4 years ago: (I'm one of the author)
[http://github.com/facebook/pfff/wiki/Main](http://github.com/facebook/pfff/wiki/Main)
with indexers for PHP, C, Java, Ocaml, and preliminary support for many other
languages.

------
chuckcode
Seems like most code editors these days have reached that microsoft excel
point where most of the requested features are already present and it is a
matter of usability and better ways to help users learn these inherently
complicated tools. I'm constantly surprised at how many really bright people
aren't using their debugging and profiling tools effectively.

The big features I'd like to see are more around collaboration and remote
execution. The ability to share, search, remotely debug a big stack easily
would be great. Github has taken some big steps forward on that but I'd love
wrap that up into the editor. Use cases like natively connecting to a
coworker's editor and see what is failing or review some code.

------
arafalov
Screenshots (of UI example) would be nice.

Is there search built-in or planned? I see some discussion in the storage
format section, but only as a negative statement.

------
endergen
It's not very clear at all what the vision is and how this is supposed to be
used. I can make guesses, but some clarity would be great if any of the Google
people involved are in this thread.

~~~
endergen
My bad. Didn't notice the overview: [http://www.kythe.io/docs/kythe-
overview.html](http://www.kythe.io/docs/kythe-overview.html)

------
dang
We changed the url from [http://google-opensource.blogspot.com/2015/01/kythe-
new-appr...](http://google-opensource.blogspot.com/2015/01/kythe-new-approach-
to-making-developer.html) to the canonical project page. It also links to
[http://www.kythe.io/docs/kythe-overview.html](http://www.kythe.io/docs/kythe-
overview.html).

------
justinsb
This seems a much clearer overview: [http://www.kythe.io/docs/kythe-
overview.html](http://www.kythe.io/docs/kythe-overview.html)

~~~
akavel
The [http://www.kythe.io/](http://www.kythe.io/) landing page seems to have
the one-sentence summary:

 _" Kythe_

 _A pluggable, (mostly) language-agnostic ecosystem for building tools that
work with code. "_

For me, this is the best of all the listed "overviews", albeit still not fully
clear.

------
ihsw
Seems like a complement to protobuf -- kythe describes code, protobuf
describes data. Neat.

