
Toward a URL for every function - joeyespo
https://text.sourcegraph.com/a-url-for-every-function-in-the-world-83cf36dfcddb#.s8nn5szeh
======
sqs
Sourcegraph founder here. We built this to make it much easier to grok code.
It saves us hours every day. Would love to hear your feedback!

The README has some good links to try Sourcegraph at
[https://github.com/sourcegraph/sourcegraph/blob/master/READM...](https://github.com/sourcegraph/sourcegraph/blob/master/README.md):

[https://sourcegraph.com/github.com/square/okhttp/-/def/JavaA...](https://sourcegraph.com/github.com/square/okhttp/-/def/JavaArtifact/com.squareup.okhttp3/okhttp/-/okhttp3/Request:type/Builder:type/method:java.lang.String:okhttp3.RequestBody)
(semantic code browsing for Java)

[https://sourcegraph.com/github.com/golang/go/-/info/GoPackag...](https://sourcegraph.com/github.com/golang/go/-/info/GoPackage/net/http/-/NewRequest)
(http.NewRequest used in 8801 repositories)

Sourcegraph supports Go and Java right now. If you want to get access to the
upcoming beta of JavaScript, Python, or other languages, send us a note at
support@sourcegraph.com or
[https://twitter.com/srcgraph](https://twitter.com/srcgraph).

~~~
michaelmior
> These URLs always refer to the latest definition and won’t break if the file
> is edited, unlike links to a specific line number.

You can always get the URL on GitHub for a particular line number _at a
particular revision_ which won't break when the file changes. A persistent
link to a function can break in a different way in that maybe the function,
while named the same, changes behaviour in a way that it no longer does
exactly what it originally did. It's not obvious to me which of these
breakages is more relevant.

When you talk about "hackable" URLs it would be great to be able to get a URL
for a named function at a particular revision. This solves both problems. I
have an immutable reference to a particular piece of code, but then by hacking
the URL I should be able to still see the most recent version.

~~~
beliu
Exactly! You mean like this?
[https://sourcegraph.com/github.com/gorilla/mux@9fa818a44c2bf...](https://sourcegraph.com/github.com/gorilla/mux@9fa818a44c2bf1396a17f9d5a3c0f6dd39d2ff8e/-/def/GoPackage/github.com/gorilla/mux/-/CurrentRoute)

~~~
michaelmior
Exactly :) Cool! Although it seems weird to me to have the revision in the
middle of the URL.

~~~
sqs
Where would you want the revision? At the very end? Right now, it's associated
with the repo, which makes sense and is easier on our URL routing. But I'm
curious to hear what you'd prefer.

~~~
michaelmior
Actually on further thought maybe this does make more sense. I don't really
have a strong preference either way anyway.

------
majewsky
When I read the title, I was imagining something more theoretical, e.g. a URL
encoding of lambda calculus terms.

~~~
gavinpc
Same here. As others have indicated, that would be more consistent with the
title (and more interesting).

The more general idea is a content-addressable function repository, where, as
you point out, code would have to be in some kind of normal form. Joe
Armstrong toys with this idea in his talk "The mess we're in," one of my
favorites. [0]

[0]
[https://www.youtube.com/watch?v=lKXe3HUG2l4&t=33m10s](https://www.youtube.com/watch?v=lKXe3HUG2l4&t=33m10s)

~~~
genericpseudo
My semantic-web-loving cold dead heart disagrees with you on the "more
interesting".

The one really good idea in the whole Semantic Web train-wreck (and at this
distance, I think it's fair to call it that) was that everything should have a
negotiable, dereferencable URL. REST includes that core principle.

A lot of single-page web apps make me sad because they've been designed
without reference to that. If what you're building is genuinely an
application, then I get it; but most of the time what you're building is a
catalogue, and everything in that _can_ have an address, so it _should_.

~~~
gavinpc
I agree 100% about catalogues and addressability. I'm also a diehard in that
respect, and I've come to distinguish "apps" in a similar manner, as sites
where either there's no "business case" for an ontology, or the content is too
transient for it to matter.

Naming is hard... I think that's what makes the "semantic web" a kind of
chimera even for those who endorse it. Can a URL really capture the worldwide
identity of a thing? Over all time? And who's going to maintain all those
names?

Suppose that naming everything is intractable as a human effort, but we still
want addressability. Alan Kay takes this to the extreme, saying, why not give
every object on the internet an IP address? [0] Not every resource, every
_object_ , in every program. It sounds facetious, but it's consistent with his
general objects-as-computers-all-the-way-down view. His system designs
(including those from VPRI) express the belief that hard barriers between the
layers of a system (usage and application, application and framework,
framework and OS) account for much of today's uncontrollable code bloat and
the limits on how much scale systems can tolerate. The "everything gets an IP
address" idea is just a recognition that network boundaries will eventually be
seen the same way. From this perspective, it might be fruitful to think about
how we'd identify things on the internet if they were homogenous with the
objects in our applications.

[0] It's in one of his talks but I don't remember which.

------
alpyne
Sourcegraph folk, are you aware of Rich Hickey's codeq [0][1] for clojure:

 _codeq allows you to track change at the program unit level (e.g. function
and method definitions) and query your programs and libraries declaratively,
with the same cognitive units and names you use while programming_

[0]
[http://blog.datomic.com/2012/10/codeq.html](http://blog.datomic.com/2012/10/codeq.html)

[1]
[https://github.com/Datomic/codeq#codeq](https://github.com/Datomic/codeq#codeq)

~~~
jpitz
It lacks the ability to track changes, but BBQ
[http://browsebyquery.sourceforge.net/](http://browsebyquery.sourceforge.net/)
can query JVM and CLR programs - and holy crap is that useful when you need
it.

Speaking of tracking changes at the method level, does anyone remember
VisualAge for Java?

------
malchow
Here's an example:
[https://sourcegraph.com/github.com/golang/go/-/def/GoPackage...](https://sourcegraph.com/github.com/golang/go/-/def/GoPackage/encoding/json/-/MarshalIndent)

~~~
sqs
Sourcegrapher here. And if you want to see everywhere in the world that
function is used, check out the usages list on the right side, which takes you
here:
[https://sourcegraph.com/github.com/golang/go/-/info/GoPackag...](https://sourcegraph.com/github.com/golang/go/-/info/GoPackage/encoding/json/-/MarshalIndent).

------
skybrian
For Go in particular, a possible alternative is godoc.org:

[https://godoc.org/flag#Arg](https://godoc.org/flag#Arg)

~~~
Nullabillity
And Java has Javadoc, and so on. This seems interesting in that it's not just
about the semantic docs, but actually provides an IDE-like source view in the
browser. The closest I've seen before would probably be SXR/Scala X-Ray[1],
but this seems much more polished.

[1]: [http://www.scala-
sbt.org/0.13/sxr/CrossVersionUtil.scala.htm...](http://www.scala-
sbt.org/0.13/sxr/CrossVersionUtil.scala.html)

~~~
skybrian
For embedding into documentation, there's also a question of longevity. Which
URL is less likely to break?

------
alberto_balsalm
Some of you may find Unison interesting:
[http://unisonweb.org/2015-05-07/about.html#post-
start](http://unisonweb.org/2015-05-07/about.html#post-start)

------
Shendare
A URL for every function on GitHub, at least. Cool idea.

~~~
oh_sigh
It appears to be namespaced, so presumably sourcegraph could add more repos at
their own leisure.

~~~
sqs
Sourcegrapher here. Indeed. Sourcegraph has repositories hosted elsewhere,
such as
[https://sourcegraph.com/bitbucket.org/gotamer/bbpost/-/def/G...](https://sourcegraph.com/bitbucket.org/gotamer/bbpost/-/def/GoPackage/bitbucket.org/gotamer/bbpost/-/main.go/Options).
These are picked up by automated backend processes; if you have a Git
repository that you'd like to specifically add, just email us
(support@sourcegraph.com) or Tweet at us
([https://twitter.com/srcgraph](https://twitter.com/srcgraph)) for now.

~~~
pc86
Any plans to support Hg or TFS?

~~~
sqs
Yep, we will, but we don't have a timeline for those right now. Any VCS that
can implement this Repository interface
([https://sourcegraph.com/sourcegraph/sourcegraph/-/def/GoPack...](https://sourcegraph.com/sourcegraph/sourcegraph/-/def/GoPackage/sourcegraph.com/sourcegraph/sourcegraph/pkg/vcs/-/Repository))
is fine. We have some code written to support Hg, but nothing ready to release
yet.

------
heynk
This should be a great long-tail SEO boost. It's just like one of (Rap)
Genius's best early SEO advantages, which was that they had a URL for every
line.

[https://moz.com/blog/how-i-would-do-seo-for-rap-
genius](https://moz.com/blog/how-i-would-do-seo-for-rap-genius)

------
z3t4
Most JS programmers seems to use modules (require/import) as masqueraded
globals, like importing complexed functions instead of just standalone
modules. And in that case it's better to just declare all dependencies in the
root (html file). You would probably want to use a package manager though, to
keep track of name conflicts and manage the script tags (dependencies of
dependencies).

As for central hosting of packages I think it will work. But we will probably
need to be able to have many src attributes in script-tags for redundancy and
optimal caching.

------
zeveb
Neat idea, although I'm not sold on the style of the URLs themselves. It'd be
cool to introduce a new URL scheme:

    
    
        code://github.com/edicl/hunchentoot/master/log.lisp?macro=with-log-stream
    

That would handle multiple definition namespaces. One could use

    
    
        code://github.com/edicl/hunchentoot/master/log.lisp?macro=with-log-stream&commit=0951a0df8fe93d99e6f2aa3f9612a2d6e581e84f
    

to refer to a particular commit. No idea what the equivalent would look like
for other VCSes though.

~~~
asimuvPR
I've been toying with roughly the same idea for some months. I've come up with
new URL schemes for JSON objects[1], and (Iot)hings[2]. They differ in what
they do but the purpose is to explore how specific URL schemes could open the
door for improvements.

[1] json://the-domain.com/example

Would return:

    
    
        {"json":"data"}
    

[2] thing://ip-address/example

In this case, it returns JSON for the sake of readability, so:

    
    
        {"name":"car","speed":88,"location":"1985"}

~~~
zeveb
What would be the advantage of those schemes over HTTP? In the case of code, I
can see that there's a potential benefit to being able to refer to conceptual
objects (e.g. variables, functions, macros) within version control systems,
and _maybe_ that's worth breaking with HTTP URLs, although I'm not completely
sold there.

What would the advantage of json: be over Content-Type: TYPE+json?

It's a little easier to see that iot:UUID might indicate something like 'over
any number of protocols, over any number of networks, please contact this
device in my locality' or somesuch.

~~~
asimuvPR
I'm exploring if there is a benefit to using new protocols for networked
things. Mostly research. :)

------
ricardobeat
Please don't do this if you want future-proof URLs - that's the whole point of
linking to a specific commit. Functions and files will get moved, renamed,
refactored and deleted.

------
jesalg
Curious why they decided not to work on adding Ruby support especially when
underlying srclib which they use has support for it.

~~~
sqs
Sourcegrapher here. We'll definitely release Ruby support in the future.
Because Ruby has a lot of dynamic language features, it's important we do it
well, and it takes some and thought. We track the coverage % of our analyzers
for all languages we support, and our Ruby support isn't at our quality
threshold yet.

We'll release Ruby when we can do it almost as well as Go (e.g.,
[https://sourcegraph.com/github.com/golang/go/-/def/GoPackage...](https://sourcegraph.com/github.com/golang/go/-/def/GoPackage/net/http/-/NewRequest)).

~~~
jesalg
Nice, looking forward to it!

------
danvoell
I'm not sure if I understand this correctly, but my first thought is, what if
the function that I am using needs to change? For instance, using css, if I
later discover that the design was incorrect, I would rather just change the
design code instead of updating each linking instance.

~~~
sqs
Sourcegrapher here. This is for linking to the source code of functions (and
other definitions), for when you are discussing or explaining code. It's not
for importing code at compile time or runtime.

P.S. Sourcegraph supports semantically linking to CSS as well:
[https://sourcegraph.com/sourcegraph/sourcegraph/-/def/basic-...](https://sourcegraph.com/sourcegraph/sourcegraph/-/def/basic-
css/sourcegraph/-/app/node_modules/amplitude-js/documentation/styles/jsdoc-
default.css%2523main).

~~~
danvoell
Got it, thanks for the clarification.

------
ed
It'd be cool if you could create a permalink from a github URL (with a line
no. param).

Then the interface could look like tinyurl (anonymously paste a github link,
get a sourcegraph link in return).

Bonus points if it simply redirects you to the new line number on GitHub's
master.

~~~
sqs
We have something like that—and better in some ways. Check out the Sourcegraph
Chrome extension: [https://chrome.google.com/webstore/detail/sourcegraph-for-
gi...](https://chrome.google.com/webstore/detail/sourcegraph-for-
github/dgjhfomjieaadpoljlnidmbgkdffpack). You get jump-to-def by clicking on
code on GitHub. And it uses these semantic URLs in a backward-compatible way,
so that they encode the definition/function name but also the line number (so
anyone not using the extension can still use the URLs).

It's a great idea to make a little app to get a permalink, too!

------
foota
Don't we already have this in the form of NPM?

------
waxjar
What happens if a function is renamed?

~~~
sqs
We don't handle that case right now, but it's a TODO. Will be an interesting
problem to address.

------
sandebert
...for an extremely narrow definition of "world". (Github)

~~~
sqs
Sourcegrapher here. Git repositories hosted anywhere with Go or Java code will
work. Admittedly, that's still narrow, but we are just beginning. :) Also, see
[https://news.ycombinator.com/item?id=11856255](https://news.ycombinator.com/item?id=11856255).

------
kazinator
Or, just embed the entire function _in_ the URL. :)

------
Roritharr
Hi, this looks very interesting but maybe more on the side of a feature i wish
GitLab, Bitbucket etc would have than other getting a dedicated for.

~~~
sqs
Sourcegraph founder here. If you install the Chrome extension
([https://chrome.google.com/webstore/detail/sourcegraph-for-
gi...](https://chrome.google.com/webstore/detail/sourcegraph-for-
github/dgjhfomjieaadpoljlnidmbgkdffpack)), you can get it on GitHub. Support
for GitLab and Bitbucket is coming soon.

And what's more, the Chrome extension actually adds # fragment URLs to
github.com that are semantically meaningful, just like on Sourcegraph.

------
nojvek
Would make a lot of node modules as functions redundant

------
partycoder
You mean like RPC? If so take a look at Thrift, Protocol buffers, Avro, or
whatever.

If the idea is to just expose reusable functions, you can take a look at
[https://algorithmia.com/](https://algorithmia.com/)

