

URLs Are The Uniform Way to Locate Resources - Titanous
http://adam.blog.heroku.com/past/2010/3/30/urls_are_the_uniform_way_to_locate_resources/

======
pak
I never liked the idea of putting passwords into URLs... it just gives people
the wrong idea about how they should handle their password.

To me, URLs and passwords are orthogonal. One says "this is where it is", the
other says "let me in please".

------
earle
NO SHIT! This is the top link on Hacker News? Is this a fucking joke?

Why are -incorrect- cursory coverage of baseline RFCs Hacker News worthy?

~~~
psadauskas
Because its become very apparent how few web-developers understand "web". The
more articles like this that point them in the right direction, the better.

------
krainboltgreene
Except URI's aren't uniform. They're used in so many ways as to be confusing.
For instance:

"git://github.com/thoughtbot/paperclip.git" vs
"smtp://user:pass@hostname/domain" vs
"<http://news.ycombinator.com/user?id=krainboltgreene> vs "chrome://history/"

The user name appears in the path, the username section, and the query params.
All of these are pretty "standard" uses. So while they might be called
"uniform" the reality is absolutely different.

~~~
jerf
"Except URI's aren't uniform."

Sure they are! They're structured data serialized to a binary blob that claims
to be a string but contains no encoding indication with guidance given by an
internet standard for the contents of the string before the first colon but no
guidance whatsoever after that! What's not uniform about that?

Normally I wouldn't be so pedantic but at the point where you're talking about
"mysql://myuser:mypass@db8.myhost.com:3306/mydatabase" as if it's some sort of
solution to a problem you've really dropped the ball.

URLs are only meaningful in a given semantic context. "http" and "https" are
meaningful because we all agree what they mean (see RFCs below). "git" is
meaningful-ish because there's only one thing that plausibly can be said to
give a definition, but your application does not magically gain any
understanding of the subsequent what-might-as-well-be-a-binary-blob merely by
virtue of sticking "git:" in front of it. "mysql" is simply meaningless. I use
Perl and therefore DBI and I observe that it too has a sort of "mysql URL" but
it looks nothing like Ruby's. Uniformity is relatively to a _universal_
agreement about what it means, and mysql lacks that. For that matter, git may
very well lack it in the future, if someone implements other gits (and I've
seen attempts). Without that universal agreement, you don't have a URL. You
can't make a URL by fiat.

For non-standard URLs, what follows the colon... and for that matter what
precedes it... is nothing more and nothing less than a binary blob in a
constrained character set. And this article should be treated exactly as if it
were an article about how all your resource location problems go away if you
just express them in terms of opaque binary blobs, because once you leave http
and https behind, that is what you are doing. Not "like" what you are doing,
it _is_ what you are doing, full stop. It may work for you and it may work for
your friends, but that is not by the magic of calling your binary blob a URL,
it is by the magic of agreeing to a way to interpret bits, and that's hardly
any sort of breakthrough.

(Just to be clear, this is vehement agreement that acknowledges that you got
to this point first.)

By the way, since I can already guess that someone will reply with something
like m0th87's point, I invite you to read the URL RFC:
<http://www.ietf.org/rfc/rfc1738.txt> But read it _carefully_ , for what it
_actually mandates_. Section 2.2: " _Many_ URL schemes reserve certain
characters for a special meaning"... none of them are universal to URLs, they
are all scheme-specific, which means you can't trust their meaning in
undefined schemes. Section 2.3: " _Some URL schemes_ ... contain names that
can be considered hierarchial"... / doesn't have a universal meaning, it's
relative to the scheme. 3.1 describes the double-slash, which I can now say
I've seen used incorrectly in both directions. Section 3.5 defining "mailto:"
observes that URLs aren't even necessarily resources. (Section 3.10, file URLs
defined in a way that violates the earlier discussion of double-slash. I
understand why, but it's still a violation.)

And if you want to talk URI (<http://www.ietf.org/rfc/rfc2396.txt> ), section
3 starts right off with "The URI syntax is dependent upon the scheme."

~~~
derefr
So, what you're saying, basically, is that URLs confer no special advantage
over URNs—which _are_ specified to just be binary blobs with a
schema[/namespace] identifier attached.

I don't agree with this. URLs, in _practice_ , are a standardized format,
predicated mostly on how HTTP has handled them. Any active, well-known URL
schema will use

    
    
        schema://username:password@host:port/resource/path?query=parameters&more=with%20percent%20encoding#and-fragment-identifier
    

as that is what we consider to _be_ a URL, no matter what the RFC says. And
that format is useful for encoding a great many things. Just because some
libraries have chosen to create things that _resemble_ URLs (such as MySQL, as
you mentioned), does not mean that they _are_ URLs as the term is
descriptively, not prescriptively, defined.

~~~
jerf
The scheme breaks down _in practice_ at the resource point, and even before
then is a stretch in some cases. But at the resource point it's all over.
There's no agreement about "mysql". And...

"Just because some libraries have chosen to create things that resemble URLs
(such as MySQL, as you mentioned), does not mean that they are URLs as the
term is descriptively, not prescriptively, defined."

As it turns out, that's a key part of the point I was making. This is why the
original article is silly.

~~~
derefr
RDBMSes, of course, _don't have_ hierarchically-organized "resources." That's
the whole point of the "R" in there. However, _the rest_ of the URL format
still applies to them. The resource and query-parameter parts of the URL are
indeed "a binary blob"—but that doesn't matter. Why?

The great thing about URLs, within a protocol like HTTP, is that they're
_discoverable_ —if the protocol associated with your schema guarantees some
sort of non-destructive querying operation on a resource (ala a GET or HEAD
operation), then you can retrieve the / resource of a server to get a
_sitemap_ , and use the further, hyperlinked URLs to find all of the site's
resources in turn. You're _supposed_ to treat the resource and query parameter
parts of the URL as a binary blob; they're a token you give to the server to
get other tokens.

URLs let you turn a your meaningless binary blob—a.k.a. a URN—into a (where to
ask, what to ask for, who I am) triple. This is a useful thing, even if the
"what to ask for" part is still opaque! It means that you can always
_dereference_ a URL starting from the Internet as a whole, whereas with a URN
the "where to ask" part has to be figured out on your own. If you combine this
with a discoverable protocol, you enable your URLs to be _spidered_ and thus
_indexed_ —and then they can all be found _and used_ by anyone who has a
single graph edge pointing into your site. That's way better than, say, an
ISBN number, isn't it?

"mailto" and "file" and those other ones you mentioned _aren't_ URLs, _as the
term is commonly used_. They are, in practce, URNs—they consist of a schema
(namespace identifier) and an opaque blob. They _don't_ decompose into a
(where to ask, what to ask for, who I am) triple.

URLs are amazing, but to _call_ something a URL, it has to already _be_
uniform, and to decompose into one of those triples. So the original article
was silly (see the sqlite3 example—that's a URN right there)—but at the same
time correct. _Use URLs to locate your resources._ Just don't make something
up and _call_ it a URL; do the actual work of having a standard, uniform
string that has a resource and a location in it. And if you don't have the
weight to make your own standard and make everyone treat it as uniform? _Use
someone else's_. Use HTTP, even! REST is basically people realizing HTTP
guarantees nice things about URLs and taking advantage of that.

~~~
thwarted
_RDBMSes, of course, don't have hierarchically-organized "resources." That's
the whole point of the "R" in there._

Databases contain schemas contain tables contain columns. The hierarchy is
just fixed types and of limited depth.

And then there's S3, which names a resource but doesn't imply any hierarchy,
since the namespace is actually flat, even though it looks like a hierarchical
path with / as a separator.

------
richcollins
I don't see why the json format that he mentioned couldn't easily be made
uniform through standard keys (protocol, port, host, path ... etc)

~~~
andrewtj
It could — I'm doing something kind of similar with a DNS service I'm
building. It has an HTTP interface which amongst other things exposes DNS-SD
services (which include host, protocol, port and service-specific key-value
pairs).

------
benkant
Surprised you didn't know.

