
Ask HN: Why does a url use ://? - quizbiz
anyone know why a url is structured that way?
======
jacquesm
The ':' is to separate the scheme from the bits that are specific to the
scheme. The // is to indicate a hostname and not a directory. http:/test is a
valid url indicating a relative path on the machine that the current resource
came from, <http://test> is a url that specifies a resource on a machine
mapped to the TLD test. The double slash removes the ambiguity. It really is
two bits, a ':' and the '//'

Other possible sources of confusion:

<http://test.com:80/>

You could then get:

http:/test.com:/

Position would give it away, but, if you then complicate matters further by
using default protocol (in your broweser http) it looks like:

:/test.com:/

In many browsers

://test.com:/ is perfectly legal.

Of course, you could strip that down further by dropping the default port
colon to get:

://test.com/

See here for a much longer (and probably better :) ) explanation:

<http://tools.ietf.org/html/rfc1738>

I hope that helps !

------
makecheck
Technically, not all of them do; "<schema>:" is the common prefix, and
everything else depends on the URL type.

See: <http://www.ietf.org/rfc/rfc1738.txt>

In particular, section 3.1, 'The scheme specific data start with a double
slash "//" to indicate that it complies with the common Internet scheme
syntax.'. So it is used by URLs that require any of this information:
"//<user>:<password>@<host>:<port>/<url-path>".

~~~
crux
I should note that both Safari and Firefox understand the URL
'http:news.ycombinator.com/item?id=796434' as well.

~~~
there
but only in the url bar; both turn a url like that into a relative url when
used in an img src, for example.

------
TimothyFitz
Note that a url like //www.example.com will use the context's protocol, which
means an img src="//www.example.com" will use http or https depending on which
the page was loaded in. Very handy!

------
vinoski
In a private conversation some years ago, Tim BL told me that he used to use
Apollo workstations back in the 1980s and that he really liked Apollo
Domain/OS, so he took the // from the Domain distributed filesystem, which
used // as a way of addressing possibly remote files, i.e.,
//hostname/path/to/file . I suspect the \\\ in Microsoft UNC pathnames is also
derived from the same, probably due to Paul Leach's influence there as he was
also from Apollo.

------
kwantam
RFC 1738 section 2.1 specifies the colon between the scheme (e.g., "http") and
the scheme-specific-part (e.g., //news.ycombinator.com). The // is specified
in section 3.1 as part of the "common internet scheme syntax." Specifically,
the // is intended to identify the scheme-specific-part as complying with the
CISS.

------
mikeytown2
Kinda wondering, but doesn't a question like this belong on stack overflow?

~~~
I_got_fifty
Possibly. But these are the cool questions that make Hacker News Hacker News
and not Digg.

~~~
TeHCrAzY
I think its the answers, more than the questions, are what makes Hacker News.

------
DXL
Tim Berners Lee, the creator of the Web and of URLs, wrote some time ago that
he regretted the double slash in URLs, saying that one would suffice.

~~~
jacquesm
It went a little further than that, I think the scheme would have been:

http:/com/ycombinator/news/item?id=796434

~~~
Sapient
And that would have been awesome.

~~~
jacquesm
Yes, but there would be a couple of problems as well, specifically the DNS
would need some major revamping. The DNS was already operational long before
the web came along and it already used the '.' notation.

http:/com/test/www/someresource

and

http:/uk/co/test/www/someresource

Would have both been valid resources but it would be harder than now to figure
out where the machine boundary is located.

You can't go by 'count' (because of subdomains) and you can't go by www
either.

I think if they would have gone that route for practical reasons the // would
have been 'reinvented', and it would probably be placed like this:

http:/com/ycombinator/news//some/path/resource.html

I'm not sure what the implications for phishing, certificates and humans
interpreting URLs would have been in that situation either, but I know that I
find it convenient to be able to fish the 'hostpart' out of a URL without
further knowledge on my side.

~~~
rwmj
Heh, when _I_ first started using the internet, domain names in the UK were
"backwards". My first email address was:

rwj@uk.ac.dl.cxa

In fact the first "domain names" I used were things like "lancs.pdsoft" but I
think those were X.25 names, and the less said about X.25 the better.

At some point around '91 or '92 JANET reversed all the domain names to bring
it into line with IETF standards. This caused some confusion with names
beginning "cs." which could either be the Computer Science dept of some UK
unversity, or a domain in the old Czechoslovakia.

~~~
alain94040
With UUCP, we used to put machine names in the order they were needed for
routing, so in effect it was backward compared to today.

------
californiaguy2
The : separates the protocol and location, and // is the root level of a path.

~~~
smakz
<http://en.wikipedia.org/wiki/URI_scheme#Generic_syntax>

------
dtby
<http://www.scottaaronson.com/blog/?p=101>

