<!-- On http://zack.is/history -->
<link rel="past-url" href="http://zackbloom.com/history.html">
<link rel="past-url" href="http://zack.is/history.html">
It's not that what we have now is perfect, but the use-case of typing in example.com or googling example is indeed a low enough bar that non-technical users get it, and re-tooling everything to support an alternate format is a non-starter. See also: IPv6 adoption, and not many people are typing in or googling IPs.
You never see an advert with "visit our site at http://user:email@example.com/app/page1244.html?abc=def#hel..., it's just "mycompany.com". The structure of the URL's once the user has hit the landing page and clicked a couple of links doesn't need to be known by the user, nor should the user care.
To put it simply: if it was hard then the web wouldn't be as big as it is now. Non-technical people understand simple URI's and that's all that's needed.
I would say my biggest criticism of it is it's rather intolerant of human error. A single mistyped character, and you are not going to the website you intend. I think that's the reason searching for something on Google can actually be a better experience than trying to figure out the URL directly.
To your advertisement point, there was a long history of companies including the http:// and www in their ads. There's always this of course too: http://i.imgur.com/cfTxpd2.jpg
I think search engines are also powerful in that they can provide context - google knows that when I search for C#, I am not looking for anything musical...but that inference is at odds with being able to pinpoint a specific content item out of a more or less infinite pile.
We were all forced to learn the technical constructs of mailing addresses and phone numbers, and URLs provide a similar service to a much larger problem set (so it's not surprising to me that they must be more complex).
I'm sure someone smarter than me will find a solution sooner or later...
Sorry, but that's a completely wrong and horrible attitude and it's the sort of sentiment which leads to things like browsers hiding URLs and the resultant rise of even more computer-illiterate users completely dependent on (and thus at the mercy of) search engines and their opaque, proprietary, and sometimes completely idiotic ranking systems... you may find the vigorous discussion here relevant:
This sort of thing is useful for advertising specifically. In the examples I've given:
1) targeted information
2) short url
3) direct-to-relevant-site from the ad
In my experience, in-house or business-wide advertising tends to use just the domain, but campaigns farmed out to marketing firms or campaigns related to specific products tend to use path info as well as domain name.
And while I've never seen auth credentials in an advertised URL, I have frequently seen "example.com/product#platform" style urls, where #platform (and sometimes ?platform) indicates which magazine the ad appeared in.
You seem to be reinforcing the OP's point.
Since the fragment doesn't get sent to the server, that seems broken …
Yes a realize that might be problematic. Just passing on the different internet culture.
But on the "pro" side for searching, it should help autocorrect people who can't remember the specific name or the correct spelling.
Definitive guide apparently paraphrased RFC 2396 which clearly defined semicolon as segment parameter delimiter, but later was obsoleted [a] by RFC 3986, moving former standard to "possible practise" [b] stating that:
> URI producing applications often use the reserved characters allowed in a segment to delimit scheme-specific or dereference-handler-specific subcomponents. For example, the semicolon (";") and equals ("=") reserved characters are often used to delimit parameters and parameter values applicable to that segment. The comma (",") reserved character is often used for similar purposes. For example, one URI producer might use a segment such as "name;v=1.1" to indicate a reference to version 1.1 of "name", whereas another might use a segment such as "name,1.1" to indicate the same. Parameter types may be defined by scheme-specific semantics, but in most cases the syntax of a parameter is specific to the implementation of the URI's dereferencing algorithm. [c]
[a] would't have noticed without: http://stackoverflow.com/questions/6444492/can-any-path-segm...
Recently I have begun using them in a few of my internal APIs, where it is useful to have a distinction between "parameters used to filter a set of items" and "parameters used to request items in a particular format"... an example:
Con: consumers of your API get to discover the joys of RFC 3986, because there is essentially zero support for path segment parameters in modern HTTP libraries.
Also previously on HN: https://eager.io/blog/the-languages-which-almost-were-css/
Both excellent reads.
It wouldn't make sense for TLDs like com, net, org, etc., but for trademarked TLDs like barclays, youtube, pwc, etc. visitors could essentially go straight to that webpage with the TLD, like https://youtube
That's because they... actually are domains too. Top Level Domains, as the acronym says.
It wouldn't make sense for TLDs like com, net, org, etc.
Actually it might --- that's a great place to put all the information about (and maybe instructions on how to register) a subdomain there.
But, to use the precise terminology, a domain name with a single 'label' has historically been treated as a "relative" domain, e.g. https://localhost , and no doubt many of you will be familiar with intranet sites that use similar "non-qualified" single-label names.
URLs started as "http://www.example.com" in the late 90's, and then went to "www.example.com", and are now mostly "example.com". For major trademarks, it could be the natural progression to just go to "example". Or not - internet use is weirdly unpredictable.
One way it could be shorter and keep context is if we hijacked a special character like Twitter does #example or @twitter--but I can't imagine standards bodies getting on board with using a special character (or trying to get people to adopt it as a replacement for web URLs).
Only one new TLD does this, and the page is empty: http://мон./ or http://xn--l1acc./
Compared to "real-life addresses", URLs are an absolute pleasure to handle and understand; which naturally raises the question of why so many people seem to have trouble, or are suggesting that others do, with URLs? It's just a "virtual" address, in an easily-parseable format for identifying a location in "cyberspace". Perhaps its the "every time you put an intelligent person in front of a computer, his/her brain suddenly disappears" syndrome (for lack of a better word)?
The advocacy of search engines instead of URLs is also not such a great idea; sadly, search engines like Google today do not work like a 'grep' that lets you find exactly what you're looking for, do not index every page (or allow you to see every result, which is somewhat equivalent), and the results are also strongly dependent upon some proprietary ranking algorithm which others have little to no control over. If relying on links and having them disappear is bad, relying on SERPs is even worse since they are far more dynamic and dependent on many more external factors which may even include things like which country your IP says you're from when you search.
Search engines are definitely useful for finding things, but as someone who has a collection of links gathered over many years, most of which are still alive and yet Google does not acknowledge the existence of even when the URL is searched, I am extremely averse to search engines as a sort of replacement for URLs.
162 Portsmouth St
Denver, CO 42348
Encodes the city in both the zip and city / state, allowing either to be wrong. If 162 doesn't exist on that street, it's possible they can correct it based on the name. If it's Portsmouth Ave not St, they likely can figure that out.
None of that flexibility exists in URLs. A single mistyped character or misspelling and you are going to the wrong place with no way of getting where you want (without search engines).
It's a system which works very well for machines, and only passably well for error-prone humans.
But I don't know if it explains why people don't understand URLs. Physical location addresses are no more complicated than URLs (with the exception of URL-encoded blobs in URLs, which are not usually human-readable). That said, I don't usually talk about this stuff with non-tech types, so maybe the average person does understand URLs more or less.
The author does not explain why he is glad that the more generic solution won out. Having strongly-typed queries might have brought as much closer to some approximation of a practical "semantic web" and done some wonders for web services, accessibility and others.
Maybe he is glad because not having any strong typing allowed us to have the flexible, completely free-form web interfaces we have today, but who's to decide that that wouldn't have emerged anyway; maybe even in a slightly saner form than the horrible mess we have today.
On a completely unrelated note:
Given the power of search engines, it’s possible the best URN format today would be a simple way for files to point to their former URLs.
This raises immediate concerns about security and spam, but that may be solvable somehow.
That being said, I really enjoyed reading this thoroughly researched history a lot, even more than the previous, also great installment about CSS's history. (But my preference is just because I'm not a Web/Design guy.)
where the DIFF_HASH is a fragment pointing to a particular resource within the commit.
Doing it for links is problematic though. What do you do if the hash doesn't match though? Show an error? Show the destination page anyway? What happens if the author _wants_ to change the content? Should links be immutable?
The big difference between SRI and my proposal is that the hash go on the URL rather than an attribute.
but not the default install IE9. Figures.
To be fair, the author said browsers, and not incompatible mock-browsers like IE.
I don't find it ridiculous, since despite gopher going out of existence, and ftp being a minority, http vs https distinction is quite important until the present day, especially considering that redirect form http to https can be insecure and the proper way to open many sites is explicitly with https://
It might get fixed in the future, but it didn't happen yet.
That said, I do think it's unreasonable to think an average Internet user will discern that you specified https. I think a better practice is to try to use https on every site, and only fallback if you decide you are willing to accept the lack of security.
Of course there is the obvious broken `java.net.URL` but there are so many other libraries and coding practices where programmers just continuously screw up URL/URI/URNs over and over and over. It is like XML/HTML escaping but seems to be far more rampant in my experience (thankfully most templating languages now escape automatically).
In large part I believe this is because of the confusion of form encoding and because of the URI specs following later that supersede the URL specs (but actually are not entirely compatible).
In our code base alone we use like 8 different URL/URI libraries. HTTP Components/Client (both 3 and 4), Spring has its own stuff, JAX-RS aka Jersey has its own stuff, the builtin crappy misnamed URLEncoder, the not that flexibile java.net.URI, several others that I can't recall. I'm surprised Guava hasn't joined in the game as well.
I would love a small decoupled URL/URI/URN library that does shit easily and correctly. URL templates would be nice as well. I have contemplated writing it several times.
As for non-expiring identifiers to pages/content, the 'expired' URLs are as good identifiers as anything, considering URL redirects exist.
I did find this tidbit of information quite intriguing: the creator of the hierarchical file system attributes his creation to a two hour conversation he had with Albert Einstein in 1952!
If it was a design choice to use a 3-character separator instead of a single character, seems an odd choice.
More here: https://www.w3.org/People/Berners-Lee/FAQ.html#etc
WS and WSS will become more and more commonplace over time. I like that a 25+ year old protocol is forward compatible enough to accommodate new methods of network communication. It could be debatable whether HTTPS and WSS are necessary for a URL but they give a hard guarantee that a secure connection will be made and not silently downgraded for those who care about such things.
Using basic authentication over SSL, does that mean if you entered https://user:pass@domain that the user and pass would be sent in the clear, or does this get put into the header and encrypted?