
Cool URIs Don't Change (1998) - tarikozket
https://www.w3.org/Provider/Style/URI
======
niftich
This is one of those classic, foundational documents about the Web. But it's
rarely followed. Tool use has come to dominate the form that URIs take; tools
are used both for delegation and to absolve humans from crafting URIs by hand.
Switching tools frequently ruins past URIs.

Additionally, widespread use of web search engines has made URI stability less
relevant for humans. Bookmarks are not the only solution to find a leaf page
by topic again. A dedicated person might find that archiving websites may have
preserved content at their old URIs.

Some of this is allowed to happen because the content is ultimately
disposable, expires, or possesses limited relevance outside of a limited
audience. Some company websites are little more than brochures. Documents and
applications that are relevant within organizations can be communicated out of
band. Ordinary people and ordinary companies don't want to be consciously
running identifier authorities forever.

~~~
paxys
The web has evolved well beyond what it was envisioned to be at the time this
was written - a collection of hyperlinked documents.

The reason for the eventual demise of the URL will simply be the fact that the
concept of "resource" will just not be sufficient enough to describe every
future class of application or abstract behavior that the web will enable.

~~~
scoutt
I don't think it has evolved. I feel that it became more like a hack, on top
of a hack, on top of another hack, and so on.

In the late 90's - early 2000's, HTML started to being pushed into fields
that, at my opinion, were unrelated (remember active desktop?). Before you had
time to react, HTML was being used to pass data between applications. At the
time I was already doing embedded stuff and I remember being astonished to
learn that I have to code an HTML parser/server/stack in my small 16-bit micro
because some jerk thought it was a good idea to pass an integer using HTML
(SOAP, for example).

In the meantime, HTML was being dynamically generated, and then dynamically
modified in the browser, and then modified back in the server using the same
thing you use to modify it in the browser. It's a snowball that will implode,
sooner or later.

~~~
082349872349872
"a hack, on top of a hack, on top of another hack, and so on" _is_ evolution.

My HN username may be a case in point, drawing from a selection of twice
five[0] digits due to legacy code of Hox genes:
[https://pubmed.ncbi.nlm.nih.gov/1363084/](https://pubmed.ncbi.nlm.nih.gov/1363084/)

[0] "This new art is called the algorismus, in which / out of these twice five
figures / 0 9 8 7 6 5 4 3 2 1, / of the Indians we derive such benefit"

[https://upload.wikimedia.org/wikipedia/commons/thumb/3/35/Ca...](https://upload.wikimedia.org/wikipedia/commons/thumb/3/35/Carmen_de_Algorismo.pdf/page1-1125px-
Carmen_de_Algorismo.pdf.jpg)

~~~
scoutt
You might see the Homer Simpson Car[0] and call it evolution too. But what I
see is a mess, described as a sequence of hacks and bad decisions, just like
HTML (and web stuff) today.

[0] [https://www.wired.com/2014/07/homer-simpson-
car/](https://www.wired.com/2014/07/homer-simpson-car/)

~~~
xboxnolifes
Homer Simpsons car did not evolve at all. It was designed all in a single
iteration.

------
joosters
While the main concept (Don't change your URIs!) is good, I can't agree at all
with their advice on picking names, in particular the 'what to leave out'
section. No subject or topic? The justification for this is flimsy at best -
_' the meaning of the words might change'_ So what? People cope with this all
the time in other media, e.g. old books. It's not too confusing. What's more
confusing is a URI that has all the meaning removed, after all this whole URI
discussion is about the human appearance of URIs. Take out the topics and you
are just left with dates, numbers and unspecific cruft. If I was designing a
company's website, I'm sure as hell going to put the product pages under
'/products'.

FWIW, the document's own URI is terrible:
'[https://www.w3.org/Provider/Style/URI'](https://www.w3.org/Provider/Style/URI')
\- who could have any idea what the page is about from that? And what if the
meaning of the word 'Provider' or 'Style' changes in x years from now? :) You
could argue that the meaning/usage of 'URI' has already changed, because
practically no-one uses that term any more. Everyone knows about URLs, not
URIs. Not many people could tell you what the difference was. So the article's
URI has already failed by its own rules.

~~~
bad_user
IMO that's a pretty good URL. For example if you drop it often in
conversations, you can remember it, since it's short enough and has no numbers
or awkward characters. I would have preferred lower casing and if you try it
with lower cased letters, it doesn't work, but other than that ...

No, a URL doesn't necessarily have to give you the title of the article, even
if having some related words in it might be good for SEO value. If you paste
it in plain text or similar, add a description to it. Here's how:

Cool URIs Don't Change:
[https://www.w3.org/Provider/Style/URI](https://www.w3.org/Provider/Style/URI)

There, now the reader will know what's this about.

~~~
SquareWheel
I don't think being able to remember URIs is particularly useful. In 99% of
cases they're clicked on or shared, not recited from memory.

I'd still get far more value out of this:

[https://www.w3.org/article/1998/cool-uris-dont-
change/](https://www.w3.org/article/1998/cool-uris-dont-change/)

~~~
CaptArmchair
I think the short gist of it is: naming things is a hard problem.

I think you both stumbled upon a fundamental part of the discussion: the
tension between finding a way to identify resources (or concepts, or physical
things) in a unique and unambiguous fashion, and affordances provided by
natural language that allow human minds to easily associate concepts and
labels with the things they refer to.

The merit of UUID's, hashes or any other random string of symbols which falls
outside of the domain of existing natural languages, is that doesn't carry any
prior meaning until an authority within a bounded context associates that
string with a resource by way of accepted convention. In a way, you're
constructing a new conceptual reference framework of (a part of) the world.

The downside is that random strings of symbols don't map with widely
understood concepts in natural language, making URL's that rely on them
utterly incomprehensible unless you dereference them and match your
observation of the dereferenced resource with what you know about the world
(e.g. "Oh! [http://0x5235.org/5aH55d](http://0x5235.org/5aH55d) actually
points to a review of "Citizen Kane")

By using natural language when you construct a URL, you're inevitably
incorporating prior meaning and significance into the URI. The problem is that
you then end up with the murkiness of linguistics and semantics, and ends with
all kinds of weird word plays if you let your mind roam entirely free about
the labels in the URI proper.

For instance, there's the famous painting by René Margritte "The treachery of
images" which correctly points out that the image is, in fact, not a pipe:
it's a representation of a pipe. [1] By the same token, an alternate URI to
this one [2] might read [http://collections.lacma.org/ceci-nest-past-une-
pipe](http://collections.lacma.org/ceci-nest-past-une-pipe), which
incidentally correct as well: it's not a pipe, it's a URI pointing to a
painting that represents a physical object - a pipe - with the phrase "this is
not a pipe."

Another example would be that a generic machine doesn't know if
[http://www.imdb.com/titanic](http://www.imdb.com/titanic) references the
movie Titanic or the actualy cruiseship, unless it dereferences the URI,
whereas we humans understand that it's the movie because we have a shared
understanding that IMDB is a database about movies, not historic cruiseships.
Of course, when you build a client that dereferences URI's from IMDB, you
basically base your implementation on that assumption: that you're working
with information about movies.

Incidentally, if you work with hashes and random strings, such as
[http://0x5235.org/5aH55d](http://0x5235.org/5aH55d), you're client still has
to be founded on a fundamental assumption that you're dereferencing URI's
minted by a movie review database. Without context, a generic machine would
perceive it as random string of characters which happens to be formatted as a
URI, and dereferencing it just gives a random stream of characters that can't
possibly be understood.

[1]
[https://en.wikipedia.org/wiki/The_Treachery_of_Images](https://en.wikipedia.org/wiki/The_Treachery_of_Images)
[2]
[https://collections.lacma.org/node/239578](https://collections.lacma.org/node/239578)

~~~
SquareWheel
Good comment, thanks for sharing.

It's an interesting topic. I agree with you that identifiers can be intended
for humans or machines, and there's often different features to optimize for
depending. URIs are the strange middle ground where they include pitfalls of
having to account for both humans and machines.

In an interesting way, each individual website has to come up with its own
system for communication. It may be a simple slug (/my-new-blog/), or it may
be an ID system (?post=3). It could be something else completely.

There is some value in offering that creativity, but a system where URIs are
derived from content also makes a lot of sense to me. You mentioned a hash
which I think is the right idea.

It seems reasonable enough that URIs could take inspiration from other
technologies like git, or even (dare I say) blockchain. This leads naturally
to built in support for archiving of older versions, as content is diffed
between versions.

There's some fun problems to think about like how to optimize the payload for
faster connections, then generate reverse diffs for visiting previous
versions. Or if browsers should assume you always want the newest version of
the page, and automatically fetch that instead.

This solves some problems, and creates many others. Interesting thought
experiment anyway.

------
mapgrep
Rhetorical question: Why must we charge annually to control domains? Should we
stop doing this in the name of greater URL stability?

The article states early on, “Except insolvency, nothing prevents the domain
name owner from keeping the name.” As it turns out, insolvency is a pretty
significant source of URL rot, but also so is non renewal of domains by choice
or by apathy, whether for financial or mere personal energy reasons (“who is
my registrar again? Where do I go to renew?”) especially by individuals. You
start a project and ten years later your interest has waned.

Domains are an increasingly abundant resource as TLDs proliferate. Why not
default to a model where you pay once up front for the domain, and thereafter
continued control is contingent on maintaining a certain percentage of
previously published resources, and if you fail at that some revocable
mechanism kicks in that serves mirrored versions of your old urls. Funding of
these mirrors comes from the up front domain fees. Design of the mechanism is
left as an exercise for the reader :-)

~~~
ocdtrekkie
Domain renewal is definitely the lesser cost to maintaining a website. If you
can afford the server, the domain is basically free already.

~~~
mapgrep
Blogger and Tumblr will map a domain to a blog for free.

~~~
dwighttk
And that will definitely not change in 100 years

~~~
mapgrep
Blogger has been serving urls for something like 17 years. I’d wager its sites
have something like 2x or more average url lifespan at this point than the
typical site. What we want right now is _more_ url stability not perfect
assurance of 100 year url lifespan. Don’t let the perfect be the enemy of the
good.

------
dang
If curious see also

2016:
[https://news.ycombinator.com/item?id=11712449](https://news.ycombinator.com/item?id=11712449)

2012:
[https://news.ycombinator.com/item?id=4154927](https://news.ycombinator.com/item?id=4154927)

2011:
[https://news.ycombinator.com/item?id=2492566](https://news.ycombinator.com/item?id=2492566)

2008 ("I just noticed that this classic piece of advice has never been
directly posted to HN."):
[https://news.ycombinator.com/item?id=175199](https://news.ycombinator.com/item?id=175199)

also one comment from 7 months ago:
[https://news.ycombinator.com/item?id=21720496](https://news.ycombinator.com/item?id=21720496)

------
heinrichhartman
I think this is just unrealistic. Let's look at this example:

    
    
        http://www.pathfinder.com/money/moneydaily/1998/981212.moneyonline.html
    

This consists of:

0\. Access protocol

1\. Hostname/DNS name

2\. Arbitrary chosen path hirarchy

3\. File extension

This is really a description where to find a document ("locator" not
"identifier"). So, if you are:

\- re-organizing / cleanup your file structure

\- change or hide the file extension

\- enable HTTPS

\- migrating files to a different domain name

This WILL change the URL. What are you going to do? Not cleanup your space
anymore? Stick to HTTP? So URLs DO change. That's just the reality.

If you want something that does not change, don't link to a location but link
to content directly: E.g.

\- git hashes do not change

\- torrent/magnet Links don't change

\- IPSFS links do not change.

Or use a central authority, that stewards the identifier:

\- DOI numbers don't change

\- ISBN numbers don't change

~~~
BorisTheBrave
> What are you going to do?

The article addresses this by reminding you that though URIs often look like
paths, they can be aribtrarily mapped.

By all means move the resource, but _put a redirect under the old URI_. This
means old links continue to work, which is the key point of the article.

~~~
heinrichhartman
Yes. Have you tried to do that even for moderately complex sites?

I have tried to do it a few times, and eventually just gave up. Carrying
forward bad naming decisions from the past, is tremendous effort. When
cleaning up the house, I also don't leave around sticky notes at the places
where I removed documents from.

On top of this:

\- When using static site generators, it's not even possible to do 301
redirects (you would have to ugly slow JS version).

\- It does not help if you don't own the old DNS name anymore.

~~~
captn3m0
Using a SSG does not mean you don't have a intelligent server that can't do
redirects. That's a limitation of certain web hosts (GitHub Pages for eg).

Netlify allows dead simple redirects, and so do most other static hosting
platforms.

~~~
cthor
Even GitHub Pages behind Cloudflare is capable of issuing a 301.

------
dfabulich
In the very footer of this page:

> _Historical note: At the end of the 20th century when this was written,
> "cool" was an epithet of approval particularly among young, indicating
> trendiness, quality, or appropriateness. In the rush to stake our DNS
> territory involved the choice of domain name and URI path were sometimes
> directed more toward apparent "coolness" than toward usefulness or
> longevity. This note is an attempt to redirect the energy behind the quest
> for coolness._

It's 2020 and "cool" still has that same meaning, as an informal positive
epithet. I believe "cool" is the longest surviving informal positive epithet
in the English language.

"Cool" has been cool since the 1920s, and it's still cool today. "Cool" has
outlived "hip," "happening," "groovy," "fresh," "dope," "swell," "funky,"
"bad," "clutch," "epic," "fat," "primo," "radical," "bodacious," "sweet,"
"ace," "bitchin'," "smooth," and "fly."

My daughter says things are "cool." I predict that _her_ children will say
"cool," too.

Isn't that cool?

~~~
yellowstuff
"Smooth" is definitely still current slang, with a meaning similar to "cool."
And "smooth" came first:

> Slang meaning "superior, classy, clever" is attested from 1893. Sense of
> "stylish" is from 1922.

> A 1599 dictionary has smoothboots "a flatterer, a faire spoken man, a
> cunning tongued fellow."

It may be time to bring that one back. "Did you see Keith chatting up that
girl at the bar? Total smoothboots."

[https://www.etymonline.com/word/smooth](https://www.etymonline.com/word/smooth)

~~~
thaumasiotes
I would say the 1599 sense more accurately reflects the current sense of
"smooth" than the 1893/1922 citations do.

~~~
ggm
_Sophisticated_ used to mean false, as in sophistry: with intent to deceive.
So a sophisticated wine, was an adulterated wine, that had something other
than fermented grape juice in it.

~~~
fouc
TIL. I was quite surprised that sophistication used to mean
deceptive/misleading behavior.

[https://en.wikipedia.org/wiki/Sophistication](https://en.wikipedia.org/wiki/Sophistication)

~~~
thaumasiotes
"Sophistry" still does.

"Silly" is the standard example of semantic shift over what people generally
perceive to be a pretty extreme distance:
[https://www.etymonline.com/word/silly](https://www.etymonline.com/word/silly)

------
whym
One thing I have been wondering about - speaking of changing URIs, did they
(W3C) change/merge the domain name from w3c.org to w3.org at some point? Some
old documents seem to point to w3c.org instead of w3.org. (e.g.
[http://www.w3c.org/2001/XMLSchema](http://www.w3c.org/2001/XMLSchema)) Not
that it hugely matters, the old (?) w3c.org links still work, since they are
redirected anyway.

Example from a book:
[https://books.google.com/books?id=yLj8m3K0kNoC&pg=PA224&dq=h...](https://books.google.com/books?id=yLj8m3K0kNoC&pg=PA224&dq=http://www.w3c.org/2001/XMLSchema)

~~~
niftich
According to WHOIS, w3c.org is from 1997 while w3.org is from 1994.

A message from a W3C staff member on a W3C mailing list on 1999-06-21 mentions
[1] that w3c.org should redirect to the corresponding page at w3.org, and the
latter is considered the 'correct' domain.

[1] [https://lists.w3.org/Archives/Public/www-rdf-
comments/1999Ap...](https://lists.w3.org/Archives/Public/www-rdf-
comments/1999AprJun/0064.html)

------
prepend
This is a great link and I think I’ll share it to people. I find that I
struggle trying to explain why URIs shouldn’t change because it’s so ingrained
in me.

One of OneDrive’s pet peeves is that if I move a file it changes the URI. So
any time someone moves a file, it breaks all the links that point to it. Or if
they change the name from foo-v1 to foo-v2. I wish they’d adopt google docs.

~~~
benrbray
I wish operating systems managed files in a similar way. Ideally filesystems
would be tag-based [1] rather than hierarchy-based. This would make hyperlinks
between my own personal documents much easier and time-resistant as my
preferences for file organization change.

[1] [https://www.nayuki.io/page/designing-better-file-
organizatio...](https://www.nayuki.io/page/designing-better-file-organization-
around-tags-not-hierarchies)

~~~
josho
MacOS does this. Native mac apps somehow can preserve file references even
after the source file has been moved or renamed. The unfortunate part however
is many cross platform apps are't written using the Mac APIs which then leaves
an inconsistent experience.

I think it's for reasons like this that many mac users strongly prefer native
apps over Electron or web apps.

~~~
Polylactic_acid
>I think it's for reasons like this that many mac users strongly prefer native
apps over Electron or web apps.

Users on every OS do.

~~~
Wowfunhappy
Could have fooled me with regard to Windows. I'm unfortunately not sure what a
"native" Windows app is at this point. They've gone through so many frameworks
over the years, everything is a mish-mosh.

And this isn't just a result of legacy compatibility. If you are a developer
today, and you want to make a _really good_ Windows app, what approach do you
take? Is it obvious?

~~~
Polylactic_acid
On windows its just a resource hog. On linux and mac they stick out like a
pimple on a pumpkin. The number 1 annoyance for me is because they are based
on chromium which doesn't have wayland support, all electron apps do not dpi
scale properly with multiple monitors.

------
bloaf
If you have sequential pages, I don't like dates in the URIs. For example if
you have something spread over 5-pages (e.g. a 5-part blog post), I should be
able to guess the URIs for all 5 parts just given one. Dates mean that I
cannot do that.

~~~
henvic
Cursors can be used to solve this issue sometimes.

[https://en.wikipedia.org/wiki/Cursor_(databases)](https://en.wikipedia.org/wiki/Cursor_\(databases\))

~~~
cortesoft
How would that help be able to predict the page urls?

------
matijs
There is a pretty cool bet [1] on longbets.org about exactly this.

[1] [http://longbets.org/601/](http://longbets.org/601/)

~~~
saagarjha
Looks like it’s pretty likely to be lost, which I think is pretty cool.

~~~
rakoo
The author made sure he lost when he added the 301 clause

------
vxNsr
> _I didn 't think URLs have to be persistent - that was URNs. This is the
> probably one of the worst side-effects of the URN discussions. Some seem to
> think that because there is research about namespaces which will be more
> persistent, that they can be as lax about dangling links as they like as
> "URNs will fix all that". If you are one of these folks, then allow me to
> disillusion you._

 _Most URN schemes I have seen look something like an authority ID followed by
either a date and a string you choose, or just a string you choose. This looks
very like an HTTP URI. In other words, if you think your organization will be
capable of creating URNs which will last, then prove it by doing it now and
using them for your HTTP URIs. There is nothing about HTTP which makes your
URIs unstable. It is your organization. Make a database which maps document
URN to current filename, and let the web server use that to actually retrieve
files._

Did this fail as a concept? Are there any active live examples of URNs?

~~~
niftich
URN namespace registrations are maintained by IANA [1].

One well-known example is the ISBN namespace [2], where the namespace-specific
string is an ISBN [3].

The term 'URI' emerged as somewhat of an abstraction over URLs and URNs [4].
People were also catching onto the fact that URNs are conceptually useful, but
you can't click on them in a mainstream browser, making its out-of-the-box
usability poor.

DOI is an example of a newer scheme that considered these factors extensively
[5] and ultimately chose locatable URIs (=URLs) as their identifiers.

[1] [https://www.iana.org/assignments/urn-namespaces/urn-
namespac...](https://www.iana.org/assignments/urn-namespaces/urn-
namespaces.xhtml) [2] [https://www.iana.org/assignments/urn-
formal/isbn](https://www.iana.org/assignments/urn-formal/isbn) [3]
[https://en.wikipedia.org/wiki/International_Standard_Book_Nu...](https://en.wikipedia.org/wiki/International_Standard_Book_Number)
[4]
[https://en.wikipedia.org/wiki/Uniform_Resource_Identifier#Hi...](https://en.wikipedia.org/wiki/Uniform_Resource_Identifier#History)
[5]
[https://www.doi.org/factsheets/DOIIdentifierSpecs.html](https://www.doi.org/factsheets/DOIIdentifierSpecs.html)

------
RcouF1uZ4gsC
That is nice in theory, but in practice stuff like archive.org are vital. If
you see a document you want to refer to later, you need to archive it, either
in a personal archive or via archive.org.

There are too many moving parts to trust that even domain names will be the
same. See geocities and tumblr for recent example. If you want a document, you
should have archived it.

~~~
JadeNB
The article isn't arguing that URIs _don 't_ change; it's arguing that they
_shouldn 't_. (The part involving judgement is elsewhere in the title—the word
'Cool'—so it can certainly seem like an assertion of fact rather than of value
at a glance.) It thus seems to me that the response "in practice, URIs _do_
change" doesn't undermine that point; your discussion of the need for some
solution to the problem rather _supports_ their point—if URIs _didn 't_
change, then there wouldn't be a problem to be solved.

(Or maybe your point was deeper, that one not only can't trust that the
resource location won't change but even that the resource itself will still be
available somewhere? That is true, too! But saying that archive.org is the
solution is just making one massively centralised point of failure. That
doesn't mean that we shouldn't have or use archive.org, but that we should
regard it as just the best solution we have now rather than the best solution,
full stop.)

------
jacquesm
The problem with URIs is that they weren't foreseen as the gateway to a whole
slew of web applications, whose URIs can have a lifetime no longer than to
serve that one request. There is a continuum here from long lived useful URIs
all the way to ephemeral ones.

And then there are the URIs that aren't even made for human consumption,
ridiculously long, impossible to parse or pass around. Another class is those
that get destroyed on purpose. Your favorite search engine _should_ just link
to the content. Instead they link to a script that then forwards you to the
content. This has all kinds of privacy implications as well as making it
impossible to pass on for instance the link to a pdf document that you have
found to a colleague because the link is unusable before you click it and
after you click it you end up in a viewer.

~~~
vbezhenar
> Your favorite search engine should just link to the content. Instead they
> link to a script that then forwards you to the content. This has all kinds
> of privacy implications as well as making it impossible to pass on for
> instance the link to a pdf document that you have found to a colleague
> because the link is unusable before you click it and after you click it you
> end up in a viewer.

I can copy Google link just fine.

~~~
jacquesm
Good for you. Now try it a number of times instead of just once and you'll see
they insert their 'click count' script in there a very large fraction of the
times.

Here is a sample:

[https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&c...](https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&cad=rja&uact=8&ved=2ahUKEwj23pv8ktTqAhVvMewKHVweAgcQFjABegQIAxAB&url=https%3A%2F%2Fwww.ks.uiuc.edu%2FTraining%2FCaseStudies%2Fpdfs%2Fdna.pdf&usg=AOvVaw2YhN-
VZjT_k5jb52luA_nX)

Obtained by right clicking the link to the pdf and then 'copy link location'.
What you _see_ is not what is sent to your clipboard.

------
EamonnMR
Whenever I see a person or API use URI instead of URL I feel like I'm in an
alternate universe. Turns out the distinction is that URIs can include things
like ISBN numbers, but everything with a protocol string is a URL so really
URL is probably the right term for most modern uses.

~~~
denisw
To be clear, the difference is that an URI generally only allows you to refer
to a resource ("Identifier"), whereas an URL also tells you where to find and
access it ("Locator").

For instance, `[https://example.com/foo`](https://example.com/foo`) tells you
that the resource can be accessed via the HTTPS protocol, at the server with
the hostname example.com (on port 443), by asking it for the path `/foo`. It
is hence an URL. On the other hand, `isbn:123456789012` precisely identifies a
specific book, but gives you no information about how to locate it. Thus, it
is just an URI, not an URL. (Every URL is also an URI, though.)

~~~
fanf2
A URI that cannot be used as a URL (ie as a locator for the resource) is a URN
(a name).

------
cryptos
It would be good if more care would be taken when designing URL schemes. It is
not accidental that URL shorteners are used everywhere.

Look for example at this link:

    
    
        https://www.amazon.com/Fundamentals-Software-Architecture-Engineering-Approach-ebook/dp/B0849MPK73/ref=sr_1_1?dchild=1&keywords=software+architecture&qid=1594966348&sr=8-1
    

Maybe each part has a solid reason to exist, but the result is a monster.

I would prefer something like this:

    
    
        https://amazon.com/dp/B0849MPK73
    

And guess what, the above short link actually works! But Amazon didn't use
this kind of links as a standard.

~~~
rapnie
The second link is completely undescriptive and just like with bit.ly and
other shorteners you don't know where you end up after clicking it.

~~~
cryptos
Fair point. A compromise could a somewhat shorter version of the original
link:

    
    
        https://amazon.com/Fundamentals-Software-Architecture/dp/B0849MPK73/
    

This includes the main title of the book + ID (this variant also works).

~~~
rapnie
There is also an argument for your original version with just the ID regarding
unchanging URL's.

The Amazon URL that includes the title should be fairly stable, but if you
look at e.g. a Discourse forum URL you see it contains the topic title, which
can change at any time and then the URL changes with it. The old URL still
works, because Discourse redirects, but this can't be taken for granted.

So Discourse then has these URL's referring to the same topic:

    
    
        - https://forum.example.com/t/my-title/12345
        - https://forum.example.com/t/my-new-title/12345
        - https://forum.example.com/t/12345
    

And using the last version may be best to use when linking to the topic from
somewhere else.

------
jauco
If you’re interested in taking this to a new level. You should check out
initiatives like

handle.net (technically it’s like a url shortner, but there’s an escrow
agreement you need to sign first to make sure that the urls stay available).
Purl and w3id.org (that allow for easy moving of whole sites to a new domain
name. And of course
[https://robustlinks.mementoweb.org/spec/](https://robustlinks.mementoweb.org/spec/)

------
emmanueloga_
TL;DR (from [1]). Guidelines for the "best" URIs:

* Simplicity: Short, mnemonic URIs will not break as easily when sent in emails and are in general easier to remember.

* Stability: Once you set up a URI to identify a certain resource, it should remain this way as long as possible ("the next 10/20 years"). Keep implementation-specific bits and pieces such as .php out, you may want to change technologies later.

* Manageability: Issue your URIs in a way that you can manage. One good practice is to include the current year in the URI path, so that you can change the URI-schema each year without breaking older URIs.

1:
[https://www.w3.org/TR/cooluris/#cooluris](https://www.w3.org/TR/cooluris/#cooluris)

------
dhosek
I'm in the midst of moving a website from mediawiki to a bespoke solution for
hosting the data which will enforce structure on what's being presented. In
the process, URLs will change, _but_ , part of the migration is setting things
up so that, for example, if someone goes to
[http://www.rejectionwiki.com/index.php?title=Acumen](http://www.rejectionwiki.com/index.php?title=Acumen)
they will be redirected automatically to
[http://www.rejectionwiki.com/j/acumen](http://www.rejectionwiki.com/j/acumen)
so old links will always work. This seems a minimal level of backwards
compatibility (although I wonder if there is any specific protocol for how to
implement this that will keep search engine mojo—but not a lot because the
site gets most of its traffic from word of mouth between users).

~~~
emmanueloga_
The point of the article is that someone visiting the old URL should the old
resource as opposed to a 404, an error, or some different content. If you
can't keep the old URL the second best thing to do is a redirect. (EDIT: I
guess being pedantic the point is to design the URLs so you don't need to
change them later, but "get it perfect the first time" is kinda useless advice
:-)

This is what 301 HTTP status (permanent redirect) should be for... [1] So it
seems to me if you use 301 you should be good to go.

Also from a quick search it seems the recommended thing to do is remove the
old URLs from your sitemap.

1:
[https://en.wikipedia.org/wiki/URL_redirection#HTTP_status_co...](https://en.wikipedia.org/wiki/URL_redirection#HTTP_status_codes_3xx)

~~~
jerven
Yes, and adding a note doing the 301 will preserve the search engine mojo.

------
ph1l337
It's kind of fun to see that this has been posted several times on hn before,
but never took off.

e.g.:
[https://news.ycombinator.com/item?id=8454570](https://news.ycombinator.com/item?id=8454570)
[https://news.ycombinator.com/item?id=10086156](https://news.ycombinator.com/item?id=10086156)
[https://news.ycombinator.com/item?id=803901](https://news.ycombinator.com/item?id=803901)

In this one
[https://news.ycombinator.com/item?id=1472611](https://news.ycombinator.com/item?id=1472611)
the URI is actually broken - not sure if it changed or if it just was a
mistake of OP back then.

------
jcahill
A comment I didn't post 7 hours ago (was busy):

True. Yet this submission will have dramatically greater visibility than it
otherwise would have because the HN facebook bot linked it 5 minutes ago[1].
As a web archivist, I've dealt a lot with the erosion of URI stability at the
hands of platform-centric traffic behavior and I don't see it letting up any
time soon.

Sidenote: The fb botpage with a far larger audience, @hnbot[2], stopped
posting some months ago.

[1]:
[https://facebook.com/hn.hiren.news/posts/2716971055212806](https://facebook.com/hn.hiren.news/posts/2716971055212806)

[2]: [https://facebook.com/hnbot](https://facebook.com/hnbot)

------
arkis22
Does this go against REST, where a url is a specific resource and http
transforms it?

~~~
niftich
Fielding's thesis [1] talks about this.

Here's some selected quotes:

6.2.1 _" (...) The definition of resource in REST is based on a simple
premise: identifiers should change as infrequently as possible. Because the
Web uses embedded identifiers rather than link servers, authors need an
identifier that closely matches the semantics they intend by a hypermedia
reference, allowing the reference to remain static even though the result of
accessing that reference may change over time. REST accomplishes this by
defining a resource to be the semantics of what the author intends to
identify, rather than the value corresponding to those semantics at the time
the reference is created. It is then left to the author to ensure that the
identifier chosen for a reference does indeed identify the intended
semantics."_

6.2.2 _" Defining resource such that a URI identifies a concept rather than a
document leaves us with another question: how does a user access, manipulate,
or transfer a concept such that they can get something useful when a hypertext
link is selected? REST answers that question by defining the things that are
manipulated to be representations of the identified resource, rather than the
resource itself. An origin server maintains a mapping from resource
identifiers to the set of representations corresponding to each resource. A
resource is therefore manipulated by transferring representations through the
generic interface defined by the resource identifier."_

[1]
[https://www.ics.uci.edu/~fielding/pubs/dissertation/fielding...](https://www.ics.uci.edu/~fielding/pubs/dissertation/fielding_dissertation.pdf)

------
indymike
SEO has caused many companies to adopt unsustainable naming schemes. A url
that references and ID is not going to have to change if a word in the title
of an article is changed.

------
vxNsr
The number one worst offender of this is microsoft onedrive. Document name or
location changed? well you'll need to reshare the file/folder with everyone.

------
lazysheepherd
> When someone follows a link and it breaks, they generally lose confidence in
> the owner of the server.

Is it a bias I've developed or has anyone else realized just how many dangling
links on microsoft.com? Redistributables, small tools, patches, support pages,
documentation pages. I've recently found out when a link domain is
microsoft.com I subconsciously expect it to be 404 with about 50% chance.

------
jabroni_salad
I've noticed that the fashion industry is just rife with linkrot, and they
spoil very quickly. If you're looking at a forum post from longer than 3
months ago chances are links to specific products will instead redirect to the
store's front page or a 404.

Is there a benefit to this? I am mostly just frustrated.

~~~
Something1234
Redirecting to the front-page is SEO-BS. It's supposed to help your domain
reputation, but I find it honestly obnoxious compared to a standard 404.

------
totorovirus
It's really interesting to see perils of old findings becoming relevant when
it becomes an actual pain to practitioners. Recent hype to functional
programming language and using immutable data was already out there among
academics in 90s but wasn't really used in practice until now.

------
based2
[http://perdu.com/](http://perdu.com/)

------
Polylactic_acid
There is a new reason that probably didn't exist back then, the
application/cms powering the old pages has been replaced and it would be a
massive effort to get the old pages working on the same urls they did before.

I think archive.org is the better long term plan. Not only does it preserve
urls forever, it also preserves the content on them.

------
pachico
Side topic, sorry in advance but, am I the only one frustrated by how this
page is rendered in a mobile browser? I know, probably this wasn't an issue
back in 1998 but I would have expected something that was more resilient to
devices from w3. Of course, I might be overseeing issues.

~~~
account42
The site is perfectly responsive (even if the margins are a bit large). The
problem is that makers of mobile phone browsers decided to assume pages are
not responsive and need a large width unless you include a specific meta tag -
which is an absolutely stupid assumption and not something anyone could have
foreseen in 1998.

------
_pmf_
I have lot of bookmarks with nice URLs that still don't exist anymore.

------
iggldiggl
"An URI is for life, not just for Christmas."

------
aabbcc1241
That bring us to the story of ipfs and ndn

~~~
rapnie
And DIDs [https://w3c.github.io/did-core/](https://w3c.github.io/did-core/)

------
tmwed
“Dope” URIs Dont Change, that’s gas.

------
wolco
urn?

~~~
mrspeaker
That's the fifth "reason" listed in the article.

