
Choosing between names and identifiers in URLs - bussetta
https://cloudplatform.googleblog.com/2017/10/API-design-choosing-between-names-and-identifiers-in-URLs.html
======
sgentle
The root of the issue here is that URLs are trying to be human-meaningful and
machine-meaningful at the same time, but those requirements are fundamentally
incompatible.

Humans work well with ambiguity and context. You know that when your coworker
says "Bob's birthday is this weekend" you know she means her husband Bob, not
Bob from accounting who nobody likes. And you even prefer that system to
having an unambiguous human identifier, even a friendly one like
"Bob-4592-daring-weasel-horseradish".

Machines, on the other hand, hate ambiguity and context. Every bit of context
is an extra bit of state that has to be stored somewhere, and now all your
results are actually statistical guesses - how inelegant!

In the early days of computing, there was no separation between the internals
of the machine and its interface. If you worked on a computer, you were as
much the mechanic as the driver. We got used to usernames, filenames, and
hostnames because they were a decent compromise; they were meaningful enough
to humans, and unambiguous enough for machines, so we could use them as a kind
of human-computer pidgin.

But we don't need them anymore, and they were never really very good at either
job anyway. Google's (probably accidental) discovery was that we were using
the web wrong. Everyone was building web directories and portals because they
thought that URLs weren't discoverable, but the real problem was that they
weren't usable. Search was the first human interface to the web.

So Google's going to kill the URL, Facebook's going to kill the username, and
someone (apparently not Microsoft) is going to kill the filename. There'll be
much wailing and gnashing of teeth from the old guard while it happens, but
someday our grandchildren will grow up never having to memorise an arbitrary
sequence of characters for a computer, and I think that's a future to look
forward to.

~~~
ballenf
> Machines, on the other hand, hate ambiguity and context.

When I ask my car to call my wife using only her first name, it suggests a
list of 3 people who I'm not even sure how they got in my contacts list. Siri,
on the other hand, gets it right every time with the exact same request. I
wouldn't say my car hates ambiguity, the programmers failed to bridge the gap
to human/machine interaction and meet the person halfway. ("If you want to
talk to a computer, you have to think like one.")

I'd say it's programmers or deadlines that mean that the extra work of
accounting for ambiguous data gets skipped. It doesn't take a neural net to
look at the recently called list for the most frequent or even most recently
dialed [wife's first name].

One irony of your "Bob" example is that sometimes using someone's last name
actually _adds_ ambiguity: "It's Bob Lingendorfer's birthday this weekend!"
... "Who is Bob Lingendorfer? ... Ohhh, you mean your husband!".

Maybe it's not irony, it's just that people read a lot into data and might
assume that all of it is relevant to the task at hand. My car kind of does the
opposite and lazily stops at the first three "close enough" hits on my wife's
name.

~~~
Banthum
One thing that worries me about computers working with all that contextual
information is that they then need to know all that information.

And since computing is so centralized these days, this means that whatever
company made the software needs to know that context about you too.

There's something to be said for computers staying dumb. I'm okay with my co-
workers knowing my social graph well enough to recognize my spouse's first
name by context. I'm not okay with faceless corporations or governments having
that same information.

~~~
ballenf
Very good point. Can't disagree with you. I _am_ ok, however, with a contacts
system letting me specify a single name nickname that it prioritizes in
matching / searches.

And I'm probably also ok with the computer knowing as much about me as my
cellular provider does, since all that is probably hoovered up already. Why
should Siri be dumber than the feds?

To take this further afield, it would be interesting to interact with a
"smart" assistant that only learned from info likely to be accessible to third
party law enforcement and/or aggregator, as a demonstration of the risk &
power.

------
yathern
Great post - I quite like the stackoverflow.com style of
`stackoverflow.com/questions/<question-id>/<question-title>`, where <question-
title> can be changed to anything, and the link still works.

This allows for easy URL readability, while also having a unique ID.

In the context of this post (the library example) that would look like

library.com/books/1as03jf08e/Moby-Dick/

~~~
saurik
Doing this means that:

1) there are now an infinite number of URLs for every one of your pages that
may end up separately stored on various services (mitigated for only some
kinds of service if you redirect to correct),

2) if the title changes the URLs distributed are now permanently wrong as they
stored part of the content (and if you redirect to correct, can lead to
temporary loops due to caches),

3) the URL is now extremely long and since most users don't know if a given
website does this weird "part of the URL is meaningless" thing there are tons
of ways of manually sharing the URL that are now extremely laborious,

4) have now made content that users think should somehow be "readable" but
which doesn't even try to be canonical... so users who share the links will
think "the person can read the URL, so I won't include more context" and the
person receiving the links thinks "and the URL has the title, which I can
trust more than what some random user adds".

The only website I have ever seen which I feel truly understands that people
misuse and abuse title slugs and actively forces people to not use them is
Hacker News (which truncates all URLs in a way I find glorious), which is why
I am going to link to this question on Stack Exchange that will hopefully give
you some better context "manually".

meta.stackexchange.com/questions/148454/why-do-stack-overflow-links-sometimes-
not-work/

Many web browsers don't even show the URL anymore: the pretense that the URL
should somehow be readable is increasingly difficult to defend. A URL should
sometimes still be short and easy to type, but these title slug URLs don't
have that property in spades.

If anything, other critical properties of a URL are that they are permanent
and canonical, and neither of these properties tend to be satisfied well by
websites that go with title slugs, and while including the ID in there
mitigates the problem it leaves it in some confusing middle-land where part of
the URL has this property and part of it doesn't.

If you are going to insist upon doing this, how about doing it using a # on
the page, so at least everyone had a chance to know that it is extra, random
data that can be dropped from the URL without penalty and might not come from
the website and so shouldn't be trusted?

(edit to add:) BTW, if you didn't know you could do this, Twitter is most epic
source of "part of the URL has no meaning" that I have ever run across as
almost no one realizes it due to where it is placed in the URL.

twitter.com/realDonaldTrump/status/247076674074718208

~~~
simcop2387
The usual way I've seen to deal with this kind of ambiguity is by doing a 301
redirect so that bookmarks get changed and the url in the address bar is also
changed. It doesn't fix external parties linking to the site with the now
deprecated url but there was never anything you could reasonably do about
that.

> If you are going to insist upon doing this, how about doing it using a # on
> the page, so at least everyone had a chance to know that it is extra, random
> data that can be dropped from the URL without penalty and might not come
> from the website and so shouldn't be trusted?

The fragment doesn't get indexed by search engines so not many will see it.
Along with that, in my understanding, having something human readable in the
URL helps with SEO in at least google an bing so doing this could hurt your
search rankings which isn't a good thing.

~~~
Tushon
Minor correction, because dealing with this is a part of my job: Almost no
browsers have implemented changing bookmarks in response to 301 redirects.
Link has further context and some testing.

[https://superuser.com/questions/151366/do-browsers-change-
ur...](https://superuser.com/questions/151366/do-browsers-change-urls-of-
saved-bookmarks-in-response-to-301-redirection)

~~~
simcop2387
Interesting, I had always gone with that since the RFC says it should happen.
Good to know.

------
nayuki
The article talks about referring to resources by using URLs containing opaque
ID numbers versus URLs containing human-readable hierarchical paths and names.
They give examples like bank accounts and library books.

This problem about naming URLs is also present in file system design. File
names can be short, meaningful, context-sensitive, and human-friendly; or they
can be long, unique, and permanent. For example, a photo might be named
IMG_1234.jpg or Mountain.jpg, or it can be named
63f8d706e07a308964e3399d9fbf8774d37493e787218ac055a572dfeed49bbe.jpg. The
problem with the short names is that they can easily collide, and often change
at the whim of the user. The article highlights the difference between the
identity of an object (the permanent long name) versus searching for an object
(the human-friendly path, which could return different results each time).

For decades, the core assumption in file system design is to provide
hierarchical paths that refer to mutable files. A number of alternative
systems have sprouted which upend this assumption - by having all files be
immutable, addressed by hash, and searchable through other mechanisms.
Examples include Git version control, BitTorrent, IPFS, Camlistore, and my own
unnamed proposal: [https://www.nayuki.io/page/designing-a-better-
nonhierarchica...](https://www.nayuki.io/page/designing-a-better-
nonhierarchical-way-to-organize-files) . (Previous discussion:
[https://news.ycombinator.com/item?id=14537650](https://news.ycombinator.com/item?id=14537650)
)

Personally, I think immutable files present a fascinating opportunity for
exploration, because they make it possible to create stable metadata. In a
mutable hierarchical file system, metadata (such as photo tags or song titles)
can be stored either within the file itself, or in a separate file that points
to the main file. But "pointers" in the form of hard links or symlinks are
brittle, hence storing metadata as a separate file is perilous. Moreover, the
main file can be overwritten with completely different data, and the metadata
can become out of date. By contrast, if the metadata points to the main data
by hash, then the reference is unambiguous, and the metadata can never
accidentally point to the "wrong" file in the future.

~~~
hire_charts
A long time ago, around when I was first taking systems programming courses, I
had this vision for a filesystem and file explorer that would do exactly what
you say. I imagined an entire OS without any filepaths for user data (in the
traditional, hierarchical sense). My opinion (both now and back then) was that
tree structures as a personal data filing system almost always made more of a
mess than it actually solved. Especially for non-techies.

Rather, everything would automatically be ingested, collated, categorized, and
(of course) searchable by a wide range of metadata. Much of it would be
automatic, but it would also support hand-tagging files with custom metadata,
like project or event names, and custom "categorizers" for more specialized
file types.

Depending on the types of files, you could imagine rich views on top -- like
photos getting their own part of the system with time-series exploration
tools, geolocation, and person-tagging with face recognition, or audio files
being automatically surfaced in a media library, with heuristics used to
classify by artist, genre, etc. But these views would be fundamentally
separate from the underlying data, and any mutations would be stored as new
versions on top of underlying, immutable files, making it easy to move things
between views or upgrade the higher level software that depended on views.

This was years ago, and I never got around to doing any of that (it would've
been a massive project that likely would've fallen flat on its face). And now,
in a roundabout kind of way, we've ended up with cloud-based systems that
accomplish a lot of what I had imagined. I'd go so far as to say that local
filesystems are quickly becoming obsolete for the average computer-user,
especially those who are primarily on phones and tablets. It's a lot more
distributed across 3rd party services than what I had in mind, but that at
least makes it "safer" from being lost all at once (despite numerous privacy
concerns).

~~~
kalleboo
Part of that is kind of what Apple has been going for the past couple of years
with macOS, even though they haven't gone all onboard by removing the
hierarchical part (since there is so much legacy software and users would
revolt).

A new user profile will come with a prominent "All my files" live search
shortcut that just shows all your files in a jumble sorted by when you last
used them. Then they expect you to search and filter through them by metadata
(which is automatically extracted/indexed by Spotlight). Then you can save
these searches/filters as saved searches which are live-updating virtual
folders.

------
wyndham
The article's main insight: "URLs based on hierarchical names are actually the
URLs of search results rather than the URLs of the entities in those search
results".

~~~
mjevans
In the most technical sense both are searches encoded in to a URI form. The
search for the (hopefully) GUID just happens to be for a specific mechanical
object, while the other is describing the taxonomic categorization of what a
matching item would look like.

~~~
always_good
Though their "/search?kind=book&title=moby-dick&shelf=american-literature"
example is fundamentally different in that all filters (being URL query
parameters) are optional and can be arbitrarily combined.

I didn't quite understand the point of the hierarchal "search URL" when you
have the /search one implemented, and they go on to say you could implement
both if you have the time and energy.

~~~
kalleboo
The Internet Archive WayBack machine kind of has an optional filter in a
traditional URL scheme - you can replace the date in a WayBack machine URL
with an asterisk as a wildcard and you'll get either the only entry it can
find or a list of dates

[https://web.archive.org/web/20000831072728/http://charlotte....](https://web.archive.org/web/20000831072728/http://charlotte.acns.nwu.edu:80/jln/wwdc97.html)

vs

[https://web.archive.org/web/*/http://charlotte.acns.nwu.edu:...](https://web.archive.org/web/*/http://charlotte.acns.nwu.edu:80/jln/wwdc97.html)

------
andrewstuart2
"The case for identifiers" is really more of a case for surrogate keys.
Surrogate keys need not be opaque, but rather are distinguished by the fact
that they're assigned by an authority and may be completely unrelated to the
properties of an entity.

Natural keys, meaning entity identification by some unique combination of
properties, are hard to get right (oops, your email address isn't unique, or
it's a mailing list) and a pain to translate into a name (`where x = x' and y
= y' and z = z'`, or `/x/x'/y/y'/z/z'`, etc.).

Surrogate keys, on the other hand, make it easy to identify one and only one
object forever, but only so long as everybody uses the same key for the same
thing.

And as mentioned in the article, the most appropriate is usually both. Often
you don't have the surrogate key, so you need to look up by the natural key,
but when you do have the surrogate key, it's fastest and most likely to be
correct if you use that in your naming scheme.

------
jey

      There are only two hard things in Computer Science: cache invalidation and
      naming things.
      
      -- Phil Karlton
    

[https://martinfowler.com/bliki/TwoHardThings.html](https://martinfowler.com/bliki/TwoHardThings.html)

~~~
andrewstuart2
Aw, you didn't quote the best one from that page.

"There are 2 hard problems in computer science: cache invalidation, naming
things, and off-by-1 errors."

~~~
kornish
Haha, didn't see that you beat me to it.

It's worth including the third saying on the page just for completeness:

"There are only two hard problems in distributed systems: 2. Exactly-once
delivery 1. Guaranteed order of messages 2. Exactly-once delivery"

------
bo1024
Something was bugging me about this, but I had to think hard to figure it out.

The article is largely based on a misguided premise: the idea that URLs should
be conceptualized as either names or identifiers. URLs are neither: they are
__addresses __of web pages. The _things located_ at the URL may have names or
identifiers, but by design of the web the stuff located at an address is
mutable while the address is immutable.

This is an important point because it breaks the analogies to books or bank
accounts. A physical copy of Moby Dick is a _thing_ that may be located at a
given address, or not. The work of fiction "Moby Dick" has an ISBN number, but
the ISBN number is metadata, not an address. A bank account number is also
metadata, not an address.

So I get the feeling that URLs should be conceptualized as addresses first and
foremost. This isn't a magic bullet for the problem the blog post addresses
(how to design URLs) but I think it gives some perspective:

* If the "thing" at the URL will always be conceptually the same "thing", but its name or other metadata may change, it makes sense to assign that thing a unique identifier and use this as part of the URL. (Because the thing with this ID will always be found at this address.)

* If the name of the stuff located at the URL is never going to change, it makes sense to use the name as part of the URL. (Because the stuff with this name will always be found there.)

* "Search results" as discussed in the blog post are a special case of the previous point: if a URL will always contain search results for a certain query, it makes sense to use the name of the query as part of the URL.

* There are also URLs that fall outside the name or identifier paradigms. [http://www.ycombinator.com/about/](http://www.ycombinator.com/about/) is the address of a bunch of stuff, which is not necessarily a single coherent thing with either an ID number or a name, but is a very reasonable address at which some content may be located.

Maybe this is all obvious, but to me it really helps think about the issue
whereas the blog post confused some things for me, so I thought I'd share.

~~~
naasking
> The things located at the URL may have names or identifiers, but by design
> of the web the stuff located at an address is mutable while the address is
> immutable.

But an address is a designator/identifier.

~~~
bo1024
I'm not sure about that. An address makes no promises (technically speaking)
about what you will find at that address.

You can give an object some metadata like "current address", but that's
different from saying the address alone identifies the object.

~~~
naasking
> I'm not sure about that. An address makes no promises (technically speaking)
> about what you will find at that address.

I don't see how that's relevant. An address, in principle, merely designates a
particular location, perhaps physical like a street address, or logical like a
memory address. In the context of a search or lookup, you can obtain what's
contained at that address.

Similarly, a URL designates a particular resource location, as exemplified by
its full name, Uniform Resource Locator. In the context of a client/server
request, you can similarly obtain a representation of what's at a URL.

------
spiralpolitik
"The downside of the second example URL is that if a book or shelf changes its
name, references to it based on hierarchical names like this one in the
example URL will break."

The author appears to have forgotten about 3xx redirection codes which were
intended to solve that very problem.

~~~
sametmax
But have been abused for black SEO and now are considered suspicious by search
engines so we use them sparingly.

This is why we can't have nice things.

~~~
always_good
I don't buy it.

Redirecting to canonical URLs is canonicalization 101.
[https://support.google.com/webmasters/answer/139066?hl=en#4](https://support.google.com/webmasters/answer/139066?hl=en#4)

Also, what would be an example of same-origin redirect abuse?

~~~
sametmax
Bypassing black lists when posting links while still benefiting from crawlers
following the links comes to mind.

During the 2000s, following links for a forum or blog was way too expensive,
so they had black lists of dirty words to avoid porn sites spaming and get
juice during the page rank golden years where any back reference mattered.

Hence it was just easier, to avoid the filters, to create non blacklisted
domain names with redirections.

Then another trick was to write a perfectly legitimate page, get google to
index it, then redirect that page to the less legitimate page. Because at the
time Google refreshed once a week (or a month...), you'd get plenty of traffic
and revenue for long enough to be worth it. If you sold niche porn and viagra,
that is.

Another one was just to setup fake sites with different URL schemes with stats
on them, and get a regular update on which URL formats were getting the best
hits. At the time URLs where very important in getting points. Then you would
regularly update your most important sites URL scheme accordingly, several
times a year if needed.

~~~
always_good
I have a hard time believing that modern search engines are so incapable that
they have to devalue redirects to the point that honest users have to worry
about it.

~~~
sametmax
Well that's just what I know about the things we did then. I'm not working in
porn anymore, so I'm missing the new cool tricks, or abuses, depending of your
point of view. But the community is VERY creative.

Now the last time I did change massively URLs for a client website and noticed
a significant drop in traffic that took a few months to recover was years ago.
So the situation might have changed. But I'm not going to test that assumption
with my clients money :)

~~~
toast0
It's been a long time since I accidentally got to porn on the internet. I
think that kind of thing is mostly dead (although some of the 'related
articles' sections look pretty iffy), and instead porn monetization has moved
towards people intentionally looking for porn.

~~~
sametmax
It's more thanks to ad blockers than anything though.

------
tejtm
[http://journals.plos.org/plosbiology/article?id=10.1371/jour...](http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.2001414)

Abstract

In many disciplines, data are highly decentralized across thousands of online
databases (repositories, registries, and knowledgebases). Wringing value from
such databases depends on the discipline of data science and on the humble
bricks and mortar that make integration possible; identifiers are a core
component of this integration infrastructure. Drawing on our experience and on
work by other groups, we outline 10 lessons we have learned about the
identifier qualities and best practices that facilitate large-scale data
integration. Specifically, we propose actions that identifier practitioners
(database providers) should take in the design, provision and reuse of
identifiers. We also outline the important considerations for those
referencing identifiers in various circumstances, including by authors and
data generators. While the importance and relevance of each lesson will vary
by context, there is a need for increased awareness about how to avoid and
manage common identifier problems, especially those related to persistence and
web-accessibility/resolvability. We focus strongly on web-based identifiers in
the life sciences; however, the principles are broadly relevant to other
disciplines.

claimer: I am one of the many authors.

------
bvrmn
Many commenters here and author of OP talk about urls in browser address bar.
However article has "API design" in title.

------
jgrodziski
Identifying changing "stuff" in the real world is for me a fundamental topic
of any serious data modeling for any kind of software (be it an API, a
traditional database stuff, etc). Identity is also at the center of the entity
concept of Domain-Driven Design (see the seminal book of Eric Evans on that:
[https://www.amazon.com/Domain-Driven-Design-Tackling-
Complex...](https://www.amazon.com/Domain-Driven-Design-Tackling-Complexity-
Software/dp/0321125215)).

I started changing my way of looking at identity by reading the rationale of
clojure
([https://clojure.org/about/state#_working_models_and_identity](https://clojure.org/about/state#_working_models_and_identity))
-> "Identities are mental tools we use to superimpose continuity on a world
which is constantly, functionally, creating new values of itself."

The timeless book "Data and reality" is also priceless:
[https://www.amazon.com/Data-Reality-Perspective-
Perceiving-I...](https://www.amazon.com/Data-Reality-Perspective-Perceiving-
Information/dp/1935504215).

More specifically concerning the article, I do agree with the point of view of
the author distinguishing access by identifier and hierarchical compound name
better represented as a search. On the id stuff, I find the amazon approach of
using URN (in summary: a namespaced identifier) very appealing:
[http://philcalcado.com/2017/03/22/pattern_using_seudo-
uris_w...](http://philcalcado.com/2017/03/22/pattern_using_seudo-
uris_with_microservices.html). And of course, performance matters concerning
IDs and UUID: [https://tomharrisonjr.com/uuid-or-guid-as-primary-keys-be-
ca...](https://tomharrisonjr.com/uuid-or-guid-as-primary-keys-be-
careful-7b2aa3dcb439).

Happy data modeling :)

EDIT: \- add an excerpt from the clojure rationale

------
lwansbrough
Nice, this reflects the choice I've made with a recent API design. This is
especially important for entity names you don't control.

For example, we ingest gamertags and IDs from players of Xbox Live, PSN,
Steam, Origin, Battle.net, etc. - each have their own requirements in terms of
what is allowed in a username, and even whether or not they're unique. Often
you can't ensure a user is unique by their gamertag alone. You can't even
ensure uniqueness based on gamertag and platform name. Reality is that search
is almost always required in these cases, and that's why we've implemented
search in the way described in this article, with each result pointing to a
GUID representing a gamer persona.

~~~
lloeki
> Reality is that search is almost always required in these cases, and that's
> why we've implemented search in the way described in this article, with each
> result pointing to a GUID representing a gamer persona

This also solves the technical† challenge of handling renaming, even within a
single platform. (Steam, I _hate_ you.)

† Another challenge is social, esp. regarding abuse.

------
jlg23
Missing for me: Timestamps. A lot of data is sufficiently unique if prefixed
with a timestamp, which could be as simple and readable as /2017/10/17/my-
great-blog-post/

------
HumanDrivenDev
A bit of an aside: why is it not standard practice to format uuids in a radix
64 encoding? It cuts down the identifier size from 32 to 22 characters

~~~
jaza
I'm a fan of formatting UUIDs in Base62 (which is like Base64 but doesn't
require any non standard alpha-numeric characters). I have used Base62-encoded
UUIDs in URLs and APIs on several occasions. It's not standard, but if you
google around you'll see that it's gaining popularity, because of the shorter
identifier length.

------
buro9
The author took an easy way out by recommending a canonical identifier based
URL _and_ a named URL, and then choosing a library as an example.

Books in a library are seldom renamed, if ever. The named URL would be almost
as permanent as the canonical URL.

However in their earlier example of a bank account, a personal account name is
typically the account holder name and the type of account, and both of these
could be subject to change as a result of marriage, death, or the change in
products offered by a bank. Even then, the rate of change is low.

A better example that the author could have (should have?) used is that of a
news website where the article title may change frequently and yet there is a
desire to make the link indicate the type of content at the destination...
this is the real crux of the issue.

On a news site a canonical identifier driven URL may be correct... but does
not sell or communicate the story behind the link and the link is likely to be
shared without context. Sure you may see
`example.com/news/a49a9762-3790-4b4f-adbf-4577a35b1df7` but this could be
_any_ news... it is far less obvious what is behind the link than the banking
example as diversity in news stories is huge.

Yet the named URL would likely fail too, as once created and shared it should
not mutate or at least should remain working... and yet the story title is
likely to be sub-edited multiple times as news evolves.

The best scheme was not even mentioned in the article... combining both an
identifier with a vanity named part:
`example.org/news/a49a9762-3790-4b4f-adbf-4577a35b1df7_choosing_between_names_identifiers_URLs`
. The named part can vary as it is not actually used for lookup, only the
prefix identifier is used for lookup.

Though that has it's own downside... one can conjure up misleading named
sections for valid identifiers to misdirect and mislead.

------
dreamfactored
Odd that the article doesn't seem to mention the considerations of whether
id's are a) globally unique and b) unguessable, and the huge difference
between the URL param and directory styles - that param id is inferred from
order in directory style, making all params required and missing the final one
default to it equalling *.

------
baradas
There's also the locality aspect of the problem which is unaddressed.
Typically humans resolve ambiguity in a finite namespace. E.g. there are only
a few Bob's I know of. If a single human were asked to resolve of a bob
without context it would be a hard to resolve problem. I think all naming
resolution problem are related to identification on the basis of attributes,
and a url in a certain sense is supposed to model enough attributes to help us
resolve this. We have modeled systems unlike humans, not with distributed and
local information but looking at url resolution using a central brain of
sorts.

------
DelightOne
> You also need to be careful about how you store your identifiers—the
> identifiers that should be stored persistently by the API implementation are
> almost always the identifiers that were used to form the permalinks. Using
> names to represent references or identity in a database is rarely the right
> thing to do—if you see names in a database used this way, you should examine
> that usage carefully.

What does this mean? Is it just to say don‘t use the name hierarchy but rather
the permalink-key as identity in the database?

------
mcdan
Isn't one problem with this is that intermediate caches now have two resources
that represent the same thing, therefore invalidation of intermediate caches
will be nearly impossible?

~~~
sneak
Cache invalidation remains one of the two hard problems in computer science
(the other being naming things and off by one errors).

~~~
kmicklas
Off by one errors basically don't exist if you use modern languages and
practices.

~~~
mjevans
As long as humans are still providing input...

------
nazri1

        Those who do not understand UNIX are condemned to reinvent it, poorly. -- Henry Spencer 
    

Hard links, symlinks and inodes.

------
monkeycantype
in the article:

/shelf/{something}

{something} could be a name - 'american literature' {something} could be an
identifier - '20211fcf-0116-4217-9816-be11a4954344'

if someone calls:

[https://library.com/locations](https://library.com/locations): { "kind":
"Shelf", "name": "20211fcf-0116-4217-9816-be11a4954344", }

now we have a shelf named with the id of a different shelf

and the meaning of

/shelf/20211fcf-0116-4217-9816-be11a4954344/book

is now ambiguous

i don't know a great way to avoid this

this is unambiguous, but i don't think my co-workers would like it:
/shelf/name/{id}/books /shelf/id/{id}/books

I think this would only be slightly more popular

/shelf/name/{id}/books /shelf/{id}/books

because the thing after shelf/ would not consistently be an id

------
amelius
Why not make every URL that's shown in the title bar a permalink by default?

That way, you have the best of both worlds in all cases.

If another object tries to use the same URL as another object (which was used
first), then a new URL must be generated (just add something at the end of the
name).

~~~
s17n
Because then you are compromising the utility of your url-as-search semantics,
complicating your implementation, and probably distorting your data and/or
schema. A better solution is to make the distinction between "display" and
"permanent" url a first class concept.

~~~
amelius
> Because then you are compromising the utility of your url-as-search
> semantics

Could you explain why?

------
a13n
For Canny, I wrote some awesome code that I'm proud of that turns a "post
title" into a unique URL.

[https://react-native.canny.io/feature-requests/p/headless-
js...](https://react-native.canny.io/feature-requests/p/headless-js-for-ios)

For example, a post with title "post title" will get url "post-title".

Then a second post with title "post title" will get url "post-title-1".

Since there's only one URL part associated with each post, it's a unique
identifier.

This gets rid of the ugly id in the URL, for epic URL awesomeness.

Furthermore, if you edit the first post to have "new post title" then its URL
will update to "new-post-title", but "post-title" will still redirect to "new-
post-title".

Someday I'm gonna open source a lib that lets you easily add awesome URLs to
your app. :)

~~~
jlg23
> For Canny, I wrote some awesome code that I'm proud of that turns a "post
> title" into a unique URL.

Did you mean "slug"? What you are describing is a basic feature of most
blogging software since the inception of blogs...

~~~
a13n
It's way more than that.

– Automatically handling duplicates

– Avoiding needing to include the unique ID in the URL

– Updating the URL after editing the post

– Redirecting previous versions to the new version

~~~
kalleboo
Wordpress has supported all of that for as long as I can remember

~~~
a13n
What's your point?

~~~
vilmosi
The point is you're reinventing the wheel.

------
joshzilla2017
Nice example of usability concerns, but I think bookshelves/<bookshelf>/<book>
is much more intuitive than bookshelves/<bookshelf>/book/<book>.

~~~
noisy_boy
if it was a box instead of bookshelf then /book/ makes more sense otherwise
you don't know which object the <book> id is for.

------
mirko22
Off topic, but i wish i could open a simple blog page without enabling ton of
JavaScript :/

------
afandian
Good advice. Interesting that Canonical URLs aren't mentioned.

But the sheer arrogance of serving a webpage that doesn't render any text
unless you execute their JavaScript really annoys me. It's not a fancy
interactive web-app, it's a webpage with some text on it.

~~~
brogrammernot
I understand the frustration but you also understand that the vast majority of
individuals render JS on the page and do not use text only browsers.

It’s not worth the time to appeal to such a minority share of internet users.

~~~
afandian
Your argument holds for web apps where it might be extra work to do
progressive enhancement. But this is literally a webpage of text. It is _more_
work to get JS involved.

Humans using off the shelf browsers aren't the only ones who consume webpages.

~~~
jonknee
> But this is literally a webpage of text. It is more work to get JS involved.
> Humans using off the shelf browsers aren't the only ones who consume
> webpages.

Sort of, the contents of the post are in a database somewhere. It's not like
someone uploaded a .html page to Blogspot and they converted it into JS. The
JS makes it easier for users to customize templates.

The main reason you'd want to avoid doing something like this is that the
Googlebot would penalize you, but somehow I doubt Google is concerned with
that.

That said, it has a <noscript> version that seems to work fine (I turned off
JS and it renders as expected.

~~~
Nicksil
> Sort of, the contents of the post are in a database somewhere. It's not like
> someone uploaded a .html page to Blogspot and they converted it into JS. The
> JS makes it easier for users to customize templates.

That's not always the case, either, however. There are a great many
(majority?) of database-driven websites out there with framework-rendered
templates; Django, Flask, Ruby on Rails, etc. They are not constructed using
-- nor are they dependent upon -- JavaScript.

