
Ask HN Mods: Can you make submission URLs human-readable? - 3rd3
By now, I probably stumbled a dozen times upon the problem that HN links do not reveal at all what they are about. To solve this problem you could maybe simply add the title string to the item ID like this:<p><pre><code>   https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=7535606-can_i_delete_my_skype_account
</code></pre>
I think that would be fairly easy to implement and it could be easily made backward compatible by forwarding the old links to the new format. With a dash as separator it’s still possible to select the ID with a double click (at least on my system). I think that would be a great improvement. Thanks!
======
izietto
I like StackOverflow URLs; f.e.:

    
    
      /questions/22881084/couldnt-find-bar-with-id-472021-polymorphic-associations
    

And every URL that matches /questions/22881084 redirects to the main URL.

The URL of this news would become:

    
    
      /items/7536719/ask-hn-mods-can-you-make-submission-urls-human-readable

~~~
gambler
I hate those. They contain duplicate information, mostly for the sake of
Google rather than users. The fact that /questions/22881084 still works is a
nice touch, though. Guess why it's there? People sometimes still type URLs by
hand. In that case numerics are actually much nicer than a long-winded textual
URL. (An interesting approach would be to use alpha encoding, like ba, bc, bc,
etc for the index. More information, easier to type on mobile devices.)

The only case when I like title-based URLs is when they're hand-crafted,
memorable and short. Something like example.com/movies/terminator2. Even then
you run into problems, because it could be terminator_2 or Terminator_II, or
whatever. The moment that URL turns into Terminator_2_Judgment_Day I begin to
question its usefulness.

What's wrong with synthetic IDs, anyway? People go on and on about
"readability" of URLs, but you have to evaluate it in the proper context.
_Where_ are they more readable? If I'm in the browser, I already have the
title at _three_ places on my screen (tab, window title, page itself).
Sequential numeric ID gives me additional information about order of the
articles and sometimes even allows for neat tricks, like skipping to the next
article by incrementing the number.

~~~
mherkender
Expecting users (or even yourself) to manually enter URLs like
"/movies/terminator2" is just not going to work long-term if you can generate
the URL based on the title of the page.

It's also not necessarily duplicate information (more like extraneous
information), nor is it a disadvantage to have readable URLs. I'm going to
explain why by weighing the pros and cons of the main options as I see them.

/questions/how-do-i-foo-bar

Doesn't allow renaming, can't have duplicates with similar titles. Since the
lookup key is "how-do-i-foo-bar" it's a variable width string, which is less
ideal than a fixed width number/string.

/questions/342993123

Doesn't have the problems above, but now it's not readable. Users can see
these URLs in search engines and perhaps more significantly, when hovering
over links to see where they go. Google's own SEO guide recommends readable
URLs.

/questions/342993123/how-do-i-foo-bar

Best of both worlds. If you redirect "/questions/342993123/*" to
"/questions/342993123/how-do-i-foo-bar", you can even allow users to rename
titles without having to store the old URL, all while having one canonical
version of each "question".

One last thing, those IDs aren't sequential, they're just unique. If you
increment a HN news article URL you'll probably end up in a comment (because
articles are just root comments), or on other sites, a deleted/spam
submission. Sequential only works for content that doesn't change, and most
content changes.

~~~
gambler
When I said "duplicate", I meant that both the number and the title are in
themselves sufficient to locate the article.

I get the renaming scenario, but I think it is an edge case.

Most duplication can be avoided by appending date or user name to the title
_if_ there is already a similar title in the database.

/questions/foo_bar-by_mherkender /questions/foo_bar-2001-01-01

At least this way _unique_ URLs will be short(er) and more memorable.

Overall, though, I think you overestimate the usefulness of named URLs,
because you concentrate on specific cases like people linking to something bad
on YouTube (that still has an appropriate title) without giving coherent
description. There are other cases too. A lot of them. They just aren't as
annoying/memorable.

[http://stackoverflow.com/questions/22883751/this-is-not-
so-h...](http://stackoverflow.com/questions/22883751/this-is-not-so-helpful-
is-it)?

I read a small magazine about PC games, and I recall using ID jumping for
navigation a lot. When the numbers are below 1000, it is easier to remember
and type "article=234" than
"/article/Does+Assassin's+creed+break+new+ground?".

Also, I sincerely think that the trend for named URLs has more to do with
Google that what users want or like. If Google liked super-long URLs with lots
of punctuation, most websites would do just that and people would find ways to
rationalize the trend. This doesn't automaticall mean title URLs are wrong,
but it's somewhat different from "everyone like them".

~~~
mherkender
Using dates is a good alternative, but they're not a universal solution. If
Microsoft open-sources Windows XP you can be sure that dozens of articles
titled "Windows XP open-sourced!" will end up on HN that day, and a site that
rejects articles because they have the same title as another has a bad user
interface. That said, if there's a small enough amount of content, it might be
good enough.

I'm not saying sequential ID URLs are bad, they're still possible in my
suggestion, I was just explaining how navigating by incrementing a number
isn't plausible for sites with large amounts of user-generated content. It
usually doesn't scale.

Anyway, I'm not overestimating the advantage of readable URLs. You haven't
given any reason why readability isn't an advantage in URLs, and I've given
several explicit reasons why they are (SEO, hovering over links, etc).

------
dm2
Here are two bookmarklets which might accomplish what you are requesting:

    
    
        javascript:window.location=window.location+'&title='+document.title
    
        javascript:window.location=window.location+'#'+document.title
    

I don't mind the simple IDs for HN URLs.

The thing that I have the most trouble with is when bookmarking comments
sections by dragging the "XX comments" link to my bookmarks bar, the title
then saves as "XX comments" rather than the title of the article. This might
be a browser / HTML issue though, I tried adding a "title" tag but that didn't
fix it.

Updated bookmarklet below:

    
    
        javascript:window.location=window.location+'&title='+document.title.replace(/[^A-Za-z0-9]/g, "_");

~~~
eik3_de
If you add converting non-url-safe characters to dashes I would call that
bookmarklet a >90% sufficient solution that doesn't need any server-side
changes.

Users who need a readable URL just use the bookmarklet.

~~~
dm2
Just replace document.title with document.title.replace(/[^A-Za-z0-9]/g, "_");
and it should work with all titles.

------
jw2013
That is a good request. What will make the old url does not need any
forwarding is add a new get parameter, such as:

[https://news.ycombinator.com/item?id=7536719&title=Ask_HN_Mo...](https://news.ycombinator.com/item?id=7536719&title=Ask_HN_Mods:_Can_you_make_submission_URLs_human-
readable)

------
Pirate-of-SV
Or by using anchors, it's backwards compatible by design.

    
    
        https://news.ycombinator.com/item?id=7535606#can_i_delete_my_skype_account

~~~
dbaupp
One would want to ensure that the anchor was guaranteed to be distinct from
every id on the page (e.g. if this submission had title 'up_7536771', browsers
would jump to your upvote button if the title was just appended).

~~~
icebraining
You can just prepend a character that is valid in the URL but invalid for an
id, like a colon. E.g.

    
    
      https://news.ycombinator.com/item?id=7536719#:ask_hn_mods

------
nailer
Why not actually have separate resources (and URLs) for each story, rather
than have all stories ever as 'item'?

    
    
        https://news.ycombinator.com/items/7535606/can_i_delete_my_skype_account

~~~
gambler
I never understood why people are so hell-bent on advocating
[https://example.com/item/7536855](https://example.com/item/7536855) over
[https://example.com/item?id=7536855](https://example.com/item?id=7536855), or
even
[https://news.ycombinator.com/?item/7536855](https://news.ycombinator.com/?item/7536855)
in human-readable websites.

The last form is short, allows for the simpler routing and much easier linking
on the backend as you don't have to do any magic to resolve relative URLs.
That's a tangible benefit, and it can make some code and operations much
simpler. (For example, you could add an alternative page called item2, and it
would happily reuse all the resources of item just via relative URL
resolution.) Ability to use direct links (<a href="?comment/111">) also means
more readable templates and easier time finding what links where.

The first form is supposedly "hierarchical" and "more semantic". What does
that buy us in practical terms?

~~~
solipsism
Most modern routing systems I've worked with can match against any of the URL
schemes you describe, without any difference in flexibility. Segments of the
path can be tied to hardcoded routes, or passed into handlers as parameters,
or any combination thereof. And relatively URLs work just fine as
"/comment/123". I'm not sure I understand what you mean by "direct link".

~~~
gambler
With enough code you can parse anything. My point is that query-based linking
can greatly simplify your overall codebase and let the _standard URL
resolution_ in the browser do most of the work for you.

Let's look at two examples: example.com/?article/123 vs
example.com/article/123.

In the first case you can link to example.com/assets/main.css by simply using
<a href="assets/main.css">. It will work with no additional code. It will work
from the root of the website, from ?article/123, or ?article/456, or
?any/other/place/you/want.

In the second case you need to reverse your routing logic, because "<a
href="../assets/main.css">" will work on some pages, but will break on paths
with more nesting (such as example.com/articles/about_it/123).

In ASP.NET MVC, for example, linking looks like <a
href="@Url.Content("~assets/main.css")">. They've added a shorthand for it in
Razor 2, but that means that the overall codebase is even more complex: your
templating engine parses the content of your templates, find things that look
like virtual URLs, reads the routing table, performs reverse relative
resolution, and spits out the altered URL. It is extremely difficult to debug
if something goes wrong. Is all this complexity really warranted?

If you're using PHP/Apache, query strings have an additional benefit of not
needing any routing setting in .htaccess. Makes deployments somewhat more
straightforward.

Also, you can easily implement "areas" with different assets by creating an
actual sub-directory with an alternate index script, then copying and editing
assets. This is way, way easier and more maintainable than all of the dynamic
solutions I've seen.

"Wait", some people say, "but what if you change stuff?" If you change
something on the backend, you change your routing tables to match it, so the
external URLs stay the same. Yes, you still need routing. But now it only does
one thing. You've decoupled routing from other components and made templates
much simpler and more readable.

~~~
kaoD
IMHO the most important reason for not doing so: queries are not made for
that! Queries are queries, not routes. You should route with routes
(completely unexpected), not with queries. The difference between query and
route actually _means_ something to the software interacting with your site.

I only hate one thing more that using queries where you should use routes:
using routes where you should use queries (especially common in shady SEO
stuff).

> query-based linking can greatly simplify your overall codebase and let the
> standard URL resolution in the browser do most of the work for you.

It does not simplify my overall codebase at all. In fact, query routing will
add lots of unnecesary code to my codebase if I have to parse the query as a
route, while the route is already parsed as a route.

> In the second case you need to reverse your routing logic

Id's just <a href="/assets/main.css"> with a leading slash. Again, that's what
/ is for.

> If you're using PHP/Apache, query strings have an additional benefit of not
> needing any routing setting in .htaccess.

I think that was exactly what early 2000s forum developers thought, but as
they learnt soon it was a really bad idea.

Don't do that. Please.

~~~
gambler
_IMHO the most important reason for not doing so: queries are not made for
that! Queries are queries, not routes. You should route with routes
(completely unexpected), not with queries._

This does not highlight any practical or even realistic theoretical benefits
of using routes over queries. Besides, it's called "path", not "route", and I
don't know of anything in URI spec that would support your vision.

    
    
       The query component contains non-hierarchical data that, along with
       data in the path component (Section 3.3), serves to identify a
       resource within the scope of the URI's scheme and naming authority
    

_Id 's just <a href="/assets/main.css"> with a leading slash. Again, that's
what / is for._

This breaks if you need to re-deploy to a sub-directory.

 _I think that was exactly what early 2000s forum developers thought, but as
they learnt soon it was a really bad idea._

The only reason I know of that it was bad idea in 2000s is because early
search engines arbitrarily assigned different values to words in query
strings. This has changed.

~~~
kaoD
> Besides, it's called "path", not "route"

Path and route are synonyms (I might be confused because of my mother tongue)
and I wanted to highlight that fact: you should use routes (paths) to route.

> and I don't know of anything in URI spec that would support your vision.

[http://tools.ietf.org/rfc/rfc3986.txt](http://tools.ietf.org/rfc/rfc3986.txt)
section 3.4 the very first sentence says: "The query component contains non-
hierarchical data"

Your proposal "?article/123" seems very hierarchical to me. You even used
path-like syntax and "routing via queries" sounds inherently hierarchical. You
should use a path instead.

> This does not highlight any practical or even realistic theoretical benefits
> of using routes over queries.

Of course it doesn't if you strip the important part of my comment :P Quoting
myself: "The difference between query and route actually means something to
the software interacting with your site."

It's akin to inventing your own HTTP verb.

> This breaks if you need to re-deploy to a sub-directory.

Then you should use <base>. That's why it's there, so you don't have to jump
through hoops.

Using queries for navigation is just as bad as the defunct habit of using
hashbangs, which served a purpose when there was no alternative. The
technology moved forward and we have better solutions.

~~~
gambler
Firstly, routing is not inherently hierarchical. It's a process of mapping
pieces of URL to variables with the purpose of figuring out which action to
call. So your argument that implies routing it inherently related to URL paths
(by the virtue of their names) is invalid. It can and often does involve
queries.

Secondly, if you ever tried to implement a web framework from scratch, you
would quickly see that routing something like
"?controller=blah&action=blah&id=123" is MUCH easier than dealing with virtual
paths. Aside from relative URL issues in templates, you also need to deal with
differences between real assets and everything else.

Thirdly, my proposal does not go contrary to rfc3986. "The query component
contains non-hierarchical data" is not the same as "query component must not
contain any hierarchical data". If it did, you could also argue that it's
"wrong" to put an address int query string, because, hey, it's a hierarchy.
(And again, routing is not inherently hierarchical.)

Fourthly, base tag doesn't deal with the original use case I described and is
a PITA to maintain.

------
mstolpm
While I agree that HN URLs are are very non-descriptive, one problem of the
proposed solution might be submissions that get their titles edited by mods
later?

~~~
latk
That's not an issue if _only_ the ID is used to identify the submission,
whereas the title slug is only human-readable fluff. E.g

    
    
        /item/1234/original-submission
        /item/1234/edited-title
        /item/1234/arbitrary-string
        /item/1234/
    

This could also be done with the current query string approach, e.g.

    
    
        /item?id=1234&title=original-submission
        /item?id=1234&title=edited-title
        /item?id=1234&title=arbitrary-string
        /item?id=1234
    

OP's suggestion of having the title be part of the ID would be quite
inelegant, but possible if the real ID is only the digits up to the first
hyphen.

Ideally, the title would be completely ignored, but other titles would
redirect to an URL with the current title.

~~~
seanwoods
I actually think the OP's suggestion is the most elegant of everything I've
seen here.

It's a matter of aesthetics.

------
pearjuice
What about items which do not have titles, like comments?

~~~
twic
Either nothing, or:

[https://news.ycombinator.com/item/7536783/pearjuice-what-
abo...](https://news.ycombinator.com/item/7536783/pearjuice-what-about-items)

Or:

[https://news.ycombinator.com/item/7536783/pearjuice-
comment-...](https://news.ycombinator.com/item/7536783/pearjuice-comment-on-
Ask-HN-Mods-Can-you-make-submission-URLs-human-readable)

------
dang
It's an interesting suggestion, but I'm not sure I clearly see the value,
especially compared to the other things we have to do.

(I'm going to demote this thread now, rather than kill it, so anyone who wants
to can continue discussing.)

------
saurik
Why is this a "problem"? I frankly consider it a feature: the URL is shorter,
and it encourages people to put a real title next to the URL that is actually
designed for humans to read and understand. Schemes that embed titles either
lock tbe title in for all time (which is clearly problematic and confusing
once the URL changes) or make the title just ancillary URL cruft. (I believe
Wordpress goes with the former, while most other websites go with the latter,
including reddit and Stack Overflow; I would be willing to believe that I'm
wrong about Wordpress, though.)

The ancillary URL cruft version of the feature means the page effectively now
has an infinite number of possible URLs and no effective "canonical" one
without running into the "title changed" problem. It doesn't let you know
what's on the other side before clicking, because you can change it to read
whatever you want (or remove it entirely, for the people who like the intrigue
or surprise). By making the URL longer it is more likely to be broken apart
and formatted horribly. It causes people to go "engh, the URL is sufficient"
and then not provide their own title, leaving the reader to have to read the
mangled "all lower case, no punctuation, space characters missing, truncated"
version of the title out of the URL; and even when the mangled form is clearly
confusing it makes adding the title again more costly (longer message that
night no longer fit that feels horribly redundant).

Of course, if the URL is further designed to be human rememberable or human
guessable, where you can reasonably expect the user to type the URL in is,
that's great! restaurant.example.com/menu/ is an amazing URL; but
restaurant.example.com/8364928/restaurant-s-daily-menu/ has neither of these
features: these "embed the title" schemes are just abusing URLs as storage for
content that simply should not be in the URL.

It certainly isn't like this is some requirement for success: YouTube has done
very well for itself without succumbing to the "embed the title" mistake.
Facebook does not embed titles of posts in their URLs. Pinterest and Instagram
could easily build such a feature out of the short descriptions attached to
images, and Twitter could have snipped some of the tweet text, but thankfully
none of them chose to. Even after being "in the game" for many years, Google
(of all people) did not do this for Google+ either (though some might say
Google+ isn't highly successful, and some even blame the URLs, it should be
clear that tacking a title to the end of the URL would not solve any of their
problems ;P).

Really, there is only one benefit I've ever seen for putting the titles in the
URLs, which I think was most eloquently described by a user on reddit a few
years ago in a thread on this very topic. I have provided a link to this
thread, but will point out that this is a very niche use case that itself
should probably not be encouraged :/.

    
    
        http://www.reddit.com/r/AskReddit/comments/1gbl0u/why_does_reddit_put_titles_in_their/

~~~
Aqueous
"these "embed the title" schemes are just abusing URLs as storage for content
that simply should not be in the URL."

I don't look at it this way. I think readable URLs are part of the way the web
was designed - this idea that URLs map to individual documents and not to an
application that has to translate between the URL and the resource it
represents.

Quite to the contrary: Having a database id in the URL is sticking model/data
layer logic where it doesn't belong - the view layer of the APP.

As for the permalink problem on changing titles you can simply store a history
of redirects for every title that the article has ever had to the current,
active URL.

~~~
path411
Having an id in the URL lets you have a key to uniquely query. Without it, you
would have to disallow identical titles.

What benefit do you get from having a title in the URL?

I think of a URL as simply an easily repeatable parameter of a web request.
The URL is only one parameter of the request, but it's the only one that's
easy to repeat.

------
fhars
[https://news.ycombinator.com/item?id=7536719&title=this+is+a...](https://news.ycombinator.com/item?id=7536719&title=this+is+a+stupid+idea)

------
6cxs2hd6
Although I like this idea, I wouldn't want this feature to get prioritized
over making the layout "responsive" on mobile browsers.

(I've tried a few HN "apps" on both iOS and Android. Either they reinvent the
UX too much for my taste, or they're no-login/read-only ... or even if they
get those points correct, they have intermittent "no comments" errors when
there are in comments. Sigh. Really I just want the web site. Albeit updated
to work better on smaller screens.)

~~~
9248
What's wrong with the desktop version?

It works great on my 320x480 screen in landscape. The only thing that would
make sense would be to limit the viewport width, other than that I don't see
anything wrong.

~~~
throw_away12847
I personally find reading HN on a mobile device really frustrating. Horizontal
scrolling is awkward and navigating between comments is frustrating. HN isn't
winning any awards for UX.

I really do believe that HN is poorly designed for end users. The UI is not
forward with intent, or pleasurable to use. It's been literally years and the
expired link issue hasn't been addressed. ---No, I'll stop you right there.
It's been acknowledged as an implementation bug, but not __addressed __. PG 's
lack of user empathy and stubbornness make HN worse.

The only reason people come here is the content.

------
beshrkayali
I agree, but it's probably easier to implement a simple URL forwarder
somewhere else to not mess with HN for now, specially that things are changing
a bit: [http://blog.ycombinator.com/meet-the-people-taking-over-
hack...](http://blog.ycombinator.com/meet-the-people-taking-over-hacker-news)

~~~
k-mcgrady
Wouldn't this be the perfect time to suggest changes? It's important to let
the new people in charge know what the community would like I think.

~~~
beshrkayali
Yeah totally agreed on this. +1 for this and for other suggestions.

------
franze
here are (my) URL rules which have served me very well over the years.

• URL-Rule 1: unique (1 URL == 1 resource, 1 resource == 1 URL)

• URL-Rule 2: permanent (they do not change, no dependencies

to anything)

• URL-Rule 3: manageable (1 logic per site section, no

complicated exceptions, no exceptions)

• URL-Rule 4: easily scalable logic

• URL-Rule 5: short

• URL-Rule 6: with a targeted keyword phrase

one is more important than two to six combined, two is more important than
three to six combined, and so on.

adding a "human readable text thingie" slug is number six. least important.
meanwhile it makes 1 to 3 much harder, and runs against nr. 5.

so basically: don't do it, never. it complicates everything. if you can give a
fulfil "human readable" and "short" and every other rule, do it. i.e.:
www.example.com/a/contact is great. even www.example-shop.com/a/blue-coffee-
maschine is great, but www.example.com/b/343435354-you-can-write-here-
everything-you-want-and-i-will-trigger-a-redirect-if-its-wrong is a very very
bad idea.

------
asko_io
If you feel you need to append a title to the URL, why not just do
[https://news.ycombinator.com/item?id=7536719&title=Ask](https://news.ycombinator.com/item?id=7536719&title=Ask)
HN Mods: Can you make submission URLs human-readable

It shouldn't break the URL and the user will see a title in it.

------
jcfrei
I'm just guessing here but one of the reasons might be to prevent ranking high
in search engine results. Search engines bring in lots of misguided traffic
and lots of users which don't fit into the HN community. Again I'm just
guessing...

~~~
duncans
#shit_HN_says

~~~
tptacek
No, as fun as I'm sure that glib comment was to write, he's not making that
up. HN does keep a low Google footprint. I don't know how much it has to do
with keeping the community small, but I appreciate that comments I write here
don't immediately get nailed to the front of a Google SERP.

Here's Paul Graham, briefly, on the subject:

[https://news.ycombinator.com/item?id=5808982#up_5808990](https://news.ycombinator.com/item?id=5808982#up_5808990)

~~~
jcfrei
thanks for providing the link. couldn't find it myself, but this was the
statement I had in the back of my mind.

------
AliAdams
I'm sure there are many more useful changes that could be made which would
bring greater benefit to the community.

Adding the 'initial scale' meta tag so that people can more easily use the
site on their mobiles would be a great start.

------
lelf
Can we start please with many far more annoyUnknown or expired text.

------
leephillips
Instead of changing the URL you could use the "title" attribute on the "a"
tag. On most browsers hovering over the link will display the attribute.

~~~
Pacabel
How does that help in situations where HTML isn't being used, though? I'm
thinking of plain text emails, discussion forums that don't allow custom
markup, discussion forums that generate markup automatically for URLs, and so
forth.

~~~
Bill_Dimm
I find long URLs to be annoying in email since you never know if they will get
wrapped and broken somewhere along the way. And, if you're emailing a URL to
someone you are presumably providing some text to explain what it is, so why
do you need extra junk in the URL?

You can already add whatever you want to the URL if you really think it needs
to be verbose:

[https://news.ycombinator.com/item?id=7536719&title=whatever-...](https://news.ycombinator.com/item?id=7536719&title=whatever-
I-want-here)

