
Ask HN: SEO impact of HN's URL “itemid” vs. an actual title? - a_small_island
What is the reasoning behind only an itemid in the URL for HN? Is there an SEO impact?<p>For instance, a url on reddit may be:<p>https:&#x2F;&#x2F;www.reddit.com&#x2F;r&#x2F;askscience&#x2F;comments&#x2F;4ea7ee&#x2F;what_would_the_horizon_look_like_if_you_were&#x2F;<p>while a link on HN looks like:<p>https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=11465163
======
jedberg
As the person who wrote the code for the SEO part of the reddit URL, I can
tell you that there is definitely an impact. It made a huge difference at the
time, because reddit wasn't really on Google's radar. Today I suspect it would
have less impact for reddit.

For HN, I get the impression that they don't really want to be all that
optimized for Google, so it probably hits their goals just fine, but it
probably does hurt them a little bit. But since the words are in an H1 right
at the top, probably not all that much.

Edit: The code in case anyone is interested:

[https://github.com/reddit/reddit/blob/cfd979fa0119191257eadc...](https://github.com/reddit/reddit/blob/cfd979fa0119191257eadc4ccfcada60968984a1/r2/r2/lib/utils/utils.py#L936)

~~~
avar
To me your comment demonstrates why it's really hard to figure out anything
worthwhile about SEO.

You wrote that code almost 8 years ago[1]. Google and other search engines
have changed a lot since then. Who knows if this has any impact today?

Actually did you even A/B test it at the time? Or just turn it on along with a
bunch of other changes?

It's hard to tell, and there's so much SEO (mis)information out there based on
old anecdotes, and all the players doing A/B testing for this sort of thing on
a big scale keep their data to themselves.

1\.
[https://github.com/reddit/reddit/commit/353ad2a](https://github.com/reddit/reddit/commit/353ad2a)

~~~
rbinv
I'm pretty confident that no one A/B tests URLs for SEO purposes. How would
you A/B test this anyway? You can't exactly serve Google different URLs and
see what works (in fact, this would ruin both approaches).

~~~
jasongill
You're definitely wrong about that - there are whole businesses built on
building "case studies" that try to decipher the inner workings of Google's
algorithm.

These days most tests are run by buying two domains that are a combination of
random letters/numbers, setting up two nearly identical sites, and doing
identical linkbuilding to both - then seeing which one ranks higher.

It's not exactly scholastic research quality, but repeated enough times it
gives you an idea of what's working "right now".

~~~
tobltobs
This method wouldn't work. To make both results comparable you would have to
put the same content on both sites, which would result in one of both pages
getting hit by the duplicate content problem.

~~~
jasongill
It actually works great; the "duplicate content problem" only impacts pages on
the same domain, not identical content across multiple sites. It's possible to
make a lot of money by taking the content of mild authority sites and putting
it on a high authority domain - can outrank the source sites in short order.

------
dsp1234
_What is the reasoning behind only an itemid in the URL for HN?_

It's easy to code

 _Is there an SEO impact?_

Probably, but as SEO is search engine _optimization_ , if a site doesn't care
about search engines, then it also probably doesn't care about optimization of
those searches.

~~~
awinder
To be fair putting a title and an item id is nearly as easy to code, just the
"overhead" of a rewrite rule.

~~~
bdcravens
It's not exactly a huge amount of work, but validating and sluggifying text is
also part of it.

------
ChuckMcM
Technically "no" the SEO goals for HN would appear to be unaffected by the
choice :-).

And in this case its actually a good thing. If you scroll through the /new
pages as I do you will see that a lot of people try to use HN like Reddit as
an SEO tool to get more views to their web site. That can be facilitated by a
link baity headline cum URI which gets indexed with the keywords of interest
of the day.

By simply putting 'itemid' in the link text HN gives very little "link love"
to keywords and so is not as easily exploited by "digital presence" folks (aka
people who try to SEO their client's sites or products).

------
cromulent
When I Google for "SEO actual title" this page is the first result.

[https://www.google.com/search?rls=en&q=seo+actual+title&ie=U...](https://www.google.com/search?rls=en&q=seo+actual+title&ie=UTF-8&oe=UTF-8)

~~~
BinaryIdiot
That's a fairly specific search term though. I'm not sure I'd use that as the
only evidence that it doesn't matter.

~~~
scrollaway
It's really not that specific. It's not the _only_ evidence that it doesn't
matter, but if google can pick up on a 3 word query (one of them being "seo",
an extremely competitive keyword) in <30 mins then it's safe to say hn is
doing fine.

~~~
BinaryIdiot
Google polls sites that provide primarily discussions regularly. Doing an
exact query match makes the most sense to do as step 1 of a query. So even if
HN was trying to actively prevent this I think Google would still have an easy
time.

~~~
dsp1234
_exact query match_

You keep saying exact query match, but commenter above used 3 non-consecutive
words out of the 12 words in the title, and did not use any sort of quoted
search text.

Indeed, it even works with just two words out of the title:

[https://www.google.com/search?q=seo+actual](https://www.google.com/search?q=seo+actual)

The point being that HN's use of just the id in the URL has a minimal, if any,
effect on search ranking.

~~~
BinaryIdiot
> You keep saying exact query match

Only said it once =/

> The point being that HN's use of just the id in the URL has a minimal, if
> any, effect on search ranking.

A single data point absolutely _does not_ indicate whether the id in the URL
affects SEO or not. Your Google search is context specific (to you) so I would
certainly expect it to show up near or at the top. But what about those who
have never touched that page or even have gone to HN? The more specific the
title the better overall but none of this gives us data about the id in the
URL being good or bad for SEO.

Likely I think it's more of a UX than an SEO thing. But still don't go off a
context specific, single data point to decide whether something is true in
general or not.

~~~
dsp1234
_Your Google search is context specific (to you) so I would certainly expect
it to show up near or at the top._

New VM at aws, no previous Google searches. Still #2.

------
romanovcode
There is an SEO impact, however HN is not that oriented on general public and
it has no ads so it doesn't really matter.

~~~
tobltobs
> There is an SEO impact,

Any source for that?

------
chejazi
What you are referring to is a "slug" [1] which adds another searchable
dimension to the content. This appeals to marketers trying to add searchable
keywords to boost discovery. Not having one won't affect HN since
"everything's present" in the forum. For instance, if you google the title of
your post it is ranked #1 in the search results [2]

[1]
[https://en.wikipedia.org/wiki/Semantic_URL#Slug](https://en.wikipedia.org/wiki/Semantic_URL#Slug)

[2]
[https://www.google.com/search?q=Ask+HN%3A+SEO+impact+of+HN%2...](https://www.google.com/search?q=Ask+HN%3A+SEO+impact+of+HN%27s+URL+%E2%80%9Citemid%E2%80%9D+vs.+an+actual+title%3F&oq=Ask+HN%3A+SEO+impact+of+HN%27s+URL+%E2%80%9Citemid%E2%80%9D+vs.+an+actual+title%3F&)

------
krapp
I'm pretty sure HN doesn't want urls to provide an SEO boost either for
themselves or the people submitting - and if so, I would tend to agree with
them.

It's well know that pg doesn't want this site to have mainstream appeal, so
having HN articles list high in search engines would probably be a problem
they would want to avoid, but also, submitters shouldn't have an incentive to
use this site to boost their own SEO by submitting low-quality linkspam.

If it were me, I would go even further and route every link through a
dereferring proxy just to mess with their analytics as well, and block
everyone except maybe IA through robots.txt. For a site which is meant to be
about discussion and thought-provoking stories and not content aggregation for
the sake of ad revenue, I think SEO is a cancer.

------
pgfrd
A site doesn't need to be selling something (ads) to implement good SEO

What is the reasoning behind only an itemid in the URL for HN?

Like someone previously mentioned, easier to code, less thought required
around site architecture and optimization

Is there an SEO impact? Yes, from a basic standpoint, descriptive URLs are
easier to crawl, index, and rank accordingly. They help readers find info
better when searching for questions + answers

From a more highlevel standpoint, descriptive URLs and an optimized site
structure helps in many ways including SEO, analytics, accessibility, and
more. Reddit does it well

~~~
gnaritas
> What is the reasoning behind only an itemid in the URL for HN?

The reason is because it's the minimum information required to do the job. HN
was built to function by PG who built it old school as a demonstration of his
custom programming language, it's clearly not been optimized for SEO and
putting a slug in the URL adds nothing functionally and thus the engineer had
no reason to add that feature. It also uses tables for layout, has embedded
style information, and saves all state to files on disk using no database.
It's hardly a "best practices" type application.

------
bhartzer
Frankly, the usage of having keywords in the URL is a very minimal factor. If
everything were equal for a URL with the keywords in the URL and one without,
there wouldn't be much more of a benefit to the one with the keywords in the
URL.

However, if your overall site structure is one that has topics and subtopics
or categories and subcategories, it would help the user see that site
structure. For example, in the reddit example above, users can get directly to
the askscience subreddit by removing part of the URL and going directly to
[https://www.reddit.com/r/askscience/](https://www.reddit.com/r/askscience/).
Setting up URLs like this is a good practice, the URL would then follow the
site's breadcrumb trail.

For H/N, I don't really see any benefit, at this point, for using keywords in
the URLs. There's just no SEO benefit.

------
primaryobjects
I can think of a couple of impacts:

\- Shorter urls are easier to copy and share.

\+ Keywords in the url may increase search engine rank.

\+ Full title in the url helps readers know what they're clicking.

There are advantages to having the full title in the url for both SEM and
readers. However, as others mention, HN wasn't designed for SEM and contains
no ads to profit from it.

~~~
chipperyman573
>Full title in the url helps readers know what they're clicking.

Almost every website I've seen lets you re-write the url. For example,

[https://www.reddit.com/r/worldnews/comments/4eaqjb/something](https://www.reddit.com/r/worldnews/comments/4eaqjb/something)

and

[https://www.reddit.com/r/worldnews/comments/4eaqjb/anything](https://www.reddit.com/r/worldnews/comments/4eaqjb/anything)

both go to the same link.

~~~
accounthere
HN could simply add a '&title='. No need to change anything in their routes,
just change the link in the main page.

[https://news.ycombinator.com/item?id=11472694&title=ask-
hn-s...](https://news.ycombinator.com/item?id=11472694&title=ask-hn-seo-
impact-of-hns-url)

------
brudgers
My understanding is that originally [and perhaps currently] Hacker News uses
the file system for storage and that the id number is the name of a file on
disk. A few years ago, I recall a discussion about a reorganization of the
files from a single [or few?] directories into a more broadly branched tree.
The basis for doing IIRC so was to improve performance.

My impression is that this would be a simple way to produce a RESTful
interface. When the resource is a file on disk, how complex does application
layer routing have to be?

Anyway, my guess is that general SEO is not particularly high on the list of
features to implement. On the other hand, if adding Algolia and the API then
it's another story. That was also a substantial improvement to the feature
set. It might even turn up the aforementioned discussion about the file system
hierarchy [I think PG wrote the post].

Good luck.

------
dragonbonheur
HN wasn't intended to be used as a tool to improve your SEO. It may increase
your visibility to actual people or you may sometimes, through HN, get noticed
by other websites but don't expect it to directly lead to an improvement in
the SERPS.

~~~
wyldfire
I think the question may be regarding indexing HN itself, not the sites-
linked-to. If I wanted to find discussion about "what would a horizon look
like ... " by searching those terms, would reddit be favored because of those
terms in the URL?

~~~
siquick
It depends on a multitude of factors but a huge weighting will be given to the
page with the largest number of __high quality __links pointing to the site.
Notice i said 'high quality'...

The URL structure isn't always that important, but it helps.

~~~
wyldfire
> The URL structure isn't always that important, but it helps.

If that's the case then all else aside the answer to the question being asked
is, "Yes, there is an impact" and perhaps "but it's not significant enough to
justify changing HN"

------
PhasmaFelis
Why does the URL text matter for SEO? Wouldn't a crawler be looking at the
page content?

------
BorisMelnik
As a user, I would find it helpful to have a more human readable permalink /
slug for the "at a quick glance" purposes. As someone w/ some knowledge of
SEO, it would most likely add value as well.

~~~
tobltobs
Isn't the link text human readable enough?

------
ajonit
HN doesnt really care. Even if slugs didn't provide any SEO benefit, I would
implement them purely for usability reasons. xyz.com/learn-seo looks much
better to a user compared to xyz.com/46765475

------
giarcyevod
Doing 'SEO' is like punching smoke. Build for those visiting and using your
website.

------
accounthere
I don't think search engines are the main source of readers for HN.

------
return0
I never understood why a slug should be considered a signal.

~~~
chrismarlow9
These might be a few:

\- Though it's not much more difficult, it does show you spent a little more
effort on the website to not just leave the pk there. I would think this would
be more relevant for sites not using a framework like wordpress (which we
absolutely know from search results filters that google can categorize sites
in this way...)

\- Legacy... older websites that are pure static pages will likely have good
keywords in the file name, since it would be maintained primarily by humans
looking at directories of files.

\- Social significance. The link would be more likely to have a better social
impact because the content of the page could be determined whether its an
anchor tag, posted in irc, or sent in gmail. You might argue that just because
the page name has the keywords doesnt mean the content is about that, but I
would argue google can quickly detect and demote those kind of things (aka the
page title is only really relevant to your seo if it's also relevant to the
content... otherwise it's essentially just a random primary key).

Just my thoughts...

~~~
return0
I find all these 3 are contrary to the definition of a URL, and the whole idea
of using slugs just screams "game me".

