

URL Structure for Usability and SEO - ujeezy
http://mattinouye.com/url-structure-for-usability-and-seo

======
tomwalsham
The points are solid, and early planning where possible is great. MVP often
contains uncertainty, so there are a couple of methods I've used to afford
flexibility in the future (although it still helps to codify this better when
the flux settles)

For applications with low expected item limits within taxonomies it can help
to use a numeric id, combined with an open-ended rewrite of the user-readable
content.

<http://www.example.org/items/12-really-awesome-thing>

RewriteRule ^/items/(\d+)[a-zA-Z0-9_\ -]+$ index.code?route=items&id=$1

Although you probably index your titles anyway, simple numeric key lookup
makes the DB layer happy. Combine this with rel=canonical to your preferred
url, and you can be more flexible with case, specific terms etc. It has the
added bonus of reducing 404s through typo'd links and such.

A couple of other anecdotal points:

* Underscore is considered a letter, not a space, for most purposes

* Pages should either have a (faux) file extension, or no trailing slash. Reserve trailing slash for directories

* Rel Canonical also allows you to carry non-critical data in query string using [QSA] in the RewriteRule, without polluting the search index with duplicate content. Consider this for, example, default view of output (CD cover vs. title list)

~~~
minouye
Interesting point on embedding numeric IDs. Looking at the Backcountry.com
example, have you ever attempted to place all content one level beneath the
domain (i.e. all content has a directory depth of 1)?

\- Category: <http://www.backcountry.com/mens-clothing>

\- Item: <http://www.backcountry.com/smith-phenom-goggle>

\- Brand: <http://www.backcountry.com/smith>

I was rather baffled by this approach, but it seems like it was a rather
recent change and I'd assume it's been well thought out. Intermingling brands,
categories, and products makes me think that they must use a lookup table of
sorts (I don't know how you'd use rewrite rules), but I've never seen an
implementation like it before. Any idea on how to approach implementing varied
content all on the same directory depth?

~~~
tomwalsham
For me, a low-value numeric id has never been an impediment. Aesthetically you
might lose a small amount, but many of us use surrogate indices in our
databases and it fits well with that design pattern.

The backcountry example smacks of overzealous in-house SEO to me - the
observation that Google (in particular) tend to favour the lower level of
structure could lead to choosing that output, but ignores that the first three
levels appear to hold comparable weights.

I'm not convinced it would offer much benefit. Matt Cutts has indicated in the
past that 'natural' levels make sense, maybe 3-4. First level entries do have
use (see the 'enhanced' results for some sites based on sitemap priorities),
but having 100s of them will likely remove this useful trigger.

From a technical standpoint it's definitely creating obstacles. You create an
additional layer of restrictions in the database index. They must have a table
of unique level-1 indices. So if there is a brand that corresponds to a
generic term (not great for trademarks, but it happens), you will get a
uniqueness collision.

It seems well executed, and no doubt their CMS input will inform them of
clashes, but for me this is genuinely not as useful as a 'breadcrumb'
structure from a UX perspective. They've essentially introduced complexity for
the sake of an assumed SEO advantage that my anecdotal evidence can't support,
and in doing so have lowered the utility of their urls, per the principles (of
which I agree) of the parent article.

That they're buying the term "smith phenom goggle" from adwords, and are stuck
on page 2 of the SERPs might back this up.

------
blauwbilgorgel
\- Short over long. Consider removing useless words from the url like
<http://www.example.com/tips-for-designing-good-urls>

\- Concise To the point, describe the page content from the url

\- Use lowercase. Generally the best idea, for sharing links and technical
issues (Apache is case-sensitive sometimes)

\- Consistenty. Stay consistent, make a style guide for URL's if necessary

\- Trailing slashes. Stick with trailing slashes or no trailing slashes

\- Be logical. Follow a logical structure, that follows the structure of the
site. A good URL might read like a breadcrumb: site.com/category/product-name,
this works for silo'ing your content. Other sites (such as news sites or
without a category) might benefit more from the shortest url possible.

\- Dashes for spaces. No underscores or %20 spaces.

\- No special chars. Consider replacing é with e and removing any non-alphabet
non-number character like: ' " (

\- Canonical. There should be only 1 URL in Google's index with a page
content. Use canonical or 301's or smart use of URL's to make sure this is the
case.

\- Degradable. What happens if a user visits site.com/category/product-name/
and then removes the /product-name/ part? The URL-structure should allow for
this and site.com/category/ should return content (preferably the category
description)

\- Timeless. If you have an event and you set the date inside the URL, then
after this date has passed, this URL gets less valuable. Either 301 these aged
URL's to the current event URL, or make it so your URL's can be "re-used" for
future events.

\- Optimized. for search Use a keyword tool, to find out what users might be
searching for and use the relevant keywords inside your URL. Keyword in URL is
a ranking factor.

\- No excessive use of dynamic variables. These will confuse your users and
search engines.

------
tomwalsham
I am an absolute advocate for rich urls - they improve user experience,
especially now browsers such as Chrome are limiting the visibility of page
titles. The SEO benefits are undisputed, and in general I'd always recommend.

There is still a case for Devil's Advocate here though. Consider one of the
most popular internet properties, Youtube.

Youtube could switch on rich urls tomorrow, but they don't. In industry where
pageviews are king and you have trusted branding (no shock redirects), then I
believe Youtube's ugly hash urls might well benefit them.

More than a few times I've visited links which I otherwise wouldn't have if
the title was embedded in the url. Occasionally I've been pleasantly surprised
by the content and stayed to watch. Considering the hokey titles of some of
the better content, I think their url structure might genuinely be a benefit.
Also, without their structure there would be no oHg5SJYRHA0

A few other large properties use this (flickr...), and while I think for early
growth you are basically throwing away potential traffic, for sites of a
certain level, there could be gains by not having rich url structure - not
least they can ignore a bunch of complex DB dupe indexing issues by throwing
away titles.

------
dools
Thinking about your URL design in the first stages of a project is only
required when using a framework that inextricably links URL structure to
functionality like django, RoR, symfony etc.

I'd prefer to use big hideous query strings until the project works, then use
mod_rewrite to do whatever the hell I want with my URLs.

------
Zakuzaa
Is that posterous toolbar at the bottom opt-in? or they just made it default
for everyone? Floats in the middle of the page on iPad.

