
SEO in JavaScript Web Apps - maccman
http://blog.alexmaccaw.com/seo-in-js-web-apps
======
blauwbilgorgel
Another option is to make a website that works without JavaScript. Only
dynamically fetch pages when JavaScript is enabled. Progressive enhancement
rocks.

Visiting [http://monocle.io/posts/how-yield-will-transform-
node](http://monocle.io/posts/how-yield-will-transform-node) without
JavaScript support yields a blank page.

All page titles are <title>Monocle</title> which isn't very descriptive and
the meta description is the same for every page. Only with escaped_fragment
can a user see descriptive page titles. For JavaScript users all pages are
titled "Monocle".

There are no unique content articles to rank nr. 1 for. The articles are all
found on other sites. I don't really see Monocle rank 2 a lot (a quick
glance). Those are reserved for other aggregating sites.

The Google guidelines say:

    
    
      Make pages primarily for users, not for search engines.
      "Does this help my users? Would I do this if search 
      engines didn't exist?"
    

I'd extend that to JavaScript apps. Why make escaped_fragment especially for
search engines, and then forget to offer this functionality for human users
too?

~~~
JohnTHaller
Honestly, with every worthwhile browser (both desktop and mobile) supporting
JavaScript and having it enabled by default, it doesn't make much sense to
support browsers with it disabled. The couple million people with NoScript
installed who enable it only on specific sites know to enable it if a site
doesn't work.

~~~
blauwbilgorgel
It does make sense to me, because:

\- Not all search engines can handle JavaScript, Google is leading the way,
but not perfect.

\- Those NoScript users could be potential clients or users of your site. I
love it when my conversation rate goes up 1%, and that becomes harder when you
ignore 1-5% of users.

\- Screenreaders generally do not support JavaScript [not true, see comment].
If only one blind user gets to access the content/design I created, then that
is worth it to me. I am thinking as front-end engineer here, not as a business
owner, where time is money. (Also depending on jurisdiction it may be against
the law to be inaccessible).

\- Noscript users will likely bounce in large numbers when seeing just a blank
page. Simply adding a <noscript> tag, explaining why you need JavaScript goes
a long way.

As a front-end engineer I go for maximum content accessibility. I don't meddle
into the politics of things ("If we don't drop IE6 support, the web won't move
forward!").

I totally understand the new landscape, where a lot of people have JavaScript
on. Some web apps can not use progressive enhancement, because the JavaScript
is core to the app. But for "static" content websites like these, it is
certainly possible to make a website that is usable by most users, human and
robot.

~~~
Isofarro
"Screenreaders in generally do not support JavaScript."

Please stop saying that. It hasn't been true for a good 5 years or so.

* [http://www.brucelawson.co.uk/2011/javascript-and-screenreade...](http://www.brucelawson.co.uk/2011/javascript-and-screenreaders/)

* [http://webaim.org/projects/screenreadersurvey4/#javascript](http://webaim.org/projects/screenreadersurvey4/#javascript)

* [http://www.w3.org/TR/WCAG20-TECHS/client-side-script.html](http://www.w3.org/TR/WCAG20-TECHS/client-side-script.html)

~~~
blauwbilgorgel
Thank you! I've been behind the times. I'll start viewing it as a general
accessibility issue, not specific to screen readers. But like Steve Klabnik
commented, and from
[http://www.w3.org/WAI/intro/aria.php](http://www.w3.org/WAI/intro/aria.php) ,
there are a few more steps to take to make JavaScript enabled screen readers
play nice:

"WAI-ARIA addresses these accessibility challenges by defining how information
about this functionality can be provided to assistive technology. With WAI-
ARIA, an advanced Web application can be made accessible and usable to people
with disabilities."

~~~
10dpd
But wai-aria has nothing to do with JavaScript, rather the application of
semantic elements to make non-native controls able to be interpreted by screen
readers.

------
callmeed
For an article that aims to _" put misconceptions to rest"_, it's pretty damn
short.

My companies do website/CMS services for small businesses–so SEO is WAY more
important to us than an app or a content aggregator like Monocle. If our
clients suspect SEO sucks for our product they will leave. Conversely, if our
product has a reputation for good SEO, it can drive a lot of business to us.

Also, Monocle.io isn't setting the title tag for any of their URLs, so that's
a pretty poor example to use wrt SEO.

I'd like to see a real, in-depth article that discusses the following:

\- At this point, should I use hash-fragments or pushState?

\- Which front-end JS framework (backbone, ember, angular, etc) has the best
support for SEO features out of the box?

\- Is Rails 4 + Turbolinks SEO-friendly?

\- I'd love to see some kind of experiment/example showing that a
JS/hash_fragment based site can actually rank well when competing against
basic HTML sites. I know that SEO comes down to content and links (more or
less) so experiments like that are hard/impossible. I just used to do a lot of
SEO for Flash sites back in they day. In the end, _you could only do so much_
and I worry that doing SEO for JS sites is similar.

Just because Google provides the hash-fragment feature doesn't mean they don't
give such sites less weight when ranking.

------
jcampbell1
Monocle.io is not getting indexed correctly.

see:

[https://www.google.com/search?q=%22Node+has+had+great+succes...](https://www.google.com/search?q=%22Node+has+had+great+success+because+of+how+simple+npm+is+to+use%22)

Notice how the monocle.io link is totally useless. Overtime, this will get you
killed by Google as they realize your domain is returning useless results.

~~~
maccman
Yeah, it's only failing because I list the summary of the posts on the index
page, and Google isn't picking up the real link.

It'll take a bit more time for Google to index the rest of the site, but I've
removed the summary from the index page so it shouldn't get picked up by this.

[1] -
[http://monocle.io/?_escaped_fragment_=](http://monocle.io/?_escaped_fragment_=)

------
Isofarro
So he starts with a JavaScript only application, then retrofits in a
progressive enhancement layer in a Google-specified query string parameter.

He could just as easily done the core experience first with HTML, got a URL
structure that is friendlier and RESTful, and then enhanced it with the
JavaScript enhancement he needs to turn it into a perception of a one page
website.

The bonus of doing it this way is clean URLs for each piece of indexable
content.

Because, really, what's the advantage of
[http://example.com/?_escaped_fragment_=about-
us](http://example.com/?_escaped_fragment_=about-us) over
[http://example.com/about-us](http://example.com/about-us) ?

------
franze
i will save you a lot of pain: just don't do it. the __escaped_fragment__ is
the most idiotic recommendation google has ever given. basically you have to
render two views, the user JS view and a server side rendered basic HTML
__escaped_fragment__ view. oh yeah, the __escaped_fragment__ view will never
be visited by your users and not by yourself - just by googlebot.

and now guess: what view will be less maintained, not up to date and regularly
broken?

why? when there is no direct feedback, there is no direct feedback.

if you want to do SEO and as well go down the "but it's faster with JS" road
just do "progressive enhancement" and history.pushState. the
__escaped_fragment__ spec is a leftover from the ajaxy web2.0 times, and even
then it was a bad idea.

------
nkuttler
I find it a little odd that this article focuses on the hash fragment approach
and only mentions the HTML5 pushState in passing, and how to avoid it. There
are a few scenarios where the hash fragment is more useful (states that don't
map well to URLs), but pushState has the huge benefit of looking natural AND
working in non-JS browsers in general.

I think it would be good to mention Sinatra in the title.

~~~
maccman
I focus on pushState throughout the article. Unfortunately, to spider a
website that uses pushState you have to use a Google spec that was originally
designed for the hash fragment.

~~~
nkuttler
No, that's wrong. You just use links to /the/normal/looking/URL and let the
browser use JS instead when it can. And obviously you serve the real content
at /the/normal/looking/URL instead of behind the obscure hash fragment method.

------
digisth
People should keep in mind that you do not always have the benefit of green
field development; sometimes you have a project already written that does not
have much of server-side component, and has no budget or time left for adding
real pages/do graceful degradation (much less progressive enhancement.) In
these cases, using a spider is pretty much your only choice. Some
notes/recommendations:

\- I'd recommend PhantomJS (there are some other packages built on top of it,
but for my custom needs, using the original was better)

\- If you spider a whole site, especially if it's somewhat complicated, log
what you're spidering and see if and where it hangs. I started getting some
PhantomJS hangs after ~100 URLs. In this case, it can be a good idea to do
multiple spidering runs using different entry points (I use command line
options to exclude certain URL patterns I know were spidered during previous
runs)

\- If you're spidering sites using script loaders (like require.js), pay
careful attention to console errors; if you notice things aren't loading, you
may have to tweak your load timeout to compensate. Using a "gnomon"
(indicator) CSS selector is very helpful here.

\- Add a link back to your static version for no-JS people in case Google/Bing
serves up the static version. This only seemed to be problem shortly after
spidering, but it's worth doing regardless (later, search engines seemed to
start serving the real version)

\- For those wondering how to keep the static version up-to-date, use a cron
job, then cp/rsync/whatever your latest version to your "static" directory.

One thing I'd like to add is that I wish PhantomJS would support more things
that node.js does (since some of its API is modeled after it), particularly
many synchronous versions of functions. That aside, it's an incredibly useful
piece of software.

------
chaddeshon
I made a web service that will do javascript SEO for you. Check it out at
BromBone.com.

We render your entire page for you a save it has html on a CDN. Then you can
just do the simple routing described in this article, but send Google the page
from our CDN instead. That way Google sees the exact same thing as your users,
but you don't have to code it again.

~~~
tonylampada
> BromBone uses a real web browser to download your web pages. We run all that
> fancy javascript, make all your AJAX calls, and save the result.

Makes me wonder why Google itself doesnt do it like that. Maybe Google should
buy BromBone...

~~~
WalterGR
> Makes me wonder why Google itself doesnt do it like that.

I do believe Google _does_ do it like that.

At least, when I had a Twitter search widget on my site, Google indexed the
content in that widget.

------
geetee
Does it matter that the JS-constructed HTML does not look anything like the
spider-friendly version?

We're about to deploy an AngularJS application that is using PhantomJS to
generate the spider-friendly content on the server. I'd much prefer to do this
simpler method if it works just as well.

~~~
_neil
We're building a small-ish site now with Backbone where we do something
similar with PhantomJS. In our `grunt build` task, phantomjs saves static
versions of subpages. Each of those pages loads up the same backbone app, but
the user (and search engine) sees the content they would expect to see no
matter where they land – without waiting for Backbone to load and run through
it's router. And it didn't take much time to setup.

Progressive enhancement is "The Right Way", for sure, but there are some
projects where we just aren't concerned with targeting users without
Javascript. That said, hopefully this will let us cater to many of those
users, reap the SEO benefits, and provide a better first-landing experience
for those not hitting the home page.

Edit: This is the grunt plugin we are using for this:
[https://github.com/cburgdorf/grunt-html-
snapshot](https://github.com/cburgdorf/grunt-html-snapshot)

------
steeve
For AngularJS apps, you are welcome to try (and improve!) AngularSEO
[https://github.com/steeve/angular-seo](https://github.com/steeve/angular-seo)
(based on PhantomJS)

~~~
benkitzelman
There's a few gems around handling this in middleware:

[https://github.com/benkitzelman/google-ajax-
crawler](https://github.com/benkitzelman/google-ajax-crawler)

[https://github.com/inossidabile/hashbang](https://github.com/inossidabile/hashbang)

------
tudorconstantin
So, basically, the article says that in order to have good SEO with full js
apps, you must also expose a non js app?

Like, you have to implement 2 sets of templates - one "classic", html only,
and one for js?

------
iopq
[https://www.cityblis.com/](https://www.cityblis.com/) I implemented the
<noscript> solution which actually shows the same content (without dynamic
positioning) to the users and allows them to go to non-javascript versions of
the pages. With JS on I serve dynamically positioned content and with scrolls.

I also have an implementation for the search, but it's not pushed yet. It
doesn't paint the first page, but only provides pagination for the users
without JS.

What do you think of this kind of a solution?

~~~
Amadou
As a noscript user I'd like to see an explanation to why I should enable
javascript on your site. If you put a link to an explanatory page along with
the "please enable javascript for a better experience" header I think that
would be useful.

Of course getting the explanation down is going to take some effort you don't
want to be condescending and you do want to give a meaningful explanation -
perhaps with pictures, animated gifs? - but don't want to make it into a
dissertation either.

~~~
iopq
Would just adding a line like "This site uses AJAX for a large part of its
functionality" work or would you prefer a more in-depth description?

~~~
Amadou
I would want to know what I was getting in trade for the increased risk of
enabling javascript. Your proposed message isn't substantially different from
the current message.

Noscript users are going to be more savvy than average users, but they are
still _users_ so if you are going to make any effort to inform them, make sure
it's from the perspective of a user rather than a developer. Developers care
about AJAX, users only care about what they see on the screen.

~~~
iopq
But I'm a developer, so I don't know what the best wording for this is. The
point is, some of the functionality of the site is broken when not using
Javascript because it's AJAXed and there is no <form> fallback with POSTs and
so forth (that's not in the development budget)

~~~
Isofarro
You charge extra to do it properly?

~~~
iopq
I only have so many hours in the day, I already got assigned on a different
project, they told me they don't have time to support less than 5% of users.

~~~
Isofarro
You ask for permission to do your job properly?

------
chadscira
Personally i only serve client-side rendered content to browsers that are in
my whitelist, everything else gets the server rendered content.

[http://chadscira.com/](http://chadscira.com/) (set your user-agent to blank)

------
thomasfromcdnjs
I wrote about using the same technique with a headless browser a while ago.

[http://backbonetutorials.com/seo-for-single-page-
apps/](http://backbonetutorials.com/seo-for-single-page-apps/)

Works nicely...

