Hacker News new | comments | show | ask | jobs | submit login

There are both architectural errors with this website and original content issues.

I see no brazen violation of the Webmaster Guidelines demanding a penalty. As far as I know, manual penalties (and their cause) get communicated through Google Webmaster Tools.

What I do see is architectural issues.


Is an empty page, that is linked to from the homepage. For listing websites: Don't put up pages for which you have no content.

The search result pages are indexable too. Even though the post claims they have blocked these pages in robots.txt, they aren't:


(Won't link as to not pollute your index).

These pages wreak havoc on your crawl budget.

Then on to the issue of original content:


This isn't content that will ever perform well in search engines. I'd even go so far as to say these results pollute the search results. They are an address and hopefully a tiny copied description. What added value does this have for a visitor? What added value does this have for a search engine to present high up in the results?

Look at how Yelp or LinkedIn got people to enter content on their listing pages. Or even Yahoo, look at all the widgets added to listings, to increase the content size and relevancy.

You even use Facebook comments (which are not crawled all that well by Googlebot, seeing they are javascript and reside on the Facebook server, not yours). Any user generated content to make pages more unique, you freely donate to Facebook, instead of adding it to your pages in plain text.

Conclusion: No (manual) penalty, but bad non-engaging duplicate content listings combined with a poor site architecture that allows zillions of page combinations with meager to no content on them. The issue is confounded by not making sure your site is canonical:


Site architecture and SEO is massively important for the success or failure of these kinds of websites, and both seem quite poorly thought out at a glance.

If the site is young, the initial boost in rankings (honeymoon period), will decay overtime and settle together with your page authority, content quality etc. If you see a decrease in visitors, you could attribute it to this initial popularity, instead of a penalty. Current visitor numbers might reflect your content / site quality a lot better.

Look at how SEOmoz helped Yelp produce a solid strategy: http://www.quora.com/How-do-StackOverflow-and-Yelp-achieve-s...

All, please re-read the point about content in JavaScript NOT being crawled. Yes, this is true. Accept it, move on, and don't waste hours or days debating this with your SEO or product manager.

Google will parse js to find links to index, but don't count on getting credit for any content you put in there.

Agree 100%. Look at ways to get each page indexed with original quality content.

These are great points. Is it possible to fix the architectural issues using "noindex" and "nofollow" on the empty pages? Are there other good strategies to clean up issues as detailed above?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact