gboudreau's comments

gboudreau · on Oct 1, 2021

You should contact Heroku, to ask them to stop sending that 3rd DST_Root_CA_X3 certificate in the chain. And if you have only a few Heroku apps, you can fix temporarily by obtaining a Let's Encrypt certificate another way, and upload it on Heroku (their web dashboard allows you to do that). I myself changed the DNS entry temporarily, got myself a LE cert on another server, and then changed back the DNS to point to Heroku. I then uploaded me cert chain (two certificates; mine and the R3); and the private key, using the Heroku dashboard. Heroku now has 90 days to fix their side, and then I will be able to switch back to using ACM.

gboudreau · on Feb 20, 2010

> How can this happen?

Good question. I don't think Google published a list of the multiple sources they use to find new websites. I think it's not too far fetched to say that all URLs they can see though any of their service would be added to the Googlebot queue. Sending an email to a Gmail user with a link to admin amahi org might be a trigger... But nothing is sure.

> Does PageRank and their famous incoming link equations not matter?? How can this be so off?

It's not off really. To Google's eyes, www amahi org == admin amahi org. They became interchangeable. Here's more details on this...

Since the end of 2007, subdomains of a specific domain are all linked together somehow. Ref: Matt Cutts [http://www.mattcutts.com/blog/about-me/] in a blog post here [http://www.mattcutts.com/blog/subdomains-and-subdirectories/]

What this means is that Google knows that www amahi org and admin amahi org are related. In your case, they were more than just related; they were pratically identical (in content). When Google finds duplicate content on your site, it might decide to use one version or the other in search results. In Google's eyes, both URL are interchangeable, and so they will be used interchangeably. And this is particularly apparent in your case, where searching for "linux home server" on Google will sometimes show admin amahi org, sometimes www amahi org, as the first result. Or when searching for "link:admin amahi org" will return pages that link to www amahi org.

Ref: Duplicate content [http://www.google.com/support/webmasters/bin/answer.py?hl=en...]

> How long will this last?

Nobody can say. They never give any estimate on how long re-crawling a site will take.

> We have gone to WMT to dramatically accelerate the crawl rate of www amahi org

You should do the same with admin amahi org, after you've changed a couple of things (more about that below).

> More importantly, how can we prevent this from happening again?

Multiple ways: robots.txt, Canonicalization, Sitemaps...

robots.txt: it's pretty easy to make robots.txt dynamic: .htaccess: <Files robots.txt> ForceType application/x-httpd-php </Files>

robots.txt: <?php if ($_SERVER['SERVER_NAME'] == 'admin amahi org') { echo "Disallow: /"; } ?>

Canonicalization: See details here [http://www.google.com/support/webmasters/bin/answer.py?answe...] Basically, you should add a <link rel="canonical" href="http://www amahi org/...> in the <head> section of your pages. This will indicate to Googlebot your preferred URL. Again, it's pretty easy with any scripting language... Just hardcode the "http://www amahi org part of the @href, and put the rest dynamically.

Sitemaps: You should submit sitemaps for both admin amahi org and www amahi org on Google Webmaster Tools. In fact, you should create only one sitemap, containing only www amahi org URLs you want Google to see, and use that same sitemap for both www and admin sites. Submitting this sitemap through Google Webmaster Tools could help Googlebot pick up your 301 Redirect, and your new robots.txt, a little faster.

Good luck with this!