Hacker News new | comments | show | ask | jobs | submit login

I'm not too surprised. I've got Googlebot still requesting old URLs even through there are no incoming links to them (that I know of) and they've been either 404 or 301 redirected for six months. I even tried using 410 Gone instead of 404, but it made no difference.

To just reiterate this further, I am still 301'ing urls that have been dead for nearly 5 years. I still get requests in for them. I don't want to 404 them in fear of losing that slight bit of traffic so I just 301 them. I am really surprised they don't remove these urls from their cache and I can't think for the life of me why they don't?

They might have some obscure incoming url from somewhere else on the net.

Webmasters should show that. If you 404 the page it'll appear in the errors pane after some time and show incoming sources.

It sounds like the only thing requesting the 404`ed page was google bot, which I do not believe tells you the referrer. If this is true, then it would mean either that google does not clear their cache (which I doubt), or that the link exists somewhere on the net, but in a place where no human would find it. I've done some work with web crawlers, and it you fall into that type of hole alot more often than I would expect.

I'm not sure I understand, why wouldn't Webmasters show that one hard to find link if Googlebot found it?

I removed a whole section from the site at the same time. Webmaster Tools shows the incoming links for every page in the section as other pages in that section. It's a whole loop of pages linking to each other and generating inbound links even though they all no longer exist and haven't for many months.

Yeah, Webmasters is reaally slow to update. Thankfully they offered a way to delete old pages. If they come back though, then it should show the source of the link.

Same here.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact