Hacker Newsnew | comments | ask | jobs | submitlogin
frederickcook 1486 days ago | link | parent

But you have to wonder how much of it really is spiders crawling over old pages with links or people actually clicking on links from old pages. Probably many more of the former, and that wouldn't be hard to test for.


grayrest 1486 days ago | link

The linked post itself says it's mostly spiders following links from various place on the web to old/invalid URLs:

> And the biggest performance boost of all: caching 404s and sending Cache-Control headers to the CDN on 404. Upwards of 66% of our server time is spent on serving 404s from spiders crawling invalid urls and from urls that exist out in the wild from 6-10 years ago.

They're still serving the pages, but they're serving them off CDN instead of serving them off their main server.

-----

dminor 1486 days ago | link

Still, I wonder what Google does with page rank that goes to a 404 vs a permanent redirect.

-----




Lists | RSS | Bookmarklet | Guidelines | FAQ | DMCA | News News | Feature Requests | Bugs | Y Combinator | Apply | Library

Search: