Hacker Newsnew | comments | show | ask | jobs | submitlogin

The linked post itself says it's mostly spiders following links from various place on the web to old/invalid URLs:

> And the biggest performance boost of all: caching 404s and sending Cache-Control headers to the CDN on 404. Upwards of 66% of our server time is spent on serving 404s from spiders crawling invalid urls and from urls that exist out in the wild from 6-10 years ago.

They're still serving the pages, but they're serving them off CDN instead of serving them off their main server.




Still, I wonder what Google does with page rank that goes to a 404 vs a permanent redirect.

-----




Applications are open for YC Summer 2015

Guidelines | FAQ | Support | API | Lists | Bookmarklet | DMCA | Y Combinator | Apply | Contact

Search: