Can I ask you what was going wrong in the past that caused your site to take such a long time for Google to crawl it - and what you did to bring the crawl time down so dramatically?
For reference, when some sites I was tracking spiked up over a few hundred MS crawl time I grew alarmed and resolved those issues.
Also, another good way to measure things is to look at 'Site Performance' in the Labs section of Webmaster Tools (right below Diagnostics used in this post). You'll get a graph that represents your site relative to the Internet along with improvement suggestions.
Also - while the conclusion seems accurate: faster load time, better ranking in perhaps 1% of cases is something Google has talked about - I think your reasoning is off.
You wrote: .If a site can be crawled faster - and requires less resources to index, doesn’t it stand to reason that it will be rewarded with higher search rankings?
It's not resources, it is a matter of user experience. The faster your site loads, the happier Google's searchers who clicked over to you are:
Speeding up websites is important - not just to site owners, but to all Internet users. Faster sites create happy users and we've seen in our internal studies that when a site responds slowly, visitors spend less time there
I understand the user experience issue, but I'm trying to illustrate that there's also the issue of Google's resources. They are a business after all, no matter how they hope to not be evil.
I don't see how it factors into their resource usage at all. A crawler that's blocked waiting for a reply from your site is unlikely to be using much/any CPU time compared to an active crawler, and that time is surely being used by another crawler on that machine. All crawlers blocked and waiting? That means they can spin up 50 more!
Unless they're doing it wrong, and I very much doubt they are given it's central to their business, it's purely a user experience issue.
I don't know the details of how their crawler works, but it seemed to me that if a page takes more time to serve to the crawler, there would be some lost resources.
Even so, faster page load time also means higher AdSense CTR, and more money - though I guess you could make an indirect user experience case for that.
There's bound to be a small amount of memory used by the idle crawler, but if they designed it well, it seems unlikely that that would be a limiting factor.
You make a fair point about the AdSense. I would guess they care more about a better user experience, but it's certainly possible. Everything they do can be viewed as a means to keep people searching and using the web, and clicking on more ads.
I just left a similar comment on the post - it likely wouldn't affect resources. The ranking bonus from fast loading pages is probably because Google knows humans prefer fast loading sites.
Google's page loading times seem kind of flaky. One of our sites' loading times has jumped between 1.5 and 10 seconds and back four times since they started graphing it.
But I have a cron job that tracks the loading times of random pages on our sites (from offsite, every 10 minutes). I get very consistent loading times (nowhere near that variance).
(Edit: they also suggest GZIP resources that are in fact already gzipped, at least they are according to the HTTP headers)
I think this post has a grain of truth behind it, though I don't agree with the reasoning about resource utilization.
I noticed that when one of my sites went from 300ms response times to 30ms, the crawler started indexing more pages per day, and would index deeper, which meant more of my pages in their index. The result was a healthy boost in organic search traffic due to more long tail search matches.
I'm starting to wonder if people are interpreting "resources" to mean "CPU resources," when I really mean "resources" as in time, money, etc.
What's your hypothesis for why the crawler started indexing more pages per day when your response time improved? That seems like evidence of my theory, but I'm a designer by training so maybe there's a more technical explanation I'm missing.
Yeah, I assumed you meant machine resources. Machine resources tend to be the limiting factor for massively parallel things like crawling, so that's why I assumed that.
My hypothesis is that there's a single crawler that was looking at my site, and it makes a best effort to crawl as much as it can. I have more pages on my site than can be crawled in a single day at 300ms/page, but at 30ms/page, it can be done in 3-4 hours. I don't know enough about their architecture to make any further guesses than to say that it probably just grabs as much as it can within the time it's focused on my site before something else on the queue becomes a higher priority.
For reference, when some sites I was tracking spiked up over a few hundred MS crawl time I grew alarmed and resolved those issues.
Also, another good way to measure things is to look at 'Site Performance' in the Labs section of Webmaster Tools (right below Diagnostics used in this post). You'll get a graph that represents your site relative to the Internet along with improvement suggestions.
Also - while the conclusion seems accurate: faster load time, better ranking in perhaps 1% of cases is something Google has talked about - I think your reasoning is off.
You wrote: .If a site can be crawled faster - and requires less resources to index, doesn’t it stand to reason that it will be rewarded with higher search rankings?
It's not resources, it is a matter of user experience. The faster your site loads, the happier Google's searchers who clicked over to you are:
Speeding up websites is important - not just to site owners, but to all Internet users. Faster sites create happy users and we've seen in our internal studies that when a site responds slowly, visitors spend less time there
See http://googlewebmastercentral.blogspot.com/2010/04/using-sit...