
Stop worrying about Time To First Byte (TTFB) - jgrahamc
http://blog.cloudflare.com/ttfb-time-to-first-byte-considered-meaningles
======
forgotusername
I'd guess this minutiae was written in response to some unreferenced critique
of their service, in reality nobody with sense would use TTFB as a useful
measure of application performance.

In fact measuring performance anywhere near the HTTP layer is often pointless
when such metrics are so disconnected from what a user actually experiences.
Perhaps a better test would be something like "perceptible latency", taking
into account factors like the ability for the browser to progressively render
partially transmitted HTML, or large hangs as a result of external objects
(e.g. web fonts, GL models, etc.), especially in a world with increasingly
complex front-end Javascript blocking the browser's UI thread (syntax
highlighters on blogs, anyone?).

As Apple demonstrated on the iPhone, it's more than possible to hide multi-
second delays using tricks as simple as showing a facsimile image of what will
shortly be loaded, similarly a web app that animates a form submit with an
unintrusive 400ms animation (hiding a 500ms response time) may appear snappier
from a user's perspective than an app with no animation but a 300ms response
time.

Actually CloudFlare's own signup process is exemplary in this respect, they
bury half minute long DNS queries in the background while the user continues
filling out forms.

~~~
TazeTSchnitzel
On the subject of perceptible latency, supposedly this is why Google's Chrome
iPad app feels fast yet is actually slower than Mobile Safari at JavaScript.
If you do the UI right, you can hide latency.

And Microsoft spoke a lot with Metro about using animations to hide latency.
By animating everything, you can hide most of the delay, and only show loading
screens when it's a long wait.

~~~
jexe
Do you have a link to the MS talk/article about their work with Metro? I'd be
very interested.

~~~
TazeTSchnitzel
Oh, er, I don't have a link, but I believe it was from a talk they did at
//build about Metro. It shouldn't be too hard to find.

------
patrickmeenan
Completely ignoring TTFB would be a BAD idea. There is no single metric that
conveys the user experience (or performance).

Certainly optimizing for 1ms because of the overhead for gzip compression
isn't where you should be spending your time but I have dozens of cases in the
WebPagetest forums where users have had 5+ second TTFB times and needed help
figuring out what was going on. I have even seen cases where it was over 20
seconds (search for TTFB in the forums and you'll be shocked). It is usually a
combination of shared hosting and excessive database queries by their CMS but
it is common enough that completely ignoring it would be a very BAD idea.

Companies like Cloudflare can help the real TTFB (without cheating) by
optimizing the conection between the end user and the origin site. Most CDN's
call it DSA where they maintain a persistent connection back to the origin and
eliminate some of the round trip times. Cloudflare's recently-launched Railgun
feature should have a similar benefit.

It's actually a pretty well-known best-practice for web performance to "flush
the document early" if you have any expensive back-end processing to do. This
isn't cheating and involves sending as much of the HTML content as possible to
give the browser a head-start downloading external resources (css, js, etc).

Doing it with just the HTTP headers in order to cheat the metric itself is not
in anybody's interest.

~~~
cbr

        > There is no single metric that conveys
        > the user experience (or performance)
    

True. But if you need to pick one, TTFB isn't as good as something like Speed
Index [1]. Getting your TTFB down will usually help you get your content in
front of the user faster by, as you say, giving it a head-start on external
resources, but that will also show up in the Speed Index.

[1] [https://sites.google.com/a/webpagetest.org/docs/using-
webpag...](https://sites.google.com/a/webpagetest.org/docs/using-
webpagetest/metrics/speed-index)

------
xpose2000
I am glad that TTFB worries have been debunked by cloudflare. However, more
issues still linger:

1) Response times, in general, are terrible with Cloudflare enabled. This
chart [<http://goo.gl/JX1v6>] shows how Googlebot responds when cloudflare is
enabled and disabled. With Cloudflare enabled in June, response times are
above 1000ms. I have since disabled it in the middle of June and response
times returned to normal.

2) Cloudflare is getting worse. Look at the chart again. Notice the elevated
response times in April? I also had cloudflare enabled then. They hit around
400 - 500 ms in response time. Then I disabled it again at the first of May.

3) Ajax requests, most notably, POST have increased noticeable latency. Some
days are better than others.

I do not have anything "special" enabled. Minimize HTML,JS, CSS are not
enabled. Nor is rocket loader.

I'd also like to add that when "cache everything" + Minimize HTML, JS, CSS +
Rocketloader is enabled, then response times seem to be perfectly normal on a
different site hosted on the same box. Googlebot: [<http://goo.gl/DRDnF>].
However, most of us cannot use this feature unless our content is static.

Everything posted here is based off a free account. I will also be cross-
posting this on the cloudflare blog post that this links to.

~~~
jgrahamc
It might be better to put in a support request on this rather then posting on
the blog. I've highlighted your comment on our internal chat.

------
aaron42net
There's an important optimization that Nginx is missing around real-world
TTFB, however that would benefit Cloudflare users.

Even though it is streaming compression, zlib normally packs compressed
content into larger than 1500-byte blocks. Browsers can start parsing gzipped
content as soon as they can decompress a block, but they have to wait for a
whole block to arrive, which means for gzipped content they will often have to
wait for more than the first packet to arrive. (Due to TCP slow-start, this
may be another round-trip.) This means they are also waiting longer to start
fetching <head> contents.

There's a simple fix for this, which is to call deflate(..., Z_SYNC_FLUSH) on
a chunk that is likely to compress to under 1500 bytes, like maybe all of
<head> or the first 4k. The total compressed size will be slightly larger, but
the tradeoff is usually worth it. Nginx doesn't currently do this, but would
be a nice optimization to make available.

It's possible to work around this in Nginx by using the embedded perl module
and SSI. After </head>, I use SSI to call perl's $r->flush, which Nginx
correctly translates into Z_SYNC_FLUSH. The improvement is small, but
measurable.

~~~
jgrahamc
Thanks for that suggestion. I will look into altering nginx to do that.

------
isalmon
"At CloudFlare TTFB is not a significant metric." People might think that this
article is written to educate masses. The thing is - if you take
webpagetest.org and run speed tests with and without Cloudflare, in MANY cases
'Load time' will actually be slower WITH CloudFlare. And the biggest
difference will be in TTFB.

Why does it happen? Cloudflare works as a proxy, taking HTML from your server
and returning it to a user in optimized form. Because all requests go through
their servers - there's one additional hop. With this hop TTFB increases
significantly.

Although I agree that TTFB is actually not that important from end user's
perspective, the reason why Cloudflare wrote this article is strictly
marketing.

~~~
saurik
The hop is not the problem. In fact, an extra hop can actually be a benefit,
due to various TCP and connectivity factors. The issue might actually be
"CloudFlare buffers all the content and does complex operations on it before
returning any of it", or maybe is simply a poor CDN (not really being very
distributed, for example).

For more information, see a long comment I wrote on this a while back (note:
while reading, remember that a CDN typically has a server very near the end
user but not so near as to be on the other end of the last mile of their
network connectivity; in essence, they are a reverse proxy solution positioned
in the network where a forward proxy would normally go), as well as the
response someone left:

<http://news.ycombinator.com/item?id=2823268>

------
josephscott
The small table comparing a large Wikipedia page is really the main point, a
relative comparison of TTFB.

The mistake was is making a relative comparison on such super small numbers.
If your TTFB is less than 2ms, then you are fine. You are better than fine.

If your TTFB is 1.5 seconds, then you are struggling.

In determining if you have a TTFB issue comparing the relative improvement of
super small numbers isn't likely to help you much.

------
igrigorik
This is just silly. Time to first byte _does matter_ , when you know what to
put in those first bytes.

[https://plus.google.com/114552443805676710515/posts/GTWYbYWP...](https://plus.google.com/114552443805676710515/posts/GTWYbYWP6xP)

~~~
jgrahamc
I've replied on G+ but I think it's worth pointing out that I don't disagree
that time to the first useful byte matters. The issue is that what's being
reported isn't that and the time of header generation is not dependent on the
time to the first useful byte (as the gzip example indicates).

------
henrikschroder
But if you have a dynamic webpage, TTFB measures how long it took you to
process the page and start outputting results. The rigged server in this test
doesn't seem relevant to _that_ case, or am I missing something?

~~~
infinity
In the article it was examined what some of the tests for page load speed
measure as TTFB: The tests measure the time until the arrival of the first
character of the HTTP headers.

When using a dynamically generated page it is possible to send the HTTP
headers, then maybe send some some content and then do some server side
calculations or database access, and finally send the rest of the page
content. The TTFB measured by the tests, which were mentioned in the article,
will not reflect the time needed for server side calculations.

When using a server side scripting language there may be some kind of output
buffering, which has to be deactivated first.

~~~
bad_user
Not many web frameworks are configured to start writing the response before
the response is actually ready.

This is because when sending the response you need to know the HTTP status of
that response. Is it a redirect? And if you're querying the database lazily,
after your page started to render, what about DB errors that could happen?
Then you'd need to send an HTTP 500.

This is the drawback of using this feature in Rails 3. You have to ensure that
there's no way something unpredictable happens. And you have to activate the
feature explicitly in your controllers. And after you did that, you're
probably going to be aware that TTFB is not that relevant.

------
ralph
Didn't TTFB become popular because back then the start-up/handover time of the
request from the server to the URL handler, CGI, FastCGI, etc., was
significant depending on the handler's language, ... Far less significant now.
Cutting TCP packets with compression is much more useful.

------
saetaes
One thing that's interesting is that the Navigation Timing API available in
modern browsers varies from being wildly incorrect to accurate, depending on
the browser. In a quick test on my Mac, Firefox 13 showed an incorrect result
(basically, the same issue as WPT and Gomez - almost immediate TTFB), while
Chrome 20 correctly detected the 10s TTFB.

~~~
saetaes
One more data point: On Windows, both IE 9 and Chrome 20 measure TTFB
correctly via NavTiming, while Firefox does not.

------
Kudos
To me it looks like Cloudflare are going to use their position in the
request/response cycle to measure server response times and inject javascript
to enable real user monitoring. It should let them build performance metrics
similar to New Relic's.

------
shawabawa3
I don't understand the point.

Surely TTFB is used to measure server latency (basically a ping). Why would
TTFB be used as a proxy for page load speed when you could just use...page
load speed?

~~~
gilini
I believe they're referring to using TTFB as a performance test, to measure
how long took the server to process your request.

~~~
dknecht
This is metric that is commonly used in the popular webpagetest.org
performances test

------
mstdokumaci
actually, http headers are not that easy to send. how do you plan to create
"content-length" header before rendering whole response?

~~~
WALoeIII
This is why nginx gzips the body before writing the headers out.

You can use 'chunked_transfer_encoding' to enable chunking, combined with
'proxy_buffering off', your backend can stream the body and nginx will gzip
the chunks. Disabling buffering has other consequences, so be sure to read up
and experiment before you go to production.

