PHP's ob_start() pre-allocates 40k memory per call

bkrausz · on Dec 8, 2011

If I'm reading the PHP source correctly, in 5.4 this is changed to 16kb (this entire code block seems to have been rewritten):

http://svn.php.net/viewvc/php/php-src/trunk/main/output.c?vi...

http://svn.php.net/viewvc/php/php-src/trunk/main/php_output....

   #define PHP_OUTPUT_HANDLER_DEFAULT_SIZE     0x4000

carldall · on Dec 8, 2011

It would surely be great if the PHP Docs would include such information right where the function is described.

joelhaasnoot · on Dec 8, 2011

It's probably hidden somewhere in the comments from the last 10 years, next to the awful comments written in foreign languages.

leif · on Dec 8, 2011

Why on earth are you trying to buffer 1-2K?

CaveTech · on Dec 8, 2011

Because in PHP the return response begins as soon as you start non-buffered output, after which you can no longer modify headers, cookies, etc.

wvenable · on Dec 8, 2011

Of course, there's no reason to output everything all the time. It is possible to use string concatenation and output the result at the end of processing rather than 16 layers of ob_start(). That seems a bit excessive.

itmag · on Dec 8, 2011

I would think the BEST way in theory would be to not output strings at all but rather build some kind of logical representation via datastructures (ie a DOM tree as in JQuery or HTML templates which are tied to a code-behind as in ASP.NET) and then output that after all processing is done.

What's the best PHP framework/library for that kind of thing?

dexen · on Dec 8, 2011

Funny you should ask that. There's that popular templating framework that is really good at transforming a flat or tree-ish structure of arrays and objects into a flat string. Also has a nice textual persistence format which lets you mix XML literals with processing instructions; well supported by most text editors out there. The framework is not all that great for other uses, but really shines at templating. You mix and match botch standard DOM -- including XSLT and whatnot -- and its fast, simple native arrays and objects. Available on many servers. The framework originally targetted Perl, but later on was re-implemented in C as standalone entity, for better performance.

They call it `PHP'.

sirclueless · on Dec 8, 2011

Ummm, all of them?

The basic problem you run into is that PHP is basically a templating language at its heart. If you don't immediately open with a "<?" you start sending data. If you want to use PHP as it was originally intended -- as raw HTML with a smattering of dynamic bits in the middle -- without losing the ability to set headers, then ob_start makes sense. If on the other hand you want to use it in the way that most people do, you open with "<?php" and proceed to abuse a whole bunch of bastardized perl templating functions followed by an "echo $data; ?>" at the end of your source code.

Now, given that PHP as a templating language is basically terrible, it's not that far out to just say, "To hell with PHP's native templating features, we're going to use our own!" but at that point you are already pretty far gone.

wvenable · on Dec 8, 2011

Actually that's not necessarily the best way because it's very memory intensive. Imagine a 50 line table containing text columns, link columns, images, buttons. That is a lot of objects in a DOM representation. A procedural loop generating the same table is significantly faster and smaller (but harder to work with). I've found the best thing is some combination -- some data structures and some flat procedural processing.

waitwhat · on Dec 8, 2011

ob_start() basically is string concatenation.

dexen · on Dec 8, 2011

Your OS does the equivalent of ob_start() on packet network sockets by default. See SO_SNDBUF, TCP_CORK, and TCP_NODELAY option to setsockopt().

If data sent over packet network wasn't bufferend, most of the time partial frames would be sent. With buffering, the OS usually sends full frames, making better use of bandwidth, at slight expense of delay and tiny extra memory use. There is even some tunable smarts built-in to keep delay down to manageable size.

The OS tries to create full frames from partial writes from applications. If OS didn't provide that service, it would be up to application to align write size for optimal bandwidth utilization.

You don't really want PHP developers to have to think about size of strings they output, now do you? ;-) For reference, http://www.pbm.com/~lindahl/mel.html -- a guy that hand-picked intervals between instructions in drum memory for better performance -- not exactly the kind of work for typical webdeveloper.

zokier · on Dec 8, 2011

You realize that because OS provides buffering at transport layer, applications do not need to care about frame sizes or any of that stuff. And PHP is even more distanced from the network, because it communicates only locally with your web server, which in turn may (and probably will) do buffering of its own.

ars · on Dec 8, 2011

That explains one ob_start() but not 16 of them.

ck2 · on Dec 8, 2011

I always treated output buffering/capturing like eval

- if you have to use it, you need to question yourself carefully as to "why" and if there is a better way.

forgotusername · on Dec 8, 2011

As any European user of HN knows too well, when you send tiny chunks separated by a few ms a time up to TCP, the majority of the planet will get a jittery half-rendered experience of your home page lasting seconds for a handful of actual kb downloaded.

The better question might be "if you want to turn it off, the question is why", or something.

ars · on Dec 8, 2011

Using one ob_start() could make sense for certain cases. But 16 of them? I can't think of any reason to nest ob_start 16 levels deep.

nawariata · on Dec 8, 2011

This is common approach in MVC frameworks, where every view (template/partial/widget) is rendered independently and then injected into main template.

Jach · on Dec 8, 2011

Me neither, I'd be interested in the use case for that many. A few years ago I made a PHP-based template system that had one nested buffer to have a poor-man's eval of template (HTML with display-logic PHP) files, it just boiled down to:

    ob_start();
    include $file;
    $contents = ob_get_contents();
    ob_end_clean();

lox · on Dec 8, 2011

Yup, this is common. If you put this in a function and use extract you can create a clean working space for a template where you can use $vars without worrying about polluting scope.

I'd assume the 16x calls would be because they have a very complicated set of templates/layouts/partials, all doing their own templating.

Jach · on Dec 8, 2011

That's actually exactly what I was doing in the full context.

I can see 16 calls over the lifetime of the request, but what was confusing me was how they could have 16 nested calls forming some ob_start();ob_start();...ob_end_flush(); Maybe some crazy way to avoid having to return text up the tree through function returns. But I read the post again and it said nothing about nesting them, so it's probably just something like 16 sub-templates or whatever as you suggest being processed at different intervals, not necessarily nested, with the gc not collecting frequently enough.

ars · on Dec 8, 2011

> with the gc not collecting frequently enough.

PHP is written in C which is not garbage collected.

PHP itself collects garbage, but this buffer is an internal C buffer and would be released immediately.

Jach · on Dec 9, 2011

...Yeah, not sure what I was thinking with that one, thanks for replying. (My excuse! I've been up over 24 hours already...)

masklinn · on Dec 8, 2011

I'm thinking various components, each doing its own ob_start() for its output, something like that.

ecaron · on Dec 8, 2011

I feel like most people use this as a stop-gap for doing gzipping content to reduce bandwidth, but hopefully more people relying on nginx/apache to do the compression delivery will reduce the circa-2004 PHP mentality that started the practice.

Oh Steve Souders, your quest for speed will doom us all:)

nknight · on Dec 8, 2011

The more usual solution for growable buffers is to start small and just double each time you hit the limit, not grow it by a static amount. People have tried a lot of different strategies, but there don't seem to be any that work much better for general purpose use.

jemfinch · on Dec 8, 2011

Depending on the number of objects you're allocating, it's often better (for reduced fragmentation and smooth performance) to use an allocation strategy like GCC's std::deque: allocate fixed-size buffers, and use an exponentially growing array to hold pointers to those buffers. Once you start getting into millions of elements in your std::vector, you'll see long program pauses (as the vector grows and all the elements are copied to the new vector) and sometimes have difficulty finding a large enough contiguous buffer.

This is one reason why the C++ cognoscenti recommend using std::deque as your default array-like container rather than std::vector.

Someone · on Dec 8, 2011

This is not merely a matter of trying and picking what suits you best. If you grow a buffer by fixed amounts, you end up copying O(n^2) items. If you grow it by a fixed factor, you end up copying O(n) items. The difference can be staggering.

jemfinch · on Dec 8, 2011

You only end up copying O(N^2) elements if you copy elements at all. You need not copy: you just keep a vector of pointers to your fixed buffers, a la GCC's deque implementation. That vector of pointers will be copied (and thus ought to grow by doubles), but each fixed buffer remains the same as the deque grows.

I haven't looked at PHP's implementation, and I've long since learned not to expect intelligent, rational implementation decisions from that language community, but it's certainly possible to grow by a fixed amount and remain O(N) by never copying elements at all.