

Massive RedHat Perl performance issue - aditya
http://blog.vipul.net/2008/08/24/redhat-perl-what-a-tragedy/

======
ajross
One issue worth pointing out is that this slots nicely into the various "Why
should we have to learn C?" discussions that pop up here from time to time.

This is why. Eventually, something dumb will happen inside your tools, and
you'll need to figure it out. Blind faith in any of the software you rely on
is bad, you need to know how it works. And to first approximation, all the
software you rely on is written in C.

~~~
kingkongrevenge
I don't disagree but diagnosing this particular problem didn't require knowing
any C, just the perl profiler and another perl build.

~~~
ajross
Only if by "diagnosing" you mean "lucking out and finding a pre-existing
bugzilla entry on the problem". Someone earlier had to identify the problem
and crawl through the RH patch sets looking for it. The author, who doesn't
want to think beyond perl, was stuck without this. Not everyone is so lucky.

~~~
kingkongrevenge
Uh, no. Diving in to C was unnecessary to solve this problem any way you slice
it. In fact, it would have been the most inefficient way to figure out that
another perl build was needed.

------
tptacek
They caused an order of magnitude slowdown without hitting the disk? That's
impressive.

Apple totally fucked sqlite for awhile (it may still be, I compile from source
now) by doing a full filesystem flush (not fsync) on every commit:

[http://adiumx.com/pipermail/adium-
devl_adiumx.com/2008-April...](http://adiumx.com/pipermail/adium-
devl_adiumx.com/2008-April/004823.html)

There was a (crazy-talk) rationale for it, though: fsync wasn't thought to be
"reliable" enough, so the order-of-magnitude slowdown was for our own good.
Doubt bless() really "needed" the slowdown for RHEL.

~~~
andrewf
From the fsync man page on OS X 10.5:

    
    
         Fsync() causes all modified data and attributes of fildes to be moved to
         a permanent storage device.  This normally results in all in-core modi-
         fied copies of buffers for the associated file to be written to a disk.
    
         Note that while fsync() will flush all data from the host to the drive
         (i.e. the "permanent storage device"), the drive itself may not physi-
         cally write the data to the platters for quite some time and it may be
         written in an out-of-order sequence.
    
         Specifically, if the drive loses power or the OS crashes, the application
         may find that only some or none of their data was written.  The disk
         drive may also re-order the data so that later writes may be present,
         while earlier writes are not.
    
         This is not a theoretical edge case.  This scenario is easily reproduced
         with real world workloads and drive power failures.
    

It's still Apple's fault on some level (after all, they control everything
from the fsync implementation to the hard drives they choose to ship in Apple
hardware) but from the perspective of the guy configuring sqlite, the full
filesystem sync makes sense.

------
paul
This is a great example of why it's important to actually determine the root
cause of a performance problem before making any decisions about how to fix
it. Performance problems are very often something stupid and not at all what
you would expect.

~~~
spc476
But I have to wonder how RedHat is compiling the packages. A few years ago I
was bit with this, when the system supplied regex library I was linking
against (I was writing a C program) was actually slower than a shell script
with 20 greps in a pipe (<http://boston.conman.org/2003/01/12.1>). This took
quite a while to track down and even then I found it hard to understand what
RedHat did when compiling the library in question.

Way to go, RedHat!

~~~
maw
It's pretty easy to find out by pulling down one of their srpms, or by looking
at fedora cvs. Why don't you do that?

------
durana
I've found that a good practice for building systems is not to rely on the
software that comes with the OS for the specific task the system is being
built for. When I can I always build task specific software from source, not
because the software with the OS is bad always, but because building from
source gives you a lot more control over the software (compile time features,
paths, etc). And you can also typically get a more recent release of the
software when building from source since it doesn't have to go through the OS
vendor.

~~~
SwellJoe
This seems like a good idea on the surface, but it has some pretty serious
negative consequences.

When you need to replicate your environment, you now have to build all of the
custom bits exactly as they are on the production system, rather than simply
running "yum install perl foo bar baz". Depending on the length of your
dependency chain, and the dependency chain of all of those components, this
could be incredibly time consuming, even _if_ you don't make any mistakes in
the building process. Building a binary tarball of all the stuff you need is
an option, but then compatibility issues with existing system libs and such
are bound to happen, and that's pretty ugly from a paths and upgrades
perspective.

You also make your environment less standard. A new hire is going to have to
learn not only your application, but also all the crazy town details about
your particular and very specific deployment (and setup their own copy of it
on their own system). If everything except your app comes from OS-standard
packages, you can expect someone familiar with RHEL/CentOS or Debian/Ubuntu or
whatever OS you use to know where most things are right off the bat.

You'll probably do more things wrong with your build than the OS vendor did
with theirs. In my business, I see a lot of custom PHP builds, for example,
and almost every single one of them is broken in more than minor ways (and we
end up hearing about it, and trying to figure out what they did wrong in their
build). Your OS vendor version has a _lot_ of people banging on their builds
and reporting bugs. I'd pretty much always bet that their build is better than
yours from a reliability perspective.

It makes it harder to replicate your deployment, if something catastrophic
happens to your production box. Packages are more resilient to library changes
and such than a big ball of crud tarball of your binary builds. And you won't
want to spend several hours rebuilding on the new target machine while you're
offline. A complete system backup could be restored...I dunno if you've ever
done that on a remote system before, I assure you it is non-trivial and
stressful.

What I would instead recommend is to find out which components you need custom
(I'm not denying that sometimes you really do need, for example, perl 5.10 and
the OS has 5.8.8--it happens, and that's fine), and build new packages int he
native format and dump them into a yum or apt repository. It takes an extra
day or two, if you don't already know how to do it, but it'll save you many
many times that amount of time in the future--and those hours in the future
might be far more stressful hours than while you're first setting things up.
Rebuilding a package from SRPM or a deb source bundle is usually pretty
easy...bumping revisions in dramatic ways might not be trivial, but
recompiling with specific options is no problem at all. And, one can usually
find a source package of the latest and greatest in the devel branch of the
OS, which makes even major revision bumps easy (though, because it is the
devel branch, you're probably giving up some maturity in the package...far
fewer testers on the devel versions).

~~~
durana
I agree, if you don't know what you are doing then you can screw things up
pretty good by building things from source. Although harder, you can still
also screw things up installing software from OS vendors. Both approaches
require care. I've found build/deployment automation and documentation to be
the things that address most of the problems highlighted here. There's a lot
of cool software out there in this area that helps. Building task specific
software from source is definitely worth it, you've just got to know what you
are doing.

------
aditya
The weird thing is, that open source vendors - for all their talk - are almost
as bad as the closed source ones. Atleast there's a build from source
solution, though.

~~~
orib
Sure, but with open source vendors, you can fix it yourself if it's broken,
and it's important enough. That's the essential difference.

All vendors suck to different degrees. Nothing ever works perfectly. Open
source gives you the ability to do something if the suckiness affects you,
though. With closed source, you're stuck until the vendor gets around to your
bug. With larger vendors, this may take forever.

~~~
jrsims
Also, you have transparency in what you are running, which is important from a
rights perspective.

