
How we spent two days making Perl faster - elnatnal
http://blog.booking.com/making-perl-faster.html
======
ChuckMcM
Nice. One of the more interesting things about perl's internals is that it
does spend a lot of time with data structures that 'pre-compute' or partially
compute the desired result as an optimization. It is my belief that this has
been the "secret sauce" of perl performance without a VM for a long time.

~~~
jerf
Perl... performance?

[http://benchmarksgame.alioth.debian.org/u32/which-
programs-a...](http://benchmarksgame.alioth.debian.org/u32/which-programs-are-
fastest.html)

Perl is solidly in the lowest tier of serious language performance, and that
result is quite robust no matter how you query that site, and correlates to my
personal experiences as well.

Perl isn't alone... it's joined by many of its 1990s-style dynamic scripting
language buddies with very similar performance. And of course many fine
programs are written in them nevertheless, so I don't mean this as a
criticism... in fact if you read this as a "criticism" that needs to be
"rebutted" in some sense you're entirely missing my point. I really shouldn't
have to "qualify" this to fend off the obvious replies, but I've been around
enough to know I do. But, it's true that a serious professional should be
aware that Perl is in the slowest executing set of languages, and it is very
possible for the difference to be significant for you in many commonly-
occurring tasks.

~~~
DougWebb
Having developed in Perl for many years, on a project that required fairly
high performance, I learned that the key to getting good performance out of
Perl code is to use it idiomatically.

Perl has a _lot_ of built-in functions, keywords, and operators. It's also a
bytecode-compiled language (yeah, Perl is compiled when you run it, the
compiler is just so fast you don't notice it.) But a lot of those built-ins
don't compile to the bytecode equivalent of machine code for a VM like you
might expect; they compile to a single instruction that's got an optimized
implementation written in C. When you write idiomatic code, you tend to use
built-ins in a way that allows those optimized instructions to be used, which
gives you near-C level performance.

For example, if you want to loop over every member of a list, perform an
operation on each member, and produce a new list based on the results of those
operations, you could use a 'for' loop for that. But if you do than the Perl
compiler needs to produce bytecode that mirrors the generic for loop and the
full code in the body of the loop, pretty much as you wrote it. The compiler
can't make any assumptions about the intended behavior. If you instead use the
'map' keyword, the compiler knows that you're transforming one list into
another by running a bit of code on each member of the list, and it'll produce
much more optimized code to do that.

It's kind of ironic. You can write 'C' style code in Perl but you won't get
C-like performance that way, you'll get Perl-like performance. But if you
write 'Perl' style code instead, you'll get C-like performance.

~~~
alayne
Perl is not bytecode based. It compiles to a data structure, essentially an
AST. I've never seen anything close to C-like performance from Perl, not in
anything computational.

~~~
DougWebb
I was being kind of loose in my technical description. Perl is similar to true
bytecode-based languages in that it gets compiled into a lower-level
representation that is not native machine code, and then that representation
gets executed by a program that behaves like a cpu whose native machine code
is the symbols used in the representation.

But you're right, Perl's implementation of this strategy isn't like classic
Pascal or modern Java's implementations. But that doesn't alter my point;
writing idiomatic Perl gives you a smaller AST with symbols that execute
optimized code.

You're also right about Perl not having C-like performance for anything
computational, because there you're dealing with Perl's data types, which are
powerful and flexible but not efficient, as the article describes. My point
was more about looping, set operations, string manipulation, conditional
statements, and data structures. Perl has both efficient and inefficent ways
to handle all of those, and it pays to learn to use the efficient expressions.

------
rurban
Nice to see that they made good usage of my pictures that I have to draw
manually with postscript.

I use them also daily to come up with similar optimizations, but on a grander
scheme, to support proper type optimizations (which might reach the
php/lua/javascript performance range then).

But to be fair: Bodyless NV's and cached class pointers for methods are the
biggest win in 5.22. OP_SIGNATURE didn't make it.

~~~
tumba
Thank you for PerlGuts Illustrated!

[http://cpansearch.perl.org/src/RURBAN/illguts-0.49/index.htm...](http://cpansearch.perl.org/src/RURBAN/illguts-0.49/index.html)

------
makmanalp
Very neat! Somehow booking.com is one of those bastions of perl and seems to
attract some top talent that way. It would be cool to read / hear more about
whether it's more or less of a pain to hire people, to find libraries to
interface with newer databases and tools, etc etc.

------
lisper-
[http://benchmarksgame.alioth.debian.org/u64/benchmark.php?te...](http://benchmarksgame.alioth.debian.org/u64/benchmark.php?test=all&lang=perl&lang2=v8&data=u64)

The results for MRI and CPython are similar. I hate to sound harsh, but what's
the point of incrementally optimizing perl, MRI, or CPython when the result is
still going to be an order of magnitude slower than SBCL, LuaJit, V8, or
CogVM? In fact, I suspect enough person hours have already gone into
optimizing these C runtimes that had they known better when they started, they
could have already had something like SBCL by now.

Performance matters. By simply changing your language to one with a modern
implementation, you can save considerable money on hardware. The argument that
Perl/Ruby/Python are so much more expressive/powerful/whatever and therefore
are worth the added cost is much less compelling when there are other,
comparably dynamic languages like JavaScript with implementations that are
much, much faster.

~~~
bane
>but what's the point of incrementally optimizing perl, MRI, or CPython when
the result is still going to be an order of magnitude slower than SBCL,
LuaJit, V8, or CogVM?

because you can't write Perl or Python for SBCL, LuaJit, V8 or CogVM.

~~~
lisper-
My point is that the effort incrementally improving these C runtimes would be
better spent producing modern implementations for them with just-in-time or
ahead-of-time native code compilation, like other dynamic languages already
enjoy. In light of V8 or SBCL, tweaking Perl's C runtime just seems like a
premature optimization.

~~~
hyperpape
I don't know about Perl, but in the case of Python and Ruby, I think there are
C extensions that would have to be replaced with a JIT. So PyPy has a 5-6x
speed improvement, but can't be deployed in many places.

~~~
SoftwareMaven
This hits us hard. I would love to deploy almost everything we do on pypy, but
legacy code dictates the use of C modules that don't have pure python
implementations that come close to the necessary functionality.

Compatibility with the massive ecosystem of existing code is something
everybody who says "just rewrite the runtime with X" seems to forget.

------
scott_s
It's not necessary to write macros in C if you want to avoid runtime function
calls. Functions can be inlined. There exist keywords to request inlining, and
most compilers have extensions to force it. But in practice, optimizers will
inline small functions like this if they have access to the definition.

------
est
This is off topic, but I really CPython 3.x could make more speed improvements
like these. We can not bet everything on Pypy.

Ruby 3.0 has a grand plan for JIT, removal of GIT and everything. Perl 6 is
catching up, PHP 7 doubled its speed and will release this year.

~~~
dragonwriter
> Ruby 3.0 has a grand plan for JIT, removal of GIT and everything.

Source? AFAICT, the Ruby community has in the last couple years just started
talking seriously about ideas about what Ruby 3 _might_ look like, but there
is no "grand plan".

~~~
est
[http://hrnabi.com/2015/05/12/7035/](http://hrnabi.com/2015/05/12/7035/)

It's from RubyKaigi 2014

video
[https://www.youtube.com/watch?v=zt56zjNf84Q](https://www.youtube.com/watch?v=zt56zjNf84Q)

You are right, it's about what Ruby 3 might look like

------
azianmike
Great writeup! I've never done perl before but it's still a great read!

------
firefoxhane
Nice write up

------
phkahler
Somehow this headline reminded me of the time the GIMP developers accidentally
integrated GEGL in a few weeks.

[http://gimpfoo.de/2012/04/17/goat-invasion-in-
gimp/](http://gimpfoo.de/2012/04/17/goat-invasion-in-gimp/)

"What was planned as a one week visit turned into 3 weeks of GEGL porting
madness. At the time this article is written, about 90% of the GIMP
application’s core are ported to GEGL..."

That was in 2012 and it's still not done. Almost, but not yet.

~~~
guelo
That GIMP refactoring looks like the CADT model[1] while these (accepted) Perl
patches look more like the work of talented professionals.

[1] [http://www.jwz.org/doc/cadt.html](http://www.jwz.org/doc/cadt.html)

~~~
phkahler
I hadn't seen that before. Good point. The purpose of a rewrite is usually
claimed to be that progress can't be made with the code as-is. So one would
expect that the rewrite would quickly show some progress _beyond_ the old
version.

~~~
Someone
Not necessarily. If you are near the top of Mont Blanc, there's no way to go
up far in small steps. If you are confident Mount Everest exists, you can
decide to go walk there, but you certainly won't see progress for quite some
time.

There are equivalents in software, for example the removal of the GIL in
Python (never done sucesfully in the sense that the majority of users started
using it) or, equivalently, the effective GIL in OS kernels (done successfully
a couple of times, AFAIK, even though it theoretically requires inspection of
all driver source code)

A JIT-ted Python similarly may eventually be the better choice, but the road
getting there may be long and harsh.

