

PHP Hacking aka trying to push PHP internals forwards - grumpycanuck
http://www.xarg.org/2011/06/php-hacking/

======
pbiggar
I've read a lot of the PHP internals source code (I'm an author of phc), and
this post leaves me a tiny bit optimistic, and a tiny bit disappointed.

I completely agree of the need to fork to make PHP a better language. The PHP
internals community is the most awful OSS community I've ever been involved
in, and getting shit done there just isn't going to work. So this is a good
first step.

I also like the things he has taken away. PHP suffers from too many
unnecessary options. I know that "unnecessary" is often subjective and in the
eye of the beholder, but the OP seems to have a good eye for this. The things
he removed are a blight on the codebase, so well done.

The optimizations are so-so. While I can see the benefits of the hardcoded
constants, the other optimizations are hacks. And that has historically been
the problem with PHP optimizations - they're all hacks. The engine needs
rearchitecting to be fast, and all these hacks need to be removed, not more
added.

Take for example that strlen is slow. The problem is that all function calls
in PHP are slow. And if you looked at the function calling code, it's very
obvious why. The correct solution is to make _all_ function calls fast,
presumably by using some kind of inline cache, or just refactoring a ton of
that crap out.

(Actually, someone introduced a patch to PHP-internals about two years ago to
cache function calls somehow. I don't remember the exact details, but it was
similar to the concept of an inline cache, perhaps storing a function cache
struct in some bytecode or something. As I recall, it was denegrated as not
being a full PIC. Sigh).

Finally, I really don't like the added functions. No problem with the
functions themselves, just that adding functions to core PHP is not useful to
anyone but the author. What if they clash with a user's function names? Why
couldn't they just be implemented in user space? Etc. This is very much going
on the wrong direction.

In the future, I'd like to see an effort like this go wild and make backwards
incompatible changes like making the needle/haystack parameters consistent,
but that's optimistic. (If they were to do it, phpcompiler.org could well be
used to provide a 2to3-like too for doing the user-code porting
automatically).

So overall, some good and some bad. If the OP focussed on making optimizations
which improved the code base, removing more crap, and avoided adding "new"
things, this would be really nice.

------
adaml_623
Good to see. It's about time PHP got forked even if experimentally.

It seems to me that you could fix a lot of the problems with PHP by breaking
backwards compatibility.

~~~
jcoby
Many of the problems and annoyances with PHP could be fixed while keeping
backwards compatibility.

Just a short list I can think of now:

\- Short array syntax. Completely optional but would clean up the language
quite a bit. This has been proposed several times and shot down each for no
real good reason.

\- Promote complex primitives into objects. With the proper interfaces it
wouldn't break any old code. Once this is done, deprecate the myriad of str*
and array* functions in favor of the object methods. This also cleans up the
"what order do i pass in the args" problem that so many people cite.

\- Named parameters. Gets rid of passing in default params or an array blob
for functions/methods that take lots of optional params. It also opens up the
way for DSLs.

\- Replace PEAR with something better. The quality of PEAR modules is really
low and PEAR always seems to be broken one way or another.

\- Add a list and hash type. PHP's array is very slow for some cases, and
sometimes you need to enforce the datatype you're working with.

\- Use exceptions for errors. Could be a runtime flag to prevent breaking
things. It's incredibly easy to write bad code because PHP make it easy to
ignore problems. Things like functions that connect to the db or trying to
retrieve data from a stream don't throw exceptions on error.

~~~
adaml_623
Well to be honest if you deprecate the str* and array* functions you are
pretty much breaking backwards compatibility.

I'm not saying it wouldn't be a good thing but those functions are one of the
defining features of PHP imho

~~~
josegonzalez
Deprecation is not the same thing as removal.

------
hackernewz
Boo, deleted short tags. Why do people hate short tags again? I can't remember
because <? is not valid XML so there shouldn't be any problems with mixing php
and xml... hmm... I wonder ....

~~~
philolson
<?= always exists as of PHP 5.4, despite the short tags setting.

~~~
romaniv
Awesome. I remember arguing for it on the mailing lists and I'm extremely
happy they did this.

With echo tag you can have a very simple and powerful templating system
written in the language itself. (I usually just use two methods,
Temaplte::show and Template::get, which both are less than 20 lines long.)

------
josegonzalez
I wonder if it wouldn't have been better to import the PHP source-tree first,
then apply each patch one at a time/in a batch. That way applying these
patches at a later date would have been easier.

~~~
mattyb
That's what was done.

[https://github.com/infusion/PHP/commit/790d551ac9ef8e204b44f...](https://github.com/infusion/PHP/commit/790d551ac9ef8e204b44f4030291536137b0be25)

~~~
josegonzalez
No it wasn't. As sc68cal says, its a patchbomb. And if you think otherwise,
let me know how I can revert his changes to remove short tags and the mysql*
changes using a single git command (hint, there isn't).

I'm all for changes to an open source project - whether it acts like one or
not - but every open source developer should, at some point, learn that
gigantic patchsets with lots of unrelated changes are a big no-no.

~~~
sc68cal
I'm just pissed that he didn't even bother to actually fork the project. On
Github! WOW! All the previous commits before the fork? Gone. Poof. It's
completely without any context. Even though there's a big "FORK" button!

~~~
mattyb
Well what project would he fork? The (apparently) official PHP mirror is way
out of date:

<https://github.com/php/php-src>

~~~
sc68cal
Well, then just use git svn to update the master of his fork to match the
official PHP svn. It'll just be a fast forward anyway.

Two birds with one stone.

------
grumpycanuck
I'm a long-time PHP user (since 1998) but I never really peered inside the
discussions of the internals of PHP. Now that I'm more connected to others in
the PHP world, what you see inside the mailing list for PHP internals is what
I would label as obstructionism and an attitude that seems to imply that if
you cannot code the requested changes yourself, don't even bother asking.

Lead, follow, or get out of the way are the only three choices available to
any language.

~~~
hackernewz
Also, if you can code the change yourself it gets rejected as "not a bug" for
a few times, then they think about it and say that it's too late for any
reasonable release and that it will go into PHP 6 or 5.3 or something that you
won't upgrade too because it breaks too much other stuff.

------
ldng
I was hoping for more profound changes. I've never read PHP internals but I'm
under the impression the opcode language is not 'jitable' because it's an
unformal mess while even python and lua are getting there ... It seems that
Opcode caching is the most you can get out of the language as is.

------
Revisor
And in a true PHP fashion, the new functions follow at least two naming
conventions. Cf. _str_random()_ vs _strcut()_

PHP has no vision, it draws no people with vision and suffers for it very
much. I say it as someone working with it for historical reason.

------
jrockway
So, are strings in PHP cstrings, or are they a length/address pair?

~~~
pbiggar
length/address pair.

~~~
jrockway
I see. Why is strlen such a hit then?

~~~
pbiggar
function call overheard. isset is a builtin.

------
viraptor
Might be not completely offtopic if I asked here: is there some reason array,
resource and object were never folded into one type (object)? It seems like
array and resource are kept separate just because it's done this way
internally (with array and resource super classes). Why can't resources or
array be native-code-backed objects like some modules in python?

It seems like many special cases were left over from old versions and the
inertia prevents any change.

------
tcdent
Minor nitpick given the scope of these additions, but his use of a boolean
argument to enhance implode is not my favorite.

    
    
      implode(',', $array, true)
    

A new function (or even leaving it the way it was) is far more readable.

    
    
      implode_keys(',', $array)
    
      implode(',', array_keys($array))

------
voidr
This should be the mainline version, a lot of good stuff especially the new
array syntax.

------
koski
Does anyone know if there are any benchmarks about this?

------
aba_sababa
Mmm, string looping!

