Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Php-o: metaprogramming PHP to give it a saner API (github.com/jsebrech)
72 points by Joeri on Feb 7, 2013 | hide | past | favorite | 72 comments


Interesting concept, the chainable object specially showcases well what PHP is missing by neglecting data types as objects. Makes you wonder whether it could be simplified further though; ie, combining s(), a(), o() and c() into a single one letter function that does it all, jQuery style. I think it'd have a better chance of adoption that way.


Jquery is a reference for designing a clean API and I suspect (this library aside) many php coders would have liked if php 6 move this way. But php core dev are waaay too conservative.


You wouldn't be able to do the JSON stuff and the string wrapping with a single function, but otherwise that could be a good idea.


Why not? You could test if the string is valid json, and if it is do object wrapping, otherwise do string wrapping. This method is used often in jQuery, figuring out what to do depending on what you pass to it. EDIT: and I guess a second, optional argument to force a specific data type could be added for those rare cases where you have a json string and you want to run string functions on it.


As a PHP programmer the inconsistency of the string functions is not only bothersome, it's embarrassing. This seems like an interesting fix for that.


As a PHP programmer the inconsistency of the string,array,regex,et al functions is an occasional annoyance blown way out of proportion, typically by developers who don't even use the language. The horse is dead, continuously beating it is tedious and avoids the elephant in the room, mainly that for the tasks it was invented to handle it still (despite all it's innumerable and interminably cited warts) serves more pages than any other language with no real change on the horizon. You want to gripe about something? Let's talk about Drupal's internals, or better yet, JavaScript.


I think trying to cast stones at JavaScript from PHP land is foolishness, but I agree with your assessment of the PHP string function situation. This project seems to produce PHP code that is an order of magnitude uglier than ordinary PHP. (No mean feat.)

Why not simply implement a nice string class wrapped around ordinary strings with consistently parameterized chainable methods? The resulting code would be cleaner and more idiomatic (in a good sense). I don't think the "win" from doing this is worth the bother (unless maybe UTF8 support were managed as well).


Written much JavaScript without the benefit of jQuery?


I have, actually, and straight Javascript actually isn't so bad for modern browsers, given document.querySelectorAll, etc. jQuery, at this point, is some pleasant syntactic sugar and a whole bunch of backward-compatibility hackery.

But if you're asking about writing JS without jQuery and including support for old browsers (which I've also done), then fair's fair: you have to support PHP4 (which came out about the same time as IE6), and you don't get anything other than the standard library either.

For a non-trivial app, I'm not sure which of those is less pleasant.


Tons.

I've also used JavaScript to automate Adobe's products, extend Cheetah 3d, build web apps in node.js, and develop games in Unity 3d. JavaScript is far from perfect, but it's a nice language.

Note that jQuery is great for fixing problems in the DOM APi, but it doesn't fix JavaScript (which has its own problems but no more than any other useful language). Much of jQuery is obsolete in modern browsers — events are handled fairly well in most browsers, and QuerySelectorAll replaces jQuery for most lookups. I don't like jQuery's iterators, but writing better ones is quite easy.

PHP's problems are with its libraries (themselves far more horrible than the DOM API) and also the language itself. I do find PHP useful but its problems far outnumber JavaScript's.


jQuery seems more like an API to talk to the DOM than an extension or wrapper around Javascript itself. JS simply doesn't reach the same depths of madness and ineptitude as PHP does at the most fundamental levels of the language. Even if it's still fairly insane compared to languages that are actually kinda elegant.


This doesn't fix that, it just makes it worse.

This is an experiment in meta-programming PHP to give it a saner API.

Sweet.

s() turns all the standard string functions into methods with identical behavior:

s($haystack)->pos($needle)

OK. Cool. Sounds good.

->preg_replace(), ->in_array()

Yep. Awesome.

The s() function also implements JavaScript's string API:

->charAt(), ->indexOf(), ->lastIndexOf()

...

......

.........


The javascript API is there because it's the only one I can keep remembering, I always forget the exact PHP functions and have to keep the php docs open. There also didn't seem to be any naming collisions, so it is harmless. Having developed some test and demo code on top of it, I've come to realize it feels too alien inside of PHP to actually use, so that part is probably going away.


JavaScript uses camelCase; the C functions use_underscores. Mixing and matching the two has been a complaint about the entire PHP library for a long time.


An interesting thing in a language would be to say that `indexOf` is an identifier aliased to `index_of`. It would means that the choice of the convention would rely on the user of the libraries instead of the API developers (which will never agree on a single notation).


Multiple calling conventions that allow for inconsistent use of functions for a language with multiple naming conventions resulting in already inconsistent code is... not the most fantastic idea I've ever entertained.

It's an interesting idea, but there's only one right answer here.


Sure, I was talking about a 'new' programming language. PHP has already a bunch of work required to fix its own standard library.

The point is, the standard libraries, especially in languages such as C++, Python or Javascript, are far from being the only libraries used by a project. Even if they follow consistent conventions, it can happen that an external library author does follow another convention. Using both libraries can lead to use multiple naming conventions in a single program, which doesn't help readability. It would be nice to be able to prevent this.


string: haystack, needle array: needle, haystack

Beyond that, many of the functions take their ordering from their C counterparts.


Getting to know this convention a while ago has made a huge improvement in how I code PHP. I never really knew what the order for haystack / needle parameters were before I read this. Now it's easy to remember.


The biggest draw card here is the multibyte string handling by default and sanitization of argument ordering in string and array functions.

The object stuff, kind of started to lose me. The chainables is cool but for me I could have stopped reading at s() and a() and been happy.

I think it'll definitely be worth experimenting with even if just for those two things, thanks for sharing!


Aside from anything else, the function call and object instantiation overhead of this library is insane.

The string comparison function provided as an example is 4 times slower than the code it's based upon, and 3 times slower than a completely equivalent multibyte-aware version.


Libraries like this are bound to introduce overhead, but in the javascript world you don't hear many people complain about the overhead of jQuery, and it is there. I think for most people a string comparison 4 times slower won't mean much. That's not to say it's "okay" to be slow, just saying it may not be big for those not pushing PHP to its boundaries. In fact last time I checked Symfony and Zend had a tremendous impact in terms of overhead, but you (almost) never hear negative comments on them. Heck, even something OOP is slower than the same thing written in procedural style in PHP.


The example I'm talking about is the first non-trivial example in the readme - 34 lines of believable code, not an individual function of the library.

The library is used about once every 3.5 lines, which feels less dense than most jQuery code I see, but that whole function runs 3 times slower than the plain MB-aware equivalent.

Some of the library functions might be implemented inefficiently but from a quick glance they're nearly all very thin wrappers.

Overhead from libraries like Symfony and ZF is quite acceptable since they offer much higher level features which you'd have to code yourself otherwise.

However making all your code (excluding I/O) run up to 3 times slower, just to add some syntactic sugar, is insane.


What causes the overhead? Is there a way to get the sugar without the calories?


A trivial example:

  $len = mb_strlen("wtf");
    # call function mb_strlen

  $len = s("wtf")->len();
    # call function s
      # allocate string object
      # instantiate string object
      # call string ctor
        # set property
    # call method len
        # get property
      # call function mb_strlen
    # destroy object
There's no way to use a syntax like this without either massive overhead, or massive changes in the engine.

There is a somewhat similar experimental project - https://github.com/nikic/scalar_objects - but that's not meant for real world use either.

Personally I think the language, despite its flaws, does its job fine. I don't think it needs turning into Javascript-with-sigils.


Isn't jQuery overhead more or less dwarfed but the performance of the DOM itself? I'm also not confident that I could write DOM code that's the as fast in every case as jQuery -- it includes a lot of knowledge/tricks to achieve the best performance across all browsers.

This project is just replacing a function call with a method call -- it's all overhead.


well , you are right but symfony for instance has a lot of caching strategies, and is usually used with apc.


I like the three one-letter functions and their effects. It's not entirely unlike how jQuery spices up DOM elements, and feels natural that way.

Then, suddenly, however, there came a validation engine with annotations and reflection and whatnot. Huh? Isn't that a whole different thing? Why is this bundled with my nice little jQuery-for-PHP-scalars? It's not just that it feels like it's a different problem entirely and thus should be separate, the code feels different. Like it has a different philosophy behind it.

It looks pretty nice, but why can't I use the validation engine without PHP-o? Or the other way around?


Author here. It's called an experiment for something. I'm going to productize it by separating it in the way that you propose. The validator API was ripped straight out of java, i'll see about normalizing its style.


I think this type of approach will never stick with PHP community (at least with the "serious" developers).

This hurts so many principles and directions that PHP community is FINALLY diving into, like SOLID and stuff. Accepting simple things like strings are not objects is the way to improve how PHP devs build their stuff and start focusing on important things like defining de-facto libraries and joining forces to make them the best and most flexible available.

Don't get me wrong, the idea is pretty good (even if it's a simple try to mask procedural PHP functions)... but it seems like a swiss knife in the end.


I don't think this is a project that I would see being used anywhere, but I do think it introduces an interesting idea for the language's evolution. I am glad to see it here.


Looks neat. Although I doubt that "hacks" like this will become popular among PHP users any time soon, a proliferation of similar hacks might give the PHP core devs some ideas about how to clean up the language in the next few major versions. PHP could introduce a native object-oriented way to interact with built-in types in PHP 6, just like they did with SPL in PHP 5. Then, in PHP 7, they would deprecate the old functions, and in PHP 8, finally remove them in favor of the new style. After all, new major versions are for backwards-incompatible changes, right? The whole cycle might take a decade, but it would be worth it.

Anyway, here are some suggestions.

1. "echo 'error message' and exit" isn't a particularly elegant solution, and in fact symbolizes everything that is wrong with PHP. What about throwing an exception instead?

2. The ability to specify the charset at object creation and convert to another charset later would be quite helpful, because nobody can remember all those mbstring functions. s('str', 'EUC-KR')->convert('UTF-8') maybe?

3. Why is O.php checking whether magic quotes are enabled? Since you're not even touching GET/POST variables, magic quotes have nothing to do with your library. Are you going to throw an error every time you discover suboptimal settings in the user's PHP environment?

4. Don't modify session settings until the user calls session_start(). They might not want to use your session handling functions, only your string and array functions. Simply including the script should have as few side effects as possible. This helps integration with existing apps.

5. Some of the methods that I'd really love to see in the string class are startsWith(), endsWith(), and contains(), copied straight from .NET. It sucks that I have to do strpos() === FALSE every time I want to check whether a string contains another string, or !strncmp($a, $b, strlen($a)) every time I want to check whether a string starts with another string.

6. While we're trying to clean up PHP's API, why not merge the case-sensitive and case-insensitive versions of string functions into a single method with an optional flag? This is another area where the API is terribly inconsistent, what with 'i's thrown in at random places and sometimes even 'case' to denote the case-insensitive version.

EDIT: Related to 5 and 6, I wrote a similar library back in 2010 for fun, which I put online just now [1]. It doesn't use the clever iterator interface that you incorporated into your library, but I do think that my method names make more sense.

[1] https://gist.github.com/kijin/4736544


Author here.

Because O is currently designed to be the first thing code starts with, for green field programming, there was nowhere for the exception to throw to. Hence the exit.

Magic quotes are being checked because I have some ideas for how to improve the PDO API to prevent SQL injection. Those ideas have yet to turn into code for O though.

I've been contemplating turning this experiment into a usable library for production code, which would mean splitting it up into constituent parts, and making O.php a container for those parts. Then the separate parts could be used in isolation. I suspect the string and array handling by itself would be quite useful even in established projects, and it seems people here agree. Will start working on that.

Thanks for your suggestion about character set conversion. That would be a good addition. I'll add it.


> Because O is currently designed to be the first thing code starts with, for green field programming, there was nowhere for the exception to throw to. Hence the exit.

A uncaught exception will end the script, just like exit() except with a stack trace and the ability to catch it if the end user wants that. There's really no excuse for any function to just exit().

> Magic quotes are being checked because I have some ideas for how to improve the PDO API to prevent SQL injection.

PDO prevents SQL injection through parameters; how do you plan on improving on that?


Parameters require you to be well-behaved with using them consistently. I was thinking of disallowing the use of quotes in queries unless you use an 'unsafe' method (naming prefix). Still have to see how it would pan out.


> Parameters require you to be well-behaved with using them consistently.

That's the whole point! Since it looks like O.php was intended as a framework for new projects rather than integration into existing code base, you could just tell people that they should use parameters when using O.php. No legacy code to support = all queries will be written from scratch anyway.

Please, don't reinvent the wheel of preventing SQL injection. It's a solved problem, and solving it again is boring. Bringing a bit of sanity to PHP, on the other hand, can be a much more exciting task.


Murphy's law applies. If there is a way to do it wrong, people will do it wrong. There's a reason SQL injection remains in the OWASP Top Ten. I disagree that it is a solved problem.

The design of O is in part based on my experience using Zend Framework on large projects. I've learned that the only way to get people to write good code is not to educate them on the right way, but to make the right way the default / easy / lazy way. That's the issue with PHP as-is. It's possible to write good code in it, but it is decidedly not the default.

But, I see now that I will first ship what's in O already in a way that is reusable on other projects, mostly because I want to start mixing it with my own production code as well. After that I'll circle back to PDO.


> the only way to get people to write good code is not to educate them on the right way, but to make the right way the default / easy / lazy way.

Completely agreed. Binding each parameter manually, or even writing a simple prepare/execute pair, is probably too much hassle for somebody who is used to the utter simplicity of a mysql_query() function call.

Which is why, in one of my own home-baked libraries, I hide all of the prepare/bind/execute complexity behind a single method call:

    $rows = DB::query('SELECT * FROM table WHERE col1 = ? AND col2 > ?', $param1, $param2);
Behind the scenes, it's PDO with prepared statements. On the surface, it's just as simple as mysql_query(), except you don't even have to worry about string interpolation. You can also pass the parameters as an array if you want to.

I'm not saying that this is the best way to do it, but if the way you're planning to do it is anything that looks remotely similar (prepared statements and bound parameters behind the scenes), then you have my apologies for having been too cynical without seeing the actual code. On the other hand, if it's just a sanitizer based on a blacklist of special characters, I would persist in my criticisms.


In the codebase i work on professionally we use zend_db for a similar approach, except we use named parameters (on oracle). It works well until you have to construct complex queries, and then the temptation to start concatenating quoted values into the query is overpowering.

I will stew a bit more on what approach (if any) is best for queries.


Disallowing quotes isn't a half-bad idea but it won't catch everything; one could still SQL-inject into a query that expects an non-parameterized numeric value.

I prefer to abstract query building enough that most of the time I'm not constructing query text directly.


This is probably overly cynical of me, but I feel like a lot of what's keeping PHP alive at this point is inertia, and if some major version ahead required that literally every standard library call be differently expressed (and by consequence, every PHP app essentially be rewritten), the case for continuing to use the language would get weaker than I think it already is. At that point, why not jump ship for something more powerful/performant/whatever?


Large code bases are rarely rewritten from scratch. They evolve over many years, and over time, old code gets replaced with new code, piece by piece. If a language evolves over a similarly long time span (10 years or more), and new features are adopted gradually, I don't see why a backward-incompatible change would require anything more than the usual amount of change when the last of the old code is retired. Call it inertia if you want, but this looks to me to be a good thing if you have a large code base to maintain.


How about developers? I know this isn't a problem for the lastest hot startup but business would have teams of people that are well versed in PHP. Sure they probably could change languages for a new project, for a new app to join a group of PHP apps it makes a lot of sense to keep the same language.

Personally I am unsure who drives contributions these days, is it individuals/ hobbyists/ small companies or is it being driven by large companies with large existing PHP codebases?


and sometimes even 'case' to denote the case-insensitive version.

To be fair, that's copied from libc (and a lot of the other naming in the string library too): http://linux.die.net/man/3/strcasestr


Having to go through the pain of the almost arcane APIs everyday, I appreciate this project and the author. Looks neat.


the sanest thing ive seen about php in a long time.


Maybe, but if several people develop several APIs like this, it will mostly make everyone's lives more difficult. This should be a part of the standard distribution.


i have something similar on the back-burner that i haven't pushed public yet, but only a single, one-letter function like jquery that takes any type. there are soooooo many functions to abstract. even though most are one-liners, i ran out of steam :)

certainly not something to use all the time, especially in perf-oriented code, but it's nice sugar.


Cool, love it - the multi-byte support and baked-in validation is awesome, going to play around with this.


This library looks like it will make my life a lot easier. Many thanks, Joeri!


*facepalm


It kind of resembles of what a wildly popular javascript lib does isn't it?


[deleted]


Javascript is the only language that runs in the browser. There are tons of great alternatives to PHP to run on a server.


This comment implies like there aren't any other great alternatives for jquery in the browser.

What about dojotoolkit or YUI library? I don't see how javascript is here to blame.


I think the point he's making is that you don't need to find a jQuery-like library for PHP because you can just use another language.

I personally say nuts to that. With all of the latest projects, PHP development is actually getting to be pretty non-shitty.


"Non-shitty" is a pretty low bar though. When you have a wide variety of choices, you may as well hold out for "good" or "great", even if it means learning a new language, some new technologies or even a new mindset. Having to learn something or spend some time setting things up is a trivial cost for any non-trivial project: increasing your productivity when programming or maintaining code is a very good return on investment.


Unconstructive comment. Which alternatives, and why? I don't think there's even one which is a clear cut alternative (where the downsides are minor, when compared with the upsides). On speed alone, PHP gets a huge head start when comparing alternatives.

Note that I'm not saying PHP is great. It's a cesspool of language defects. However, it's success has merit.


Ok, please use any decent php web framework and compare the speed to lets say python/ruby - not to mention something like pypy.

"Hello World" is not world-like performance benchmark.


Why would you compare a PHP framework to either Python or Ruby?


because people do not develop web-apps with bare python/ruby


Sorry, I don't think what you wrote makes any sense.

He's comparing "any decent php web framework" to Python or Ruby, because neither of those languages are used without frameworks to develop web-apps?


No, because php is rarely used to develop applications without proper level of abstraction - so it makes sense to compare apples to apples.


Not always. I am obliged to use PHP at the moment.


How good is PHP? I was thinking about peer review the other day, and how one measures peer review. One of the most important metrics is the number of citations an article acquires. If an article is published and, over the next 20 years, is only cited 3 times, we can say that it failed to influence its field, but if an article accumulates 1,000 citations, we can say that it definitely had some impact.

Then I thought about PHP, and how it compared with other languages. I went to Google and started searching:

influenced by gosling java

influenced by hickey clojure

influenced by matz ruby

influenced by Stroustrup C++

influenced by larry wall perl

and then finally:

influenced by lerdorf php

You look through some of those links and you notice a difference. A lot of people write about the brilliant things that have been said by Gosling and Hickey and Matz and Larry Wall and Stroustrup but no one ever cites Ramus Lerdorf as an inspiration, nor does anyone seem to think he has ever said anything especially brilliant or insightful. I'm sure if you dig you might find exceptions, but as a rough metric of who has influenced who, I think this reveals something important about how computer programmers perceive the quality of PHP and one of its core contributors.

(You could counter-argue that PHP has several core contributors, in which case, I would counter-counter-argue by asking that you please suggest a core contributor to PHP who is quoted with the same admiration expressed for the other architects that I've listed here.)


You're not wrong, but this doesn't seem to relate to the topic at-hand, except peripherally. I get it, PHP sucks. We all know PHP sucks. We've been talking about how much PHP sucks on HN for probably years.

How about we talk about something new, such as the library this fucking post is actually about. If you really want to grind your ax against PHP (which we all generally agree is terrible), please make this its own post and submit it separately, so that we can actually talk about the submission.


lerdorf seems like quite an understated guy from what I've seen of videos of him talking on youtube. He doesn't seem to want to influence anybody, clearly a smart guy but more of a regular "working class" developer who accidentally became a language inventor.

I think he's said a few times on the record that he didn't really know anything about language design when he invented PHP.


Symfony2 gets quite a bit of love here on HN... as much love as anything PHP-related can get.

Its creator, Fabien Potencier, is usually cited as someone bringing sanity and quality to the PHP world.


Not to say that you don't have a point, but how is this relevant to the linked project?


No one should be influenced by PHP's ad-hoc language design. PHP is an idiosyncratic, programmable API for web apps - it's not really a proper programming language. More pidgin than high elvish.


The point you present is interesting.

What also interests me is that the links on your HN profile seem to be Wordpress (built on PHP) related.


"Because Rasmus Lerdorf doesn't sit with the popular kids" is not a valid reason to dislike PHP.


Sure Lerdford is smarter than most hackers. But I you compare him with Guido, Larry, Goslin, etc is not match.And I think it shows in the language.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: