I think the "path of least resistance" is important: developers are time-constrained, understanding-constrained, lazy (if they're virtuous), etc. There's a big incentive to do whatever is easiest/quickest.
When I last used PHP, about 5 years ago, there were OOP APIs cropping up to replace many of the standard global functions; namespaces had just been introduced; closures had become useful; frameworks like Symfony (and Drupal 8) were becoming established, rather than the old "plugin" approach of throwing around arbitrary code and hoping for the best; dependencies were being managed by composer; files could be autoloaded from sensible locations; testing frameworks like PHPUnit and PHPSpec had become best practice; etc.
Yet all of those things were opt-in and verbose. The path of least resistance was still:
Hello <?php echo $_GET['name']; ?>
Compare this to something like Java: it favours class-based OOP so much that even "hello world" needs a class. The path of least resistance is to do things "right" (from Java's perspective). Haskell's path of least resistance is simultaneously easier ("hello world" is just `main = putStrLn "hello world"`) and harder (`main` uses the `IO` type, whose API enforces certain conventions).
Deprecation warnings, linters, etc. can help with this; but PHP's only real strength is its installed base of code and developers; changing the language too much would throw away this advantage (akin to being a new language, see Python 3 and Perl 6); not changing it enough prevents the more serious and/or systemic issues from being dealt with.
I wish the language designers and users luck, but I'm really hoping to never use it again ;)
You are absolutely right that the path to this has been fixing the path of least resistance. Laravel and other frameworks have made it easy to get variables using their preferred methods (and automatically sanitizing input). Learning resources (tutorials, docs) utilize the OO functions and many of the older functions were deprecated and have been removed for years. If you look at one of the more infamous functions, mysql_real_escape_string (https://www.php.net/manual/en/function.mysql-real-escape-str...) that was removed entirely in PHP 7. If you look at the more modern mysqli, they've chosen to have both escape_string and real_escape string, but one is an alias to the other. Similarly, sane defaults have become the norm, especially since most developers are using containers or VMs/vagrant to program. The one major outstanding issue is the api naming is still inconsistent.
On a sidenote, the PHP docs are great with pages showing examples and notes talking about potential issues. When I switched to python I couldn't believe how bad the docs were in comparison.
I've got nothing against DSLs and template languages and such, but most of the ones I encounter make me sigh and say, "I know you thought it would be easier but it's just one more thing I have to figure out how to debug, without any of my usual debugging tools."
I did get a little perverse amusement at writing HTML files containing nothing other than single `<?php` element; where that PHP's job was to generate HTML; and it did so via a templating language (e.g. Twig or Smarty)...
One thing that shocked me during my brief foray into PHP was how heavily popular PHP apps like WordPress rely on the filesystem. Want to move a site? Copy the database, and copy a bunch of files. It seems like there's a cultural expectation that applications run on a single server in your closet. Of course you don't have to write code this way... but people still do.
echo "hello world";
There - no XSS, no HTML, no class requirements.
- The path of least resistance in PHP is often directly at odds with the "proper" way of doing things (including many things listed in this article).
- This is important because PHP's intended domain is Web development, so this can easily cause security flaws.
- Any attempt to "fix" this example will inevitably make it longer, more complicated, harder to remember, etc.
Your PHP "hello world" does not demonstrate those problems, since it has no XSS or HTML, so it is a poor example.
After this demonstration, I abstracted to the more general concepts of "path of least resistance" and "'proper' ways to do things", irrespective of language or domain.
To clarify my meaning for these phrases, and to demonstrate how language designers can use the former to push the latter, I gave two examples: Java pushing its preferred approach of class-based OOP, by disallowing raw, top-level statements; and Haskell pushing its preferred approach of monadic I/O, by enforcing the type of `main` and restricting the available APIs. Notice that neither of those examples make the "proper" way easier (that can be very difficult, in general); instead they eliminate anything that's easier (like top-level statements), such that the "proper" way is the easiest thing that's left. In other words, Java's requirement that even "hello world" be wrapped in a class is a good thing for Java programmers, since allowing top-level statements like `System.out.println("hello world");` would undermine the principles of the language. This general argument is not specific to PHP, and certainly not a direct comparison to the XSS snippet.
After defining and demonstrating these general concepts, I then returned to the specific case of PHP, to point out how backwards-compatibility with its legacy of easy, insecure approaches undermines the attempts to improve the situation. In other words, the "proper" way to write PHP (at least, back when I used it) is to use OOP, namespaces, type hints, escaping of user input, etc. Yet the design of the PHP language discourages all of those, by providing easier alternatives which are "improper" (like in my XSS example); and removing those alternatives (in the same way that Java forbids top-level statements) would break almost all existing PHP projects and require the majority of PHP developers to change their habits; and doing so would eliminate PHP's main selling point (installed base and developer mindshare).
I hope that clarifies why the comparison is not disingenuous (i.e. because I'm not making the comparison that you claim).
If I were to make a comparison of that vulnerable PHP code against something else, it would need to be against a language which is primarily designed for Web development (to avoid being disingenuous), and it should be designed to make the "proper" approach the easiest. The Ur/Web language fits these criteria nicely, and (from a quick skim of the tutorial at http://www.expdev.net/urtutorial/step1.html ) the equivalent to that PHP would be:
fun greet data = return <xml>
PHP, on the other hand, requires manual escaping with htmlentities() ... It is very, very error prone.