>PHP's poor design and implementation has probably cost the world millions of lost man-hours; loss that could have been avoided by choosing a better-designed solution.
Could you give some examples of this?
I'm not a huge fan of PHP, but it generally works, as folks like FaceBook have proven.
I think their object model is a good example. Introduced OOP in version 4 but they implemented it in a way that was unusable. They put it right in version 5 but that broke older code -- and it was still weird. It was only in the latest release that it became possible that you could get the right class name in a static method of a class that was inherited by another. Everything they do, they do wrong[1] and it takes them years to get it into approximative acceptable shape.
The only thing that saves them is that most php users have no clue of what they are doing, have not training of any sorts, and no nothing but php and maybe perl, vb or whatever.
[1] Okay, I'm exaggerating and the latest iteration is no longer the menace it once used to be.
> PHP is considered extremely weak performance-wise
I don't know, it seems a lot of the very high trafficked sites (Facebook, Yahoo) run PHP. Yes benchmarks do show PHP doing poorly, but when was the last time anyone used PHP to crunch numbers? It's apparently fast at what it needs to be fast for.
> There is poor to no support of multithreading
That is not a flaw. PHP is a share nothing architecture; each request is completely independent. This is a good match for the stateless nature of HTTP. Multithreading isn't at all appropriate for PHP and I can't think of a scenario where it would be needed.
> There is no eventing system
PHP is a programming language not a framework. I'd make more of a comment about the various PHP frameworks but "eventing system" seems kind of vague. Maybe you can clarify this a bit and then I'll comment.
> As a web framework, it sins in mixing code & design (opposed to e.g. Django-Python or Ruby on Rails)
Nobody mixes in code & design in PHP anymore -- you won't find it terribly common among the various PHP frameworks. I do use it from time to time as a debugging aid and my compiling template engine uses it as the compilation target.
> I don't know, it seems a lot of the very high trafficked sites (Facebook, Yahoo) run PHP
I'm completely for PHP as a simple and accessible language. The previous author wanted to hear about the shortcomings of PHP and I tried to provide some. PHP sites such as Wikipedia probably need more servers (since every request is a separate process). They might be saving a lot of manpower by using PHP, so it might be an OK tradeoff.
By the way, Facebook uses Erlang for its performance-critical parts (such as chat).
> PHP is a share nothing architecture; each request is completely independent.
You could still use an agent-like messaging system for shared-nothing multithreaded/multiprocessed environment.
> I'd make more of a comment about the various PHP frameworks but "eventing system" seems kind of vague. Maybe you can clarify this a bit and then I'll comment.
e.g. When using cURL you must poll the handler until the request is complete. Obviously you'd need multithreading to achieve a better control flow in such a scenario.
> Nobody mixes in code & design in PHP anymore
I've written and debugged some wordpress plugins, and I can't say I came to the same conclusion as you...
> PHP sites such as Wikipedia probably need more servers (since every request is a separate process).
Citation needed. But every request isn't a separate PHP process; PHP is linked into Apache. Apache starts up a pool of processes and reuses them for requests. It's all highly efficient.
> By the way, Facebook uses Erlang for its performance-critical parts (such as chat).
Actually, it's not so much the performance critical parts but the parts where PHP's share-nothing architecture isn't appropriate. I doubt they're using a front-end web server for chat either.
> You could still use an agent-like messaging system for shared-nothing multithreaded/multiprocessed environment.
If you had the need, I suppose you could. Although I'm not sure what point you're trying to make here.
> When using cURL you must poll the handler until the request is complete. Obviously you'd need multithreading to achieve a better control flow in such a scenario.
You don't really need multithreading. JavaScript has callbacks for everything, for example, but isn't multithreaded. So this isn't so much a problem with PHP as it is a lack of design of cURL. But then cURL is a C library.
> I've written and debugged some wordpress plugins, and I can't say I came to the same conclusion as you...
Wordpress is old and not well designed. (Much like PHP itself)
Apache starts up a pool of processes and reuses them for requests. It's all highly efficient.
It's highly efficient if you have a small number of requests at any given time; It's not as efficient for peak hours.
If you had the need, I suppose you could. Although I'm not sure what point you're trying to make here.
There are several use-cases I can think of, the simplest being more than one blocking request for rendering your page, and using multi-process to reduce the latency to the maximal block time instead of the sum of all block times.
You don't really need multithreading. JavaScript has callbacks for everything, for example, but isn't multithreaded. So this isn't so much a problem with PHP as it is a lack of design of cURL. But then cURL is a C library.
Javascript on server environments needs solutions such as node.js. On the client side, well, most UI apps have a single UI thread and delegate jobs to worker threads. In javascript, these are usually done via XMLHttpRequests.
Actually it's quite fine during peak hours. You could, however, say it's a waste of resources during the lull times. But I'm not using those resources for anything else anyway; the entire purpose of the server is to service web requests. If a bunch of processes are sitting there wasting memory being idle, that's not a problem.
> There are several use-cases I can think of, the simplest being more than one blocking request for rendering your page
That's a fair case but I haven't personally encountered it yet. A good example, however, would be building a mashup page of content from a bunch of different web services.
It's true; many languages allow you to handle concurrent downloads by letting the OS call a function when the socket with the data is readable. (This is part of an "event loop".)
Many language implementations call this threading, however, because it is. You don't need one heavyweight OS thread for every thread of control in your application; many applications that make heavy use of IO easily handle thousands of threads inside one OS thread. It's a very useful abstraction.
Multithreading isn't at all appropriate for PHP and I can't think of a scenario where it would be needed.
Not blocking the entire PHP process while waiting for database results, waiting for a file to download, waiting for memcached to respond, etc.
I would (and do) use an event loop for this sort of things, but lightweight threads are technically a better abstraction. Web apps would use a lot less memory (and hardware) if people were not so afraid of threads.
(This happens a lot; a bad implementation of something taints the whole category. pthreads, Java threads, Perl ithreads, etc., are broken, so all threading is bad. C++'s object model is bad, so all object oriented programming is bad. It's sad.)
BTW: some good thread implementations include Haskell's and Perl's Coro.
> Not blocking the entire PHP process while waiting for database results, waiting for a file to download, waiting for memcached to respond, etc.
The thing is, I need those database results -- I can't do anything else until I have them. I received a single request from the user and I'm spitting out a single HTML response back. That's what PHP is all about. And mulithreading as optimization itself wouldn't improve performance because I'd be taking away CPU from other requests that are running concurrently.
> Web apps would use a lot less memory (and hardware) if people were not so afraid of threads.
Fundamentally I don't disagree; there's a lot of cool work going with event-based web servers. However, I think those designs are great for very specific tasks like chat servers and but require too much fiddling when doing traditional stateless web work. However, it may indeed be the future.
The thing is, I need those database results -- I can't do anything else until I have them.
You could start requesting the database results for a user that is now waiting in the tcp connect queue, cutting latency significantly.
Most people handle this with load balancing between processes, but your blocked-for-DB-results-process is sitting idle using system resources, like memory.
If you only have one user using your site at a time, then this isn't a concern. But for everyone else, blocked processes are wasteful.
but require too much fiddling when doing traditional stateless web work
Well, yeah, because you don't have threads. When you do, everything is handled by libraries for you; you can think you're blocking, but actually not block the process.
> You could start requesting the database results for a user that is now waiting in the tcp connect queue, cutting latency significantly.
That's already how it works. If one process is blocked waiting for results, another process will get it's turn. Apache is one process per HTTP connection (when not running threaded).
> but your blocked-for-DB-results-process is sitting idle using system resources, like memory.
Either way it will need to use system resources. I can't throw away the resources of the user waiting for results, I need that once the results come in! Processes really don't take up that much memory and typically you're reusing a small number of them over and over anyway.
> If you only have one user using your site at a time, then this isn't a concern.
If somebody is blocked then someone else's request gets a turn at the CPU. Blocking a single process is completely inconsequential to multi-process web server serving multiple users.
Processes really don't take up that much memory and typically you're reusing a small number of them over and over anyway.
They take up several orders of magnitude more memory than lightweight threads. I did some benchmarks a few months ago showing this; check searchyc if you care. Edit: here you are: http://news.ycombinator.com/item?id=794409
Anyway, when you start thinking about "web applications" instead of "web pages", lightweight threads make a lot more sense. And, you'll find that treating a web application like it's just a bunch of web pages leads to many problems down the road. (Performance is the least of your worries.)
Finally, "I definitely don't need to know about this abstraction because my favorite language doesn't have it" is dangerous thinking.
> They take up several orders of magnitude more memory than lightweight threads.
Premature optimization; when a 1st grader can count to the number of processes you need for large scale web application optimizing it any more is a waste of time and effort.
> You'll find that treating a web application like it's just a bunch of web pages leads to many problems down the road.
Really? I find it refreshingly helpful. Instead of your large application being a large application, it actually just a series of very small applications. If a page crashes, it's pretty much no harm at all. If your whole application crashes, you're screwed. You want to add some functionality, just add it on.
> Finally, "I definitely don't need to know about this abstraction because my favorite language doesn't have it" is dangerous thinking.
"Oooh shiny" is also dangerous thinking. Is it really better to waste your time using up as many threads as possible or actually solving real problems for a few hundred processes?
Sidebar: Why is always you, Jonathan Rockway? Just how many posts do you make on Hacker news in a day? :)
I wonder what you're comparing PHP to. PHP is faster than Ruby and about the same speed as Python. Threading is not a pressing need in the sort of environment PHP typically runs - multiprocess web servers. Python and Ruby, when used for web apps also manage just fine with 'little to no support of multithreading'. And as someone else pointed out, it's not a framework to begin with (although there are PHP frameworks). The language is not without its warts and limitations but I don't think they're the ones you've listed.
Let's put the term "framework" aside. The Wikipedia article states it is "a basic conceptual structure used to solve or address complex issues, usually a set of tools, materials or components". I believe a scripting language that is pre-designed for website design could be considered a framework, but it's not that important whether there is a consensus.
I'd like to see your source for the performance claim. To my own I can bring e.g.
I found threading a great shortcoming of PHP. I had web services that required data from several sources and was in need of querying in parallel. there are about 370,000 results for +php +multithreading on Google so I assume it's more than just myself.
The benchmarks you're linking are pretty limited (especially the first one that tests just instantiation and method dispatch). Also synthetic but somewhat more comprehensive -
As I said, if threading is a problem in PHP, it's as much of a problem in Ruby and Python. Common workaround is async message queueing with a great number of solutions available for all of these languages. There's a parallel curl available in recent versions of PHP that covers the simple 'multiple data services' case reasonably well.
You're referring to mcurl. It's an absolute necessity unless you want to shell out extra PHP processes. Still, you have to poll it and the code is not friendly.
Ruby is no gem either (I've always wanted to say that), with the one advantage of having dedicated high-level programmers as advocates within the community.
I haven't seen anything close to RoR or Django in PHP. Nor did I see a Twisted or node.js variant.
Crawling is a problem that is fundamentally different from serving up dynamic websites. I've written a php based crawler framework, it simply uses a database as a central synchronization area and a bunch of parallel scripts to do the crawling.
I could have done the same using curl but for technical reasons chose to do it this way. It's definitely a work around kind of solution though.
I hope to replace the whole thing with a clojure/jvm based solution.
Php is pretty fast in its way actually because a lot of the stuff is just plain wrappers for C functions. Use PHP right and do as much as possible with the built in functions and it's pretty fast -- build complex OO junkworks and it'll slow down a little.
I am involved with two very high volume sites, one is written in Java using tomcat as the server and it needs 3 servers to handle between 650K and 1.5M uniques daily, server load is ok but fairly high, the other is a PHP based service that serves up 1.8 million pages on a good day to 50K uniques, the server load on that box is a relaxed 0.6.
When used properly PHP is performing quite well. Note that this is without any front-end caching, each page gets generated anew.
Otherwise I wouldn't have used them as examples. They are basically implementations of the same kind of site but on different platforms, the one more successful than the other.
In terms of raw performance I'd say the java environment is a little bit faster, but not by much.
In terms of programmer performance I'd say the PHP environment is a lot faster than the java environment.
The java I would qualify as more maintainable though, but it is less stable in production than PHP (we see the occasional freeze on java, the apache/PHP stuff seems to run itself for year after year without any issues).
I wonder whether it's not so much Java, but the excessive "architecture" around the language. My software uses the Jetty web server, a minimal framework, and JRuby for application code. It's very very fast, stable, not particular heavy on memory consumption, and I get good programmer productivity with the Ruby language. Importantly, I get to use all the shiny Java libraries instead of battling with Ruby gems.
I have come to the conclusion that the JVM combined with the vast number of libraries is great, but the Java ecosystem is not terribly optimal and over complicated.
One day someone will come up with a 'runnable war file' that contains the whole web application+container and does not require unpacking or other manipulation, entirely stand-alone.
Would it actually be that hard to put everything in a single jar file and generate a class to boot it? Is it more than tracing dependencies and setting up the servlet configuration? Admittedly I have avoided the whole Java infrastructure, but using Jetty as an embedded web server has been painless.
While one could argue for PHP being a "language" upon which frameworks are built, it really is just a framework. A low level framework that is. Frameworks built on top of it are just higher-level more abstract frameworks.
I come from 4+ years of using it professionally and most of the higher level frameworks I'm talking about (Kohana, Zend, CodeIgniter, etc...) really only exist to overcome a lot of PHP's more abstract weaknesses (even though they can't really do much about the interpreter implementation).
I can easily say that something written in Python, Scheme, or whatever-you-choose will run a lot better, take less time to build, and be easier to manage down the road.
The primary problem in this industry is that the majority of web application programmers grew upon PHP and are running their own IP Consulting shops or startups by now and have zero experience with other languages, therefore making them much less interested in trying to adopt a different web development paradigm when an aspiring web developer with Python/Erlang/Scheme/Ruby skills under their belt try to build something with that tool chain in their employ.
The less knowledge and/or experience with other languages you have, the more rooted in what you are comfortable with you are. Education is a good thing, that is what I love about Hacker News; the people here experiment with new languages and trying out new technologies for their projects/products - which is much more progressive than the enterprise situation.
You have some points, but I think they're not very strong. PHP seems fast enough for many popular and large websites, I'm not sure how important multithreading is to most web applications out there, same with events, and the last criticism is in the PHP community thought of as a feature. PHP easily suffices as a quick, simple templating language when you don't feel like you need to bring in Smarty or some other thing specific to templating.
Could you give some examples of this?
I'm not a huge fan of PHP, but it generally works, as folks like FaceBook have proven.