Hacker News new | comments | ask | show | jobs | submit login

Apache was always a badly designed HTTP server. Forking processes for every request was a stupid idea for any public web server (see also: CGI). It's the only reason web servers got Slashdotted back then and why they rarely do now. Forking processes is incredibly expensive compared to an event loop. Event-based HTTP servers inevitably won, it just took a while. Using select() wasn't ideal but then epoll() made things optimal.

The idea that Apache was ever good enough was wrong then and is wrong now. We always needed a more efficient design.

I'm sure someone could write an even more somewhat more efficient version of nginx but I'm not sure they could do it with as many features as nginx has. Which means you probably would just end up switching to nginx at some point. I think Cloudflare uses nginx. They probably could save on machines if they used a more minimal server.




So, you don't know enough to know that Apache is not tied to any particular concurrency model, nor has it used the concurrency implementation you've described (process-per-connection) in decades, yet you feel you know enough to make recommendations for web servers? Even if using Apache in the now quite old prefork MPM, it is a process pool, rather than a process-per-connection that you've described, and is not subject to the performance problems you're alleging (memory usage can be high when using that MPM for high concurrency workloads, but its performance in most deployments is not all that bad).

Further, "forking processes" is actually not incredibly expensive on Linux, and in fact, pthreads on Linux are implemented by the same code (clone() with varying levels of sharing). Forked applications on Linux are pretty much just as fast as threaded applications. It is a myth based on extremely outdated knowledge (fork on Solaris, for instance, and some other UNIX variants, had a history of being slow; but, Linux has always had a very fast fork).

It is true that event-based thread (or process) pool concurrency implementations can provide superior performance to thread- or process-per-connection implementations, for a variety of reasons, but Apache has that covered. I'm gonna guess you've never even used or seen an Apache installation that forked a process for every request (because it's been so long since that was a thing Apache did), so I'm not sure how you could believe it works that way.

Where did you get all of these assertions from? Are there sites out there propagating these crazy claims about Apache? And, if so, why? What does one gain by trash-talking a project that was instrumental in helping build the open web and still powers more websites than any other web server in the world? And, does it well, I might add. There are some good reasons a reasonable admin might choose nginx over Apache. But, they aren't because Apache is a terrible piece of software written by incompetent people.

In short, your comment has negative value, by providing misleading and outright incorrect information.

Edit: And, this is why I hate it when performance is the measuring stick people use to discuss web servers. It begins to seem like it is a useful metric for comparing web servers, when it really is not for 99% (or more) of deployments. Apache is fast enough. nginx is fast enough. Pick your web server based on other characteristics, because otherwise you're almost certainly making decisions based on the wrong things.


> Further, "forking processes" is actually not incredibly expensive on Linux, and in fact, pthreads on Linux are implemented by the same code (clone() with varying levels of sharing). Forked applications on Linux are pretty much just as fast as threaded applications. It is a myth based on extremely outdated knowledge (fork on Solaris, for instance, and some other UNIX variants, had a history of being slow; but, Linux has always had a very fast fork).

I think you are referring to vfork vs. fork. While not terribly expensive, forking certainly is more expensive than the alternatives. (There is a reason why none of the Apache MPM's do that unless you give them a very messed up config.)

However, you are right that the real issue isn't the cost of the fork (which, duh, if it were Apache would totally have you covered!!). It's more about the address space used up by each process/thread. That becomes a limiting factor for high levels of concurrency, though below levels where it may be a limiting factor the model tends to execute very efficiently (arguably more efficiently).

Apache's "event" MPM isn't really quite the same as engines light nginx & lighttpd... it works quite well, but it even describes itself as a "hybrid multi-process/mutli-threaded server".

As for sites "propagating" this information, there is certainly the ol' traditional: http://www.kegel.com/c10k.html

You can also fight some pretty well established companies in the web hosting business that really ought to know of what they speak, like say DreamHost: http://wiki.dreamhost.com/Web_Server_Performance_Comparison


The performance is not the only problem, the config files are just terrible compare to Nginx.


That's a valid consideration. If the Apache configuration file is difficult, then choosing something else might make sense. I'm not saying everyone should use Apache. I'm saying that almost no one should be using performance as the metric by which they judge web servers.

Though, while we're on the subject, it's a little too easy to configure nginx in insecure ways, and several configuration examples on the web exhibit pretty bad practices. But, on the whole, I agree that nginx is pretty easy to setup and maintain, and it is a great piece of software.

It's interesting that folks have interpreted my comments to mean people shouldn't use nginx and should always use Apache. I've never suggested that (and, I find it funny, because I was the biggest proponent of adding nginx support to the control panel software I work on). All I've said is that I recommend people not choose their webserver based on performance, because they're all (Apache and nginx in particular, and the one OP posted about) fast enough for the vast majority of websites and environments.


You're arguing against several strawmen of your own invention. Preforking is still forking, for one. Under heavy load you're forking to keep up with new connections while existing processes are tied up very slowly serving responses to bandwidth constrained clients. Also, I never claimed forking is expensive. I said it's expensive compared to an event loop. Correcting things I didn't say might feel good but you're just lying to yourself. Regurgitating what you've read about how Linux processes and clone() works is stereotypical sysadmin bloviating.

Several other things you say are equally wrong. Claiming that Apache has been away from a process model for decades is dishonest. It's not even decades old and the Apache project itself contradicts you.

Preforking is still recommended for "sites requiring stability" and is in (most?) common usage http://httpd.apache.org/docs/current/mpm.html

> The server can be better customized for the needs of the particular site. For example, sites that need a great deal of scalability can choose to use a threaded MPM like worker or event, while sites requiring stability or compatibility with older software can use a prefork.

Your religious devotion to Apache and claim that "Apache is fast enough" ignores the reality of what nginx can do with so much less CPU and memory. It's not a technical argument but a religious one. Nginx is good enough. Apache is not. It never was, people have lived with it for too long because people like yourself buried their heads in the sand. You're still doing it.


> Preforking is still forking, for one. Under heavy load you're forking to keep up with new connections while existing processes are tied up very slowly serving responses to bandwidth constrained clients.

No, that's not strictly true. If you have min & max servers fixed at the same value and the max requests per child set absurdly high, you won't fork much if at all under heavy load. Pre-forking can result in a lot of forking and requests being blocked while you fork if a) you have a surge in traffic and b) your min servers isn't set high enough to cover the surge or alternatively c) your max requests per child is low enough that you are constantly having processes exit.

> I never claimed forking is expensive. I said it's expensive compared to an event loop.

Agreed, although that's kind of a meaningless statement (an event loop is something you go through on a per-request basis), and misses the real problem with the multi-process model: virtual address space for each process.

> It's not a technical argument but a religious one. Nginx is good enough. Apache is not.

Umm... that sounds like a religious argument in its own right. Apache is certainly good enough for plenty of people, and more importantly with all the dynamic content on sites, the web server tends to be a pretty unimportant factor in the performance of many sites. Apache brings other things to the table which are often valued for projects, and there is no reason that needs to be considered a "religious" decision.


> I never claimed forking is expensive. I said it's expensive ... yeah, no that seems to be exactly what you are saying. just nitpickinkg.


It's a good thing we have http://httpd.apache.org/docs/current/mod/event.html then.


I've spent a decent amount of time trying to make evented Apache work. It's never been worth it when a better choice that just does it is right there for the taking.

(And god help you if you try to do it with Passenger.)




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: