I'm still using FastCGI! It works well on Dreamhost.
The Python support is not good! In theory you just write a WSGI app, and it will work under a FastCGI wrapper.
But I had to revive the old "flup" wrapper, since Dreamhost has Python 2. I downloaded an older tarball and build it myself.
Use case: I parse thousands of shell scripts on every release and upload the results as a ".wwz" file, which is just a zip file served by a FastCGI script.
So whenever there's a URL with .wwz in it, you're hitting a FastCGI script!
This technique makes backing up a website a lot easier, as you can sync a single 50 MB zip file, rather than 10,000 tiny files, which takes forever to stat() the file system metadata.
Does anyone know of any other web hosts that support FastCGI well? I like having my site portable and host independent. I think FastCGI is a good open standard for dynamic content, and it works well on shared hosting (which has a lot of the benefits of the cloud, and not many of the downsides).
FastCGI provides that but the Python libraries are not that well documented, and sometimes unmaintained. For some reason this appears to be a "cultural" thing and not a technical issue.
I wrote a whole bunch of comments about my plans to work on that here (andyc):
I'm probably going to share my .wwz Python WSGI/FastCGI script. And show the hacks I did to deploy it on Dreamhost.
Long term I want to build FastCGI support into https://www.oilshell.org/ so the same problem doesn't exist there! i.e. the fact that you CAN deploy Python on many shared hosts, but nobody does because the ecosystem doesn't support it. Whereas the PHP ecosystem does support it, but the language is hard to learn.
nginx on gentoo has a relatively straightforward path to using fastcgi (fcgiwrap). I'm "oldschool" in that i don't really care about what fancy functionality i gain with some new tech (like nodejs) when all i want to do is speak to a very specific API endpoint, or handle a specific API call from some server like rocket.chat or mattermost.
I contemplated using nodejs or something else, but the ability to bang out a script that answers on a URL on nginx is so much easier, python or otherwise.
I went a step further and spent most of my free time this year writing CGI/C apps.
This was not nostalgia either : the reason for doing that is because I was writing webapps for my pinephone. Toying with the phone, I decided I wanted my apps to be webapps rather than GTK apps, so that I can access them either from mobile or laptop (through local network), but I didn't want to have the apps running all the time, in order for them to consume less energy (which directly translates to battery lifetime on a mobile). Turns out that CGI is perfect for that : the only process always running are nginx and fcgiwrap, then all my apps are started only on demand, for the lifetime of the request.
I did expect a big performance hit, but I was surprised it was not so bad. I guess that's because they are C app rather than written in languages which require loading an interpreter before running anything. One app that I rewrote from libmicrohttpd had actually better perfs (although it was the first time I used libmicrohttpd, so it was probably something I didn't do correctly).
> Turns out that CGI is perfect for that : the only process always running are nginx and fcgiwrap, then all my apps are started only on demand, for the lifetime of the request.
This is a really nice idea. It almost makes me wish browsers could just make a local CGI request directly, without a webserver at all. A quick search turns up an extension for ye olde Firefox: https://github.com/RufusHamade/lcgi. I suppose with native messaging APIs it's probably possible with new extension APIs too, if a bit more convoluted.
My contract is up next month... how are you finding the pinephone? I just need web, gps, voice calls, email, camera, music, and a battery lasting over 24 hours mostly on standby. Is it a daily driver yet?
I would say that for battery life alone, that probably wouldn't be for you yet. Regarding other apps, while they do work, it's quite a severe downgrade from android/iOS - mostly, I think, because the pinephone is so low end hardware. I do use it as my main phone, but that's only because I find hacking it worth those sacrifices. In your case, you'll probably want to wait for the Librem 5 to see how it compares, or for pine64 to release a buffed up pinephone (if they ever do that).
>I just need web, gps, voice calls, email, camera, music, and a battery lasting over 24 hours.
Isn't this what most people use their phone for? What exactly are you compromising on in this list? You might as well say you just need a fully-functional smartphone, not a hacking project.
Video playback (Netflix), potentially choice of audio playback (I choose Spotify, not local mp3s or something), banking apps, authenticator, an app store, games, ridesharing, home security camera viewer. Probably the number 1 smartphone app wasn't included, Facebook.
I'm guessing 75% of most people's phone use is compromised, at least by changing to a web app. What the GP listed would be about 20% of my use, personally.
Not having Google/Apple ecosystem is a feature, not a compromise for those going to the PinePhone. Those making that choice need to have realistic expectations on what they're getting, and be able to accept compromises on the availability/polish on services like web, gps, email, camera, music, and also battery life. If you have to ask, it's probably not for you.
yup, that's how we used to do it in 90s, as long as code is compiled it has a fairly decent performances. Downside is that debugging is far harder than with the scripting languages.
Heh, that takes me back! In 1994, I was a high schooler volunteering at a famous west coast science museum (it was either that or work at Taco Bell, thanks dad), and one of my tasks was to make a searchable employee directory...
I don't remember if there was any other option on the server we had, but I had to write that sucker in C! Reading and parsing a .csv file, handling search options, working as a CGI program, what a pain, especially for a dumb high schooler like me!
I _think_ I checked it but didn't find any way to run cgi apps, only fastcgi (maybe I should check again, I'm not 100% sure about that).
Although, now that I understand fastcgi better, I guess fcgiwrap could be adapted for lighttpd, if it doesn't work for it as is (my package manager's description mentions explicitly it's for nginx).
Don't worry, I don't think these lessons were forgotten -- we've made your way back full circle already with "serverless functions" if you squint a little bit.
I think coming full circle is a sign that they were indeed forgotten - as in "Those who cannot remember the past are condemned to repeat it". Although the newer iterations of this idea are more managed and auto-scalable and such.
Let's dive in a little to how Serverless & FastCGI might be related.
The thing that fastcgi brought over cgi-bin was that an application process could be left open to communicate with the server, where-as cgi-bin model required spawning a new process for each request.
If one reads the AWS Lambda docs, they'll see the execution context[1] has a similar behavior. AWS will spin up new instances, but these instances will serve multiple requests, via a fairly custom "function" interface defined for various runtimes (but which is actually, typically an http interface). There is a standard HTTP api for runtimes to use to retrieve function invocations[2].
With FastCGI the front end server uses a socket to push request messages to app servers, which replies in order. Where-as with Lambda & it's above mentioned runtime API, the runtime is retrieving requests from Amazon at it's pacing, & fulfilling them as it can. So there's a push vs pull model, but in both cases, the application server is talking a fairly custom protocol to the front-end server.
Also though, there are some cgi-bin like behaviors seen in some serverless systems. Serverless is a big umbrella with a lot of different implementation strategies. One optimization is use of checkpoint-restore. With checkpoint restore, an app server is brought up to a "ready to serve" state, then the host operating system takes a "snapshot" of the process. When new instances of the process are needed, the serverless system can "restore" this memory mapped process & the resources it was using, bringing it up in a ready-to-serve state quickly. This behavior is more cgi-bin like, in that it's a technique for spawning new serving processes quickly, although few serverless systems go as far as cgi-bin went with a per-request process. None-the-less, openwhisk for example has was showing off start times decreasing from 0.9-0.5s for node, python, java app servers down to .09s-0.7s startup times using these checkpoint restore capabilities of the OS.
And this is where the squinting comes in! As far as I'm concerned the differences are mostly implementation details. As far as the idea of deploying small, relatively self-contained chunks of functionality I think serverless is a revisit of the principles of CGI. I've actually given at tiny presentation on this if anyone likes to read my view in depth[0] which I think fully explains my stance. Huge grain of salt though -- I wasn't there when all this happened, and I only caught the end of the cgi-bin era.
> The thing that fastcgi brought over cgi-bin was that an application process could be left open to communicate with the server, where-as cgi-bin model required spawning a new process for each request.
Absolutely (I read/skimmed the article, and this was brought up) -- serverless functions are the same here, because it also allows for machines to stay running for some indefinite
> If one reads the AWS Lambda docs, they'll see the execution context[1] has a similar behavior. AWS will spin up new instances, but these instances will serve multiple requests, via a fairly custom "function" interface defined for various runtimes (but which is actually, typically an http interface). There is a standard HTTP api for runtimes to use to retrieve function invocations[2].
> With FastCGI the front end server uses a socket to push request messages to app servers, which replies in order. Where-as with Lambda & it's above mentioned runtime API, the runtime is retrieving requests from Amazon at it's pacing, & fulfilling them as it can. So there's a push vs pull model, but in both cases, the application server is talking a fairly custom protocol to the front-end server.
This is one of the reasons I said "serverless functions" instead of Lambda. While AWS happens to specify their operational semantics that way, there is no need for anyone else to. While improvements are necessary I still think this is fairly close to FCGI, and FCGI could absolutely serve as a "serverless" provider implementation.
Needless to say, how long you keep around the process, or how you checkpoint restore (criu[1] is very interesting, for anyone who's never seen it) and move processes are all implementation-specific in my mind.
I disagree a lot with this article's opinions on PHP. It sounds like coming at it from the perspective of someone who hasn't used it in over a decade. Modern PHP with frameworks is just about as good as it gets for web development.
Edit: oh lol, this was written in 2002, alright, that confirms that.
> I disagree a lot with this article's opinions on PHP. It sounds like coming at it from the perspective of someone who hasn't used it in over a decade. Modern PHP with frameworks is just about as good as it gets for web development.
hard disagree, but I'd love to learn more of the things that PHP is good at when compared to other frameworks and languages. Rails & Django are the frameworks to beat in terms of completeness and productivity, Flask/Sinatra for python/ruby flavored micro frameworks, NodeJS for massive concurrency and parallelism (Node does support threads as well as subprocesses) and large (sometimes sketchy) ecosystem and ease of contributions. There are the compiled languages as well with various advantages -- single binary deploy, speed, error avoidance with compile time type declaration/inference, etc.
What does modern PHP and a framework (like Laravel? what is popular in the space these days?) do better than it's contemporaries do? IIRC HHVM which seemed to be a bright spot (?) was abandoned by FB... Would love to hear about PHP's bright spots from a PHP enthusiast.
PHP was built to parallelize requests, out of the box. It will use all of your cores automatically with no work on the developers part. It’s seemingly hard to find benchmarks comparing _modern_ versions of PHP not behind Apache to Node but once you get Apache out of the way of PHP7 the performance of modern PHP screams.
HHVM as you mentioned lost momentum when PHP7 largely met or surpassed its performance. That’s not a bad thing though, it just wasn’t hugely necessary anymore. The major upside is everyone gets the boost now, whereas HHVM was a nightmare to setup if you were not Facebook.
Beyond that, I truly believe the stateless nature of PHP requests is by far the easiest mental model to work with. As cross-request state is held exclusively and explicitly elsewhere (Redis, Memcache, SQL), it’s remarkably easy to reason about at scale.
This makes it easy to scale! Just throw another server on the load balancer and everything works. I work on a relatively large educational product, and we auto-balance up to 30+ servers during the week to handle a couple million users sending continuous streams of data, down to a minimum of 2 on weekends. It’s taken almost no consideration on the developers parts because of PHPs naturally stateless clean slate per page load model.
We also have a fair bit of Go in production, largely for CPU intensive operations. I love Go and use it for the majority of my newer personal projects. The major advantage to PHP that keeps it our primary language is the speed at which I can develop and iterate. The difference save/refresh vs save/recompile/restart/refresh makes in speed is not to be underestimated.
Myself as well as our lead dev have been using Go since before 1.x and we both still agree that PHP is better for rapid prototyping as well as just generally getting things out the door quickly.
> HHVM as you mentioned lost momentum when PHP7 largely met or surpassed its performance. That’s not a bad thing though, it just wasn’t hugely necessary anymore. The major upside is everyone gets the boost now, whereas HHVM was a nightmare to setup if you were not Facebook.
I had no idea this was why!
Thanks for sharing this insight, great to hear the rationale of teams that are choosing PHP for good reasons these days, the points you've made definitely make sense.
I work mostly with Django at my work. I still choose Lumen / Laravel for my personal projects. Both Django and Laravel are amazing but I'm still more effective with Laravel. It has way more functionality out of the box.
I can’t find any good information on implementing a fastcgi compliant server. Does anyone happen to have information about how to do this? I think it would work well for some applications I have and I can’t find anything to increase my know how to take in incoming fastcgi requests. Think nodejs in particular but language agnostic documents would be best
I can't tell you how to implement one from scratch if that's your question, but I use apache httpd with mod_fcgi.
This server isn't directly exposed, only valid requests are proxied to it.
I have no reason to think this is the best approach, it's a near forgotten foundational layer of an application that has been stable and reliable for a decade, that is under constant attack and regularly pen-tested.
And probably add logic to allow the web server to restart the process if it should fail/crash and maybe recycle it ever Nth hour. Bad C++ code could easily make your process crash.
Apache's mod_fastcgi/mod_fcgid used to try and manage its own FastCGI processes, but the result was rather brittle. It was also a non-starter when the FastCGI processes were on a different machine for load balancing purposes.
It's much simpler to use an independent monitoring program to manage your processes than to rely on the web server. Modern init systems can do this for you, or you can use tools like monit. Docker will probably work, too. Node has PM2. PHP has FPM, which can automatically recycle processes on a preconfigured schedule. Not that they need to be recycled, because PHP is very stable these days, but it's just a one-liner in a config file if you need it.
Implementing fastcgi is a lot of fun, and by far not the hardest protocol out there. It's certainly easier than HTTP/1.1, HTTP/2, and (elder gods have pity on us) HTTP/3. So if you don't have access to a library that does it for you, going by the spec will get you there.
If you're working with a stack like nodejs, I don't see how FastCGI would be any faster or better in any way.
In the late 90s I used FastCGI with Perl to basically do everything... personal sites, the local Lee Newspaper, even a math expert system funded by a NSF grant that interfaced with MatLab. Nowadays, http server extensions and cloud services easily match FastCGI performance, while providing many scaling and logging and denial of service protections and features built in. It's hard to argue for FastCGI anymore, but everyone wants to reinvent the wheel, and some of those wheels are nice. Who needs EDI when we have SOAP? Who needs either when we have REST.
A fastcgi (or similar, uwsgi is great) server gives you a lot of control over the application lifetime that you have to implement on your own if you use http.
You also have less control over buffering and since you don't see the actual connection, you have to rely on special headers to get things like remote ip and remote protocol. Not a big deal, but still nice.
~15 years ago, I was tasked with implementing FastCGI support for a webserver I was writing in C++, so that it could proxy some downstream crud stuff written in Python. After mulling over it for a couple of days, I ended up implementing SCGI[1] support instead. SCGI is like FastCGI, but a lot simpler to parse, IIRC.
Yes, SCGI ranks even higher on the "forgotten" scale (not as sure about the "treasure" part). It's pretty great if you have to implement at least one end yourself (new http server, new connected language/framework).
One of the advantages of fcgi is that it can be launched "in place" like cgi, with the server restarting the backend every N requests. Does SCGI support doing this?
FastCGI was one of my favorite silver bullets for improving the speed of /cgi-bin/ programs. I used to have slow machines where the Perl interpreter startup time would be significant, so rewriting the script around an fcgi loop made a huge difference. The script could stick around serving requests, and if you feared a memory leak, you could just exit() after a hundred requests or whatever, and the FastCGI spawner would restart them on the next request.
It wasn't as fast as something like mod_perl [1], but much easier to integrate with quickly hacked scripts, as long as you were lexically scoping everything with my() and not screwing up your globals. Fun times.
Its too bad this never got more traction, I always thought the ability to define roles was an incredibly smart and useful feature. Defining 'authorizer' and 'filter' allows you to stack all sorts of independent apps - responder provides data, filter does rendering, authorizer is SSO, for example.
Wprth mentioning that Openbsd does have fastcgi support in it's httpd server, but only supports the responder role
Yes, but still just a responder role. Not many servers support all the roles, which is the part I find most useful. I only recall hearing of one implementation from a well known company that supported filters and authorizers, and was from a while back; I also haven't looked into it in a number of years and really can't remember
In a similar vain I've used https://github.com/shellinabox/shellinabox (present in the standard Debian repo) to give easy access to a few commands to relevant people at work who don't like my preferred method for such temporary/intermittent needs (a collection of ssh/sudo/similar based wrappers sometimes hooked together using byubo/parallel-ssh/watch to produce a hacked-together dashboard like display). Mostly read-only stuff (the fully interactive use is possible), even then nothing really sensitive (mostly status information), and local access only (I wouldn't trust something like this on a public network, no matter how careful I thought I was being with the setup), but that is useful enough. For a quick dashboard view, throw together a quick page that links a couple of instances in iframes.
After hearing about CGI I became a bit obsessed with its simplicity and started writing my blog[1] as a CGI app, eventually incorporating a FastCGI library later.
Performance-wise I don't know how it would compare to other options, but I think it's a great way to expose C/C++ code to the web.
One advantage of FastCGI is that it's inherently easy to scale horizontally and load balance because it basically forces the apps to have no state. This is one of the big benefits that PHP happens to have for that reason.
It's possible to use http/2 between the load balancer and the app server (haproxy 2.0+ supports this, caddy supports h2, but not h2c). FastCGI has been around longer, and there are probably more reverse proxies that support it on the backend than h2c (or even just h2), but I'm not sure if there are any technical advantages to using FastCGI as the protocol over http/2.
There is no difference between the two because they are the same thing, just architected differently. The networking is done by, say, nginx rather than a server that is part of the application.
fastcgi applications can "scale to 0" and cost nothing when they are not used. An application server must always have at least one open socket on a running machine.
fastcgi would probably do better over http/2 than other protocols not designed for multiplexing.
Java servlets were likely inspired by FastCGI, and WSGI (Web Server Gateway Interface, Python) is known to be inspired by both, so I doubt that's true. There were just more convenient solutions than having to write the IPC part yourself.
It's interesting to revisit technologies I couldn't comprehend when I got started in programming, way back in 1995 or so. Now I finally understand exactly what's going on.
So we've just rediscovered stateful web applications. Been doing those since early 90's and still doing it now (well among lots of other things). The difference for me is that I've never went for FastCGI. I would instead embed HTTP server in the application itself. In the beginning when the whole concept was young said server would face the outside world directly, now it is hiding behind reverse proxies like Nginx.
The issue with that to me is the concern of memory leaks. With PHP/FastCGI, it's not a problem because after every request, the memory is thrown away. You just need to trust that PHP-FPM itself doesn't leak.
But that said, I do use https://reactphp.org/ occasionally for certain types of tasks (like running a websocket server) and that works great too, but it takes extra care to make sure not to use blocking IO and to take more care of memory management.
Modern C++ greatly simplifies the situation with the memory (unless you specifically want to play with with it). My latest code for server application contains exactly zero explicit de-/allocations.
It's actually less forgotten now than it was in 2002.
Apache's FastCGI support was atrocious until mod_proxy_fcgi arrived only a few years ago. This severely limited FastCGI adoption in Apache territory. But sometime in the mid-00s lighttpd came along, and then nginx. Both of them not only supported but required FastCGI to interface with PHP. So the lighttpd guys developed spawn-fcgi, which was a huge step ahead of Apache's mod_fastcgi/mod_fcgid. With the increasing popularity of nginx, PHP itself adopted FPM (FastCGI Process Manager), which went yet another step ahead of spawn-fcgi. The performance gains were unbelievable! That was sometime between 2006 and 2008 IIRC, just when PHP was starting to clean up after itself.
PHP-FPM is now the preferred way to run PHP no matter what webserver you use (perhaps with the exception of IIS). It keeps things running fast and smooth without giving up PHP's straightforward, CGI-like execution model.
FastCGI doesn't set environment variables. Only CGI does that since the CGI binary is executed for each request. FastCGI binaries run as a server listening on a unix/tcp socket for connections from the frontend web server. It works the like a webapp listening for http requests on a local socket with reverse http proxy rules setup on the frontend web server.
For example, the HTTP header Proxy may be converted to "HTTP_PROXY" and some application servers may interpret it as the environment variable HTTP_PROXY (I seem to remember HHVM did it). Good servers have measures in place to handle that header, but it can bite you if you are implementing a new server.
Yes. Both have the same issue since it's the same thing. CGI starts one process per request, FastCGI reuse a process for more requests to improve performance.
Does anyone know how FastCGI compares to performance with fast VM languages that self host like Go, C#, Java? I remember it being fast compared to CGI with Perl... But thats a pretty low bar
The Python support is not good! In theory you just write a WSGI app, and it will work under a FastCGI wrapper.
But I had to revive the old "flup" wrapper, since Dreamhost has Python 2. I downloaded an older tarball and build it myself.
Use case: I parse thousands of shell scripts on every release and upload the results as a ".wwz" file, which is just a zip file served by a FastCGI script.
https://www.oilshell.org/release/0.8.1/test/wild.wwz/
So whenever there's a URL with .wwz in it, you're hitting a FastCGI script!
This technique makes backing up a website a lot easier, as you can sync a single 50 MB zip file, rather than 10,000 tiny files, which takes forever to stat() the file system metadata.
It's more rsync-friendly, in other words.
I also use it for logs in my continuous build: http://travis-ci.oilshell.org/jobs/
-----
Does anyone know of any other web hosts that support FastCGI well? I like having my site portable and host independent. I think FastCGI is a good open standard for dynamic content, and it works well on shared hosting (which has a lot of the benefits of the cloud, and not many of the downsides).