
What Were CGI Scripts? - UkiahSmith
http://rickcarlino.com/2019/07/20/what-were-cgi-scripts-html.html
======
llarsson
To answer the title: what we called FaaS back in the day. And back in those
days, the suspender-wearing UNIX sysadmins said that it was just a more
restricted version of inetd.

Wonder what the next generation will call it?

And yes, I am being sarcastic. There are differences. But do recall that when
AWS Lambda came out, it had the exact same limitations w.r.t. one process per
call, needing one fresh connection to a DB per request to handle, etc.

~~~
jeremyjh
Lambda is still single threaded, right? You will only be sent one request at a
time per process. So now its at best FastCGI.

~~~
naniwaduni
compare inetd in wait mode

------
knadh
This makes me nostalgic. Spent inordinate amounts of time writing terrible,
unmaintainable Perl scripts and FTPing them to cgi-bin on free 5MB hosting
plans on F2S and Netfirms. Most hosts wouldn't provide error logs, so there
was nothing more than a 500 page from httpd to tell you that something had
gone wrong. Debugging anything was an endeavor. Then of course, PHP came along
and killed cgi-bin.

If I had to pick one thing to represent the CGI era, it would be Matt's
FormMail [1].

[1]
[https://www.scriptarchive.com/formmail.html](https://www.scriptarchive.com/formmail.html)

~~~
deckard1
> PHP came along and killed cgi-bin

PHP was using CGI like Perl. And then PHP had Apache's mod_php, which I
believe is still widely used today.

Perl, of course, had mod_perl. And later, PSGI/Plack followed by Catalyst,
Mojolicious, Dancer, etc.

~~~
johannes1234321
mod_perl had the issue that it had hooks deep into the webserver, which wasn't
"shared hosting safe" mod_php had less hooks and some basic cross-vhost-access
mitigations (Safe_mode oben_basedir etc.) which allowed independent users to
put PHP scripts on the same server, which in turn made web hosting cheap.

------
JJMcJ
CGI is still a reasonable choice for low query rate and simple pages, like an
AJAX responder.

Especially when there won't framework already in use on this particular
server.

Why add <framework of the day> to your setup when every 10 minutes or so, it's
time to read the temperature sensor and return a string with the reading.

Also, since it's served by a transient process, there is no problem of memory
leaks, file handle leaks, other resource capture by long running processes.

Per Murphy's Law, of course, everything is fine until one day the internal app
goes public and suddenly there are 500 queries a second.

~~~
msla
> CGI is still a reasonable choice for low query rate and simple pages

By which you mean, "Almost every website in the Universe which isn't actively
being DDoSed" with the understanding that essentially _nobody_ can stand up to
a modern, competent DDoS.

> Per Murphy's Law, of course, everything is fine until one day the internal
> app goes public and suddenly there are 500 queries a second.

Omae wa mou shindeiru. You are already dead.

~~~
deckard1
It's humorous to think every random person on HN is working on a site that
gets more traffic than Slashdot did in 1999. A site that ran on mod_perl and,
by modern standards, fairly basic hardware.

And for the most part, CGI went on the decline the day mod_perl came out.
Which was 1996. So we really didn't use basic CGI for that long. The
popularity of putting Perl right in Apache eclipsed running standalone Perl
via CGI. It would be at least another decade before the world started to move
on from this arrangement (mod_php or mod_perl).

------
shagie
One of the bits of CGI (I wrote a lot of perl back in the day) is that parts
of it are still there, under the covers.

Looking at RFC 3875
[https://tools.ietf.org/html/rfc3875](https://tools.ietf.org/html/rfc3875) \-
you see things like PATH_INFO, PATH_TRANSLATED, QUERY_STRING...

Pull up Java HttpServletRequest, [https://javaee.github.io/javaee-
spec/javadocs/javax/servlet/...](https://javaee.github.io/javaee-
spec/javadocs/javax/servlet/http/HttpServletRequest.html) and there is
getPathInfo(), getPathTranslated(), and getQueryString() along with many of
the other parameters that would be familiar to someone writing a CGI.

You can find them in C# - [https://docs.microsoft.com/en-
us/dotnet/api/system.web.httpr...](https://docs.microsoft.com/en-
us/dotnet/api/system.web.httprequest.pathinfo?view=netframework-4.8) \-
PathInfo, QueryString and the like.

You can find them in Haskell - [https://hackage.haskell.org/package/happstack-
server-7.5.1.1...](https://hackage.haskell.org/package/happstack-
server-7.5.1.1/docs/Happstack-Server-Routing.html) Route by pathInfo and the
QUERY_STRING is clear in [https://hackage.haskell.org/package/happstack-
server-7.5.1.1...](https://hackage.haskell.org/package/happstack-
server-7.5.1.1/docs/Happstack-Server-Internal-Types.html#t:Request)

CGI scripts aren't dead... they just got better plumbing.

------
patsplat
OP does a poor job explaining what CGI scripts were.

From the RFC linked in the OP:
[https://tools.ietf.org/html/rfc3875#page-23](https://tools.ietf.org/html/rfc3875#page-23)

CGI defined an interface between a web server and an executable that would
provide a response.

\- Request meta-data i.e. path, query string, and other headers passed as
environment variables.

\- Request body passed via stdin

\- Response header and body passed via stdout

In this way, a webserver like Apache could provide a platform for a wide array
of languages. Yes there were security and scaling concerns, but it also was an
opportunity to rapidly release and iterate on a product.

------
mapgrep
I’m really grateful I had to cut my teeth on cgi because it forced me to
understand the whole http request/response cycle in detail. There were
libraries to help (CGI.pm, anyone?) but they stayed down at a pretty low level
(helping parse params from an url or POST, for example). To learn how to
implement a web login form, I had to understand how cookies worked, how to set
and then read a cookie header, how to send the right headers with the
response, how to encode the cookie payload, how to store the password locally.

Today, a web framework will do this grunt work for you. Which honestly, if
you’re handling passwords, is in many ways a good thing. But people are less
likely to learn the basics of how their app talks to a browser.

~~~
heavenlyblue
>> But people are less likely to learn the basics of how their app talks to a
browser.

Do they need to? Do you yourself understand every miniscule detail of radio
wave propagation before making a cell call to your friend?

~~~
Izkata
It certainly helps knowing enough to go outside if you have no signal, rather
than just thinking your phone is broken.

~~~
heavenlyblue
That's the same as knowing that "there's a thing called cookie that needs to
be 'set' for my user to be authenticated".

------
aasasd
CGI and derived protocols actually got one thing right compared to the
reverse-http-proxying of today: they pass request headers in protocol
variables, not the other way around. In CGI, no amount of accidental
misconfiguration would permit a request to overwrite custom request variables
on the way from the server to the script.

I mean, it's probably possible to configure a server to escape the original
request headers as ‘HTTP-<Header-Name>: value’ and add custom ones on top, but
I haven't seen it done, and frameworks depend on the headers being there
intact.

~~~
ademarre
That happened with the HTTP_PROXY environment variable and the Proxy request
header: [https://httpoxy.org/](https://httpoxy.org/)

Since the CGI "protocol variables" are actually environment variables, it
creates a namespace collision and an injection opportunity for environment
variables beginning with "HTTP_".

~~~
aasasd
Well, putting protocol fields in environment variables, un-namespaced, is one
thing CGI hasn't gotten right. That's where FastCGI and SCGI come in.

Also, once in a while I begin thinking that PHP is a pretty nice language,
certainly doing its job and very performant compared to Python or Ruby, even
if PHP code resembles Java more and more. And I forget the numerous
questionable semantic-breaking decisions. But then, bam:

> _Warning: if PHP is running in a SAPI such as Fast CGI, this function will
> always return the value of an environment variable set by the SAPI, even if
> putenv() has been used to set a local environment variable of the same name.
> Use the local_only parameter to return the value of locally-set environment
> variables._

------
z92
Written as if it's a 25 year old tech no one uses now?

I still use cgi scripts. Though these aren't "scripts", rather compiled
binaries written in C.

That made some of the pages with heaviest calculations, load in under a
second, which were taking more than 10 seconds in PHP.

~~~
hnarn
Do you have any more information available on this? I'm curious about the use
case for using compiled binaries directly over PHP.

~~~
Bubbadoo
It sounds as though the compiled binaries optimize the compute portion, which
the interpreted php code was taking 10 secs to run. It's still likely running
as a cgi process, that is as a fork/exec in a separate address space. If
you're using mod_php or one of the many php engines available for modern web
servers, you're already enjoying a scalability and execution time improvement
over standard cgi. Except with compute code, apparently.

------
mperham
I built a billing site for my software business and had to decide which tech
to use. SPA? React + Rails? Elixir? Keep in mind we are talking about single
digit req/min.

I went with CGI. It has some drawbacks but consider the advantages:

    
    
      * Requires nothing but Apache running, minimizes attack surface area
      * Deploy is a simple `git pull`, no services to restart
      * No app server running 24/7 so I don't have to monitor memory usage or anything else.
    

I love it. Takes little to no maintenance because it never changes. Runs on a
$5/mo droplet.

[https://www.mikeperham.com/2015/01/05/cgi-rubys-bare-
metal/](https://www.mikeperham.com/2015/01/05/cgi-rubys-bare-metal/)

~~~
laumars
I'm not about to dictate to someone that their choices are better nor worse
than anybody else's but I don't think you make a particularly convincing
argument in favour of CGI:

> _Requires nothing but Apache running, minimizes attack surface area_

Sure, but the attack surface remaining (CGI) is far less secure than pretty
much everything else out there.

> _Deploy is a simple `git pull`, no services to restart_

Same could be said for a dozen other service side frameworks.

> _No app server running 24 /7 so I don't have to monitor memory usage or
> anything else._

I mean, if you exclude the server you're talking about, then sure. But you
still have a server running 24/7 that you need to monitor so the framework
becomes somewhat irrelevant from that perspective.

\---

Personally I still use CGI for personal projects I hack together. But none of
them are directly open to the internet.

~~~
chasil
The thttpd server is efficient in the extreme, can constrain itself to a
chroot(), and is able to execute CGIs.

In that environment, even the largest attack surface will have great
difficulty escaping confinement.

~~~
laumars
Apache httpd 2.4 can be chrooted (maybe 2.2 as well? Or at the very least some
distros back ported that feature to 2.2) but it's not enabled by default and
can take some set up to get working properly if you're not a seasoned
sysadmin.

chroot is definitely not common amongst Apache installations, let alone common
amongst CGI usage (which likely comprise of more than just Apache httpd
users).

However one blessing is at least the execution directory is limited to /cgi-
bin (even if software running inside cgi-bin can fork their own processes
outside of that directory).

It's also worth noting that "efficient" is a poor choice of words for CGI - be
it in the context of security or it's more common usage in terms of
performance. CGI work by forking processes (which is slooooow). In fact CGI is
ostensibly a $SHELL for HTTP (if you look at the mechanics of how shells are
written vs how CGI works). Someone elsewhere else in this discussion described
CGI as a "protocol" and I wouldn't even go that far because all the HTTP
headers and such like are passed to the forked process via environmental
variables. In fact could literally set the same environmental variables in
Bash (for example) and then run your CGI-enabled executable in the command
line and get the same output as if you hit httpd first.

But as I said in my first post: I don't hate CGI. Far from it, it's been a
great tool over the years and I still use it now for hacking personal stuff
together. But it's also not something I'd trust on the open internet any
longer. It's one of those cool pieces of tech that simply doesn't make sense
on the modern internet any longer (like VRML, Sockwave and Flash, the web 1.0
method of incremental page loads (I forget the name), FTP (that protocol
really needs to die!) etc.

------
Animats
CGI scripts don't have to run with web server privileges. Nor should they.
They should be set-UID to some other user.

I still use FCGI with Go programs. FCGI launches a service process when
there's a request, but keeps it alive for a while, for later requests. It can
fire up multiple copies of the service process if there's sufficient demand.
If there are no requests for a while, the service processes are told to exit.
Until you get big enough to need multiple machines and load balancers, that's
enough.

~~~
zonidjan
Set-UID (by itself, at least) is _not_ a feature to drop privileges. On Linux,
you _must_ also use setresuid to set all ID's to the EUID. And then you must
hope that you were able to execute setresuid before any vulnerabilities can be
triggered.

You should tell your web server to run CGI processes as a different user,
instead (f.e. suexec).

------
genmon
Python 3.8 is expected to include PEP 594 "Removing dead batteries from the
standard library"... one of the modules schedules to be deprecated is cgi.

[https://www.python.org/dev/peps/pep-0594/](https://www.python.org/dev/peps/pep-0594/)

As out of date as the module is, reading the PEP made me nostalgic for the
days of hammering out a quick CGI script, and I've probably got a few of those
scripts still chugging away.

~~~
nandhp
PEP 594, actually:
[https://www.python.org/dev/peps/pep-0594/](https://www.python.org/dev/peps/pep-0594/)

~~~
genmon
Thanks — fixed

------
znpy
It's worth noting that the RHCE (Red Hat Certified Engineer) used to require
to at least be able to write a simple CGI script and make it available via
Apache.

I haven't checked how this changed in the last RHCE update, but still.

I had the opportunity/necessity to write a couple of CGI script. While it's
not comfortable at all, the nice thing is that the HTTP server makes not
assumption on what language/runtime you're using. Literally, any out of the
box apache will be able to run cgi scripts, no matter what language you used
to write them.

If for any reason you cannot install other runtimes, you can still use CGI.

Whether you should, that's a different matter.

------
NelsonMinar
I still do a fair amount of CGI; it's just fine for low traffic simple web
services. Startup time for the script isn't awesome, to fix that you want FCGI
or SCGI or whatever. Apache's default MPM these days is event (or worker)
which scales CGI pretty well. Beware that it's not thread safe, so if your CGI
script is doing something multithreaded it will break. (Related: if you're
doing something multithreaded, it's time to graduate from CGI).

~~~
covener
Multiple CGIs under worker and event get their own processes, they don't run
in the same process as Apache nor eachother

------
rcdwealth
Author is speaking past tense, showing lack of knowledge. CGI programs are in
use, nothing "passed". All major web servers have CGI support, and it has so
much use in web applications. It is standard protocol that is not going to go
away.

------
avip
CGI is still "widely" (I don't know how to quantify) used in web interfaces
for IoT devices that did not drink the Lua kool-aid. That's _a lot_ of CGI out
there. Not dead yet.

------
__david__
One of my most popular sites was CGI until 2014. Now it's just static files,
rendered from the same cgi program. And the reason it changed wasn't related
to performance.

------
qubyte
I cut my web-teeth on CGI. I was a TA instructing physics undergrads in C (I
think around 2007). I only really knew C and python myself, and was tasked
with building a page which students could upload programs and results to. It
was the perfect abstraction given the tools I had and my primitive
understanding HTTP at the time.

It took a few yesrs to realise I was hooked, and while I don't use CGI
nowadays I'm really glad I started out with it.

------
chriswarbo
I'd like to point out that CGI (and its descendents) can be used with any
language, not just C (e.g. there are various false dichotomies in these
comments between PHP vs CGI+C).

There are also many fast, compiled languages which, unlike C, are memory-safe,
make string handling easier, are higher-level, have stronger types, etc. In
particular, ML-family languages like Rust/Ocaml/ReasonML/StandardML/Haskell
are really good as meta-languages for safely generating other languages like
HTML (that's what the "ML" stands for ;) )

It would be a shame if an inexperienced developer got the impression that
their pages would load faster if they taught themselves C. Yes, it's possible
to write reasonably-safe C; no, an Internet-facing CGI application isn't a
good idea for a first C project.

------
FlorianRappl
Anyone else noticed that the page is constantly doing some requests to log the
users behavior, e.g., on scroll? Also some of these requests (always?) fail.

Was annoying as the page did some jumps on these requests and the icon
indicated the loading activity. If you have to do - do it in the background.

~~~
astura
This weirdness also polluted my history with the page at different scroll
intervals. I had to click back about 20 times to get back to HN because ever
scroll action was in my history.

------
pmontra
> since the server was directly executing the script, security issues could
> easily creep in (the script shares the permissions of the HTTP server).

This is still the case. PHP often runs as www-data, Rails or Django or Node or
whatever often run as a normal user (usual guess, Ubuntu user id 1000) with
read/write access to all the files in that user home directory. Running in a
container gives some isolation now.

Anyway, writing my first CGI script in C back in 1994 was quite a hell (not a
very convenient language for string processing), then Perl and CGI.pm got the
upper hand for a while.

------
bin0
What _were_ cgi script? Not sure that people have stopped using them. Old
things don't always need replacing; some times, they still work fine (though
you have to watch your security).

------
miohtama
How CGI scripts differ from Amazon Lambda and other serverless solutions?

~~~
singlow
For CGI you have to maintain a server running Apache or some other web server
software and persistent storage for a copy of your script along with a means
of deploying them. You are limited to concurrency equal to the number of
cores/threads and memory on your server or you have to manage some means of
load balancing between multiple servers which you maintain along with
deploying your script to each of them.

Certainly you could run a cluster of servers or even an auto-scale group of
servers with all of your cgi programs and handle lots of http requests.
Serverless/Lambda means you don't have to do this.

Also, not all lambda's are handling http requests, so you'll need to figure
out the best way to deal with other events and monitor queues, etc.

~~~
throwaway2048
The real difference is amazon is doing it for you, not that it isn't
happening.

~~~
singlow
I believe i framed my entire response from the perspective of how it is
different for you. That is the whole point of lambda. You do less.

------
klingonopera
Acting like CGI is dead was the reason I (feel like I) wasted three years with
PHP. It's not, and if the thought of "programming" webpages in C/C++ sounds
appealing to you, you should definitely check it out.

The obvious advantages: Total (binary) control of data streams and structures
and execution as well as (nigh-)zero runtime overhead.

The obvious disadvantages: Static and rigid, not well suited for rapidly
changing requirements (unless they are coded in), longer development times and
increased complexity.

------
dehrmann
Isn't FastCGI (ok, I guess it's not really true CGI) still the canonical way
to set up PHP with a webserver?

~~~
liveoneggs
FastCGI is a binary (vs text) + enhancement version of the CGI protocol so it
is less expensive to parse. CGI (the protocol) doesn't specify the execute-
per-request, it's just a convention.

In the wild you will find php-cgi, php-cgid, mod_php, and php-fpm (fastcgi)

------
gscott
Find the path to Pearl and the magic will be guaranteed. I installed a lot of
Perl cgi scripts.

------
edf13
Used to do mine in Perl... a tangled web of Perl scripts. They worked at the
time!

~~~
xrd
Remember Lincoln Stein and CGI.pm?

[https://en.wikipedia.org/wiki/Lincoln_Stein](https://en.wikipedia.org/wiki/Lincoln_Stein)

I wrote a Perl module (Smil.pm:
[https://metacpan.org/pod/Smil](https://metacpan.org/pod/Smil)) that mimics
his, wrote him about it, and it was thrilling when he responded.

------
samstave
This makes me ask the question: How much knowledge are we going to forget in
the next 50 years...

\---

We have forgotten so much. But our mental velocity has been so vast in the
last ~100 years... we are going to foget key knowledge soon....

------
Blackthorn
I wonder -- are there benchmarks out there of fastcgi vs reverse proxy via
http server for various languages? I'd expect that fastcgi would be faster,
but everyone uses reverse proxying nowadays.

------
haolez
What would be the main benefits of today's serverless against classic CGI? I
can think of auto scaling, but I'm not sure it wasn't possible (or even used?)
in the past.

------
dev_dull
What’s old is new again! Many of the benefits (and drawbacks) live on in
webasm and serverless execution.

------
torstenvl
"Were?"

CGI is still the best choice for this. It's easier to guarantee your server's
Perl installation has all the latest security patches than it is to guarantee
your users' browsers all do. AJAXy nonsense is mostly too error-prone, too
insecure, and too slow.

------
jedberg
I miss the days when I had shell on a unix box and could just drop a .pl file
into my ~public_web directory and have random people on the internet run the
script with the permission of the webserver user. Back before there was any
concern with doing such a thing.

------
purplezooey
crap I feel old now

------
Lowkeyloki
s/Were/Are/

There, I fixed it for you.

------
saagarjha
I wonder if we've reached the point where many of the people who would have
been using CGI scripts (if they had not been succeeded by newer technologies
to provide dynamic functionality) no longer know what it is.

~~~
tuvistavie
I am pretty sure we reached this point quite a few years ago. From what I
know, most people learning web programming nowadays start with Rails, Express
JS or whatever framework of the day they have been recommended, and use the
integrated HTTP server for development, I would assume, often without even
really worrying about what is actually happening. CGI scripts are definitely
not on the list of things to learn there.

------
nivexous
CGI scripts sound like a very simple idea when described here, but I avoided
learning about them at the time because the name was so arcane. Is there a
good reason why these weren’t simply called “executable pages”, or “response
programs”, etc.?

~~~
goatinaboat
Because Perl scripts were only one application of CGI, which was intended to
be a generic interface between the web and any sort of backend. The thing that
became Oracle Web Application Server for example started life as a direct
interface between a web server and an Oracle database called WOW for Web-
Oracle-Web. There were others. Pretty soon the “application server” took over
that role as a thing in its own right, and instead of a Perl script directly
invoked by the web server you would have a JSP or something.

