

Ask HN: Do people build websites in Awk? - bgilroy26

I've been reading through Classic Shell Scripting by Arnold Robbins and Nelson H. F. Beebe, it's a great book!<p>Seeing how full-featured the classic Unix/BSD/GNU/Linux etc. toolkit is has left me wondering why Perl, Python, and Ruby are so successful.<p>If there are absolutely no server-side applications that are made out of these classic software tools, it seems like there must be a reason why.<p>Are there web applications that are made out of Bash, SED, AWK and all that? Or for reasons of security, complexity or something else has putting all of these tools under one roof (first in the form of Perl, later Python and Ruby) won the day?
======
fusiongyro
I actually wrote a static site generator written with Make and m4. Never used
it in production, just as a proof-of-concept:
[http://old.storytotell.org/blog/2009/07/13/how-to-manage-
a-w...](http://old.storytotell.org/blog/2009/07/13/how-to-manage-a-website-
destructively.html)

If the question is "why didn't Unix win?" the answers are basically
historical. The original way to create pages dynamically was CGI. The idea was
that you'd have some Unix crap on the server you wanted to expose over the
web, so you'd write up your form and the form target would be a CGI script.
The way CGI is specified, the arguments show up in argv and a bunch of
incidental stuff winds up in the environment. This is one reason why PHP's
global situation is so strange. Anyway, the request would come in, the server
would parse it, fork and exec your CGI binary passing in the data and whatever
came out got sent back to the client.

CGI is a great piece of classic Unix engineering, because it's completely
language agnostic. Your CGI scripts could be shell, Perl, C, anything. The
cost is also classic Unix: it's expensive, due to a lot of forking and re-
doing the same work, and it's a bit of a security problem since it raises a
lot of questions about which permissions are relevant and who should be
running what. These questions led to a couple interesting solutions (suexec
and fastCGI) that were apparently too much like epicycles and not enough like
ellipses, or we'd all know about them.

mod_php and mod_perl basically killed CGI themselves by solving the
performance problem and paring the security problem back down to just the web
server itself. I suspect there's heavy inspiration from server-side includes
but I wasn't there for that part so I don't know if that's true or just sounds
reasonable. The other side of the tree, the Ruby/Python/Java side, wound up
deciding that HTTP wasn't so complex it warranted doing a lot of backflips
just to go through Apache, so they wound up creating their own servers so they
could manage state (both application and interpreter/VM) more appropriately
for their platform. This school of thought seems to have won the day, except
for PHP and Perl which still tow the hard line for integration.

I have wondered for a while if CGI's performance penalties would really be all
that noticeable today, for lightly loaded sites. It would be interesting to
take a look into. If so, the problem is really just that one request shouldn't
map onto one command. FastCGI was an attempt to create CGIs that endured
longer than a single request. My impression is that it didn't become a big
deal because by the time it was invented CGI was already on the way out and
it's somewhat more brittle.

If there were a reasonable alternate way to represent the request that maps
well onto Unix it might be possible to revive the dream.

~~~
chromatic
The state of the art in Perl web deployment these days is PSGI, which shows
the direct influence of Rack and WSGI. A static front-end proxy works nicely
with a backend app server. See <http://plackperl.org/> for more information.

~~~
fusiongyro
I take it you just search for the word Perl and reply whether or not you
actually have anything relevant to say.

------
patio11
I worked on a project which was indecently large/successful that ran on awk.
No details (NDA) but there, you now have confirmation that there exists a
production system on AWK. Obligatory disclaimer, quoting Chris Rock: "You can
drive a car with your feet if you have to, but that don't make it a good
idea."

There are SUBSTANTIAL advantages to switching to e.g. ruby, which gives you
all the fun of scripting languages and also a community which has already
poured thousands of man-years of effort into making stuff you can use for web
applications, rather than you spending time implementing e.g. the world's
first (and worst) oAuth/Twitter wrapper for gawk.

~~~
bgilroy26
>and also a community which has already poured thousands of man-years of
effort into making stuff you can use for web applications

I would wager a goodly amount that you've hit the nail on the head here.

It seemed at first as though bundling all of the unix environment together and
abandoning "The Unix Way" was a triumph of advertising and people's laziness.
Thinking further, it's pretty clear that mobilizing a community around a
varied basket of tools would have been nigh impossible.

It's pretty interesting how important this developer/social aspect has been to
the development of the web. Larry Wall has certainly been in on the joke all
along.

------
IvyMike
Back in the early days of the web (I'm talking 1993-1995), perl had a bunch of
huge advantages:

1) It derived a lot of syntax from the classic unix tools. So if you knew sh,
sed, awk, troff, grep, you were halfway to knowing perl.

2) The performance of fork() on many unixes was bad. Or at least, bad enough
to make you avoid starting a lot of processes. With perl, you could usually do
everything in-process.

3) Bash existed, but did not have the wide acceptance that it has today.
Remember, Linux was brand spankin' new, and we were mostly using proprietary
SunOS/Solaris/HPUX/AIX systems at the time. Those systems which shipped with
/bin/sh, not /bin/bash. Heck, most sysadmins did not bother to install bash,
because what as the point of installing some non-standard buggy and redundant
shell? (Whether or not if this was true at the time is more up for debate, but
that was the attitude.)

4) Bash didn't have associative arrays until much later, I believe.

5) There were things awk just had trouble doing, or at least doing quickly.
Perl had a lot of the sysadmin-y system calls available as pure perl
functions.

6) Perl's regular expressions were second-to-none

7) The comp.lang.perl community was probably the best, or at least busiest, of
the applicable newsgroups at the time.

8) The perl community was very quick to provide example CGI code, so if you
wanted to write a CGI script, you could bootstrap yourself very quickly with
perl.

If you can ever find a copy of the 1991 edition of Programming Perl, you
should pick it up--it was an amazing introduction to the language, and
explains a lot of the advantages over other scriptish languages at the time.

And of course when Perl 5 came out, the language gained another huge advantage
with the module system and pseudo-OO features.

------
srl
Here you go: <http://werc.cat-v.org/>

I personally think the standard unix tools are best used for static site
generation - I wouldn't want to try handling arbitrary input with programs
that tend to have semantics incompatible in the edge cases.

~~~
bgilroy26
I guess I didn't click through far enough through. I saw this Werc page
yesterday and thought that it was an April Fool's post!

Thanks for helping me give the page a second look.

------
rachelbythebay
I've been known to use "fly" (a script-ish wrapper to libgd) within a shell
script running as a CGI to do super quick prototyping of dynamic graphics for
web pages. It's nothing I'd keep around for real work, though. It was just too
heavy.

One time, I had the problem of taking a huge data dump and reformatting it, so
I wrote a shell script in the traditional way: grep, cut, and things like
that. It was terribly slow. It was so slow that I managed to write, test, and
run a C program which did the same thing before the shell script could finish.
That's how slow it can be.

------
niggler
The USACO site was originally a bash script (dunno what Kolstad uses now)

