I disagree with the sentiment of this blog post, because it implies (unless I mi...

aaronem · on March 4, 2014

Sure. But then you hit a case where your URI scheme doesn't map 1:1 to the part of the filesystem your web server is looking at. Sometimes you can just hang ?id=foo off the end of the URI and be done with it. Sometimes you're doing something more complex, and an application-level URI router suddenly looks useful, precisely because it makes it easy to map arbitrarily and in detail between the things your application knows about and the URIs your client interacts with.

It's a bit much, too, to say "All frameworks are too complicated to be worth the effort because Django!" Even Django's partisans acknowledge that it's on the very high end of the complexity curve; indeed, it's the only framework I know of which requires more up front from the developer than Rails. I'm lately more and more of the opinion that, in almost all cases, most of that up-front investment is wasted; you can get 90% of the benefit from a microframework such as Ruby's Sinatra or Python's Flask, either of which can be mastered from a standing start inside a weekend, and save the suffering involved in Rails or Django for when, if ever, you actually need that last ten percent.

Basically, you want to choose the right tool for the job. If you're putting in two screws, just grab your trusty Apache-brand screwdriver and go to it. If you're putting in forty, you're better off breaking out the Flask power drill and a screwdriver bit; sure, you can put in forty screws with a hand driver, but you won't do your wrists any favors in the process. And if you're putting up a frame for a two-story house, you're really going to need something like Django's powder-actuated, magazine-fed nail gun, but for anything smaller, it'd be overkill.

twistedpair · on March 4, 2014

The purist approach is nice, and we've all been there, but eventually you'll want to pool those db connections for speed, to comprehensively handle security, and to do things Apache does not do with the URL's like dynamically register/unregister paths in URL's that have no concrete corresponding script file.

Of course you can attempt to add those things yourself, but the best code you'll often write is that code you don't write at all because someone already did. Don't reinvent square wheels.

oneeyedpigeon · on March 4, 2014

Or you could keep your markup out of the database and db connections become a non-issue (granted, I'm talking about the 99.9% of websites that are content-centric as opposed to web apps).

gaius · on March 5, 2014

We were doing all of those things in the 1990s.

For many "developers" these days the job consists of trawling the Internet looking for a "framework" that matches their project, then tweaking it.

yeahbutbut · on March 4, 2014

I agree with the sentiment that the framework "disease" has gotten a little out of hand (it seems like we have a new one every few weeks now...). However, these same people would come up with a dozen ways of routing URLs to handler functions if they weren't given some sort of standard solution to use.

Basically, frameworks are a way to work on a team without everyone reinventing solutions to problems that are almost beneath notice when trying to get dynamic content on the web.

The reason we don't have single CGI scripts for each url, and instead route everything through a single bootup script is to centralize things like session management, database initialization, memcache, configuration... etc. Things that you could do on every page, but that would be too cumbersome, repetitive, and generally don't change often.

When working with django, or zend, or cake, or whatever the framework of the day is I've often just wanted to go back to the simplicity of plain WSGI. But then I see the mess that people still make when everything is done for them and realize that would just create a new hell.

It takes discipline to program at a lower level, and understanding. Both of those qualities are in short supply.

mattgreenrocks · on March 4, 2014

> It takes discipline to program at a lower level, and understanding.

Why don't we train developers for this instead of constantly telling them they need to learn flash-in-the-pan-framework.js?

Illotus · on March 5, 2014

Well, because when you want to get stuff done you really don't want to invent the wheel again. Why not reuse the work someone else has done before?

This is not to say that understanding web at a low level isn't useful, it certainly is. However, in my opinion it is better to for example build a simple mvc framework as learning experience and use something battle tested in production.

mattgreenrocks · on March 5, 2014

Reuse is good. But my fear is that people learn the intricacies of Rails over SOLID Ruby. The basics are worth practicing over and over again -- the benefits accumulate much like compound interest. Whereas there's no guarantee of Rails being around in ten years.

There's a certain sort of intellectual deference that is given to these frameworks (much of it has to do with how readily people accept things said by famous/rich people). For example, is MVC really the best way to write traditional webapps?

We need more iconoclasts.

DasIch · on March 4, 2014

You don't separate the headers from the body, producing a potential vulnerability for header injection or response splitting attacks. Both are severe security problems and both are easily avoided by using a web framework.

While it may not appear to be the case, developing secure web applications is rather complicated, that's why we have frameworks and that's why these frameworks can be somewhat complex. That doesn't make not using them the simpler solution.

userbinator · on March 5, 2014

> While it may not appear to be the case, developing secure web applications is rather complicated, that's why we have frameworks and that's why these frameworks can be somewhat complex. That doesn't make not using them the simpler solution.

I disagree. The more complexity you introduce, the more code is needed, the greater the chance of bugs, and the greater the chance of those bugs not being discovered sooner.

js2 · on March 5, 2014

You don't separate the headers from the body

Looks like it does to me. The content-type is printed with a \n and then Python implicitly adds a second newline. Or did you mean something else?

DasIch · on March 5, 2014

Hm, you're right. Response splitting and with that header injection should still be possible though, I think.

In any case having to manually make sure to print newlines in the right places and escape user input in headers correctly is insane.

bmelton · on March 4, 2014

> I swear half the devs I meet lately are more concerned with javascript frameworks and rewriting their own webserver than they are about actually serving content to users.

But a good URL router allows for code separation, and to a lesser extent, separation of concerns, which directly speaks to the ability to deliver, uhhh, baked goods.

If you're just building static HTML files, then sure, your URL routing is, and probably should be handled by Apache, because each page is its own destination. In modern web development though, you have an application, and that application would likely get big and unruly if you didn't make some efforts to modularize the application into URL addressable resources.

"Handling URLs" then, is as much a part of baking a cake as sifting the flour, or measuring the baking soda.

oneeyedpigeon · on March 4, 2014

"If you're just building static HTML files, then sure, your URL routing is, and probably should be handled by Apache, because each page is its own destination. In modern web development though, you have an application"

This is part of the problem, though: most websites should just be collections of static files, but copious kool-aid swallowing has led to heavyweight frameworks being the default architecture.

bmelton · on March 4, 2014

I suppose, but that's a highly variable statement to make. Undoubtedly, there are many websites that could be reduced to static equivalents, but as I deal in web applications, that's kind of a tough pill to swallow.

With apps, even if I build a single-page app, I have to build an API to power it, and that API needs to know the difference between "customers" and "customers/customer_id", and the easiest logical way to manage that is through URL routing.

Even in static apps, URL routing is kind of a burden, unless you never link to internal pages and never grow your content beyond what you can keep in your head.

jknupp · on March 4, 2014

The goal was not to convince anyone that they needed to use a web framework. Rather, I hoped to explain to novices exactly what a web framework is and what problems it solves.

lewaldman · on March 4, 2014

Couldn't resist!!! :P

Apache? Python? Why!?

Nginx maybe?

  http {
    upstream database {
        postgres_server  127.0.0.1 dbname=test user=test password=test;
     }

     server {
         location / {
             postgres_pass   database;
             postgres_query  "SELECT * FROM cats";
             rds_json on;
         }
     }
  }

I'm actually using it... (and planning to use for much deeper things tho).

https://github.com/FRiCKLE/ngx_postgres/

https://github.com/agentzh/rds-json-nginx-module

edit: reformated code block

jebblue · on March 4, 2014

Because software engineering should happen with programming languages plus service engines, the service engine should not be the total replacement for all.

lewaldman · on March 4, 2014

Should not be the total replacement for all because?

Also, you could use LUA with nginx to increase the flexibility (if the lack of it was what you mean).

vertex-four · on March 5, 2014

OpenResty[0] is a web application framework that consists of nginx and a bunch of plugins, including Lua scripting.

[0] http://openresty.org/

rahimnathwani · on March 5, 2014

Why not, _if_ you are operating at such scale that the extra performance outweighs any additional development/maintenance cost/complexity?

I think this is what Taobao uses/used: https://github.com/alibaba/tengine

scarecrowbob · on March 4, 2014

"Imagine a baker who was more concerned with building ovens than baking bread."

I'd imagine that at some point, it's hard to make enough bread without considering the ovens.

blhack · on March 4, 2014

You're right.

But my point is that sometimes people get caught into pre-mature optimization. The baker analogy, within the context of what I'm saying, is about a baker who never bakes any bread, because they spend all of their time iterating on ovens, and not bread.

Having some super fancy bread-making-machine is totally vital...eventually, or if you're doing something very cutting-edge, or out of the ordinary.

But not when you're writing a personal blog.

scarecrowbob · on March 5, 2014

Fair enough points; there is indeed a difference between being more concerned about the ovens than the bread and all.

That said, I have worked with/for a lot of bakers who have literally given no time thinking about the oven (I've done a lot of cheap PHP freelancing), and occasionally ruminate on finding a better oven and the choice that led to my current situation in the kitchen.

There are a fair number of folks out there who bake plenty of bread, but who could do so in more sanitary conditions with easier to reproduce recipes if they were open to some different methods.

I agree that a personal blog is probably not the right point for that kind of methodology, but there are also a lot folks who need to learn the difference between yeast and random fungii that they found by googling "free wordpress themes".

couchand · on March 4, 2014

Good point. The metaphor is much stronger than the grandparent realized: a great baker should spend some time to find the best oven, install it in their bakery, then (hopefully) never think about ovens again.

jeffbr13 · on March 4, 2014

> then (hopefully) never think about ovens again.

If "baker" denotes an individual whose job is to bake bread in a small bakery, then perhaps.

But we're more like industrial chefs, hired to "make" bread at scale (e.g. for Tesco or Walmart). As a software engineer, our roles may be much more about combining, improving, or making our own ingredients, machinery, and processes.

Then, when it comes to baking the bread, we're able to produce much more with less effort, or even hand off the final production to others.

nashequilibrium · on March 4, 2014

yes, but then you hire a specialist at building ovens and tailor it to your bread.

sopooneo · on March 5, 2014

Thank you! But I'm slipping into things I don't understand, and maybe you could illuminate. I assume that for both your examples, we are using, say Apache, to listen on port 80 and route requests to files. So how do you get apache to not just deliver the file as-is, "#!/usr/bin/python" and all? Does Apache typically know to execute and return the result?

Or if my questions aren't making sense, that would be good to know as well.

myhf · on March 5, 2014

You can configure Apache to treat some files as CGI executables [1], which get invoked by their file path, read http requests on their standard in, and print http responses on their standard out. It's how most of "Web 1.0" was built, and it supports streaming in a way that most MVC frameworks don't.

[1] http://en.wikipedia.org/wiki/Common_Gateway_Interface

collyw · on March 5, 2014

Sometimes if you haven't got things configured properly, you do return the script as text rather than the output of the script.

The old school default was to put executable in the cgi-bin folder (wherever that had been configured to be) and make them executable. That would run the script and return the output.

marcosdumay · on March 5, 2014

Sorry, but I'll have to disagree. Frameworks are great. Yep, sometimes a simple one will be enough, but you'll still need one.

Yep, it's quite simple to do GCI by hand... But you'll want all your pages to look the same, so you'll need to add some template functionality to your code. Also, you'll certainly need to handle data, so add some data validation there. You'll also benefit from some better abstraction over your database, and connection pooling... and while we are talking about finite resources, you'll want to limit the number of threads Apache launches.

There is certainly more. I'll certainly not remember all the troubles of reinventing that wheel, as I don't do it since the 90's.

Illotus · on March 5, 2014

This. Even when your needs are relatively simple there is a lot of work in building a web app from scratch.

slavik81 · on March 4, 2014

I've been on the other side of an exchange like that. I'm not a web developer, so "Apache" doesn't mean anything to me. I wanted to know "What does Apache do, and how does it interact with the code you write?" but they didn't seem to understand what I was asking.

I found the article informative. I can understand what their 'simplest' example does knowing only what sockets are. However, I have no idea what cgi is. Your example is magic to me. The the things the example is supposed to teach are hidden behind its abstraction.

rjbond3rd · on March 5, 2014

CGI lets any server-side script "print" to the browser. (Apache "httpd" is a modular "do everything" web server, which you can customize and fine-tune for performance -- but some say it has a steep learning curve.)

untog · on March 4, 2014

How do you handle deployments when your URL handling is in Apache? Do you deploy the Apache config at the same time as your code?

Seems messy to say the least.

eCa · on March 4, 2014

Way back in my Apache/CGI days the file system was the url handling. No need to touch httpd.conf, just ftp the files..

These days, with Mojolicious behind nginx, it's much cleaner. (But it is good to have used a less abstract setup.)

Kluny · on March 4, 2014

There's no reason to touch the Apache config to handle URLs. .htaccess files do the job just fine and they're located in the directory structure of your application. Or am I totally misunderstanding what you're talking about?

wldlyinaccurate · on March 4, 2014

We put all of our web server config in the web server config - overrides like .htaccess need to be compiled/interpreted/whatever at the beginning of each request so they can add unnecessary latency.

Deployments are still easy. Either symlink the web server config, or (in our case) deploy an RPM which copies the config to The right location. Then it's just a matter of a graceful restart.