I disagree with the sentiment of this blog post, because it implies (unless I missed something) that you need to use some complicated web framework to make the internet go.
You don't, and I've recently been encountering a LOT of very confused would-be devs who get stuck behind some tangled mess of django installation.
The "simplest" web application could look like this:
#!/usr/bin/python
import cgi
form = cgi.FieldStorage()
name = form.getvalue("name", none)
print "content-type:text/plain\n"
print "Your name is %s" % (name)
Or maybe you want to get fancy:
#!/usr/bin/python
import MySQLdb
db=MySQLdb.connect(user=user,passwd=pass,db=test)
c.execute("select something from sometable")
results = c.fetchall()
print "content-type:text/plain\n"
#hey look, CSV data!
for line in results:
print ",".join(line)
etc. etc.
I encountered a dev the other day (I guess I'm a shitty dev, then?) who asked me how my application handled URLs.
uhh...apache?
"No, how does it handle URLs, though"
"Apache. WTF are you talking about?"
"No, like if you go to example.com/foo/bar/, how does it know what to send me?"
"...?"
I swear half the devs I meet lately are more concerned with javascript frameworks and rewriting their own webserver than they are about actually serving content to users.
Imagine a baker who was more concerned with building ovens than baking bread.
Keep in mind what your goals are when you're working: are you looking to write a javascript framework, or are you looking to write a web application? Then evaluate what path you want to take from "not having an application" to "shipping a product".
Sometimes (probably many times), you don't need to make it as complicated as you're making it, and apache will work just fine.
Sure. But then you hit a case where your URI scheme doesn't map 1:1 to the part of the filesystem your web server is looking at. Sometimes you can just hang ?id=foo off the end of the URI and be done with it. Sometimes you're doing something more complex, and an application-level URI router suddenly looks useful, precisely because it makes it easy to map arbitrarily and in detail between the things your application knows about and the URIs your client interacts with.
It's a bit much, too, to say "All frameworks are too complicated to be worth the effort because Django!" Even Django's partisans acknowledge that it's on the very high end of the complexity curve; indeed, it's the only framework I know of which requires more up front from the developer than Rails. I'm lately more and more of the opinion that, in almost all cases, most of that up-front investment is wasted; you can get 90% of the benefit from a microframework such as Ruby's Sinatra or Python's Flask, either of which can be mastered from a standing start inside a weekend, and save the suffering involved in Rails or Django for when, if ever, you actually need that last ten percent.
Basically, you want to choose the right tool for the job. If you're putting in two screws, just grab your trusty Apache-brand screwdriver and go to it. If you're putting in forty, you're better off breaking out the Flask power drill and a screwdriver bit; sure, you can put in forty screws with a hand driver, but you won't do your wrists any favors in the process. And if you're putting up a frame for a two-story house, you're really going to need something like Django's powder-actuated, magazine-fed nail gun, but for anything smaller, it'd be overkill.
The purist approach is nice, and we've all been there, but eventually you'll want to pool those db connections for speed, to comprehensively handle security, and to do things Apache does not do with the URL's like dynamically register/unregister paths in URL's that have no concrete corresponding script file.
Of course you can attempt to add those things yourself, but the best code you'll often write is that code you don't write at all because someone already did. Don't reinvent square wheels.
Or you could keep your markup out of the database and db connections become a non-issue (granted, I'm talking about the 99.9% of websites that are content-centric as opposed to web apps).
I agree with the sentiment that the framework "disease" has gotten a little out of hand (it seems like we have a new one every few weeks now...). However, these same people would come up with a dozen ways of routing URLs to handler functions if they weren't given some sort of standard solution to use.
Basically, frameworks are a way to work on a team without everyone reinventing solutions to problems that are almost beneath notice when trying to get dynamic content on the web.
The reason we don't have single CGI scripts for each url, and instead route everything through a single bootup script is to centralize things like session management, database initialization, memcache, configuration... etc. Things that you could do on every page, but that would be too cumbersome, repetitive, and generally don't change often.
When working with django, or zend, or cake, or whatever the framework of the day is I've often just wanted to go back to the simplicity of plain WSGI. But then I see the mess that people still make when everything is done for them and realize that would just create a new hell.
It takes discipline to program at a lower level, and understanding. Both of those qualities are in short supply.
Well, because when you want to get stuff done you really don't want to invent the wheel again. Why not reuse the work someone else has done before?
This is not to say that understanding web at a low level isn't useful, it certainly is. However, in my opinion it is better to for example build a simple mvc framework as learning experience and use something battle tested in production.
Reuse is good. But my fear is that people learn the intricacies of Rails over SOLID Ruby. The basics are worth practicing over and over again -- the benefits accumulate much like compound interest. Whereas there's no guarantee of Rails being around in ten years.
There's a certain sort of intellectual deference that is given to these frameworks (much of it has to do with how readily people accept things said by famous/rich people). For example, is MVC really the best way to write traditional webapps?
You don't separate the headers from the body, producing a potential vulnerability for header injection or response splitting attacks. Both are severe security problems and both are easily avoided by using a web framework.
While it may not appear to be the case, developing secure web applications is rather complicated, that's why we have frameworks and that's why these frameworks can be somewhat complex. That doesn't make not using them the simpler solution.
> While it may not appear to be the case, developing secure web applications is rather complicated, that's why we have frameworks and that's why these frameworks can be somewhat complex. That doesn't make not using them the simpler solution.
I disagree. The more complexity you introduce, the more code is needed, the greater the chance of bugs, and the greater the chance of those bugs not being discovered sooner.
> I swear half the devs I meet lately are more concerned with javascript frameworks and rewriting their own webserver than they are about actually serving content to users.
But a good URL router allows for code separation, and to a lesser extent, separation of concerns, which directly speaks to the ability to deliver, uhhh, baked goods.
If you're just building static HTML files, then sure, your URL routing is, and probably should be handled by Apache, because each page is its own destination. In modern web development though, you have an application, and that application would likely get big and unruly if you didn't make some efforts to modularize the application into URL addressable resources.
"Handling URLs" then, is as much a part of baking a cake as sifting the flour, or measuring the baking soda.
"If you're just building static HTML files, then sure, your URL routing is, and probably should be handled by Apache, because each page is its own destination. In modern web development though, you have an application"
This is part of the problem, though: most websites should just be collections of static files, but copious kool-aid swallowing has led to heavyweight frameworks being the default architecture.
I suppose, but that's a highly variable statement to make. Undoubtedly, there are many websites that could be reduced to static equivalents, but as I deal in web applications, that's kind of a tough pill to swallow.
With apps, even if I build a single-page app, I have to build an API to power it, and that API needs to know the difference between "customers" and "customers/customer_id", and the easiest logical way to manage that is through URL routing.
Even in static apps, URL routing is kind of a burden, unless you never link to internal pages and never grow your content beyond what you can keep in your head.
The goal was not to convince anyone that they needed to use a web framework. Rather, I hoped to explain to novices exactly what a web framework is and what problems it solves.
Because software engineering should happen with programming languages plus service engines, the service engine should not be the total replacement for all.
But my point is that sometimes people get caught into pre-mature optimization. The baker analogy, within the context of what I'm saying, is about a baker who never bakes any bread, because they spend all of their time iterating on ovens, and not bread.
Having some super fancy bread-making-machine is totally vital...eventually, or if you're doing something very cutting-edge, or out of the ordinary.
Fair enough points; there is indeed a difference between being more concerned about the ovens than the bread and all.
That said, I have worked with/for a lot of bakers who have literally given no time thinking about the oven (I've done a lot of cheap PHP freelancing), and occasionally ruminate on finding a better oven and the choice that led to my current situation in the kitchen.
There are a fair number of folks out there who bake plenty of bread, but who could do so in more sanitary conditions with easier to reproduce recipes if they were open to some different methods.
I agree that a personal blog is probably not the right point for that kind of methodology, but there are also a lot folks who need to learn the difference between yeast and random fungii that they found by googling "free wordpress themes".
Good point. The metaphor is much stronger than the grandparent realized: a great baker should spend some time to find the best oven, install it in their bakery, then (hopefully) never think about ovens again.
If "baker" denotes an individual whose job is to bake bread in a small bakery, then perhaps.
But we're more like industrial chefs, hired to "make" bread at scale (e.g. for Tesco or Walmart). As a software engineer, our roles may be much more about combining, improving, or making our own ingredients, machinery, and processes.
Then, when it comes to baking the bread, we're able to produce much more with less effort, or even hand off the final production to others.
Thank you! But I'm slipping into things I don't understand, and maybe you could illuminate. I assume that for both your examples, we are using, say Apache, to listen on port 80 and route requests to files. So how do you get apache to not just deliver the file as-is, "#!/usr/bin/python" and all? Does Apache typically know to execute and return the result?
Or if my questions aren't making sense, that would be good to know as well.
You can configure Apache to treat some files as
CGI executables [1], which get invoked by their file path, read http requests on their standard in, and print http responses on their standard out. It's how most of "Web 1.0" was built, and it supports streaming in a way that most MVC frameworks don't.
Sometimes if you haven't got things configured properly, you do return the script as text rather than the output of the script.
The old school default was to put executable in the cgi-bin folder (wherever that had been configured to be) and make them executable. That would run the script and return the output.
Sorry, but I'll have to disagree. Frameworks are great. Yep, sometimes a simple one will be enough, but you'll still need one.
Yep, it's quite simple to do GCI by hand... But you'll want all your pages to look the same, so you'll need to add some template functionality to your code. Also, you'll certainly need to handle data, so add some data validation there. You'll also benefit from some better abstraction over your database, and connection pooling... and while we are talking about finite resources, you'll want to limit the number of threads Apache launches.
There is certainly more. I'll certainly not remember all the troubles of reinventing that wheel, as I don't do it since the 90's.
I've been on the other side of an exchange like that. I'm not a web developer, so "Apache" doesn't mean anything to me. I wanted to know "What does Apache do, and how does it interact with the code you write?" but they didn't seem to understand what I was asking.
I found the article informative. I can understand what their 'simplest' example does knowing only what sockets are. However, I have no idea what cgi is. Your example is magic to me. The the things the example is supposed to teach are hidden behind its abstraction.
CGI lets any server-side script "print" to the browser. (Apache "httpd" is a modular "do everything" web server, which you can customize and fine-tune for performance -- but some say it has a steep learning curve.)
There's no reason to touch the Apache config to handle URLs. .htaccess files do the job just fine and they're located in the directory structure of your application. Or am I totally misunderstanding what you're talking about?
We put all of our web server config in the web server config - overrides like .htaccess need to be compiled/interpreted/whatever at the beginning of each request so they can add unnecessary latency.
Deployments are still easy. Either symlink the web server config, or (in our case) deploy an RPM which copies the config to The right location. Then it's just a matter of a graceful restart.
You don't, and I've recently been encountering a LOT of very confused would-be devs who get stuck behind some tangled mess of django installation.
The "simplest" web application could look like this:
Or maybe you want to get fancy: etc. etc.I encountered a dev the other day (I guess I'm a shitty dev, then?) who asked me how my application handled URLs.
uhh...apache?
"No, how does it handle URLs, though"
"Apache. WTF are you talking about?"
"No, like if you go to example.com/foo/bar/, how does it know what to send me?"
"...?"
I swear half the devs I meet lately are more concerned with javascript frameworks and rewriting their own webserver than they are about actually serving content to users.
Imagine a baker who was more concerned with building ovens than baking bread.
Keep in mind what your goals are when you're working: are you looking to write a javascript framework, or are you looking to write a web application? Then evaluate what path you want to take from "not having an application" to "shipping a product".
Sometimes (probably many times), you don't need to make it as complicated as you're making it, and apache will work just fine.