make all install PREFIX=$HOME/opt/mongrel2
From your commit log:
> Disable target install-py for now, it requires sudo.
Agreed: re: disabling app that needs root unnecessarily. I don't want anything fucking with my OS python.
Have you considered having it simply call pip instead, so it installs to the current virtualenv?
Any ideas on it other than disabling?
Keeping everything in one dir also makes deployment way simpler.
I'm not sure I understand the motivation for putting the code on github unless changes can flow both ways.
But, I really should just follow this so I can get the changes as an email.
Based on anecdotal evidence only, it seems the rate at which articles move up and down the HN frontpage pretty much ensures you'll see everything that hits the front page per day if you check HN twice - once in the morning and once in the late afternoon or evening.
I don't think that's too much HN browsing to prevent you from being uber productive, as long as you're not also browsing tons of other sites on the net too.
Perhaps some examples of its benefits right in the introduction?
I haven't tried it yet. But it's certainly interesting.
Ragel parser? Um, why should I care if I'm not going to dig into the code myself?
because zedshaw is so fucking awesome? Not a compelling reason to try a web server to me, is it to you?
sqlite configuration? Maybe, but that's not super compelling to me at the moment.
despite my sarcasm, my question is sincere. I'm curious how mongrel2 could be an improvement over what is currently used, which for me is mostly apache
Any other reasons?
What Mongrel2 has over Apache is it's ability to run async jssockets, HTTP long poll style work, and regular HTTP at the same time. If you hit an application where you want to do some async socket type stuff, that's when you should look at Mongrel2.
Of course, that's not all it does, but if you already love your Apache and it's crazy config file format and weirdo 1995 style syntax, then Mongrel2's only addition is the extra protocols.
I'll definitely check out the async stuff, that is compelling!
Configure your web server in any language you want (so long as it has sqlite bindings).
Serve your web pages from any back-end you want (so long as it has zeromq).
Has it turns out, zeromq and sqlite are low barriers to entry for any language.
You're right though, in that you can do all of this with apache, if you're willing to write the right config generator and install (or create) mod_*. Mongrel2 just makes it easy.
edit: Also, what zed said :)
There's no way I can use it for anything remotely close to production but I have some hobby ideas baking...
In fact, there's been discussions on the httpbis list to downplay or remove HTTP pipelining since it's ambiguous and causes performance headaches.
"In Mongrel2 we use a parser that rejects invalid requests from first basic principles using technology that's 30 years old and backed by solid mathematics."
The mind boggles... (Unless he's just being satirical here).
Personally, my advice to anyone thinking of using mongrel2 would be to write your own webserver. You'll learn far more than you ever dreamt. If a webserver isn't that important to your success/failure, use a battle-worn webserver - apache etc
IMHO a webserver is one of the things you want to be relaxed, laid back, and basically not care if the client gets things wrong. Just serve up what they look like they wanted.
The original Mongrel was known for its powerful request handler that didn't let many of the same security attacks through. In fact, in the Ruby world, many other non-mongrel web servers used the mongrel handler for that very reason.
A hand written http parser is kind of like writing a "white-list" of what the server rejects. Since there's no algorithm backing it the only thing you can do is list out all the things you can think of or have run into that is "wrong".
Using a parser (well lexer really) like Ragel I can make something that's relaxed, but it's more of a white-list of what it accepts. The algorithm explictily says this particular set of characters in this grammar is all that I'll answer to.
If you then write the grammar so that it handles 99% of the requests you run into in the wild, you get the same relaxed quality as a hand written one, but it explicitly drops the 1% that are invalid or usually hacks.
This is also the same parser that's power a large number of web servers in multiple languages, so it's proven to work.
A grammar is theoretically provable (yes, that is a double entendre). An ad-hoc implementation is not provable and exhaustively testing its validity is unrealistic for anything but trivial grammars.
Sorry but I'm even more confused now. What are we proving?
"The result is most of the requests "look like" a desire to serve up viruses or spam."
I have no idea what you mean by that.
HTTP is a trivial grammar. The parser is the simple bit. What you do with the headers and how you respond to them is the more interesting bit.
Why would rejecting invalid requests be desirable? Why not just serve up what we think they want? (Of course there's levels of 'invalid'. Reject the crazies, but allow some).
Ad-hoc parsers can be shown to accept all "OK" strings that somebody used to test the parser and can be shown to reject all "not OK" strings that somebody used to test the parser. "The problem with idiots (and black-hats) is that they are so ingenious." The only way to prove that an ad-hoc parser is truly correct is to run all possible strings through it, complete with a-priory knowledge of which strings are OK and which are to be rejected. This is an O(infinite) problem (i.e. the halting problem http://en.wikipedia.org/wiki/Halting_problem).
Guessing intent is a wormhole: how close does the request need to be? What if you guess wrong?
The combination of ad-hoc parsers with guessing intent is a potent way to introduce security flaws in your program. In the case of a web server, the "attack surface" is the whole internet, i.e. there is a huge number of idiots and black-hats that could potentially attack your program.
 War story: in a previous life, the company decided they needed to have a custom code standards checker program (a result of a chain of four or five decisions, all of them really stupid, but that is a different war story). They contracted out the creation of the program, complete with a requirement that the contractee company write the test cases (fox in the hen house). The program was a POS (how did you know that was coming???).
When I looked at the test cases: they had one "positive" (i.e. catches a "bad" construct) test case and NO "negative" (i.e. does not have false positive) test cases. As a result, when run on real code, the "standards checking" program was actively sabotaging good code!
The HTTP parser is simple enough to not have any concerns in itself if written properly.
You should fix the security issues.
This doesn't make sense. Why should a particular piece of the application not be coded with security in mind?
> You should fix the security issues.
One part of this is sanitizing user input. Why would you not do this as early as possible?
The place to block application specific hacky looking requests isn't in the general HTTP request parser. It's in the 'application specific' stuff.
And the purpose isn't to "block application specific hacky looking requests", it only does that as a side-effect — this isn't some inane IDS bullshit sold to PHBs. It's not looking for exploit signatures, it just sanitizes all input as a consequence of correctness.
HTTP requests come from millions of different browsers. Some with bugs, some with idiot creators, etc etc.
My point was that an HTTP request parser is trivial to write correctly. What you do with the headers and request later on are where sometimes you need to be careful.
TBH Though I think I'm just in a different world to all of this mongrel stuff.
EDIT: I see now that it's "language agnostic" based on using ZeroMQ to shuttle request/responses between Mongrel2 and a language with a ZeroMQ library. Sounds cool, but also makes me wonder exactly how that's different from someone writing, say, mod_zeromq for Apache, and attaching handlers to ZeroMQ in the same way Mongrel2 does? Am I missing something?
In Mongrel2 it's kind of like every thing's long poll or an async socket. That let's you do a ton of very cool things you can't do easily in Apache or other web servers. Sure you could hack them in, but it's a nightmare.
Want to write a backend in PHP for a real-time chat? That is very difficult with Apache and mod_php. With Mongrel2 and ZeroMQ, it is almost trivial.
Mongrel2's asynchronous pub/sub networking paradigm opens up possibilities for real-time communication to browsers. When websockets get real, Mongrel2 may very well be the way we all start using them.
Where Mongrel2 has some ground to cover is handling traditional server-generated pages. Want to run an existing PHP application with a framework like CodeIgniter? You can't do it from Mongrel2 without proxying to nginx or apache (yet). Hopefully this is something that will come along in Mongrel2 v2.0.
I have written several HTTP servers myself, and I know it's doable, but the world does not need another webserver that only has this distinguishing feature.
My question is if all of this can be done with Mongrel2?
The features sound cool and significantly evolved past current webservers. And the documentation is extensive and so far, an enjoyable read - now that's impressive. :)
Mongrel was written by Zed in Ruby. Mongrel2 is a new project.
/this note is me sharing what I've learned from zed's tweets, I've not played with the software yet, please correct if wrong.
Also, "does seem like a sensible choice"?
Apart from that, it's completely written in C and uses language-agnostic ZeroMQ to pass "requests" back to the handler than can be written in any language with a ZeroMQ library (pretty much all of them).