I have no interest in a religious war over languages either, so no worry about that. I just don't like to see Alioth's one small data point extrapolated into a trend.
The benchmark tests are toy programs that solve a small set of problems under constraints that create a known bias in favor of languages like C++. They are an objective measure of something, but that something is not language performance and compactness is real-world, non-toy programs that solve problems unlike the ones in the game.
Impressions are admittedly not the best way to gauge such things, but they're better than relying on a test that does not make any attempt to address the question at hand.
My personal heuristic is to assume Alioth is roughly right with a largish margin of error, and then look for anecdotal evidence of specific areas that the game does a particularly poor job of reflecting. For Lisps, code size appears to be a large blind spot based on everything I have seen. Lisp's ability to create mini-languages and very high-level abstractions — a large source of its brevity — is pretty much useless on the scale of the Benchmarks Game.
I don't know if you're trolling or cocksure, but no, that is not what I was talking about. I said the constraints create a bias, not the measurements themselves. For example, the performance measurements are biased against most garbage collected languages because the rules don't allow any options to fine-tune the GC's behavior (which can make a big difference). Obviously, there are no equivalent rules forbidding people from fine-tuning C++'s manual memory management.