The algorithms differ considerably across language and the PHP one is particularly inefficient. The algorithms must be held constant in order for this "benchmark" to be valid.
As of right now this is just a programming competition across a variety of languages.
I'm imagining an alternate version of the Wizard of Oz, where Dorothy clicks her ruby keyboard and says, "All benchmarks are flawed. All benchmarks are flawed. All benchmarks are flawed."
This is not an uncommon technique for comparing programming languages. We are not just trying to compare the performance of different implementations, but we are trying to compare the performance of different idiomatic programming styles in different languages. The shootout does the same thing. And yes, the shootout is also flawed.
As of right now this is just a programming competition across a variety of languages.