Either the hypervisor scheduler is shit or the abstraction is costly. I reckon its down to the reduction in CPU cache availability and the IO mux.
This was HP kit end to end, VMware certified, debian stable on and off ESX 4.
Second: Why did you implement a large-scale VMWare install without either having a testing period before sinking contract dollars and license costs into it, or at least having contract terms to opt-out if their claims didn't match reality?
We did have a testing period which was unfortunately run by a Muppet who decided the loss was acceptable. Muppet now no longer works for organisation.
We were running heavy loads and there was nothing like a 20-30% hit. I'm not saying you didn't see one but this isn't magic or a black box: we had a few spots where we needed to tune (e.g. configuring our virtual networking to distribute traffic across all of the host NICs) but it performed very similarly to the bare metal benchmarks in equivalent configs.
What precisely was slower, anyway - network, local disk, SAN? Each of those have significant potential confounds for benchmarking.