There's now an Apache effort for producing a fully packaged, validated and deployable stack of Hadoop components. The project is called Apache Bigtop (incubating) and the relationship with Cloudera's CDH is like a relationship between Debian and Ubuntu. We make it super easy for folks to deploy the released versions of Bigtop distribution either via packages: http://bit.ly/rHpybV or VMs: http://bit.ly/tBGmNt
I've run clusters with and without Cloudera and I'd never go back to without it. It just works which when setting up a new cluster is often something you can't say.