Visibility is one part of the problem. 50k requests/min is only ~833/s. The reality is that a single dyno should be able to more than handle that sort of load, especially if it is a simple app. People are doing 10k connections on a single laptop, 833s should be a piece of cake. So, yes, visibility is a big issue here because you have no idea if you need 10, 11, 12 or 20 dyno's to serve 50k requests/min. You just guess and when you guess wrong, it ends up with cascading failure of H12's and other issues. Never mind that very few apps have a steady stream of traffic and most have big dips depending on the time of day and HN popularity... and now we are back to the auto scaling discussion.
Another key part of your statement is 'with very little variation'. The code pretty much can't be doing anything other than serving up some static content because as soon as anything that requires any sort of IO or cpu will instantly throw the system into H12 hell. Yes, a CDN will take load off your Heroku dyno's because god forbid that your dyno actually do anything itself. Except that you forget that not all apps are webapps and in my case, there is no reason to add a CDN when I'm just serving requests and responses to an iphone app.
The other part of the problem is being able to actually do something about it. I've tried anywhere between 50 and 300 dynos (yes we got that number increased). If we could just throw money at the problem that would be one thing, but nothing was able to resolve the H12's that we see and our paid support contract was no help either.
"If you're running a blog that serves 600 rpm / 10 reqs/sec off of two dynos, you don't need to sweat it."
Once again, we are back at the same conclusion... don't use Heroku if you want to run a production system.