However, if someone chose us purely because of a routing algorithm, we probably weren't a great fit for them to begin with. We're not selling algorithms, we're selling abstraction, convenience, elasticity, and productivity.
I do see that part of the reason this topic has been very emotional (for us inside Heroku as well as our customers and the community) is that the Heroku router has historically been seen as a sort of magic black box. This matter required peeking inside that black box, which I think has created a sense of some of that magic being dispelled.
Developers don't want to pay for abstraction just for abstraction's sake, they want to pay for abstraction of difficult things. Intelligent routing is one of those difficult things. Random routing is easy, which is of course why it's also more reliable, but also why you're seeing people feeling like they didn't get what they paid for.
I should be clear; this doesn't affect me personally but I am totally sympathetic with the customers who are bent out of shape about this and I still see a divide between your response and why people are upset, and I'm trying to help you bridge that divide.
It's interesting — very few customers are actually bent out of shape about this. (A few are, for sure.) It's more non-customers who are watching from the sidelines that seem to be upset. I do want to try to explain ourselves to the community in general, and that's what this post was for. But my first loyalty is to serving our current customers well.
I'm using gunicorn with python, and if I use the sync worker the request queue easily hits 10 seconds and nothing works; if I switch to gevent or eventlet, new-relic tells me that redis is taking the same time stuck getting a value. This is using the same code in my current provider that works just fine with eventlet and scales well.
To add insult to injury, adding dynos actually degrades performance.
Can you email me at adam at heroku dot com and I'll connect you with one of our Python experts? I can't promise we'll solve it, but I'd like to take a look.
Would you be willing to pair up with one of our developers on your app's performance? If so email me (adam at heroku dot com).
Fwiw, I've found the addon / db connection limits to be the primary blocker when load testing so far.
Seems to me you don't get it. Sure there are some very vocal non-customers but you also have a lot of potential customers and users (spinning up free instances) evaluating your product and hoping for a better product. I agree that your true value is the abstraction you provide. Some of these potential customers want to ensure Heroku is as good an abstraction as promised to justify the cost and commitment.
I'd argue that we dropped the ball on that before (on web performance, at least), and are rectifying it now.
A major selling point of Heroku is that scaling dynos wouldn't be a risk. This guarantee is now gone and it's not coming back soon even if routing behavior is reverted, because users prefer good communication and trust with their providers. The responses of Heroku are blithe non-acknowledgement acknowledgements of this problem.
>A lot of Heroku's apparent value came from the intelligent routing feature. Everybody knew that it was harder to implement than random routing, that's why they were willing to pay Heroku for it.
is being addressed by Adam in this comment:
>Heroku's value proposition is that we abstract away infrastructure tasks and let you focus on your app. Keeping you from needing to deal with load balancers is one part of that. If you're worried about how routing works then you're in some way dealing with load balancing.
I think Adam is getting to a really fair point here, which is that nobody really minds whether the particular algorithm is used. If A-Company is using "intelligent routing" and B-Company uses "random routing," but B-Company has better performance and slower queue times, who are you going to choose? You're going to choose B-Company.
At the end of the day, "intelligent routing" is really nothing more than a feather in your cap. People care about performance. That's what started this whole thing - lousy performance. Better performance is what makes it go away, not "intelligent routing."
This is why it was good marketing for Heroku to advertise intelligent routing, instead of just saying 'oh it's a black box, trust us'. You need to know, at the very least, the asymptotic performance beavhior of the black box.
And that's why the change had consequences. In particular, RapGenius designed their software to fit intelligent routing. For their design the dynos needed to guarantee good near-worst-case performance increased with the square of the load, and my back-of-the-envelope math suggests the average case increases by O(n^1.5).
The original RapGenius post documents them here: http://rapgenius.com/James-somers-herokus-ugly-secret-lyrics
The alleged fix, "switch to a concurrent back-end", is hardly trivial and doesn't solve the underlying problem of maldistribution and underutilization. Maybe intelligent routing doesn't scale but 1) there are efficient non-deterministic algorithms that have the desired properties and 2) it appears the old "doesn't scale" algorithm actually worked better at scale, at least for RapGenius.
That's not the point.
As I think of it, "performance" is an observation of a specific case while intelligent/random can be used to predict performance across all cases.
"A major selling point of Heroku is that scaling wouldn't be a risk"
This is interesting, especially the word "risk." Can you expand on this?
To my point of view, the routing is "random" thus kind of unpredictable. If scaling becomes more of an issue with my business, the last thing I want is to have random scaling issues that I can not do anything about because the load balancer is queuing requests randomly to my dynos.
I want my business to be predictable and if I'm not able to have it I'm going to pack my stuff and move somewhere else.
For now, I'm happy with you except for your damn customer service. They take way too long to answer our questions!
Can you email me (adam at heroku dot com) some links to your past support tickets? I'd like to investigate.
Thanks for running your app with us. Naturally, I expect you to move elsewhere if we can't provide you the service you need. That's the beauty of contract-free SaaS and writing your app with open standards: we have to do a good job serving you, or you'll leave. I wouldn't want it any other way.
Fire-fighting during the scaling phase is a problem that every fast-growing software-as-a-service business will probably have to face. I think Heroku makes it easier, perhaps way easier; but I hope our marketing materials etc have not implied a scaling panacea. Such a thing doesn't exist and is most likely impossible, in my view.