
Github and Engineyard part ways - jcapote
http://www.engineyard.com/blog/2009/engine-yard-and-github-transition/
======
tmornini_ey
Hey all. Wanted to chime in the discussion.

My blog post is direct and to the point. There's honestly no mystery about
this, I laid out all the details, possibly excepting one.

We designed our infrastructure offering 3 years ago. We _knew_ that the vast
majority of websites would produce nearly read-only file I/O. GFS's less than
stellar write performance isn't a problem in that typical case.

Along comes Github, and they have an entirely different disk I/O profile to
the rest of our customers. Github was built on a shared-something architecture
because GFS made it quick and easy to get up and running, developing features,
and attracting users. This is a good thing!

Github has been very vocal about their dislike of the GFS filesystem we use
for shared filesystem access. GFS doesn't scale forever, and we've never
suggested it does. It could scale far larger than it has at Github, but fizx
hit the nail on the head: we weren't willing to do it for free, and Github was
unwilling to pay our price.

We warned Github many, many moons ago that given their growth rate, in order
to scale their application smoothly and inexpensively, a shared-nothing
architecture was eventually going to be needed. We offered to help them with
that architecture, as we saw Github running wonderfully atop a cloud service
such as EC2.

Rather than do the re-architecture now, they've chosen instead to move to a
vendor who can provide them a high performance, high availability, non-
commodity, proprietary network file server infrastructure. I suspect it will
work well for them, and we'll all enjoy a faster Github. :-)

From my perspective, I really want everyone to understand that there's no bad
blood on the EY side, and hopefully none on the Github side. Business is
business, decisions need to be made each and every day.

~~~
seiji
I much prefer being talked to this way instead of hearing about "non-Ruby
repositories."

~~~
liquidcool
Personally, I like hearing about both the technical and the commercial reasons
driving a business decision. The fact that GitHub did not serve Ruby projects
exclusively (or primarily) was an important factor to EY - why should they
neglect to mention that?

------
seiji
I think they could have worded this better: "GitHub offers the largest free
storage quota among the big SCM hosters, and we came to the conclusion that we
didn’t want to subsidize that quota for non-Ruby developers."

It sounds odd Engine Yard hosts an entire for-profit company for free just in
exchange for some complimentary accounts.

My imagination tells me GitHub is tired of being slow and Engine Yard couldn't
do anything more to help them. Instead of the news being "GitHub Leaves Engine
Yard Because It's Too Slow" the news is "Engine Yard Kicks GitHub To The Curb
Because GitHub has Non-Ruby Repos."

I don't see either headline being positive towards Engine Yard.

~~~
anonysock
Ok, as far as I can tell, the real story is that Engine Yard relies on a
GFS/SAN setup that doesn't scale in the unique way that Github needs.

If you think about it, Github is one of the few sites that actually directly
uses the filesystem heavily. Everyone else hits scaling issues on the DB
first.

    
    
      The sad thing of all of this is it's not really a matter  
      of scaling, and it never has been. Our bottleneck has 
      always been the file system. GFS just... sucks. I'm sorry, 
      but I have to say it. Case in point, your graph. The first 
      rebuild I ran timed out because of GFS. The second one ran 
      fine, took maybe a minute to process, if that. GFS impacts 
      everything... gem build failures due to cloning... GFS. 
      Network graphs taking long time to build... GFS. Caching 
      jobs not completing... GFS. I think you see where I'm 
      going here. There's no plans to deploy the new code to the 
      live servers, and I think the reason is that we're afraid 
      it'll make GFS performance worse, not better. But on the 
      new servers where we don't have to fight GFS, it's 
      amazing.

~~~
ezmobius
Funny thing is that we told github that gfs would not scale for them over a
year ago, we also outlined how to move to a shared nothing chunk server
architecture. They didn't take our advice so it's mostly their own
architecture decisions that were holding them back with regards to gfs.

Anyway there seems to be plenty of airchair quarterback on this one. The real
story is that we can't afford to host them for free anymore.

~~~
trevorturk
FWIW - thanks for hosting them for so long. GH is a wonderful service, and I'm
sure EY contributed greatly to it's success.

------
stephenjudkins
When evaluating hosting providers, the quote Engineyard gave us was
tremendously expensive.

They offered a great deal of value-added services, but we found that it would
be cheaper to go with the second-most expensive option (Amazon EC2) while also
hiring a qualified full-time sysadmin.

Whether this was a wise choice or not given our experience with EC2 is up for
debate. But even if we had gone with Engineyard and had no problems at all it
would be hard to justify a couple developer's salaries for it.

~~~
lsb
Just out of curiosity, what were your options that were cheaper than EC2, and
why'd you reject them?

~~~
stephenjudkins
There were several hosting services that were a good deal cheaper. It costs
$576/mo to run an extra large EC2 instance fulltime, while purchasing a server
with similar specs only costs $2k or so. The data center across town could
give us an equal number of server power for much cheaper. We're in our own
data center right now, in another city, and we simply have a contractor we pay
to go in and occasionally replace broken equipment. It really is quite cheap
to do it this way.

Various hosts offered greater degrees of managed services at various prices.
None really offered the flexibility that EC2 offered, or offered that much
over simply colocating our own servers. With less traffic or fewer servers I
can see many of these hosts being a great deal.

There also were several places that gave us completely screwy quotes that I
can imagine were only meant to trick CEOs who don't know any better. One
example is a place that offered us the use of an HTTP load balancer for the
"low, low price" of $1000 a month. A lot of places offered expensive managed
services, but when talking to the people who would serve us it was clear they
didn't know what they were talking about. We did not seriously consider these
places.

To Engineyard's credit, they were only managed host who sounded like they had
knowledge that would be really valuable to us. They also made clear that they
would devote significant resources to getting our site working smoothly. We
would have paid a premium for this service, but not the premium they asked.

EC2 offers an amazing amount of flexibility that we found highly valuable. A
sysadmin and developer spent most of a day diagnosing an issue with a DB
server that ended up being a flaky drive. In EC2, we would have simply killed
the server, fired up a new instance, waited an hour for its replication to
catch up, and returned it to the rotation.

To return to EC2 we would have to drastically reduce the load on our MySQL
instances, since we found that EBS volumes had somewhat unpredictable and not
that great performance. Reducing our reliance on MySQL is something we're
doing anyways, since on its current course it would have been difficult and
expensive to continue scaling up in any data center.

~~~
lsb
Not to second-guess you, but extra-large reserved instances going full-time
for 3 years go for $250/mo via [http://aws.typepad.com/aws/2009/08/lower-
pricing-for-amazon-...](http://aws.typepad.com/aws/2009/08/lower-pricing-for-
amazon-ec2-reserved-instances.html)

------
calambrac
Jesus Christ, there's a lot of drama-mongering in this thread. Grow the fuck
up, it's a business decision that both sides had an interest in seeing happen.

------
davidw
> We identified the bottlenecks and supported GitHub and the community by
> making patches to ssh to allow key lookup in MySQL rather than a text file.
> That remains, to this day, one of the finest examples of Engine Yard support
> and it makes me extremely proud just thinking of it.

This seems... odd to me. It doesn't feel like the right boundary between
businesses. From a hosting provider, I expect good, steady service, reboots, a
root console, and that they'll fix anything that's on their end (hardware, for
instance).

Patching ssh is development work, and is something I would expect to pay for
to meet specific goals, but not something that comes as part of my hosting
package. I mean, what if you are just cruising along, and _don't_ need any
deep hacking over a couple of months. Is your money being wasted? With actual
developers, I could redeploy them to do other things. Can I do that with EY?

They seem like really good, sharp guys, but I don't quite get the business, I
guess.

~~~
tmornini_ey
From our perspective, it was a matter of helping our partner and the Ruby
community succeed.

We're working really hard to make Ruby on Rails succeed. This is just one of
the many ways that we've pitched in to help it do so.

If, during that downtime, when the "Twitter's problems are Ruby on Rails" FUD
was running high, would anyone believe that a Github scale failure wasn't a
Ruby on Rails problem? We weren't willing to test it, so we solved the
problem, i.e. we put our money where our mouth is -- That Ruby on Rails scales
just fine, the bottlenecks are generally elsewhere.

~~~
davidw
> From our perspective, it was a matter of helping our partner and the Ruby
> community succeed.

I suppose what seems confusing to me is that there are a lot of things you
_could_ do to help your customers succeed, but many of them are fairly
expensive - such as cutting your rates, or doing high end development work.

With more basic hosting, say EC2, I know that Amazon isn't going to do beans
for me, so I'm on my own. I know exactly what the price does and doesn't
include, and what I have to provide myself. EY seems fuzzier... it's almost
like having an extra developer on staff in terms of talent, but it is and it
isn't: you can't tell that person to go off and do something else.

Say I move one of my sites to EY; is your "whatever it takes" attitude going
to include fixing up my ugly design/graphics work?:-)

~~~
nakajima
It seems to me that the patch wasn't for "GitHub, the EngineYard client", but
instead, "GitHub, the Ruby community resource", just as they support JRuby,
Rubinius, and Rails development.

If you operated a site that EngineYard saw as a valuable resource for the Ruby
community, I'm sure they'd help you out in whatever way possible as well.

~~~
davidw
What you describe is that paying customers subsidize 'community resources',
which is great for me as a mooching Rails (and github) user, but perhaps not
so great for paying customers. I don't think they'd put it that way; and
indeed I believe they must provide a lot of 'extras' for their customers for
the cost of that service. Still though, it seems that it's a service that will
be best with clients who make a lot of help requests, getting their money's
worth.

------
yesimahuman
Engine Yard and Github's business relationship aside, is anyone else sick of
everything these days being about the tool or language people are using? For
some reason we hear more from the Ruby side than any other about how great
their language is, but frankly the only thing that matters are the products
made from it. I don't know, I'm just kind of sick of Ruby fan boys.

~~~
carbon8
_"we hear more from the Ruby side than any other about how great their
language is"_

Um, that crown _unquestionably_ goes to a big chunk of the python community
right now. Reddit alone provides a steady stream of both extreme fawning over
python and vitriol targeted at ruby, and your comment and its upvotes
demonstrate that it's present here, as well.

It's also quite a colossal double standard to complain about Ruby-specific
hosts considering the existence of GAE, all the django-targeting hosting
companies <http://www.google.com/search?q=django+hosting> and all the
companies providing language-specific services for PHP, Java and every other
mainstream language.

~~~
yesimahuman
I'm not complaining about the hosts, I'm complaining about the extreme focus
we put on the language being used rather than the products being made.

~~~
carbon8
_"I'm complaining about the extreme focus we put on the language being used
rather than the products being made."_

What "extreme focus on the language"?

Simply mentioning "Ruby" in the context of a Ruby-only hosting company's
decisions does not constitute "extreme focus" any more than mentioning
"Python" or, more recently, "Java" when discussing the Google App Engine. It's
completely uncontroversial to anyone who doesn't have an axe to grind.

Also, the services provided by the companies (or as you put it, "the products
being made") are Ruby and Git hosting, with EY also maintaining several Ruby
implementations. I'm not sure how you could say anything about EY's "products
being made" without mentioning Ruby.

~~~
yesimahuman
Perhaps in the context of what services EY are providing it isn't
controversial, since they clearly just provide Ruby hosting. In general though
I feel like lately everyone is so dead set on the langs/tools they are using
that we tend to focus on the languages rather than the general CS topics
behind all of them. This was just a good time to bring up my annoyance I
guess.

------
datums
Bottomline, it's a business decision, I'm sure if staying with EngineYard was
an option they would have gone that route. Moving to EC2 is probably a lot
cheaper than other pricey, like RackSpace solutions. It's part of growing up,
you sometimes have new friends, but never forget those you grew up with. I
wonder how many new EY customer use their github accounts ?

------
mitchellh
Does anyone have any idea what host github is moving to?

~~~
pjhyett
We're moving to Rackspace.

~~~
judofyr
Cloud or dedicated/managed?

~~~
pjhyett
Real hardware and lots of it. Once we get moved over and everything's humming
along nicely, we'll do a writeup or two about the new infrastructure.

~~~
jnewland
Nice, I'm really looking forward that to that. I'd love to hear about the
transition too - big moves aren't easy, as I'm sure you guys know.

------
llimllib
Silly question: the GFS referred to here is Global File System
(<http://www.redhat.com/gfs/>) not Google File System, right?

~~~
tmornini_ey
You are correct, sir.

------
jwr
I think it's rather funny that this would never be news if not for the
keywords. I mean, a customer changes a hosting provider for business reasons.
Yawn. Who cares.

But, drop in "Ruby" and "Git" and see it as front page news on HN with a
discussion going off on tangents.

------
omouse
Don't offer a free storage quota or limit how many projects you can host.
There, problem solved. The people who leave will no longer be a drain on your
servers and the people who stay will be paying for the maintenance of those
servers.

~~~
matthewcford
That would kill the majority of open source projects hosted on github, free
hosting for public repos is one of the reasons it's so great. I'm willing to
pay for private repos, but if I had to pay for my public ones too, I'd just
not host them on github.

~~~
omouse
Maybe those open source projects should charge a fee as well?

------
callmeed
Does this mean I'll no longer get a free GitHub account as an EngineYard
customer?

EDIT: N/M, found the answer.

Does this mean I can have a free GitHub account because I'm also a Rackspace
customer :)

