
Going Colo - wyclif
http://blog.pinboard.in/2012/06/going_colo/
======
brandon
It's really cool that Maciej is so open about his setup. This is especially
true given how opinionated people are about these sorts of operations. For
example:

I'm hoping the Hurricane Electric facility he mentions is Fremont 2. The
Fremont 1 facility has been plagued with power and packet loss issues over the
last year (ask your friends with Linodes in Fremont all about it ;)

It also seems a shame to put both of the deployments so close together.
Proximity to your machines is a non-issue with a decent colo provider: his
example of driving to Vegas every time a drive fails assumes that the provider
would be unable or unwilling to sell him 20 minutes of remote hands. Spend a
little extra money up front for out-of-band access to your infrastructure and
leverage the provider for the physical stuff!

~~~
lsc
>Spend a little extra money up front for out-of-band access to your
infrastructure and leverage the provider for the physical stuff!

This /sounds/ like a good idea until the reality of trying to get someone
random to fix an actual hardware problem over the phone sets in.

It only takes one idiot pulling the wrong drive once.

(to be clear, spend the money on oob management, too. Serial consoles are
essential if you want to have a hope of knowing why that server rebooted last
night. And you need to be able to monitor your power draw; it really doesn't
cost that much more to spring for the PDU that can monitor and switch the
outlets.)

~~~
rdl
I was half thinking of setting up colo in Hawaii specifically _to_ cause
people to have to make work-covered travel plans to there to "fix things". The
official justification would be to be under US law and have lower latency
access to Asia (although I think Seattle to North Asia is about the same as
HNL to North Asia, and certainly cheaper/better bandwidth).

The problem is power is even more expensive there than in SF.

------
krobertson
HE Fremont is a total ghetto. 15A per full rack, essentially no cooling. Its
just a massive open warehouse. Fremont 2 is much better than Fremont 1, but
still had one power outage when I was there.

Connectivity wise, they are pretty good. Sometimes may be a little more DDoS
prone though.

I've been in several Sacramento datacenters and currently have a half rack in
in Webion in Rancho Cordova. Ragingwire is freakin expensive.

The pricing with prgmr.com looked pretty good though. SJ gets better bandwidth
pricing, and probably about just as crappy on power.

If looking for a serious deployment, generally have to look outside CA. Great
connectivity, but power is what will kill you.

~~~
lsc
if you are using serious power and are willing to go super high density (e.g.
you are willing to pay the premium for blades) BAIS in santa clara is not a
bad deal; north of 10Kw usable for under two grand a month. Santa clara power
prices are dramatically better than san jose power prices.

I'm trying to move my own stuff in the direction of santa clara.

Of course, at the low end? I mean, if you are the $200/month customer that is
thinking about buying from me? that sort of thing probably makes less
difference.

The funny thing about he.net is that if you work out the cost per watt, they
are more expensive than most places after negotiation. (on the other hand,
he.net gives you the price off the top; most other places go for 30-50% over
what the real price is.)

~~~
krobertson
Doing the quick math from what I remember HE's pricing to be, it is cheaper
per watt... at my peak I had a full rack and 4.2Kw and only actually using
half a rack. It just feels wrong to have to get 3 racks at HE and basically
just be 4 servers per rack.

HE just seems at weird economies of scale. Pretty much no cooling, 70-80% of
the warehouse empty, and $1/Mbit (they lower now?). If Fremont 2 gets more
populated, will it inherit the same power issues? Part of me doesn't want to
know.

I've heard good things about SVTIX though... haven't toured or had anything in
there, but it seems to have a good rep.

~~~
lsc
you had 4.2kw in a rack at he.net? I didn't parse that. ah. You must have had
3 racks (or at least 3 15a circuits) as that'd be about that much power
usable.

they'd only sell me 1 15a 120v circuit, which is 12a usable; 1.44Kw. Something
like $500/month before bandwidth, $347 per Kw.

Compare that to $600/month for 20a 120v at svtix (I'm going to charge you
$650, and throw in some bandwidth if you buy from me. If you go direct with
the guy that runs the place, maybe you can do better, maybe not, but you'll
definitely spend more time negotiating with him than with me.) that's 16a
usable, 1.92Kw. $339 or so per Kw retail, including 10Mbps bandwidth.

(I'm not saying my prices are great; I think my prices are a reasonable no
haggle deal. compare this to the quote from BAIS in santa clara for 11.5kw..
6x20a 120v circuits in a single 47u rack... 96a usable, for two grand, that's
$174 per Kw. Of course, at that point you are paying extra for higher density
servers, though, and the up front price was rather higher.)

The power at Svtix has been solid for me, and the owner of the data center
keeps it way colder than it needs to be. Which, I guess, is good marketing?
but it costs money; 70-80F isn't going to kill a modern server.

I think that he.net fremont is a fine temperature; 80 degree temps aren't
going to hurt your servers. (much above that and you worry) but the power at
fremont 1, at least, has been horrible. I have two legacy racks there that I'm
trying to get out of, and man, they've gone down three times in the last two
years, which is horrible by the standards of such things. Once, the power
surge when the power came back toasted one of my servers.

It was miserable. One of those "this is why you should get dedicated servers
rather than co-locating your own stuff" experiences. I screwed up the
configuration on my PDU, so my stuff didn't come online after the first
outage; I was paged and had to leave a party. I fix it, and then the next day
(early, if I remember correctly, or maybe it just seemed early) the power went
out again, and this time it fried one of my servers, so I had to shuffle in,
zombie-like, and swap it out with a spare.

~~~
krobertson
Ohh no, 4.2Kw in Sacramento... just thinking in terms of what that would be in
HE.

I must need sleep, since just realized I calculated kW/$. :)

With my math corrected though, getting about $338/Kw.

------
justauser
As mentioned in the blog, "So many factors to consider!" Do your homework
early and upfront even if planning to colo just an Atom or Soekris box. I can
tell you that moving many multiples of racked equipment via semi-truck from
one datacenter to another is not an enjoyable experience.

Ping - Power - Pipe : Ask everything you need. If a datacenter is unwilling to
share or discuss who they are, what they do and what they offer then walk
away. Colocation is a buyer beware environment.

Here's a VERY ROUGH list/braindump of questions that AT THE VERY LEAST you
need to know before placing equipment with the colocation provider/datacenter.

Tier I,II,III, IV datacenter? Who owns the datacenter? Investors? What's your
ASN number(s)? Carrier neutral facility? How many connectivity providers are
present in the data center? Does data center run BGP over these providers? IP
Transit providers - Bandwidth Pricing? Overage pricing? Straight $mbps/month?
Who manages your network? Ip allotment? Assignment procedure? High density for
micro vms(Heroku style)? 200 VMs per host meaning 200IPs per vm. Swip/rwhois
updates on assignments? Cross-connects? Cross connect pricing? Charges? Local
loop(last mile connectivity) charges? BGP support? Is our traffic on a
private/separate v-lan from your other customers?

Infrastructure? -Redundancy in detail -Brands -Networking -Connection -Power.
Maintenance programs?

A/B feeds from two separator breakers or breaker panels or better yet two
Utility Power Service Provider if they are truly A/B feeds. A/B Diesel-
Generators? Additional fuel storage on site? Is a PDU included? If so, how
many outlets and is it accessible from the web or telnet? What type of hand
off/drops to your rack will you be provided (Ethernet)? Are there raised
floors? Are there any fire suppression systems in place? Physical/Video
Security? Biometrics? Screening of employees?

Insurance requirements to colocate equipment with you?

On-site visit requirements? 24x7 staffing? Remote Hands - Cost and Abilities?
What's included? Shipping/Receiving of hardware?

Sas II 70 or SSAE 16 certification? Audit available for review? -Common
practices -Staffing -Security -Emergencies - Escalations -Contacts Outage
alerts? Notifications through Twitter or similar offsite status page? RFOs for
past outages? What's been your corrective action?

Power density per rack? 80% rule or true metered? Can you rack the equipment
for us? Who provides cabling? Types of rails for servers? Billing-95th or
other? Rack size? Partitioning available? 4u/10u/Quarter/Half? Cages? Lockable
or baker's racks? Shared with others?

Finally, ask for a listing of ALL fees.

~~~
wmf
A perfect example of the time-wasting that he says you should avoid.

~~~
merlincorey
Maybe for 1 or 2 servers in a quarter rack; but all of these questions and
concerns are very salient if you are filling full rack(s) or cages and plan to
get your own ASN and such (as any serious internet company should, since
otherwise you are not PART OF THE INTERNET). Not getting the answers to these
questions is the difference between quite a lot of pain, heartache, and money
spent or having everything run (relatively) smoothly.

~~~
moe
I have to agree his list is a little unfortunate. It contains all the basic
questions that you _should_ ask (responsibilities/transits/power density
etc.), but then also rather specialized things (PDUs, RFOs) that are a little
over the top when you're going to rent half a rack without significant growth
perspective.

That said, the colo _should_ indeed answer all these questions. It's just that
most small shops will not be able to make sense of the answers...

If you're a small shop looking to dip a toe into colo then a good summary
question to ask is "Who else do you host?". If they have a few big-names to
share then they'll probably be good enough for you, too.

Ideally ask for permission to ring up one or two of their reference customers
and then actually do that. Colos don't like that very much - but if they
outright refuse then you'll know to better look at another one.

~~~
justauser
Ideally, most of this would be on a provider's website. Unfortunately, the
NAPs of the world rely on multi-level sales staffing that present what they
think is important(or differentiates them). The smaller colos who envision
themselves the equivalent of the NAPs refrain from sharing much info as well
and then the consumer ends up in a dim corner of "kinda/sorta generally
acceptable commodity(service and quality) level colocation." It is then your
burden to translate that to a customer why their LOLcats website is loading
slowly for Uncle Bob. SLAs are hard and most providers treat them like a joke.
If you're a startup and want to pass the buck, then just go with AWS. There is
a solution for everyone.

~~~
lsc
Yeah, I find this incredibly irritating. Just getting the real price can take
months of back and forth. On the low end, the cost of just the negotiation can
dominate everything else.

------
rosser
I'm also currently working with Luke, of prgmr.com, to set up a new home for
my colo-ed server. (In fact, I'm getting the same 10U, 4 amp package.) He and
his team have been fantastic. I can't recommend them enough.

~~~
dholowiski
Honestly, I found that page (<http://prgmr.com/san-jose-co-location.html>)
fascinating. Way better than the linked article.

~~~
rdl
EDIT: I was wrong (based on misreading website)

They are single-homed to Cogent (at least that's how it appears; I didn't
check looking glass), which is fine for a hobby server, but not really good
for a startup.

~~~
lsc
nope. I'm single homed to egihosting, who is mostly he.net but also
globalcrossing and nlayer.

<http://www.datacentermap.com/as/47066_upstreams.html> (47066 is me)
<http://www.datacentermap.com/as/18779_upstreams.html> (18779 is egihosting)

Now, that's not a whole heck of a lot better from my point of view; but
egihosting (they are another co-lo provider you might want to check out) is
multihomed... so if you are the sort that thinks outsourcing means you don't
have to worry about it, I have outsourced my multihomed bandwidth to
egihosting.

The references to Cogent were referring my last attempt to get a second
upstream. (the deal appears to have fallen through? we will see. I will have
to write about it. It was a long story, in which a mistake I made 7 years ago
comes back to bite me in the ass.)

I mean, yeah, the network isn't awesome, but eh, it is reasonable, I think,
for the cost. I think that if you have physical hardware, you are better off
spending the money on a second location (and for the difference between my
level and 'premium' co-location, well, you can host with me, you can duplicate
your setup with someone else in my price range, and probably have some money
left over.

Two cheap ones are usually more reliable than one good one

(especially as this is way more of a 'market for lemons' than auto buying.
It's really hard to get true information about the reliability of various
providers.)

~~~
rdl
Oh! Sorry -- I was confused. That's a bit better. (I like nlayer in
particular). egi is pretty good.

Being single homed to a small but competent provider who is multihomed is IMO
better than being single homed to a large provider, in that the small provider
is a lot more likely to actually help you out when things go wrong.

Cogent obviously would be a nice way to get some cheap extra bandwidth, too. I
just fear depeering days :)

~~~
lsc
I agree that singlehoming to a good tier-2 provider is much better than being
singlehomed to a tier-1 of any quality; It's not so much the support I fear as
the... as you said, depeering.

(of course, if you are looking for and willing to pay for serious bandwidth,
you want to look for three or more upstreams. And in that case, having 3 tier
1 providers is better than having 3 tier 2 providers, as you are less likely
to get route overlap, and tier 1 providers, all other things being equal,
usually have better latency.)

------
pcowans
Just a couple of observations based on experience:

\- You'll most likely run out of power well before you run out of physical
rack space.

\- You'll (hopefully) spend less time in the data centre than you think,
especially if you have good lights out access, remote power bar access etc. If
you're locating in a large city you can often get a better deal somewhere in
the suburbs rather than downtown with very little downside (at least that
seems to be the case here in London).

\- Interconnects between racks can be awkward to get set up, and you can't
necessarily get the rack next door when you want more space.

\- Redundant everything will save you a lot of hassle, but gets quite
expensive. In particular redundant power will save you from downtime when the
data centre decide they need to take one of the power feeds down for
maintenance (it happens), and setting up redundant (external) networking has
the same effect. Having a second network for admin access prevents you from
being locked out in some failure cases. Hot swappable drives are useful for
systems which are awkward to take offline (e.g. database servers), but not
massively useful elsewhere if you have good redundancy at the server level.

\- Data centre rack floors are about the most unpleasant place to work you can
imagine :-)

\- If you can get a handle for how competent the data centre are at mitigating
DoS attacks it's a useful datapoint. If they aren't on the ball an attack
against anyone hosted in the same place will take you offline too. having said
that, I'm not entirely sure there's a fool-proof way to determine this ahead
of time.

\- You can generally borrow a cart with a keyboard and monitor on it to wheel
up to your rack, so no need to have those permanently installed.

\- Keep a stash of network cables, cable ties etc. at the bottom of your rack
- it's annoying to show up only to find you've forgotten that kind of thing.

\- You may need to register authorised employees before showing up, so make
sure you do that before taking your new intern to help rack a server. Expect
them to ask for ID before letting you in.

\- Data centres (at least some of them) will take deliveries on your behalf,
so there's normally no need to get things shipped to your office then drive
them out yourself.

------
nsxwolf
So, what's the consensus regarding colo vs. the cloud? I've enjoyed walking
through rows of colo'd equipment, both as a security guard and as a technology
professional. There's a lot of romance to it.

It also seems that we've entered an era where you're a fool if you colo a lean
startup. Is there a best practice, book, really good blog post, etc that helps
one determine when it makes sense to colo vs just spinning up more Heroku
dynos?

~~~
rdl
IO heavy applications still don't belong in the cloud. They may make more
sense in dedicated servers vs. coloing your own hardware, though.

~~~
lsc
>IO heavy applications still don't belong in the cloud.

I agree.

>They may make more sense in dedicated servers vs. coloing your own hardware,
though.

depends on what skills you have already, and then your time to money exchange
rate. If you use a lot of compute power, you can save a lot of money with a
relatively small time investment, if you already have the skills.

~~~
rdl
Even though I like messing with hardware, if you've only got a few racks of
equipment, some kind of managed hosting often makes sense for alternate sites
just due to cost of sending staff there to mess with things. Maybe use colo at
your primary site, but managed hosting for a DR site somewhere.

~~~
lsc
I think "the cloud" is perfect for DR. have a script that spins up your site
in the cloud, test it one day a month, and pay for it one day a month. You go
down for a day? you pay for one day in the cloud.

Of course, this does eat the same amount of labor as building for the cloud
/and/ building for co-lo, so you still have the time cost.

~~~
krobertson
It is great for DR if it can support your IO load, considering the cloud sucks
for IO. :)

One project moved from AWS to a managed hosting provider with gear built to
our specs mostly due to IO issues... continued to grow and then went back to
AWS looking to do an easy DR process. Had some issues just with EC2 being able
to keep up with replication at that point.

The one thing I'd like to try with AWS though is their Direct Connect service.
Basically colo in a facility near them and get a cross connect to EC2 (think
it focuses on VPC). <http://aws.amazon.com/directconnect/>

IO loads on real hardware, EC2 for all the other stuff.

~~~
lsc
well, really, it's virtualization[1] that sucks for I/O, not "the cloud" - you
could build something very ec2-like without shared disk, in fact, I think
ammazon might.

But yeah. Active/active is a far better solution than what I describe, I mean,
modulo the extra cash it costs.

[1]well, really, it's sharing spinning disk... the actual overhead of
virtualization is pretty low by this point, but if you have two people
accessing the same spinning disk? each gets rather less than half the
performance they'd have gotten if they have the whole disk to themselves due
to the physics of spinning disk and how much worse random access is than
sequential. (two simultaneous sequential read/write streams to the same disk
can look a lot like random access, depending on queueing.)

------
kintamanimatt
The article says that you're are not allowed to use the full rated capacity of
a circuit. Can anyone tell me why? This seems weird to me.

Also, is his concern about earthquakes really warranted? I'd have thought that
a DC would have been able to withstand more or less anything thrown its way.

~~~
duskwuff
Most electrical equipment, including computers, draw significantly more power
as they're starting up than they do at steady-state. If everyone's server were
pulling the maximum rated capacity at steady-state, you'd probably trip a
breaker when everything started up (for instance, if power had just been
restored after a failure, or if several machines were power cycled
simultaneously).

~~~
cagenut
Its this. You need the headroom for power-on/reboots. Blade chassis are smart
enough to stagger starting up each blade for this reason. 1Us don't know about
each other.

~~~
kika
1U also "know" as long as you use proper PDUs (Power Distribution Units,
basically power strips with some tiny brain). PDU's will also measure the amps
for you and you call poll them for numbers and you can powercycle servers
remotely. They're usually worth every penny.

------
caseyf
We moved from a single dedicated server to a colo 5 years ago and we've never
looked back. It's saved us a lot of money. We started with 2 lonely servers in
24U and now we've got 9 servers, a couple routers, a couple switches, and some
other stuff in there.

Here are my tips for people that are considering a colo. These are really all
about working comfortably within a confined space (a half rack or smaller)

* 10U is not that much. Don't buy space that you aren't going to need but compare prices and find out about options for expansion. Space isn't expensive but you might have trouble buying space that isn't bundled with power.

* Keep your wiring organized. Don't say "I'll fix that later".

* This goes hand in hand with the previous: If everything has redundant power, are you going to run out of outlets before you fill your space? Consider installing a metered PDU (or PDUs) before you add any machines

* You probably don't need 2U servers. The 1Us that I currently buy have 2 CPUs, 10 x 2.5" drives (also handy for SSD) and redundant power: [http://www.supermicro.com/products/system/1U/1027/SYS-1027R-...](http://www.supermicro.com/products/system/1U/1027/SYS-1027R-N3RF.cfm)

Also, like justauser said, get a complete price list. Sucks to grow and find
out that you are going to get hosed on cross connects or extra power or
something like that.

------
sargun
So, I have a lot of opinions about datacenters.

Bandwidth - So, this seriously matters, but you shouldn't spend money on it.
The biggest things about bandwidth, and connectivity are: reliability,
latency, and capacity (in order of most important to least important)

Reliability: This is things like, packet loss, how often your connectivity is
impaired, etc... This will have the largest Ux impact. Most bandwidth
providers will have a pretty good story about this. Typically, the delta of
reliability between a $15/mo provider and a $150/mo provider doesn't actually
provide any realistic differences. Don't waste your money.

Latency: Although, most of the applications we'll be working on won't be
sensitive to connectivity latency, it does create an upper bound of the speed
of TCP data transfer. (Look at the big fat, long pipe problem). All of us have
big fat, long pipes to our houses phones now. You can't do much for this than
locate your datacenter well. The other thing i, you can ensure that your
transit provider has good peering relationships, and transit. I would
recommend looking at the list of peers through a looking glass (bgp.he.net is
a particularly good looking one).

Capacity: This is the question of how much bandwidth can I soak up at any
given point. You shouldn't ever have to think about this problem. You're
probably not going to have to scale your bandwidth faster than what your
upstream already has. Nonetheless, if you're running a bandwidth intensive
site, you should consider this, because you could potentially choke your
upstream. They will more than likely just go ahead and blackhole you in order
to preserve the rest of their revenue stream.

------
bschlinker
Interesting that the author chose to use a Kill-a-Watt for power consumption
monitoring.. seems like a better idea to use the planning applications
available from the major vendors to determine usage depending on workload,
etc. Dell and HP both have a calculator where you set the system (RAM, CPU, #
HDs, etc.) and the load and it will provide you with an approximate power use,
along with a peak load.

~~~
ajdecon
The calculator is useful for approximations, but I've been bitten a few times.
The only real way to know, as always, is to _measure_ it.

The Dell/HP salesmen are worse, if you try to talk about power, especially if
you do HPC. They will often flat-out refuse to believe that you're going to
take 100 servers, stuff GPUs and all the RAM you can find in them, and run
them at 100% for years on end.

------
nodesocket
Surprised the author was able to get 100mbps unmetered for such low prices.
Would guess the bandwidth is not multi-homed and single carrier. Usually you
pay for a standard 100mbps drop, but then 95th percentile based on the number
of purchased megabits in your contract.

~~~
kika
HE in Fremont is a very special place. I have about 10 racks there. They're
cheap and bandwidth is free, the support is awesome, but it's a ghetto, to put
it simply. Or a hippie camp. The racks are telecom, not computer ones (which
means that they're shallow and narrow, 30" long server prevents the back door
from closing). The power is not redundant and only 15A per rack tops (extra
amps are ridiculously expensive).

But for a frugal startup it's perfect - cheap, good connectivity, awesome
support, easy access. Not very reliable, right, but you don't have many users,
don't ya? :-)

~~~
smgoller
They just opened Phase 3 in Fremont2. All the racks are 36" and there is tons
of available space. I agree that power is expensive, but the bandwidth cost
can't really be beat.

------
dredmorbius
If you're looking at Sacramento, consider the flood risk.

The entire city sits in a flood plain, and is protected by floodwalls, levies,
dikes, and flood gates.

The last major flooding of major structures that I'm aware of was 1956 (mostly
affecting areas north and east of Sacramento State), but given changing
weather patterns and grave concerns over the delta levee system, nothing is
certain.

Most of the major datacenters are clustered along US50 (which follows the
American River), with others downtown and in West Sacramento, also within
vulnerable zones.

Probabilities of a major flooding event exceed those of a quake. Though of
course the Fremont/Hayward datacenters are direction on the Hayward fault.

------
troyk
If your looking to Sacramento for seismic or other reasons, the place for
scrappy startups is istreet solutions. I'm local to Sac and have had all my
servers parked their since 2003 with only a few hiccups. If you inquire ask
for Mark (the owner), ask for the "Troy Deal" (I am his first customer, leased
a cage of an otherwise empty datacenter in 2003, spent way to many creepy
nights behind windows servers and a kvm switch before discovering the serenity
of headless linux). Not that he'll give the same deal as his first customer,
but it doesn't hurt to ask!

------
jwegan
Another big question for colos is redundant power. Preferably they have two
separate lines all the way to the utility and test their generators on a
regular basis.

------
whimsy
Hey, you spelled prgmr.com wrong. (You got it right in the <a> tag though.)

------
mdemare
10 terabytes of fanfic? I thought that you could fit the Library of Congress
in less. 10TB is 20M novels of 100K words, unzipped.

~~~
astine
"I thought that you could fit the Library of Congress in less."

Nope. I've worked there, they've got multiple petabytes of data, though a
major portion of that is audio and video. Even the books aren't stored as txt
files.

------
halayli
This is very similar to what I ended up doing. I bought a server, and hosted
it at he.net.

------
Chirael
Nice write-up - thanks for this!

------
papsosouid
Somewhat on-topicish, does anyone have any recommendations for colocation in
Toronto? The only recommendations I've gotten for a reliable provider has been
for peer1, but they are insanely expensive and both their support and sales
staff are quite the opposite of helpful. For reference (since peer1 tends to
make up "whatever we think we can get away with" pricing), I was quoted $60/Mb
for 95th percentile or $0.85/GB (which puts 1TB of data transfer at nearly
$1000!).

~~~
barkingcat
Give 151 Front a try. <http://www.151-front-street.com/>

