
Network Update: Multihomed, Increased Transit, Peering - stenius
https://blog.linode.com/2016/11/02/network-update-multihomed-increased-transit-peering/
======
pjungwir
> per-customer VLANs

I am looking forward to that! Linode is my go-to hosting service, but it's a
little troubling that anyone in the datacenter can hit your private IPs [1].
On the other hand, maybe it shouldn't matter, and you should always act like
the network is compromised. Isn't trusting their private network how Google
leaked traffic to the NSA? Still, it seems like a nice improvement that would
make compromises less likely.

[1] [https://blog.linode.com/2008/03/14/private-back-end-
network-...](https://blog.linode.com/2008/03/14/private-back-end-network-
support/)

~~~
misframer
> _On the other hand, maybe it shouldn 't matter, and you should always act
> like the network is compromised._

What about cases like AWS's VPCs?

~~~
vgt
To add color to both your comment and parent's.

Everything at Google Cloud is encrypted at rest and in transit [0]. Any GCE
project is essentially a VPC by default, and a global one at that [1] (aka no
need to VPN between regions). Traffic between GCE zones/regions never hits
public wire by default ,and Google will carry your packet to the nearest
Google POP around the world on its private backbone [2].

(work at Google Cloud, but not on networking/GCE)

[0] [https://cloud.google.com/security/encryption-at-
rest/](https://cloud.google.com/security/encryption-at-rest/)

[1]
[https://cloud.google.com/docs/compare/aws/](https://cloud.google.com/docs/compare/aws/)

[2]
[http://peering.google.com/#/infrastructure](http://peering.google.com/#/infrastructure)

~~~
nodesocket
Nice, great information. Google Cloud has the networking model right.

------
secure
I appreciate how transparent they are about their locations, transit and
peering.

I’m looking to replace one of my VPS at digitalocean because of stability
issues (need to reboot the VM every couple of months, it just entirely drops
off the network apparently).

Linode seems like a good alternative. My criteria for this application are ≥
1G of RAM, SSD storage, fast RTT to my other VPS, native IPv6 support.

------
geuis
I fucking love Linode. I've been hosting with them for years and over time
I've gotten more performance and more data transfer for the same money.

[https://jsonip.com](https://jsonip.com) is hosted with Linode and supports
millions of requests a day. It's been a great home for the service.

~~~
grubles
Just don't store cryptocurrency there:

[https://news.ycombinator.com/item?id=3655137](https://news.ycombinator.com/item?id=3655137)

~~~
tbrownaw
Don't store it on any cloud service.

------
swalsh
I guess it never occurred to me before, but with the increased amounts of
attacks lately it's been near the top of my mind. These guys seem to be
throwing around the physical addresses of data centers pretty freely. What is
the security of these places like? How decentralized are we really? It seems
like a few strategic strikes could deal a devastating blow to our edge
infrastructure. I know personally, my servers are only hosted in a single
datacenter. The company I work for is in 3 datacenters, but I'm not sure the
other 2 data centers could handle the full load for an extended period of time
if the primary one was completely taken down.

Granted not as big of deal as power plants etc, but if you're looking for soft
targets, it's a scary thought.

~~~
jlgaddis
These are all very well-known datacenters with several layers of physical
security (there are standards and certifications for datacenters). Their
locations aren't exactly secret.

Most datacenters have fences/gates, require access cards and/or biometrics to
get in and move around inside the building. Once inside, you can only get into
your own cages.

It's not like you can walk up, knock the door in with a battering ram, and
then have access to everything inside.

~~~
aroch
>It's not like you can walk up, knock the door in with a battering ram, and
then have access to everything inside.

Well, I mean, yes you can. The actual doors/gates used aren't 'milspec'
intrusion rated. They're `better-than-home-depot` doors (all steel, steel door
frames, reinforced). Cage door are often hilarious flimsy (thing metal
sheet/bars).

Certainly breachable by even modestly equipped attackers.

The reason you pick real DCs and not the basement of your fortified house is
because there's human security in addition to the physical security measures.
Which means, some one will notice if you try to bust down the door

~~~
CodeWriter23
And here I thought the reasons to choose a DC over my basement was multi-
homing, abundant bandwidth, redundant air conditioning and battery/diesel
electrical backup.

~~~
aroch
We were talking about security...

But yes, it's almost like DCs were built with this in mind. Wh would've
guessed...

~~~
CodeWriter23
Security in my basement > Security in average data center. Similar quality
locked doors. No visitors. Armed response by owner.

------
vetrom
Are they still running a hard-to-audit ColdFusion CMS?

~~~
ksec
Wanted to know that as well. They were making a major rewrite. Not sure if it
is finished yet.

------
StanAngeloff
This is good news for their users, incl. us given the frequency of DDoS
attacks lately. There has been hardly a month go by without their status page
flagging an incident report involving increased traffic to one of their
datacentres as a result of a DDoS attack.

------
hhw
"we now manage our own true service provider network, allowing us to deliver
robust and reliable connectivity."

What's needed to combat DDoS attacks is distributed defense. Without their own
backbone / private transport links between all of their locations, their
network is just a disparate set of data centres and there is no advantage to
their having multiple locations, so far as protection from DDoS attacks are
concerned.

They also fail to mention what capacity each of the links are. They could be
anywhere from 1Gbps to 100Gbps, but I presume they'd mention as a selling
point anything 40Gbps and up, so let's assume they're using all 10Gbps links
and not 1Gbps to give them the benefit of the doubt. So, they range from
50Gbps (Singapore) to 100Gbps (London) per location.

It's an impressive list to look at in aggregate, but not really that much for
any one location in 2016, especially given a company of their size and
visibility, when you can rent shared access to a 200Gbps+ botnet for $19.99.
[https://www.nanog.org/sites/default/files/20161015_Winward_T...](https://www.nanog.org/sites/default/files/20161015_Winward_The_Current_Economics_v1.pdf)

Instead of buying transit from up to 7 carriers per location, when there are
starkly diminishing returns after 3 or 4 so far as routing performance is
concerned, they should have instead bought higher capacity to each provider
(to ensure at least 10Gbps of unused capacity per provider outside of regular
legitimate traffic), external DDoS mitigation, or domestic backbone links and
turned up more capacity at the LA Any2 (for Asia) and NYIIX (for Europe) to
absorb the majority of DDoS traffic which comes from those regions. With up to
7 carriers, they simply have 7x different points of failure each at only
10Gbps, while getting worse deals on transit pricing due to lower volumes with
each provider.

~~~
dsl
You don't need a private backbone to be able to mitigate attacks across
multiple locations. I've done it fighting off multi-hundred Gbps attacks and
it was never an issue. You can QoS your own intra-site GRE tunnels.

Linode is moving 200-300 Gbps globally. That is about 37.5 Gbps per location,
and when you figure in a 20% utilization (because you need to be able to
burst)... they have about 300 Gbps of transit per location. Spread across 3-5
carriers I would guess they have 40-100 Gbps from each. Way more than your
estimated 10 Gbps.

As far as "routing performance" they appears to be buying from a few Tier 1
networks per location, and a mix of regional Tier 2s. That is in line with
best practices. Sometimes to reach the right networks you do need to spin up
circuits with multiple Tier 2s, there is no such thing as "diminishing
returns" if you are doing traffic engineering properly.

The right way to build networks is to meet your performance needs first and
foremost, have enough headroom to grow and serve your customers, and work with
your upstreams to manage incoming attacks. An external scrubbing service makes
no sense when you can adapt your network as Linode has done so they can easily
blackhole targets at their upsteams edge.

I applaud their efforts. This is some smart network engineering.

~~~
hhw
Why the heck would you send traffic out transit links using GRE tunnels, when
it's more cost effective and you have more control over your own private
backbone links? At the scale I speculated 1/10th of what you're suggesting, it
would have already been cost effective to operate a backbone. If they're
anywhere near the scale you're suggesting, then it should be a no brainer for
them to operate their own backbone of 100Gb waves.

I'm quite skeptical of the numbers you're citing though. Their PeeringDB
profile suggests they only have 10-20Gbps of peering per city. Considering
their profile was updated just a few days ago, I would be inclined to consider
those listed capacities accurate. Although they mention 'hundreds of Gbps' of
capacity per city, they also mention sending up to 50% of traffic through
peering in London, where they only have 50Gbps of total peering capacity.
Perhaps you're right on that 300Gbps of capacity per location, in which case
they would run much lower utilization rates on transits than peers. But that
would be even worse allocation of spending than in my initial assessment,
considering the cost of a port at an exchange is much cheaper than a transit
link with CDR. It also leaves them highly vulnerable to DDoS attacks through
exchanges.

For a content network, public peering at exchanges in North America just with
route servers and networks that have open policies would result in 30-40% of
traffic going through the exchange. They would easily do more traffic at the
exchange than any one transit in a mix of 3-4, let alone in a mix of 5-7.

With any significant private peering, easily 60% of traffic could be
settlement free. And guess what, most significant peers require peering at
multiple locations with a full set of prefixes, which requires you have a
backbone. With the traffic levels you're suggesting they run, they should be
able to negotiate settlement free peering with many major regional Tier 2's,
making it even less sensible to be purchasing from multiple ones. Considering
most Tier 2's within a given region will all peer with each other, there are
very few improvements to be had by turning up additional ones. Where there's
the most room for improvement is being on the right long haul fiber paths. In
which case, given their North American focus, they should be buying from
Level3 and they probably could at comparable rates to their current agreements
by concentrating more of their commits at fewer providers. If they had their
own transport, they could also determine which fiber paths they take across
their backbone to ensure optimal latency. Beyond that, given that Tier 1's all
peer with each other by definition, it's just a matter of dumping traffic out
any one of them for local traffic without traversing a congested peering link.
The microseconds it takes to go an extra AS hop within a city has
indistinguishable impact on performance.

I'm not sure what best practices you're referring to. Who else can you name
that utilizes up to 7 transit providers in a given city, without operating
their own backbone? The only one that I can personally think of Internap, when
they abandoned building their own backbone halfway through turning it up. Ask
their former network engineers, from their golden years when they had their
highest market share, how that worked for their network and for their
business.

Are you a current Linode customer in one or more locations? If you were, you'd
probably have experienced packet loss issues on a regular basis due to DDoS
attacks. There's a reason why they're performing these network upgrades;
they've had near daily network interruptions due to DDoS attacks since
Christmas of last year, with some outages lasting up to almost a day. Smart
network engineering would have never let their network become that unreliable
in the first place. And if you were going to blame a lack of budget for that,
my suggestions would be even more appropriate for them as they would them to
scale their network in a much more cost effective way. An external scrubbing
service makes sense, when they've been ineffective at mitigating attacks to
date. Your network is only as resilient towards DDoS attacks as your weakest
links, and spreading out capacity to a larger number of providers instead of
concentrating higher capacities with fewer ones makes it much easier to
saturate connectivity to one of them.

The only way Linode's current network strategy makes sense, assuming that it's
not due to technical oversight, is if it's marketing driven. That's a fair
reason, but it should also be fair to call them out on it. I'm not sure why
you feel that strategy is in any way optimal, when it's the opposite of the
models of hosting companies most renowned for their networks. Take for example
SoftLayer, who went to great lengths to build out their own backbone fairly
early on. I may halfheartedly agree that Linode's network upgrade strategy
might be smart marketing, but I would wholeheartedly disagree that it's smart
network engineering. It's not cost effective, is sub-optimal for resiliency
against attacks, and fails to leverage peering effectively.

~~~
dsl
> Why the heck would you send traffic out transit links using GRE tunnels

I was responding to your statement that you needed a backbone to be able to
deal with DDoS attacks. That is simply untrue. Most hosting providers announce
separate space from each location and do not backhaul. If you do want to sink
attacks closer to the source (which in itself only really makes sense if you
have highly diverse POPs) you can GRE the clean traffic between sites.

Your PeeringDB profile indicates you push 10 Gbps at peak. As this is getting
quite long in the tooth for a HN thread, email me next time you are in the bay
area. I'd happily share some operational tips for running high volume and
attack sinking networks that really require a whiteboard. Heck, maybe we can
get someone from Linode to join us too for beers. :)

~~~
alexforster
We're usually around at nanog!

------
neom
Reminds me of DigitalOcean in 2015. Curious what a "per-customer VLAN" is in
reality.

~~~
VLM
Maybe one way to describe MPLS is its kinda a VPN for VLANs. Or a way to put
VLANs in something like a VLAN sorta.

I can't speak for them but I worked at what boils down to a semi competitor a
decade ago doing network stuff. MPLS is old stuff now and you can google the
specific cisco model numbers and MPLS if you'd like to read configuration
guides.

Superficially only having 4096 VLAN labels on an ethernet connection appears
to be a big problem if you have more than 4096 customers. However MPLS label
space is 20 bits so you're good to a million customers.

Then you have some "fun" mapping games such that your router connects traffic
on MPLS label 123456 (which is your customer number) to local ethernet
interface port wtf on vlan 100 or whatever you have been given.

At least that would have been cutting edge a decade ago and probably still is
today.

Its unlikely to be any more, or any less, secure than anything else in a
virtualized cloudy environment.

~~~
jlgaddis
> _Or a way to put VLANs in something like a VLAN sorta._

802.1ad, a.k.a. "Q-in-Q" [0]

> _However MPLS label space is 20 bits so you 're good to a million
> customers._

VXLAN, cf. RFC 7348 [1], is the latest coolness, allowing for up to 16M
"virtual" networks (using 24 bits) and bridging layer 2 over IP (4789/UDP).

[0]:
[https://en.m.wikipedia.org/wiki/IEEE_802.1ad](https://en.m.wikipedia.org/wiki/IEEE_802.1ad)

[1]:
[https://tools.ietf.org/html/rfc7348](https://tools.ietf.org/html/rfc7348)

