Essentially, they have spent the last 20 yrs building their software which runs on Motorola, MIPS, PowerPC, etc., running arcane switching protocols - not always interoperably even. And these 'software-less' switches can be made by almost anyone, since the software is their secret sauce.
Think of it as going from Minicomputers, which were custom boxes which had custom hw/sw from a few vendors, to PCs, which have an 'open' design and are designed for interoperability.
That's what OpenFlow does to the switching/networking ecosystem.
And since none of the incumbents want to commit hara-kiri, a few startups are trying to do this, Nicira, BigSwitch, etc. Many others have OpenFlow compatible switches, but nowhere near the scale that Google would need in their datacenters.
Brilliant stuff. And I'd love to see Cisco die because of this - they've kept the industry back for long enough.
I can see that within an organization's internal network, they can assess the importance of different communications, and route accordingly. So on the internal side, it's potentially a big win.
But across the globe, who can assign the priority of traffic accurately and impartially? And isn't the decentralized nature of the current architecture an important feature, because of the way it can route around problems (be they technical or regulatory) of its own accord, without requiring a higher authority to tell it how (and thus without being susceptible to the agenda of that authority)?
If you haven't dealt with network gear before, it's like going back 3-4 decades in general computing: bizarre, obscure UIs, features which are complicated by very strict limits, management is a bolt-on after-thought generally treated as a profit center ("We'll sell you tools to deal with our arcane UI!"), paid bug fixes which have to be installed by hand, interoperability across vendors is very limited, etc.
OpenFlow allows an entity to keep a global consistant state, and calculate the rules by which each of the nodes should forward. This logically centralised control can then enable higher utilisation of the network. Think about traffic reports on the radio -- If you are driving and know that there is a bottleneck on one highway, then you can take an alternative route.
EDIT: I use "Global" in this context for within an AS, not necessarily internet-wide.
Thank you for that analogy, because it actually serves to illustrate my concern.
Here in NJ there's a station that reaches most of the state, and makes a big deal of its every-15-minute traffic reports. I used to listen to these while commuting, until I found from experience that their reports, at least for the roads I deal with, carried data that was either so stale as to be useless, or was just plain wrong. So now I don't listen to that station anymore. Instead, I use an app called Waze for my phone. This uses crowd-sourced data (i.e., decentralized), which also isn't wholly dependable (there's not always another user there ahead of me to make a report, and it's still susceptible to gaming), but on the whole it gives me a better picture of the traffic situation.
Note also the use of "logically centralized", not "physically centralized".
Today, to do this, you may need to configure several switches, routers, between the server, and the source & dest of traffic to the server, while not being able to globally optimize.
Depending on security considerations, it may even preclude certain servers from being in certain racks, based on the switch it is connected through.
edit: down voters, have you configured STP in a data center? have you had a single vmware esx instance shut down the root VLAN on a DC? Spanning tree is being addressed with solutions like this.
You have knowledge. I could learn stuff from you. I learn almost nothing from your comment "fuck spanning tree".
This kind of behind the curtain stuff is mysterious to many people. I would welcome something that taught me more about it. I'd especially welcome informed insights from someone who works with the technology.
Spanning Tree Protocol: http://en.wikipedia.org/wiki/Spanning_Tree_Protocol
STP as it's called, builds linear networks. Simple, single paths, through layer 2 (see: ethernet) networks. Think of spanning tree as a large state table tracking all MAC addresses on a network. If the state table realizes there a duplicate entries (ie: duplicate paths) for a single MAC address, it literally brings down the entire network to recreate path without duplicate entries.
Most managed (commercial) ethernet switches, speak the spanning tree protocol; this protocol allows synchronization of MAC tables between switches. However, by default, the vmware vswitch does not speak this protocol. This creates problems when you multi-home servers (connect a single server to multiple siwtches). The vswitch does not participate in spanning tree, and the default vswitch "load-balances" by transmitting frames from the various ports it has accessible. This, in the traditional switches' eyes, constitutes a loop in the network and can bring an entire ethernet domain down. This is a horrific scenario during which all participating hosts lose network access for 15 seconds or more, depending on the configuration (STP vs. RVSTP). If the vswitch remains active with it's default settings, the network may be down until a network engineer realizes the problem or the server is taken offline.
The reason I say "fuck spanning tree" is as a network engineer, I've taken entire data center's off-line due to a mistaken configuration on a ESX host (which I did not have visibility into at the time). This is obviously not a good way to go about production practices.
Network coordination services, like the one developed by Google, stand a good chance of replacing this antiquated protocol. Everyone in the ethernet networking world has been plagued by STP and its related quirks. I'm, for one, very happy to see its demise and hope for a future clear of such, potentially, disruptive technology with data centers.
Telephone networks tend to use these (on a scheme called SS7) because in most countries, the telephone networks were built by monopolies. It was possible to develop the entire network as a single system and thus to obtain very high efficiencies for certain use cases.
Google goes a step further. What they seem to have done is married circuit-based networking with batch planning. The network itself is circuit based -- rather than each packet "finding" its own way, it can be routed end-to-end by a central plan. But the decision of what to move when can also be planned. Note the reference to "simulating a load". That's similar to what mainframe batch planning achieves.
As usual, everything old is new again.
So whats next Google reinvents x.400 and x.500 (not the special needs version LDAP)
There will be a shakeup because telling the hardware manufacturers their software isn't good enough isn't a great way to start that conversation, but it's inevitable. They will insist that the software is like that for a reason, and to do so is saying the last twenty years of development has been done the wrong way.
I think it's highly interesting. I went to school for computer science but found computer networking very interesting. There seems to be a certain level of dismissal in the complexity of networking by people who write applications. Writing a one line java socket that connects to another TCP port is trivial, but the details are tedious. The same way we forget how difficult it is to get phone calls to work because the end result is simple - phones ring.
OpenFlow will need to reinvent the wheel unless the existing hardware manufacturers decide to give them a head start, which is unlikely. If it's open source it will evolve quickly, however. There are many difficult decisions and engineering problems to solve, which I suppose is a good sign.
Because of this if you want the latest and greatest features from Cisco you have to run all Cisco. Or Juniper. You can't just buy 10 Cisco switches and 10 Juniper switches and all run the same operating system. Compare this to PC hardware where I could buy any combination and install any OS I want.
End result is that nearly a decade later, physically installing more than 3 servers at any given time is a project that takes many weeks. In one instance, installation of a small cluster took over 3 years.
It might be technically cool in the video, but my cellphone contract costs more than my broadband one and the connection is more stable too. I don't want to pay a phone company while they get to freeload on my ISP contract, and I don't want my phonecalls dropping out when my broadband does.
Copyright (c) 2008 The Board of Trustees of The Leland Stanford Junior University
We are making the OpenFlow specification and associated documentation (Software) available for public use and benefit with the expectation that others will use, modify and enhance the Software and contribute those enhancements back to the community. However, since we would like to make the Software available for broadest use, with as few restrictions as possible permission is hereby granted, free of charge, to any person obtaining a copy of this Software to deal in the Software under the copyrights without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
The name and trademarks of copyright holder(s) may NOT be used in advertising or publicity pertaining to the Software or any derivatives without specific, written prior permission.
It seems they're rolling their own license, as opposed to adopting any of the other open licensing schemes out there. Any thoughts why they might do this?
Why bother trying to illustrate with examples, if you're gonna write things like that?
Now they reap the benefits.
This is true even on a small scale. When I was upgrading the phone wiring in my house a number of years ago I went ahead and ran ethernet and coax all over the place while I was at it. If you're already on your back in a crawl space, it isn't that much more hassle to drill a few extra holes. :-)
See for example this from a decade ago: (http://www.businessweek.com/bwdaily/dnflash/aug2001/nf200108...)
And this, from about ten years ago:
> This is true even on a small scale. When I was upgrading the phone wiring in my house a number of years ago I went ahead and ran ethernet and coax all over the place while I was at it. If you're already on your back in a crawl space, it isn't that much more hassle to drill a few extra holes.
I agree. Wired is always handy. Other people - people buying your house - may disagree. Those people might prefer wifi, even if wifi is unsuitable for the building, even if there are very many neighbours with over-powered wifi on nearby channels.
Remember that AT&T commercial from 1997 where they said "and imagine all these services coming to you from one company" and it should a data jack - implying that you would get everything from that AT&T cable... well -- google wants to be that port.
Furthermore, traffic controller(s) would constantly gather information about traffic conditions from each intersection, and tell them how to direct the traffic to optimise road usage.
This way, the traffic controller knows everything about the optimal way traffic should flow, and it only has to make decisions based on the number of intersections rather than the number of cars. If you consider that one intersection could have tens of thousands of cars per day, that provides a huge saving in computation.
This differs from the analogy in the article;
Intersection = Router, Car = Packet.
Traditionally the argument has been specialized hardware doing function X, will be faster (but not customizable, upgradable) compared to doing function X in software on commodity hardware (slow but upgradable).
And fpga were pointed out as somewhere in the middle ( some hw customization using sw)
What are your thoughts on the argument that by using commodity hw & implementing routing algos, etc in sw, will be more flexible but slower.
Is there a significant performance/speed cost when you implement core networking features in sw ?
The routing layer in hardware acts on a set of cached rules which are very simple. Simplified, you can imagine them to be of the form Matching Rule -> Routing Action. The matching rule selects by the packet fields such as source port or destination IP. The routing actions could be "forward to port" or "drop packet". All of this is just as fast as in every other commodity router.
What is special in OpenFlow is another possibility for the "routing action" field (or for unmatched packets): You can send certain packets up into the software level, to the OpenFlow controller. This can be a centralized server and the logic is implemented in software. The software decides about the routing of these kinds of packets and sends the answer back to the router. Here the rule is cached again and from now on the routing for this is as fast as for all the other packets.
This last bit is the only part which is slower compared to commodity routers. A really great solution in my opinion.
Perfect timing from Google.
In my naivety, I'd expect the main benefits of openflow to be on the WAN links, so you could get away with 6 or so ports in a PC-like chassis running software routing, with dumb local switches?
What am I missing?
To be clear, it's not literally Cisco/Juniper hardware, but it's similar (ASIC-based).
on the WAN links you could get away with 6 or so ports in a PC-like chassis running software routing
ASICs are line rate, denser, and cheaper per port than x86 servers. Roughly $15K buys you either an x86 server with 6 10G ports or an OpenFlow switch with 64 10G ports. Since Google has over 100 ports per switch (according to EE Times), they presumably need the ports and the savings is significant at that scale.
The switching hardware is ASIC based and super optimized with CAM based switching on VLAN tags, destination addresses, etc. 1U switch could have 36 ports @ 1Gbps, and needs to switch 36Gbps within the switch, and perhaps 10Gbps upstream.
So line rate switching of 36Gbps+ in 1U, possible today only with ASICs, typically from Broadcom.
I was thinking you could have have "dumb box, many ports" and "smart box, few ports". Each cab needs one of the former, but you could get away with not many of the latter?
Cisco/Juniper boxes were smart, many ports, but smart only within each box. That is you had to configure each one individually. As in you could do static provisioning with a single tool across multiple boxes, but on the order of minutes between changes. If you had to create a VLAN with ten boxes, across 5 different racks, you would be spending quite a bit of time doing that, since you would need to find spare VLANs that are unused across the fabric, configure all the switches in between, etc.
Now with OpenFlow, all the switches are controlled by the central controller. And flow configuration is dynamic - ie you don't need to fiddle with individual configs, when the flows start, the controller is queried, and if there's an appropriate flow setup on the controller, it will be implemented, with local free VLANs, etc.
Basically the entire fabric becomes as dynamic & smart as the controller, instead of each switch being smart and static.
I built a proof of concept for soemthing like this for my master thesis - 2000.
I wish it had occurred to me that unlike the user being in control (in active networks), the network operator could be in control.
This is not about Dr. Evil-style centralized control of the whole Internet.