

Introducing “Wedge” and “FBOSS,” the next steps toward a disaggregated network - hiteshiitk
https://code.facebook.com/posts/681382905244727/introducing-wedge-and-fboss-the-next-steps-toward-a-disaggregated-network/

======
SEJeff
Regardless of what people think of facebook and their business, this is a
pretty big deal. As a very large tech company, they have the time and talent
to develop their own switches. If they are releasing the reference
implementations, unlike google, this helps anyone else trying to build the
next big web company. The more information the merrier.

~~~
tambourine_man
Yes, but only because they lag behind Google in this regard. Companies embrace
and propose open standards in the areas they don't dominate. As soon as you
hit a core competitive advantage, they get as closed as possible.

~~~
planckscnst
I'm not so sure of this. Google has published multiple documents [1]
mentioning the limitation of top-of-rack switching capacity and data center
bisection bandwidth: how they need to design around it, and why it makes sense
to use commodity switches despite the burden on software design.

[1] Here is one less than a year old:
[http://www.morganclaypool.com/doi/abs/10.2200/S00516ED2V01Y2...](http://www.morganclaypool.com/doi/abs/10.2200/S00516ED2V01Y201306CAC024)

~~~
magicalist
I didn't read it to see what you were referring to, but that's the second
edition of an older book, so maybe they didn't update everything for the new
edition. Google has been banging the software-defined networking drum for a
while now (configuration is part of the compute engine APIs now, for
instance[1]).

[1] [http://googlecloudplatform.blogspot.com/2014/04/enter-
androm...](http://googlecloudplatform.blogspot.com/2014/04/enter-andromeda-
zone-google-cloud-platforms-latest-networking-stack.html)

~~~
planckscnst
Yes, I just looked at the place where it specifically talks about switching
costs (chapter 1 around page 18/19), and it hasn't changed.

------
nbm
I work in a somewhat related team (Traffic/CDN) at Facebook, and I'm very
excited about what this is going to allow us to do in future.

Current switches just don't support the deployment, monitoring, and
configuration power we have for servers. While we've done a lot (probably
close to the most than can be) to bring them somewhat close to par, Wedge
should not only leapfrog to equality, but also use the same infrastructure -
and gain whenever the server processes improve.

The opportunities opened by being able to quickly canary some new features
(without doing a firmware upgrade before turning on and again after turning it
off), have detailed logging and monitoring and reusing our existing tools for
correlation and comparison, and to do some things we currently are forced to
do on separate machines now are fairly large.

~~~
joshAg
What managed switches did you guys try and find lacking?

~~~
nbm
This isn't my area, so I don't know what we've previously disclosed or what
agreements we might have in place with vendors about, say, not mentioning
them, so I don't feel comfortable disclosing that.

If it was anything like experiences in other cases, my suspicion (as I said,
not my area) is that switch vendors have relatively few customers like us
(certainly few that discover bugs, change configuration, upgrade firmware
nearly as often as we do, or who make use of the particular set of features we
do at the same time), and so some things we really would want would not really
be in their interest to work on relative to things that would be useful to
most of their customers.

At some point it probably became worth trying something new like Wedge/FBOSS
(which while technically hard at least can build on our experience building
hardware and software for servers) in the hopes of improved turn-around time
and getting the features we want further down the line.

I'll try remember to track down someone from the team to give a less vague
answer after lunch.

------
AaronFriel
I really wish one of the big hardware vendors would just start shipping
validated, certified and warrantied Open Compute Project hardware at the
substantial savings that can be had from it. Or maybe I'm ignorant here and
there are little savings to be had.

The rest of this post is just a rant from my perspective in the SMB space.

In my space, everything that's worth getting is too expensive, and everything
else is crap. The switch and storage market is a racket, as near as I can
tell, where every opportunity to get you to pay another 20 or 30 percent
premium over what you had before is taken with selling you features you don't
want or can't use. Software defined storage is, ultimately, limited by your
network. Software defined networking is here (OpenFlow, network
virtualization) but SDN is being used as a value-add to get customers to pay
_even more_. The result is that software designed storage is a crapshoot (only
as high quality as the network) and whether or not you save money is
debateable.

Shared storage is a tremendous racket because adding "SAS" to anything doubles
or triples its price. Consumer SSDs are advancing the state of the art much
faster than enterprise tech (which tends to accommodate slower purchasing
cycles and longer service lifetimes), but to get an older, slower SSD for a
shared SAS JBOD means paying five or six times as much per gigabyte.

I really want a virtual SAN that doesn't suck, and a network that doesn't cost
$1000 dollars per port to connect a handful of servers. Alas, it doesn't look
like anything like that is coming soon.

~~~
wmf
In theory you just need an interposer to use SATA drives in a SAS enclosure:
[http://www.dataonstorage.com/dataon-products/6g-sas-to-
sata-...](http://www.dataonstorage.com/dataon-products/6g-sas-to-sata-
interposer-card-and-kit.html) But that still doesn't solve the problem of the
JBOD itself costing more than the disks (less so with SSD) or ZFS costing more
than hardware RAID, etc.

For (relatively) cheap networking check out
[http://www.colfaxdirect.com/store/pc/home.asp](http://www.colfaxdirect.com/store/pc/home.asp)

~~~
AaronFriel
I'll start with what I know firsthand: those interposers are _not_ supported
for many technologies, including Windows Server Clustered Storage Spaces. This
is straight from DataOn storage reps.

And what I know second-hand: many SATA SSDs have terrible failure modes in the
form of RESET storms when behind an interposer. That is, you can end up in a
situation where you have to shut down all hosts attached to the SSDs and power
them off before they return to life. Not a good situation. The interposers
apparently greatly exacerbate this problem.

I will check out your link on networking, thank you :)

------
sfeng
It's somewhat hilarious that providers like Cisco are working so hard on
nonsense like the Internet of Things™ while ignoring the work that will
actually define the future of networking (and should have been done a decade
ago)

------
ChuckMcM
I am so excited about this, while I realize switches are perhaps one of the
last bastions of over priced software I would _love_ to have a switch where it
is just a freakin' switch. It isn't trying to be all things to all people at
some level and doing that badly. I've got Blade, HP, Cisco, Supermicro, and
Mellanox switches that have been in this role (Top of Rack) and so often they
bite the big one when it comes to some random protocol going nuts. Every
single site outage in nearly 4 years of 'launch' has been due to a switch bug.

~~~
wmf
Quanta + Cumulus is pretty much already there.

------
josu
Are ASICs gaining popularity or are they already broadly used?

~~~
wmf
Ethernet switches have used ASICs pretty much since forever.

------
shawnreilly
I can't wait to build one of these!

------
kv85s
"We’re big believers in the value of disaggregation" \-- says the world's
biggest data aggregator.

The irony is amazing.

~~~
Smudge
They go on to define their terms: "breaking down traditional data center
technologies into their core components" \-- very little to do with what we
think of as "data aggregation."

So, I wouldn't exactly call it ironic. More... homophonic? Homophonically
ironic?

~~~
taylorwc
Homophironic. You can TM that.

