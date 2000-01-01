No where does this mention Facebook as a founding member. This dates back to at least 2015.
> We created a 100G single-mode optical transceiver solution, and we're sharing it through the Open Compute Project.
What gives? Looks to me like someone may be taking more credit than they deserve.
EDIT:
> The starting point for this specification is the CWDM4 MSA, a standard that was agreed upon in 2014 by several optical transceiver suppliers. It uses a wider wavelength grid (CWDM = 20 nm spacing) and, for many of the different technology approaches, does not require a cooler inside the module to keep the laser wavelength stable.
So Facebook took something that already existed. Tweaked it slightly by relaxing the constraints so that it works in the data-center only.
The next step down in size will be to sfp/sfp+ dimensions. I can see it coming, a qsfp is already a lot smaller than a first generation 100GbE CFP.
Facebook didn't do any extra engineering work, they just specced less sensitive Rx parts with the optics OEMS , and 1dB less powerful Tx.
I guess that depends on how you define "engineering."
They're basically trying to create a new class of transceiver. It remains to be seen if this will take off or not, but since it is part of the OCP effort, the chances are good that it will be taken seriously by QSFP vendors.
OCP is generating a lot of activity and change on the networking side. Whether it just becomes a race to the bottom where only the giant suppliers survive or whether it creates a new eco-system with more players and interesting technology remains to be seen.
I think in order to bring more OEM vendors in we need to see the other big players to also accept the relaxed specs.
Hopefully, we don't end up with another dozen different 100G or 200G MSAs that work from 15-55C.
I would guess the NRE to develop either is similar and that the design for either is almost the same. Perhaps Facebook is just trying to get the optics cost down by negotiating discounts on the non-yielding MSA parts that would have otherwise had to get thrown out?
Although you'd pay at least 10-20x that if it had a big vendor name on the top.
We've been using them with great success at Netflix in our flash storage appliances. We are able serve well over 90Gb/s from single machines with these NICs using our tuned/enhanced FreeBSD and nginx.
Incredible
We do all of our stack "traditionally", in the kernel, whereas DPDK moves things into userspace. By using a traditional stack with async sendfile in the kernel, we benefit from the VM page cache, and reduce IO requirements at peak capacity. There are no memory to memory copies, very little kernel/user boundary crossing, and no AIO. Using single-socket Intel Xeon E5-2697A v4, we serve at 90Gb/s using roughly 35-50% CPU (which will increase as more and more clients adopt HTTPS streaming).
There is no question that FreeBSD is lacking in a number of areas. For example, device support is a constant struggle.
Regardless of those things, the OpenConnect boxes are doing a pretty small subset of possible server tasks, it's basically serving static content from disk, and updating the static content occasionally. This is a task that FreeBSD has been excelling at, since basically forever. FreeBSD is a pretty stable target to tweak on as well, Netflix moved TLS bulk encryption into sendfile(), which helps avoid the transitions from kernel to user space, but by putting more stuff in kernel space, rather than the DPDK method of putting more stuff in user space. They've continued to tweak sendfile, which I imagine helped them get up to nearly 100Gbps out.
I haven't had the pleasure of running 100G network, but I had to do very little tuning to saturate 10G with FreeBSD on http download site, and TLS cpu was the primary thing holding me back when moving it to https. Bulk download was moved away from my team before I got new servers with 2x10G and fancier processors, so I was never able to see if I could saturate that too :(
kernel bypass will get you line rate 40gbps w/64 byte packets.
We've been showing IPsec over 40g links during the past week.
https://twitter.com/netgateusa/status/853694461456646144
In a lot of scenarios the big pipe is right at the load balancer so this is useful.
On the distance, there are other specs that support 100Gb up to 20km on there!
Edit: I think you're referring to omnipath. Should be clear than Intel has no 100gbe card available. That link is also pcie3x8, so it can do a max of ~60gbps to host.
Not too unreasonably expensive - roughly in line with a GeForce 1080, and that's after the Official HP Hardware markup. I wonder how much the switch/cabling is...
Do people really do this?! (I don't have DC experience, serious question)
Single channel 100 GB would mean 2-3 mm long bits. Imagine that :-)
So if you have a 1.5RU 32-port 100GbE top of rack switch it can serve up to 120 servers, leaving two 100GbE ports free for uplink.
The MSA is 5 dB @ 2000m, so where is the additional loss coming from in the OCP spec? Different connectors?
