
ExaLink Fusion – Ultra low latency switch - deadgrey19
http://exablaze.com/exalink-fusion
======
robin_reala
Out of interest, what’s the performance of current switches in this market
segment? I don’t have anything to relate 110ns to.

~~~
deadgrey19
"Latency from 550ns to 2 microseconds"
([http://www.arista.com/en/products/7250x-series](http://www.arista.com/en/products/7250x-series))

"Consistent latency as low as 350ns for all packet sizes"
([http://www.arista.com/en/products/7150-series](http://www.arista.com/en/products/7150-series))

So this is about 3x faster than current devices.

------
myrandomcomment
So Arista has a switch with an FPGA - 7124FX. The market for the HFT has
crashed. When Arista shipped the 7124S the HFT guys where using Cisco 4900 at
4ms and the 7124S was at 600ns. It was quite a change. The markets went crazy.
The chip was called Bali by Fulcrum Microsystems - Intel bought them. The
follow on chip Alta was very very very late. Arista did the 7124SX which got
the latency down to 500ns. The Bali itself was 300ns but the PHY chips added
the additional latency (in and out). This switch replaced the 7124S but there
was not a mad rush to upgrade. Going from 4ms to 600ns was an order of
magnitude but going from 600ns to 500ns, not so much. Cisco who got killed by
Arista in this market spent ~$100M and came up with the Nexus 3548. It had
"warp mode" that could do ~50ns. This mode however required lots of pre-
planning and was fixed. Market data feeds are multicast. The handoff from the
exchange was 1G or even low as 100mb in the Asian markets back then. If you
added up every feed you could buy it was around ~3G. You would never do this
on one link. The servers for looking at the data would join groups to get the
feed. The servers used 10G NICs when the traffic load on them was only around
~100MB. Once again for latency. Serialization delay was the key here. The
market order would go back up to the market on another path. The idea was the
HFT guys would want to process the data faster then the rest of the people.
The link down to them and the link up to order was the same for everyone. If
you went into the NYSE colo the cable length to the router was the same if you
where in the rack next to it or on the other side of the datacenter. Anyway
back to switches. So Arista shipped the 7150 around the same time as Cisco
shipped the 3548. It was around 350ns using the Alta chip. The reality was
that this was low enough and the traders started to look for other places to
tweak.

Calling this the fastest switch in the world and then printing 5ns is
misleading. It is 5ns as a Layer 1 patch panel. Not really what you want for
market data which needs multicast, PIM, IGMP snooping, BPG, ACL, etc.. For the
110s is this multicast with all features enabled? Let me know if I missed a
link.

BTW, if you look back when the 7124S was released there were others that built
a switch based on the Bali chip like BNT. An important point to note is that
the chip is line rate multicast but that is not enough. Processing the
joins/leaves and programing the chip is a function of the software and that
depends on the quality of the code and the CPU system in the switch. Arista
won because of this. Here is a link to a bake off from 2010.

[http://www.networkworld.com/article/2241525/virtualization/a...](http://www.networkworld.com/article/2241525/virtualization/arista
--blade-win-top-spot-in-data-center-switch-test.html)

~~~
coreyoconnor
I used to work for Fulcrum Microsystems. Super fun. Your summary of the market
around that is great. Thanks! I had not thought about them in a while.

The technology involved in enabling 64 octet frames to be shoved around in
300ns is fascinating. The software control of these systems was just as
fascinating. These chips had a high level of programmability in the frame
handler. How to use that programmability was an open question when I left
Fulcrum.

~~~
sargun
What happened to Fulcrum? I know Intel bought them. The FM6000 is still their
top of the line chip. When can I get a new switch that's competitive to T2,
with FlexPipe?

~~~
wmf
There are tidbits of info about the FM10000 going around but I haven't seen
speeds and feeds. It seems to integrate the NICs into the switch; I'm not sure
what the point of that is.

------
bio4m
The really interesting bit for me is that theres a plugin module with a FPGA
that can be programmed by the end user.

Which means simple apps wont even have their traffic leave the switch. I can
think of a ton of uses for that, especially from a security perspective.

~~~
lrm242
Exablaze also sells NICs with an FPGA as does SolarFlare. You can get an FPGA
in an Arista switch as well. Very cool stuff indeed.

~~~
deadgrey19
I don't think Arista sells that device anymore. Having said that, the Arista
device (7124FX?) only had the FPGA on 8 out of 24 ports and the latency
through the transceivers was pretty terrible. Very few people took it up.

The Solarflare device had (has?) the FPGA in a strange place, behind the NIC
controller. AFAIK, the Exablaze NIC is pure FPGA which again saves on latency.

------
crxgames
And the high frequency trading guys just busted out the checkbooks.

------
deadgrey19
Exablaze, today announced at the London STAC Summit that the company has
introduced the world’s fastest network switch and application platform, the
ExaLINK Fusion. The ExaLINK Fusion performs conventional layer 2 switching at
approximately 110 nanoseconds latency and layer 1.5 switching at 100
nanoseconds, significantly faster than any existing switching device. The
ExaLINK Fusion preserves the sub-five nanosecond layer 1 switching fabric and
related capabilities of its industry leading ExaLINK 50 device, and adds layer
2 switching functionality implemented within a Xilinx Ultrascale FPGA. The
layer 1 switching fabric is used as a central connection point for front panel
line cards and internal application-specific modules.

------
nomnombunty
I have doubts about how useful the ExaLink Fusion is in practice. Many
exchanges require at minimum a layer 3 switch to terminate at the cross
connect. In those cases, you cannot directly connect an ExaLink switch to the
exchange.

I am quite surprised that no one mentioned the Cisco nexus 3548. Switching at
L2 with 110ns latency is not that impressive considering that the Csico nexus
3548 switches packet at (L2/L3) with 50ns (with warp span turned on)
[http://www.cisco.com/c/en/us/products/switches/nexus-3548-sw...](http://www.cisco.com/c/en/us/products/switches/nexus-3548-switch/index.html)

~~~
deadgrey19
I think this is what they mean when they call it a "layer2+" device.

110ns is the device in its capacity as a full layer 2(+) switch. The "warp
span" equivalent would be using the layer broadcast groups functionality which
runs at 5ns, 10x faster.

------
patrickg_zill
So, can high 10GE cards actually make use of the lower latency? Another
comment says that 350ns is the current best and this is faster... but how much
slop is there in the 10GE standard? What I mean is, if the latency is less
than the standard timing allows, it may not offer a practical benefit.

~~~
KaiserPro
you can also pipe 10gige over inifniband, which is pretty cheap.

Has that advantage of having quicker rdma than 10gig ethernet. so while the
switching may be slower, the processing is faster because data is piped
directly into memory.

[http://www.mellanox.com/page/performance_infiniband](http://www.mellanox.com/page/performance_infiniband)

~~~
deadgrey19
Ethernet now also supports RDMA via the RoCE standard. No idea how it compares
to Inifiband RDMA, but it has been heavily exploited in this project from
Microsoft Research:
[https://www.usenix.org/system/files/conference/nsdi14/nsdi14...](https://www.usenix.org/system/files/conference/nsdi14/nsdi14-paper-
dragojevic.pdf)

~~~
justincormack
Plus they mention it in their opencompute nodes so it must be shipping
[https://gigaom.com/2014/10/30/microsoft-tweaks-its-server-
sp...](https://gigaom.com/2014/10/30/microsoft-tweaks-its-server-specs-as-
part-of-open-compute-project/)

------
electic
Pretty impressive. Does anyone know about availability and the pricing for
this?

~~~
deadgrey19
They said in the presentation (paraphrasing) "Evaluation units available late
2014, retail units available early 2015". My impression is that pricing will
be similar to other high-end switches e.g. $20-30K.

------
sargun
Any idea what the forwarding plane is made of in these switches?

~~~
otherdude438
vitesse crosspoint asic and Xilinx FPGA

~~~
matthurd
Yeah, besides the Xilinx FGPA in the module, the PCB interconnect would have
to be specifically the Vitesse Crosspoint VSC3144XHR-12
[https://www.vitesse.com/products/product/VSC3144](https://www.vitesse.com/products/product/VSC3144)

and you can see from the product brief
([https://www.vitesse.com/products/download.php?fid=4548&numbe...](https://www.vitesse.com/products/download.php?fid=4548&number=VSC3144))
many people use it in telco oriented gear for back-planes and the like.

Here is my take on why you shouldn't buy an ExaNIC from Exablaze
([http://meanderful.blogspot.com/2014/11/dont-buy-exanic-
from-...](http://meanderful.blogspot.com/2014/11/dont-buy-exanic-from-
exablaze.html))

Just remember that I'm very biased...

\--Matt.

