Mellanox has apparently been under activist investor pressure to reduce their R&D expenses and pay more dividends. And then there was the rumors that Intel were interested, but apparently Nvidia in the end offered more.
From a HPC perspective I think it's good Nvidia got the deal, Intel is already a quite dominating force in that market, and if they'd have gotten the deal it wouldn't have surprised me if they would just have sunsetted it in favor of their own Omni-Path (which they could then develop at a leisurely pace due to lack of competition).
Though as I have mentioned before, I do wonder about the long-term prospects for Infiniband as a technology. Modern high-end ethernet does many of the same things with RDMA (RoCE), though I believe IB still has a latency advantage. And multipathing with ethernet is weird, seems both Trill and SPB are kind of dead, and most players seem to do multipathing at the L3 level (which might not be good for latency?). And in contrast to ethernet, IB is pretty much a single-player technology nowadays, so is the market big enough to bear the R&D costs to keep developing it?
Ethernet tooling for HPC has a ways to go, but I suspect in the future it will be more competitive. Especially if specialty fabric vendors cut down on R&D.
CLOS fabric designs seem to be winning the war these days which I think favors Ethernet in the long run. Better flow distribution on aggregate links and now widespread support for MC-LAG means you can build a really wide CLOS network with L2-only.
That means that these switches, while fast, cannot check the packets for correctness (they don't have the full packet). That they will have "aborted" packets. That in some important ways these networks have the problems of the "half-duplex" networks of old.
Broadcom focuses on features for packet transmission. That means these Mellanox switches are pretty much restricted to situations where you want to have a set of servers on a single network segment and nothing else (not even an upstream connection). If that's exactly what you need, great. But mostly you're going to need more.
Your information may be old; Mellanox has pretty much the same feature set as Broadcom now.
Also, you can run cumulus on the switch, which is pretty awesome.
But yes, seems EVPN + VXLAN is the way the industry is going nowadays to build eth CLOS fabrics, whereas Trill & SPB seem more or less dead, for some reason.
Everybody is saying ethernet is simpler to manage than IB, but IME at least for HPC the opposite is true. IB is more or less plug and play, you get RDMA, multipathing etc. all right out the box. Whereas if you'd set up an equivalent thing with ethernet, you'd have to set up DCB, RoCEv2, EVPN+VXLAN+BGP (or something equivalent).
Intel is far worse to deal with, and additionally engages in anti-competitive architectural wars, preventing other vendors from interfacing with the CPU bus.
As a result we have NVLink, and now we will soon have official IB cards with NVLink ports, and probably ARM cores too.
Huh? Nvidia is by far the worst option when it comes to GPUs if you want to run Linux. Both Intel and AMD manage to have excellent open source drivers, while Nvidia's is a proprietary mess that everyone complains about and doesn't work all that great with typical distro update mechanisms.
IB is actually easier to reason about and debug than DCE, but obviously a different community of practice.
But I think this is the way the market is going in the longer term.
What I'm getting at is setting up clusters larger than what you can fit behind a single switch. So you'll want e.g. a CLOS fabric with multipathing (the typical IB setup, FWIW). As Trill and SPB seem pretty dead, it seems the momentum is to do the multipathing at the L3 level, using the aforementioned EPVN+VXLAN+BGP, or something similar.
Or maybe you could do one "provisioning and admin" VLAN that spans the entire cluster and which uses spanning tree, and then the high-performance RDMA stuff uses the per-leaf VLAN's and L3 multipath routing? Is that simpler and better performing that EVPN + VXLAN?
What is the routing latency on such BGP setups BTW? I find it hard to image you can get even close to eth (not to mention IB!) L2 latencies? Or can the fast paths be done in hw (or FPGA's)?
In most ASICs everything is the same latency since packets go through the whole pipeline whether they use all the functionality or not. Anyway, the latency of plain routing would have to be equal or faster than VXLAN encap + routing.
Their ethernet accelerator VMA stands for "Voltaire Messaging Accelerator".
Later Marvell went on to acquire Cavium for 6 billion dollars with aim to build an infrastructure company. Though from the company's latest earnings release it seems that the deal isn't really a good one.
Edit: sorry somehow glossed over your mention of OmniPath, didn't mean to restate what you already said.
Again, I am suprisied it’s valued at only $6.9Bn.
Pardon my ignorance, but how come mobile apps and websites get valued for 10+ or 20+ Bn dollars , while someone who creates real technology is valued at only $6.9Bn
Imagine a building, say a shopping center, airport, or a city main square, that gets the same amount of visitors per day as some of those apps, and it might start to make more sense. Take Clash of Clans for example, valued at $10 billion. It has around 100 million daily active players. One of the busiest airports, the Hartsfield-Jackson Atlanta International Airport, has 104 million passengers annually. NY Times square receives about fifty million visitors per year.
If something, anything, gets 100 million visitors PER DAY, the value of such a real estate, even if digital, is immense.
There are also different ways in which you can use the data to help you create another app with high daily active players. An app can have the whole world as their oyster, unlike shopping centers, etc.
To the point of this article, it's kind of surprising to me at least that licensing companies like ARM and Qualcomm have been so much more successful than Mellanox.
In contrast, Mellanox produces a niche product. It is critical to HPC and some deep pocketed finance people, but it is never going to be at the volume of e.g. ethernet.
The long term aspect is that, as Moore's law scaling becomes harder and harder, distributed computation becomes more essential. Right now we are in a dip in peak compute rate requirements, because we have invented battery powered computers and 4G, and hence cloud. But the work done in the cloud isn't very taxing, mainly single threaded. But I'm confident we'll soon be seeing a lot heavier parallel computation in the cloud soon, and stuff that won't fit into one or four GPUs. The tradeoffs between PCIe and IB after 4*16x start to favour IB, especially if the IB silicon is on the NVLink switch complex. So from the perspective of integration bringing much greater IB volume, the acquisition multiples the worth of MLNX by an order of magnitude -- if nVidia can execute.
At one point, 6 of the top 10 supercomputers in the top 500 were interconnected with Quadrics, and over 1/3 with Myrinet. However, both Quadrics and Myricom are both long gone. And neither of them sold for anything close to $6.9B
I worked for Myricom when Mellanox was starting out with IB. I recall stories from that time of customers that would try IB for something like 1/2 to 1/3 of what we charged. But they could never get it to work, and ripped it out and installed Myrinet. Sadly, this made our management smug. Mellanox eventually ate their lunch because we never responded to their marketing and pricing war.
I don't think it was just the price or marketing (although Mellanox at least had sales people -- never heard from anyone at Myricom even after buying $$ of gear). Mellanox always seemed to be pushing the envelope, which might have been more R&D dollars. Never had problems with IB not working (CX-4 and up) but I guess the gremlins depend on scale.
Can you name many apps and websites that closed (aka IPO or sold, not funding round) at 10+/20+ billion? Unless you're going to go for the stupid "uber is an android app my kid could could make for their senior project" shtick, I think most 10b+ apps are backed by _real_ technology
(Not interested in playing the no true technology game, just pointing out that snapchat is the highly valued mobile app that comes to my mind)
(https://engineering.linkedin.com/open-source as as starting point)
This is more like a race car company not being as valuable as Toyota. Yes, the technology of the race car is unbelievable, but there are only a few thousand race cars in the whole world and there's a Toyota in every household.
The market just has a big dissonance in how it estimates and discounts those future earnings. Amazon is seen as a future huge profit maker, while Apple is viewed as perpetually at risk of a large scale decline in it's recent profitability.
Sure, the Nvidia driver is closed-source and a pain to work with for OS developers, but for the use-cases it's designed for (CUDA etc), it's far and away the best-in-class on Linux.
To my knowledge, there are no systems in the TOP500 running AMD chips or GPUs. Intel has some competition in the CPU space (POWER series, some ARM, etc) but if GPUs are in those systems, they're Nvidia.
It's unfortunately a sad truth.
CUDA won, And is now the de-factor standard for almost every application that run over GPU. Nvidia succeeded to jailed the entire HPC community to their bloated, badly maintained crappy software stack and this is very regrettable.
Any admin / integrator that had to deal with NVidia bloatwares under Linux hate it, and for very good reasons.
Biggest issue with AMD on Linux right now is that they sometimes seem to forget to fully enable support in patches before release. Like the RX 590 had to have firmware updates post release because they forgot to do everything I guess.
nVidia GPUs were always recommended over AMD because their support was significantly better before the mainoine Radeon/Radeon si/amdgpu drivers really started being great. nVidia will still run better now, but the benefit of the open source driver ecosystem out weighs that for me.
Wanted to try out SwayWM, but they don't work around how nvidia handles things in comparison to what everyone else does.
Works in Ubuntu, but could not for the life of me get 2 monitors working in Arch.
How it works with open source drivers is that you main-line your drivers so that the kernel maintainer maintain the drivers for you, for free. Choosing to keep your drivers closed source means committing to keeping your drivers up-to-date with changes in the kernel, or writing an open-source shim that does that. Which approach is more "fragile?"
Graphics card drivers happen to be very complex beasts, somewhere along the stack they do need a compiler for multiple custom and often proprietary architectures, which most definitely no-one will maintain for free for you. There is a huge incentive to keep most of the code platform independent and only maintain a minimal kernel specific component. This part of the code base is a comparatively trivial part, essentially the kernel should get out of the way as much as possible. The kernel specific abstractions like KMS are examples of such comparatively trivial things.
Not sure what do you mean by "maintain for free". Full coverage testing is not free.
However, they're hot and need serious juice to run, so you cannot just shove 36 of them to a rack and just power them on.
At NetApp we were an early customer of Mellanox (I told the founder that their name sounded like a poison gas :-)) which Steve Kleiman claimed implemnted Infiniband in anger. It was a good technology for the clustering team. Later as they grew and diversified into ethernet switches we bought a couple of their big core switches at Blekko. And at the current company we use their 40g network adapters to connect to high speed SDR hardware.
So now they are going to be part of Nvidia.
I get that this helps Nvidia in being more data center centric, but does it help them build better machine learning architectures? It does seem to be the only system that benefits from custom hardware more than the cost of that hardware. It seems that loosely coupled shared nothing clusters are not good machine learning back ends.
If they want to continue building large GPU accelerated workloads, pairing more tightly with networking seems like an obvious move.
Well.. "almost every important company in the 3D area filed lawsuits against NVIDIA" :)
It would be interesting to know how other computing technologies looked like 20 years ago. Is there any good place to find that online?
Firingsquad was a big one- It was originally started by a guy who won John Carmack's ferrari in a quake tournament if I remember the story right: http://web.archive.org/web/19990101000000*/firingsquad.com
Others you may want to look at:
tomshardware -really great coverage around intel's Rambus RDRAM debacle around 2000.
Aces Hardware- this site was really in depth for the time, but it updated infrequently before just stopping altogether around 2004. https://www.aceshardware.com/ Their stuff is still online.
Those were the main sites that I would check that I can remember...
They also have some sort of a parallel VLIW CPU architecture that they've been trying to get off the ground for a while now called TILE/TILE64 so that might also play into things.
However since NVIDIA opened their offices in Israel a while ago they might simply be looking for an acquihire since Mellanox is a fabless semi chip maker it kinda fits that also.
Mellanox has driven IB speeds for more than a decade, limited only by PCIe bandwidth. Since they've had NICs that do both IB and Ethernet, they've been driving the ethernet market as well. We've been using their 100G adapters since 2015 (when they were first to market by a big margin). Even today, there are only a handful of vendors that can deliver a 100g NIC. I worry that if Mellanox stops driving port speed, we'll see a slower increase in the speed of NICs due to the lack of competition (eg, 400g will take longer..).
IB still has lower latency than Ethernet at least on paper especially when it comes to RDMA but I don't know how much of an issue that is for these applications.
But overall I'm not sure how much it matters to Mellanox since they are also the ones who are making the high speed Ethernet switches and host adapters.
You also need to use their cables and transceivers (or a similar alternative) for these speeds doesn’t matter if you are using Ethernet or IB.
I had a TileGX dev board and ported our product at the time (nearly 10 years ago). It was an ok arch but that’s a tough niche to fight for.
I’ve worked on several projects with them, and found they generally do a good job of feeding the beast when the OS and driver’s are properly tuned.
I’m not informed enough to call good or bad, but will instead say it’s interesting, especially in the HPC space (and the emerging AI space).
Mellanox has also embraced bringing RDMA to things like Ceph and working with the broader vendor ecosystem like Red Hat for using this in production.
I hope Nvidia doesn't taint the good reputation of this company.
They still are the company to go for infiniband, but infiniband it lost much of its appeal to non true supercomputing tasks.
Ethernet nowadays can do RDMA, soft guarantees on latency, in-order and reliable delivery at lower costs, and an option to reuse existing L2 networks. Mellanix has squeezed the infiniband cow dry.
And who did the Ethernet RDMA protocol (RoCEv1/v2) and sell the RDMA compatible NIC that everyone use in the HPC world currently ?
Tip: It is starting by Mella too.
I have no idea what it is, actually. Some ideas that may or may not matter (or might not even be correct):
- IB is a couple of decades younger, so could benefit from knowledge how to do fast protocols. (Not an explanation per se)
- Simpler forwarding. In IB the subnet manager gives out the LID's that are used for routing withing a subnet. They are shorter than an eth MAC (16 vs. 48 bits), so the lookups circuit in the switches can be smaller and faster(?), and also since the LID's are assigned by the subnet manager rather than being burned at the factory, they can be distributed taking into account the subnet topology, allowing switches to use LID Mask Count (LMC) filtering. Similarly, all routes within a subnet are calculated statically a priori by the subnet manager (load balancing among multiple paths is only static round robin, not dynamical load dependent), and don't have to be calculated on the fly by the switches.
- FEC rather than retransmission in case of corruption.
For everything else, RDMA on Ethernet buys you with ability to reuse your L2, and this matters way way more to people running DC businesses than anything else.
You can do that today with Cumulus Linux, on Mellanox switches, no less. Switches from other manufacturers as well:
Also, there are multiple solutions to run Linux on switches, most notably Cumulus Linux and VyOS
I just hope Nvidia would not change much to the company. For example Netflix's Open Appliance, if I remember correctly were running on FreeBSD + Mellanox 100Gb NIC. All because of their top notch FreeBSD Drivers.
The answer: GPUoF like NVMeoF
Now NVidia only needs to buy Xilinx...
I'm not sure it would be in anyone's interest for both to be acquired by Intel, even if IB isn't very relevant anymore outside of supercomputing.
Cavium bought QLogic in 2016, Marvell in turn bought Cavium in 2018.
This acquisition has big implications for HPC in particular.
IMO, more of the issues with these advanced networking protocols is latency - something that is usually more addressable via protocol dev than just hardware. Would be very interesting to see nVidia try to acquire Xilinx and supercharge their "Aurora" low overhead transfer protocol...
Is it, really?
FPGA's are now all over HPC, the 5G rollout is going to put a LOT of really expensive FPGA's in new towers, and high-end networking gear is about to get a bump due to PCI going up.
I don't know how much damage the Intel acquisition did to Altera, but if Altera has been AWOL with big customers due to the Intel acquisition, Xilinx may have a massive amount of customer wins locked in for a very long time.
I wish Xilinx would die in a hot flaming pit of Hell for many and various reasons, but they are the 500lb gorilla of FPGAs.
My original comment reacted to nVidia acquiring Xilinx and I am trying to explain why that won't happen: it's too expensive.
Furthermore, Palestinians in the West bank regularly talk to me about their situation, because I'm what you would call a "settler" and I buy in their towns, and I pick them up hitchhiking, and I talk to them with no borders. They all have family in Jordan and will happily tell you how much better their "occupied" life is than their Jordanian family. And yes, I've been to Jordan and I've been to Egypt and I've been to Lebanon (albeit in uniform on that one).
Dislike Israel's policies as much as you want, I'm unhappy with many of them as well. Hate our PM, you'll find good reason to. But there is no need to lie or put spin on the fact that 99% of Israelis have no qualm with Palestinians or pay them less for equal labour. Likewise, 99% of Palestinians have no qualm with Israel or Israelis, and want (like us) to work hard, come home and love our children, and live in peace with our neighbours.
Funnily enough, I also have family in Jordan. Let me tell you, their lives are much better than those under Israeli occupation.
Another poster here did mention that the fine article mentions that Mellanox can hire three Palestinians for the price of one Israeli. That is not typical of the Israeli high tech sector. It _is_ typical of some other labour markets, notably building and agriculture.
Note that I'm specifically referring to the West Bank. I'm sure that the situation in Gaza is so much worse that in my worst dreams I cannot imagine it. From where are your family? I'm in Eshkolot, in the south Hebron hills, right next to Al-Ramadin.
Even if completely wrong, the GP is pretty clearly posting in good faith (as you are too, I'm sure).
As an alternative, sharing some details of your position would be more interesting and helpful. Only if you want to, of course.
> "For the price of one Israeli engineer, an [Israeli] company can hire three Palestinians in the West Bank"
If you look at Israeli public opinion polls from past ten years or so, you get very different view. 30% feel hatred when they hear Arab spoken in the street, 50% say they would refuse to work at a job where direct supervisor was an Arab.
I'm sure your are aware that you didn't give a counterargument.
Maybe moderate people support bigoted politicians for some reason and politicians think that bigotry gets votes, I don't know. But it surely don't look good.
Using all of these new media channels under his absolute or partial control he galvanised the populus against the treacherous left, and perpetuated a very strong siege mentality (which us Jews are, understandably, prone to) while positioning himself as the only viable leader capable of fending off all those who wish to destroy Israel (which is easy, because there are plenty of people actually seeking that end)
The last few years he really upped his game by talking directly to the public through Facebook and WhatsApp, (because the media is apparently against him ️...) And using much more crude language, a la Trump. This new media operation is managed by his son.
He and his far right allies also orchestrate a public campaign against the so called left junta in the high court. This is done via his affiliated media, and by bringing about blatantly populist non 'constitutional' laws that the court rejects, which he can use as proof for the 'leftist' court.
(We don't have a formal constitution, but we have something similar)
Similarly He also promotes the idea that a leftist junta is controlling the media, the police (after his appointed chief of police didn't curtail investigations against him) and all public beaurocrats. Oh and his son is posting antisemitic memes depicting Soros copied from the American alt right. So yeah, the israeli PM is using classic antisemetic tactics against the israeli left. (We live in weird times... :( )
But, The employment rates are high, TV is entertaining and the economy is stable, so most people can easily buy in the only Netanyahu can lead us.
So if you'd ask me a few years ago, I'd say people of Israel are moderate, but fearful for the lives. Since then, the constant hammering of propoganda is corroding some basic democratic values. Younger people are less moderate, and paradoxaly are more prone to the siege mentality (although Israel is stronger than ever military wise) and nationalistic chauvinism. So I don't know if the future is bright. We still have a vocal oposition, and a moderate (albeit leaderless) majority, so I hope when he leaves the scene we'll manage to come back to some sane discourse, but I won't bet on it.
Note that the polls suffer from "survivor bias", the people answering the polls are those who feel that they need to should their opinion to the world. I don't answer polls, and with elections coming I get several SMS polls per day sent to my phone.
30% feel hatred when they hear Arabic? Who are they polling?!? There is absolutely no way that is true for the general population. You cannot go anywhere in Israel and not hear Arabic.
50% would refuse to work under Arab supervision? Again, this is ridiculous. I don't think money-sucking Jews would refuse to work under Hitler.
Got curious and googled. Couldn't find that last statistic, but it seems 58% (give or take) of residents in the city of Ashkelon were in favor of terminating public works projects (specifically construction of bomb shelters at kindergartens) where Arab workers were employed.
Whether this is an effective proxy for all of Israel or even for the question at hand is unknown to me, and it probably isn't the case considering Israeli AG Yehuda Weinstein warned the city's mayor not to execute the decree, but I'm not terribly focused on the decree itself as much as I'm focused on the population backing it. 58% of a city would appear to be fearful of an employed population because of their ethnic heritage... that's terrifying.
(p.s. it seems mayor Itamar Shimoni ultimately backed off the move, possibly one of the rare cases where an elected public official made a more well-informed decision than what was desired by the official's constituents. https://www.jta.org/2014/11/23/israel/ashkelon-mayor-decides... )
Disclosure: I'm an American-born person of Persian descent, though I'm likely unable (and certainly unwilling) to revisit Persia/Iran under the current regime.
Often polls will conflate the terms Arab, Muslim, Palestinian, Gazan, and a few other words. You will see that they will ask a question that is interpreted to the poll taker as "would you agree that Gazans who have bombed Ashkelon should be forbidden to work in Ashkelon" and then reported as "Ashkelon residents in favor of terminating construction of bomb shelters at kindergartens where Arab workers are employed".
The polls are _designed_ to present a specific picture, they are not designed to inform about the true nature of the situation. Just ask yourself, why are so many polls being taken, what is their purpose. You know as well as I do that there are no disinterested parties here, everybody has an agenda. At least I state my agenda and position clearly.
You'll also be surprised to know that I know not a single Israeli, not one, who has any qualm with the Iranian people. We're terrified of their nuclear program, but we remember the days of friendship between our countries and a significant portion of Israelis are of Iranian decent. Iran pretty much attacks us via proxy today (Hizboallah), but we see that as a manifest of their current religious regime and not as representative of the Persian people.
Ah, the "No true Scotsman" defense.
Not just "you"; everyone. It is illegal. Even the UN has called for an end to this practice. Most countries don't recognize Israel's occupation of the West Bank. The UNSC has condemned the practice. Israel routinely destroys Palestinian homes and villages for the benefit of you people (the "settlers").
Read more about "settlers": https://www.btselem.org/topic/settler_violence
> "You can't say your PM is a jerk, but not the people because the people voted for him (or at least his party)."
No, it's not the people. It's some people that voted Likud in. Much like some Americans voted Trump in. Where's the strawman?
30% of Americans even today support Trump. So one would be right to claim that 30% of Americans are jerks.
“'For the price of one Israeli engineer, an [Israeli] company can hire three Palestinians in the West Bank, and they have very high motivation'”
Likewise, the 1% of Palestinians and Israelis have any qualms with one another also has no backing by any data or real life. Since it also isn’t true at all and there’s actually polls, maybe studies, to show that isn’t true. Too much spin.
Would be cool to hear from a palestinian living there as well
Gaza is literally a concentration camp that is extremely hard to get out of. Moreover, the Israelis block imports of food and building supplies, essentially trying to starve the inhabitants to death.
What OP is saying is similar to saying that black people loved slavery and then segregation, because their masters were so fair and kind to them.
I wonder if it can even ever be resolved.
From Human Rights Watch : Israel maintains entrenched discriminatory systems that treat Palestinians unequally. Its over half-century-long occupation of the West Bank and Gaza involves systematic rights abuses, including collective punishment, routine use of excessive lethal force, and prolonged administrative detention without charge or trial for hundreds. It builds and supports illegal settlements in the occupied West Bank, expropriating Palestinian land and imposing burdens on Palestinians but not on settlers, restricting their access to basic services and making it nearly impossible for them to build in much of the West Bank without risking demolition. Israel’s decade-long closure of Gaza, supported by Egypt, severely restricts the movement of people and goods, with devastating humanitarian impact. The Palestinian Authority in the West Bank and Hamas in Gaza both sharply restrict dissent, arbitrarily arresting critics and torturing those in their custody.
For more details, see  and .
Note that in the West Bank itself, there are not many opportunities for the children to play together if the parents do not already know each other, because each side is wary of the other. So much of that interaction happens on the other side of the Green Line, with Arab children that do not live in the West Bank.
Assuming this were true, how exactly is it a good thing?
But I think it's a good thing. People working together is one of the best ways to get a better understanding of people you would normally never meet. I've met a ton of different people through tech and it's one of the best parts about it. Meritocracy.
Well, except for the airspace and territorial waters, and enforcing it's security zone on the Gaza side of the Oslo Accord demarcation line.
From what I read* about adobedtm.com it's not necessarily advertising.