Hacker News new | past | comments | ask | show | jobs | submit login

Sad in a way, but no surprise. I recently summarized my opinions on hackernews[1] in response to why Netflix uses Linux instead of Solaris, which might be of interest here:

"I worked on Solaris for over a decade, and for a while it was usually a better choice than Linux, especially due to price/performance (which includes how many instances it takes to run a given workload). It was worth fighting for, and I fought hard. But Linux has now become technically better in just about every way. Out-of-box performance, tuned performance, observability tools, reliability (on patched LTS), scheduling, networking (including TCP feature support), driver support, application support, processor support, debuggers, syscall features, etc. Last I checked, ZFS worked better on Solaris than Linux, but it's an area where Linux has been catching up. I have little hope that Solaris will ever catch up to Linux, and I have even less hope for illumos: Linux now has around 1,000 monthly contributors, whereas illumos has about 15.

In addition to technology advantages, Linux has a community and workforce that's orders of magnitude larger, staff with invested skills (re-education is part of a TCO calculation), companies with invested infrastructure (rewriting automation scripts is also part of TCO), and also much better future employment prospects (a factor than can influence people wanting to work at your company on that OS). Even with my considerable and well-known Solaris expertise, the employment prospects with Solaris are bleak and getting worse every year. With my Linux skills, I can work at awesome companies like Netflix (which I highly recommend), Facebook, Google, SpaceX, etc.

Large technology-focused companies, like Netflix, Facebook, and Google, have the expertise and appetite to make a technology-based OS decision. We have dedicated teams for the OS and kernel with deep expertise. On Netflix's OS team, there are three staff who previously worked at Sun Microsystems and have more Solaris expertise than they do Linux expertise, and I believe you'll find similar people at Facebook and Google as well. And we are choosing Linux.

The choice of an OS includes many factors. If an OS came along that was better, we'd start with a thorough internal investigation, involving microbenchmarks (including an automated suite I wrote), macrobenchmarks (depending on the expected gains), and production testing using canaries. We'd be able to come up with a rough estimate of the cost savings based on price/performance. Most microservices we have run hot in user-level applications (think 99% user time), not the kernel, so it's difficult to find large gains from the OS or kernel. Gains are more likely to come from off-CPU activities, like task scheduling and TCP congestion, and indirect, like NUMA memory placement: all areas where Linux is leading. It would be very difficult to find a large gain by changing the kernel from Linux to something else. Just based on CPU cycles, the target that should have the most attention is Java, not the OS. But let's say that somehow we did find an OS with a significant enough gain: we'd then look at the cost to switch, including retraining staff, rewriting automation software, and how quickly we could find help to resolve issues as they came up. Linux is so widely used that there's a good chance someone else has found an issue, had it fixed in a certain version or documented a workaround.

What's left where Solaris/SmartOS/illumos is better? 1. There's more marketing of the features and people. Linux develops great technologies and has some highly skilled kernel engineers, but I haven't seen any serious effort to market these. Why does Linux need to? And 2. Enterprise support. Large enterprise companies where technology is not their focus (eg, a breakfast cereal company) and who want to outsource these decisions to companies like Oracle and IBM. Oracle still has Solaris enterprise support that I believe is very competitive compared to Linux offerings.

So you've chosen to deploy on Solaris or SmartOS? I don't know why you would, but this is also why I also wouldn't rush to criticize your choice: I don't know the process whereby you arrived at that decision, and for all I know it may be the best business decision for your set of requirements.

I'd suggest you give other tech companies the benefit of the doubt for times when you don't actually know why they have decided something. You never know, one day you might want to work at one."

I feel sorry for the Solaris engineers (and likely ex-colleagues) who are about to lose their jobs. My advise would be to take a good look at Linux or FreeBSD, both of which we use at Netflix. Linux has been getting much better in recent years, including reaching DTrace capabilities in the kernel.[2] It's not as bad as it used to be, although to really evaluate where it's at you need to be on a very new kernel (4.9 is currently in development), as features have been pouring in.

Also, since I was one of the top Solaris performance experts, I've been creating new Linux performance content on a website that should also be useful (I've already been thanked for this by a few Solaris engineers who have switched.) I've been meaning to create a FreeBSD page too (better, a similar page on the FreeBSD wiki so others can contribute).

FreeBSD feels to me to be the closest environment to Solaris, and would be a bit easier to switch to than Linux. And it already has ZFS and DTrace.

[1] https://news.ycombinator.com/item?id=12837972 [2] http://www.brendangregg.com/blog/2016-10-27/dtrace-for-linux... [3] http://www.brendangregg.com/linuxperf.html




I respect that Linux works for Netflix. However IMHO:

BTRFS/ZoL doesn't beat Illumos ZFS. FreeBSD ZFS is pretty standalone in the OS; It can only since recently deal with a hot spare drive and thats it. IO scheduling on FreeBSD is spartan; It will always favor large IO and starve small read / writes.

LXC doesn't beat FreeBSD jails or Solaris zones since LXC is not considered a security boundary.

Openvswitch can perhaps measure itself with illumos crossbow.

Systemd doesn't beat SMF on Illumos. I think SMF really nailed it (Systemd is overkill and plain RC scripts in FreeBSD are a pain).

So IMHO Solaris/Illumos/SmartOS sits nicely between Linux and FreeBSD.



"why Netflix uses Linux instead of Solaris"

Just a reminder, we also run FreeBSD on our CDN servers at Netflix.


Right, thanks, I should have said "on the cloud" and "Cloud OS team" instead of "OS team".


Is there public information available about why Netflix uses Linux as opposed to FreeBSD for those pieces of infrastructure?


Netflix runs Java and the Oracle JVM runs on Linux but not FreeBSD.

I have no information, but there aren't very many dots to connect here.


Has Linux reached parity with BSD in terms of the TCP stack? My understanding was that it still wasn't as efficient but that info is outdated.


Linux has been beating BSD for at least 8-10 years when it comes to TCP. When it comes to new features in TCP-land, Linux easily beats it. Google added Receive Side Scaling / Receive Flow Steering to Linux years ago, and it is still a WIP in FreeBSD as an example. Also take a look at much of the bufferbloat research recently that has been merged into Linux, etc.


The RSS is Linux is not particularly useful (for netflix scale workloads) because it does not integrate the RSS hashing across the entire stack. So all you get is connection sharding. With real RSS, as done in Windows and FreeBSD where the kernel has intimate knowledge of the hash key and algorithm, you can use RSS to split the TCP hash table up and make it per-CPU. By using multiple accept sockets for per-CPU workers, you can effectively keep everything for a single connection on a CPU, and run almost everything with no cross-CPU contention. You can't move the connection around at will between CPUs, but you don't care to, because no connection is special (in a netflix workload), it just one of tens of thousands.

Adrian Chadd did most of the FreeBSD RSS work, and gave a good talk about it at BAFUG: https://www.youtube.com/watch?v=7CvIztTz-RQ

The RSS in Linux was just used for load spreading (the last I checked, I haven't used Linux much since I left Google 1.5 years ago). If this has improved, I'd love to hear about it.

Linux RFS depends on the packets being dispatched to the correct CPU for the connection by the interrupt handler running wherever the packet happened to land. This has cache & memory locality implications, especially on NUMA.

Linux aRFS lets the NIC do the steering. Unfortunately, each connection requires an interaction with the NIC to poke it into the steering table, and most NICs can't steer 100,000 connections.

So, to sum up, Linux has a lot of cool tech for steering individual connections and support for that varies greatly by NIC. Windows and FreeBSD use standard RSS to predictably steer an unlimited number of connections. For a large CDN server, the latter is more useful. However, for low-latency / high bandwidth applications, I can see the advantage to aRFS.


I wouldn't say 8 to 10 years. A major bug in Linux's default congestion control algorithm was only fixed just last year:

https://bitsup.blogspot.com.au/2015/09/thanks-google-tcp-tea...

Linux is the platform of choice for bufferbloat research, although FreeBSD isn't far behind in adopting the results of it:

https://lists.freebsd.org/pipermail/freebsd-ipfw/2016-April/...


I guess my information was not just outdated, but clearly wrong.


Don't get me wrong, BSD is still absolutely solid, but for anything cutting edge, Linux is spanking the pants off of it.


As is often the case it depends on the specifics of the application and on those building a solution. As far as raw performance is concerned FreeBSD performs very well.

Netflix gets nearly 100Gbps from storage out the network on their FreeBSD+NGINX OCA appliances. Some details in the "Mellanox CDN Reference Architecture" whitepaper at http://www.mellanox.com/related-docs/solutions/cdn_ref_arch..... The closest equivalent I've found on Linux was a blog post on BBC streaming getting about 1/4 of the performance.

Chelsio has a demo video (with terrible music) using TCP zero copy of 100Gbps on a single TCP session, with <1% CPU usage https://www.youtube.com/watch?v=NKTApBf8Oko.

At SC16 NASA had a "Building Cost-Effective 100-Gbps Firewalls for HPC" demo, using FreeBSD and netmap: https://www.nas.nasa.gov/SC16/demos/demo9.html


Thanks for the reference to the whitepaper. FWIW, I'm the kernel guy at Netflix who has been doing the performance work to get us to 100Gb/s from a single socket.

Another interesting optimization we've done (and which needs to be upstreamed) is TLS sendfile. There is a tech blog about this at http://techblog.netflix.com/2016/08/protecting-netflix-viewi.... We don't have a paper yet about the latest work, but we're doing more than 80Gb/s of 100% TLS encrypted traffic from a single socket Xeon with no hardware encryption offloads.


I just wanted to thank you publicly for all your hard work on this. The community will benefit greatly from this. If I recall, correct me if I am wrong, didn't you also port FreeBSD to the Alpha many moons ago? I loved the Alpha and it broke my heart when it died. Sad panda :(


Doug Rabson did most of the early work on alpha. I sent him enough patches that he sponsored me for a commit bit. My primary desktop for several years was running FreeBSD/alpha. First was an AlphaStation 600, then an API UP1000.

I was very sad when alpha got axed, but I agreed with killing it. FreeBSD is about current hardware.


You're spot on regarding the app and FreeBSD performing very well. Don't disagree with you one bit. Also, great link on the Netflix CDN work, they're doing some really fascinating stuff. It is nice to see the openness.

I work directly with both of the gents who gave this talk about 100G networking[1] (on Linux) and still find that much of the actual cutting edge research is done on Linux. Perhaps I'm biased! I've also been to one of Mellanox's engineering offices (Tel Aviv) to speak with their engineers at my previous employer 7-8 years ago. They told me they do most all of their prototyping and initial development on Linux, and RHEL to be specific. Then then port to other platforms.

Maybe I was wrong on some of this, but my use case (due to my employer's industry being finance) is lower latency, where Linux absolutely and positively crushes anything else.

    [1] http://events.linuxfoundation.org/sites/events/files/slides/100G%20Networking%20Toronto_0.pdf


Mellanox is now one of the role model vendors in the FreeBSD ecosystem. They have a handful of BSD developers as well as sales and support staff that are in tune with the needs of high scalability FreeBSD users.


Maybe I was wrong on some of this, but my use case (due to my employer's industry being finance) is lower latency, where Linux absolutely and positively crushes anything else.

Actually, while we're on the subject, SmartOS with CPU bursting from illumos is the leader in low latency trading:

http://containersummit.io/events/sf-2015/videos/wolf-of-what...


That is a slick platform they've built, but I still don't see how it is competitive with Linux for very low latencies. He mentions trading at microseconds, but we're building microwave radio networks to trade at nanoseconds. Unless this has changed extremely recently, Solaris/Illumos and hence SmartOS still don't have tickless kernels. I recall Solaris having a 100hz tick by default which you could change to 1000hz with a special boot flag. Linux has had dynticks since fairly early 2.6 kernels and with the modern 3.x kernels (RHEL7+), you've got full on tickless via the nohz_full options. Without this, the kernel interrupts the application to use cpu time.

Additionally, I don't believe (Experts please correct me if this is wrong) SmartOS has an equivalent to Linux's isolcpus boot command line flag (or cpu_exclusive=1 if you're in a cpuset) to remove a cpu core entirely from the global scheduler domain. This prevents any tasks from running on that CPU, including kernel threads. Kernel threads will still occasionally interrupt applications if you simply set the affinity on pid 1 so that does't count.

These two features, along with hardware that is configured to not throw SMIs, allow Linux to get out of the way of applications for truly low latency. As far as I'm aware, this is impossible to do in Solaris/SmartOS. I'm not even getting into the SLUB memory allocator being better or the lazy TLB in Linux massively lowering TLB shootdowns, etc, etc. There is a reason why virtually every single major financial exchange in the world runs Linux (CME in Chicago, NYSE/NYMEX in New York, LSE in London, and Xetra in Frankfurt), it is better for the low latency use case.


You asked for an expert to correct you if you're wrong, so here it is: this is just completely wrong and entirely ignorant of both the capacity of the system and its history.

On timers: we (I) added arbitrary resolution interval timers to the operating system in 1999[1] -- predating Linux by years. (We have had CPU binding and processor sets for even longer.) The operating system was and is being used in many real-time capacities (in both the financial and defense sectors in particular) -- and before "every single major financial exchange" was running Linux, many of them were running Solaris.

[1] https://github.com/joyent/illumos-joyent/blob/master/usr/src...


Thank you Bryan for the correction, I did after all ask for it :)

One final question while I've got you that your response didn't seemingly address. Does the cyclic subsystem allow turning off the cpu timer entirely ala Linux's nohz_full? If so, I stand corrected.


Yes, it does -- the cyclic subsystem will only fire on CPUs that have a cyclic scheduled, which won't be any CPU that is engaged in interrupt sheltering via psradm.[1] This is how it is able to achieve hard real-time latency (and indeed, was used for some hardware-in-the-loop flight simulator systems in the defense sector that had very tight latency tolerence).

[1] https://illumos.org/man/1m/psradm


You should really chat to some HFT folk in NYC before making that conclusion.


This is an adolescent evaluation. FreeBSD will have a new TCP stack with BBR made public in a couple months. It will be easier to correctly deploy and more cohesive than Linux. The entire packet path is more cohesive and easier to debug and tune using DTrace although Linux might have caught up here recently. By volume, FreeBSD is doing at least 30% of Internet facing traffic between a well known company and some quieter giants. BTW it only took a half dozen people collaborating between 3 companies about 2 years catch up to state of the art.

I've done a great deal of reading and research on OS ethos, IMO a thriving and production worthy operating system can be maintained with as few as 40 people in total. The superiority of Linux feels exaggerated, and systems innovation has chilled because of it.


What's WIP about FreeBSD's RSS?


I'd have to recheck. Linux has done some interesting new systems engineering work with BPF kernel bypass to improve network performance (the eXtreme Data Path project, used by Facebook).


"""Has Linux reached parity with BSD in terms of the TCP stack?"""

Im not sure what you mean. Linux has led TCP implementations for a decade now.


Two years ago, Facebook was trying to hire someone to make Linux's network stack as good as FreeBSD's:

http://m.slashdot.org/story/205565


I work for Google and previously worked at LBL on a team that developed optimized network stacks. I have a fair amount of experience running large scale data transfers as well as low-latency network in LANs.

The Linux network stack is great. It's the preferred system of choice for nearly every researcher in the networking field. I don't know what Facebook meant in their case.


You have more experience than I do in this area, but I am going to reply anyway in the belief that I can contribute some useful information to the discussion. First, there is a discussion of this here:

https://www.quora.com/How-is-FreeBSDs-network-stack-superior...

The main remark seems to be:

> The predominant difference is that the FreeBSD network stack was much more carefully designed. The Linux stack was less careful and thus is much more haphazard. Also, more work has been put into optimizing the FreeBSD stack.

It is not my area of expertise, but the Linux skbuf seems to fit the description of haphazard while the FreeBSD mbuf seems to fit the description of more carefully designed. The same could be said about epoll versus kqueue.

The remark about more work in optimizing the FreeBSD stack also seems to be true. While I cannot speak for everything in FreeBSD's network stack, I do know that FreeBSD's netmap far exceeded anything Linux could do at the time and while it is available on Linux, I never hear of it being used anywhere but on FreeBSD:

http://info.iet.unipi.it/~luigi/netmap/

Development of FreeBSD's network stack had plenty of innovative things in development at the time Facebook's post was made:

https://wiki.freebsd.org/201405DevSummit/NetworkStack

That included additional contributions from a major network equipment vendor that had made many contributions throughout the years. If I checked the commit history, I imagine I would find performance work done by said vendor. From what I can tell, FreeBSD's network stack is improving regardless of whether the rest of us hear about it.

Lastly, there have been multiple things discovered to be wrong in the Linux network stack since that facebook job listing. Two prominent ones that I recall offhand are:

https://blog.cloudflare.com/the-story-of-one-latency-spike/

https://bitsup.blogspot.com.au/2015/09/thanks-google-tcp-tea...

They both could fall into the category of stability problems to which facebook had alluded. The second one more so though:

> The end result is that applications that oscillate between transmitting lots of data and then laying quiescent for a bit before returning to high rates of sending will transmit way too fast when returning to the sending state. This consequence of this is self induced packet loss along with retransmissions, wasted bandwidth, out of order packet delivery, and application level stalls.


For the cloudfare example they didn't actually identify any problems with Linux. What they learned- correctly- is that TCP autotuning needs to be enabled if you want high performance out of your network stack.

This is covered by my previous team's page: https://fasterdata.es.net/host-tuning/linux/ Note: "On newer Linux OSes this is no longer needed." (IE, it's already set properly).

For the second one, they fixed a bug in Linux TCP cubic implementation. FreeBSD didn't get cubic until 8.2, which was around 2009. So, you're criticizing Linux for having a in a bug in a feature that FreeBSD didn't even add until 7 years ago.

Again, I will repeat: I worked on a team that did multi-OS TCP/IP optimization. What you're describing in terms of oscillation is a well-known problem in many implementations. All of the people doing research on this are now using linux as their platform for research and development.


If you "worked on a team that did multi-OS TCP/IP optimization", then you should know that there is no one size fits all solution that is quantitatively better for everyone.

Not implementing cubic in FreeBSD when there was a bug in the only implementation of it in the world could have been an advantage in certain situations, including Facebook's.

There seems to be a hubris by many Linux users that Linux is the best solution in the world for everything and it is not. There is always someone who does something better. Maybe not in everything, but the same applies to Linux. No matter how good it becomes, it is not the best in everything. Networking is a broad topic. I don't think Linux is the best in every area of networking. I am not even sure if it is the best in many of them, given that many platforms do things very well and at some point, it is hard to be better.


My opinion on this as it always has been is Linux gets some new features first before freebsd but they are always done sloppily, half assed, and start getting refined and fixed about the time freebsd implements the same feature after thinking about it, designing it well, then implementing it.


With regard to https://wiki.freebsd.org/201405DevSummit/NetworkStack saying they had plenty of innovative things is really misleading. Most of the comments are "linux has this nice thing that works well, let's copy it".


In the BSD community this awareness of other operating systems is seen as a strength and pride. It's telling that Linux fans see it as weakness.

Subsystems are now done with up front design and some degree of consensus in the BSDs, closer to the cathedral and commercial development than the bazaar of Linux. This necessarily means we are not usually at the forefront of cutting edge features. It doesn't necessarily mean we don't have features before Linux; if the idea exists in academia or other OSes enough to reason about it's reasonable to propose, design, and build. Netmap is a good example. The new FreeBSD selectable TCP stacks are another, where we avoid incremental growing pains and baggage. When these designed features hit, they tend to be coherent, usable, obvious, and lasting.

My opinion of Linux features is that little due diligence was done, especially public acknowledgement of inspiration and why one route was taken over another. For instance, the Linux KPIs are littered with questionable decisions made in isolation. epoll and the various file notification calls are examples. That attitude manifested strangely up to userland through IPC/DBus with the continued systemd drama.

A little bit of logical inference.. there are financial drivers vendors are fleeing the Linux kernel in preference of userspace (i.e. Intel's DPDK and SPDK). One is licensing, which is not an issue with BSD nor userland. The other is the rate and quality of KPI churn. Linux KPIs break all the time, switch licenses all the time, and it is a general nuisance to maintain a vendor tree whether it is open or closed source. The good side is that hopefully drivers and products end up open source. The bad side is, in many modern usages, that does not happen because GPL is not relevant to hosted services, as well as low motivation/quality/incentive/license violation for IoT type things. The BSDs start with no pretense of GPL nor flippant APIs, so it is a lot more comfortable to consume and build great products.


That is only in 4 items and Linux only appears in a few comments on each of them.

This remark seems more to me like a statement of belief that no one else can do good things other than Linux. That is far from true.


"In linux these are managed with device-independent tools, which is much better than the custom methods we have now, and avoids polluting the ifnet with extra information."

"In linux, buffers in the tx queue hold a reference to the socket so completions can be used to notify sockets. Implementing the same mechanism in FreeBSD should be relatively straightforward. "

"We don’t have software TCP segmentation, we have to carry information in the mbufs. Performance was doubled, without hardware support, by doing segmentation very low in the stack, right before input into driver. (Student project.) Linux calls this approach GSO, pushing large segments through the stack; the hardware can do segmentation if supported, otherwise we do it at the bottom layer. Simplifies TCP code since you can send arbitrarily large segments. "

"Linux has their standard ifnet interface, with a single pointer to the extensions; if the interface does not support them, the system still runs. If it does, have interfaces to configure numbers of queues, numbers of buffers, etc. All of this is slow-path (configuration) code. Think we should go for a similar route — ease configuration of 10gig interfaces"

the rest of the stuff in there is just low level optimizations to update the design that was written out in the original FreeBSD book.

I never said that people can't do good things in OSes other than linux. I said that Linux's networking stack has been better than BSD's for ten years, I can cite numerous factual arguments and research papers to support this, along with my extensive experience with linux (my experience with BSD is less, but enough to know it's stack isn't magically better.


If you want to say one is better, then you ought to at least define what being better means. "better" clearly does not mean the same thing to both of us.

Linux does have plenty of nice things and plenty of nice work, but I am not going to dismiss everything being done elsewhere by declaring Linux to be "better". At best, I would say that it is ahead in some areas, behind in other areas and the same in many areas. As for what some of those "other areas" are, I recall Adrian Chadd implementing time division multiplexed atheros wifi support in FreeBSD that Linux does not have. Netflix also contributed a rather nice thing to FreeBSD that Linux did not have:

http://arstechnica.com/security/2015/04/it-wasnt-easy-but-ne...

There are plenty of nice things in both platforms. Labelling one as "better" just doesn't do justice to either of them. It ignores opportunities for the "better" one to improve by denying that opportunities for improvement have been demonstrated to exist. It also denies the "lesser" one the acknowledgement of having done something worth while.


the BSD feature you mention was added to linux 6 months later.

When I say something is "better", I mean "I've looked at the data, and integrated over a wide range of parameters".

I'm still waiting to hear about a magical BSD feature that is better. That hasn't happened in about 10 years, hence my statement.


I just linked one feature that by your own definition was better and you replied that Linux got it in 6 months, rather than "I did not realize FreeBSD does some great things in networking first".

If you are as experienced in networking as you claim, you should stop waiting to hear about magical features that are better. Nothing will ever impress you as being magical. That is a downside of having experience.

Maybe you would find talking to an actual expert on FreeBSD's network stack more interesting. I am not one and while I could list several other things I know, I am clearly is not doing it justice.



I definitely wish dtrace and zfs had come to linux earlier, and without the stupid license restrictions of ZFS which Canonical currently flouts. Both of those were far better than the alternatives that Linux came up with.


That's a great find. And look at what Facebook have done: invested in BPF and eXtreme Data Path.


that job posting provides no specific details about how BSD is "better" than Linux execpt making unspecific allegations about stability.


I imagine that it worked better in their testing and rather than jump ship, they decided to invest in making Linux better. If they knew how FreeBSD was better at that time, they likely would have patched Linux rather than look for talent that could figure it out.


Maybe ten years ago when the first generation of PCI-X Intel chipset 10GbE NICs were available for servers, the driver support and tcp offload engine support were much better on FreeBSD. But the situation is reversed now with all of the drivers and updates to the v4.x series Linux kernel.


ADB was merged two days ago. That closed much of the gap between ZoL and Solaris ZFS in one stroke. :)


ADB?


ARC Data Buffer. It replaces SLABs in ZIO buffers with scatter gather lists.


Brendan Gregg? What does he know about Solaris?!?

(I keep the DTrace book within reach when I sit at the keyboard. This is fan mail. Many thanks, for your work has helped me become a better computer person.)


> FreeBSD

Why not OpenBSD? I'm not an advocate of either; I'm trying to learn more about their usefulness in real world applications.


OpenBSD is awesome, but if you're a performance engineer it won't be your first choice - the CPU scheduler/NUMA support/... isn't great, there are no advanced performance-tracing tools like DTrace or Linux'-perf-flavour-of-the-week, etc.


To be fair, FreeBSD's NUMA support is nothing to write home about, and we tend to avoid NUMA hardware (though I have done a cool hack to improve VM scaling using fake NUMA domains that I really need to upstream).

SMP scalability in general is far ahead of OpenBSDs the last time I looked, as is device support for 100G NICs, NVME storage, etc.

The performance monitoring is also far ahead on FreeBSD, with tools like Dtrace, Intel's PCM tools, and Intel's VTune available for FreeBSD.


Brendan, The difference between Solaris and Linux is mainly scalability. Linux scales well on clusters such as SGI UV3000 scale-out servers, or top500 supercomputers. These scale-out clusters serve one scientist starting HPC number crunching workloads 24-48h. Scale-out workloads are easy to parallelize doing a calculation on the same set of grid points, over and over again. All this fits into a cpu cache and can run on each separate compute node. All SGI UV2000/UV3000 use cases are HPC number crunching, analytics, etc.

OTOH, enterprise business workloads (SAP, OLTP databases, etc) typically serve thousands of users simultaneously. They do pay roll, accounting, etc etc. Such workloads can not be cached in the cpu cache, so you need to go out to RAM all the time. RAM is typically 100ns, which corresponds to 10 MHz cpu. Do you remember 10 MHz cpus? This means business workloads have huge scalability problems because you need to place all cpus on the same bus, in one single large scale-up server. If you try to run business workloads on a scale-out server, performance will drop drastically as data is shuffled among nodes on a network, instead on a fast bus.

Thus, business workloads use one single large scale-up servers, with max 16 or 32-sockets. This domain belongs to Unix/RISC and Mainframes. HPC number crunching use large clusters such as SGI UV3000 which has 10.000s of cores.

The largest Linux scale-up server is the new HP Kraken. It is a redesigned old Integrity Unix server with 64-sockets. The x86 version of the Integrity maxes out at 16-sockets only. Other than that, the largest x86 server is vanilla 8-socket servers by IBM, HP, Oracle, etc.

Linux devs only have access to 1-2 socket PCs so Linux can not be optimized nor tested on large 8-16 socket servers. Which Linux dev have access to anything larger than 4-sockets? No one. Linus Torvalds? No, he does not work on scalability on 16-socket servers. There is no Linux dev working on scalability on 16-socket servers. Why? Because, until last year, 16-socket x86 servers hardly even existed! Google this if you want, try to find a 16-socket x86 server other than the brand new HP Kraken and SGI UV300H. OTOH, Unix/RISC and Mainframes have scaled to 64 sockets for decades.

Look at the SAP benchmarks. The top scores all belong to 32-socket UNIX/RISC doing large SAP workloads. Linux on x86 has the bottom part, doing small SAP workloads. The HP Kraken has bad SAP scores, considering it has 16-sockets. It is almost the same as the 8-socket x86 SAP scores. Bad scalability.

Thus, if you want to run workloads larger than 2-4 sockets, you need to go to Unix/RISC. Linux maxes out at 2-4 sockets or so. The new Oracle Exadata server sporting SPARC T7 (same as the M7 cpu) runs Linux and it maxes out at 2-sockets. If you want 16-socket workloads, you must go to Solaris and SPARC. All large business servers, use Unix or Mainframes. No Linux nowhere.

Linux = small business workloads. Solaris = large business workloads. And the big money is in large business servers. If Oracle kills off Solaris, then Oracle is stuck at 2-4 sockets (small revenue). Only Solaris can drive large business servers (big revenue).

It does not make sense to kill of Solaris, because then Oracle can not offer (expensive) large business servers. Then Oracle will be stuck at small cheap business servers with Linux and Windows.

Regarding Linux vs Solaris code quality: https://en.wikipedia.org/wiki/Criticism_of_Linux#Kernel_code...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: