Hacker News new | comments | show | ask | jobs | submit login
"Solaris being canned, at least 50% of teams to be RIF'd in short term" (thelayoff.com)
288 points by QUFB 292 days ago | hide | past | web | 233 comments | favorite



Both the site and the article seem like the dodgy kind that we typically penalize, and it's not clear that this is more than just a rumor. But the comments below are so good, especially the ones by primary sources, that we've taken off the penalties.


Sad in a way, but no surprise. I recently summarized my opinions on hackernews[1] in response to why Netflix uses Linux instead of Solaris, which might be of interest here:

"I worked on Solaris for over a decade, and for a while it was usually a better choice than Linux, especially due to price/performance (which includes how many instances it takes to run a given workload). It was worth fighting for, and I fought hard. But Linux has now become technically better in just about every way. Out-of-box performance, tuned performance, observability tools, reliability (on patched LTS), scheduling, networking (including TCP feature support), driver support, application support, processor support, debuggers, syscall features, etc. Last I checked, ZFS worked better on Solaris than Linux, but it's an area where Linux has been catching up. I have little hope that Solaris will ever catch up to Linux, and I have even less hope for illumos: Linux now has around 1,000 monthly contributors, whereas illumos has about 15.

In addition to technology advantages, Linux has a community and workforce that's orders of magnitude larger, staff with invested skills (re-education is part of a TCO calculation), companies with invested infrastructure (rewriting automation scripts is also part of TCO), and also much better future employment prospects (a factor than can influence people wanting to work at your company on that OS). Even with my considerable and well-known Solaris expertise, the employment prospects with Solaris are bleak and getting worse every year. With my Linux skills, I can work at awesome companies like Netflix (which I highly recommend), Facebook, Google, SpaceX, etc.

Large technology-focused companies, like Netflix, Facebook, and Google, have the expertise and appetite to make a technology-based OS decision. We have dedicated teams for the OS and kernel with deep expertise. On Netflix's OS team, there are three staff who previously worked at Sun Microsystems and have more Solaris expertise than they do Linux expertise, and I believe you'll find similar people at Facebook and Google as well. And we are choosing Linux.

The choice of an OS includes many factors. If an OS came along that was better, we'd start with a thorough internal investigation, involving microbenchmarks (including an automated suite I wrote), macrobenchmarks (depending on the expected gains), and production testing using canaries. We'd be able to come up with a rough estimate of the cost savings based on price/performance. Most microservices we have run hot in user-level applications (think 99% user time), not the kernel, so it's difficult to find large gains from the OS or kernel. Gains are more likely to come from off-CPU activities, like task scheduling and TCP congestion, and indirect, like NUMA memory placement: all areas where Linux is leading. It would be very difficult to find a large gain by changing the kernel from Linux to something else. Just based on CPU cycles, the target that should have the most attention is Java, not the OS. But let's say that somehow we did find an OS with a significant enough gain: we'd then look at the cost to switch, including retraining staff, rewriting automation software, and how quickly we could find help to resolve issues as they came up. Linux is so widely used that there's a good chance someone else has found an issue, had it fixed in a certain version or documented a workaround.

What's left where Solaris/SmartOS/illumos is better? 1. There's more marketing of the features and people. Linux develops great technologies and has some highly skilled kernel engineers, but I haven't seen any serious effort to market these. Why does Linux need to? And 2. Enterprise support. Large enterprise companies where technology is not their focus (eg, a breakfast cereal company) and who want to outsource these decisions to companies like Oracle and IBM. Oracle still has Solaris enterprise support that I believe is very competitive compared to Linux offerings.

So you've chosen to deploy on Solaris or SmartOS? I don't know why you would, but this is also why I also wouldn't rush to criticize your choice: I don't know the process whereby you arrived at that decision, and for all I know it may be the best business decision for your set of requirements.

I'd suggest you give other tech companies the benefit of the doubt for times when you don't actually know why they have decided something. You never know, one day you might want to work at one."

I feel sorry for the Solaris engineers (and likely ex-colleagues) who are about to lose their jobs. My advise would be to take a good look at Linux or FreeBSD, both of which we use at Netflix. Linux has been getting much better in recent years, including reaching DTrace capabilities in the kernel.[2] It's not as bad as it used to be, although to really evaluate where it's at you need to be on a very new kernel (4.9 is currently in development), as features have been pouring in.

Also, since I was one of the top Solaris performance experts, I've been creating new Linux performance content on a website that should also be useful (I've already been thanked for this by a few Solaris engineers who have switched.) I've been meaning to create a FreeBSD page too (better, a similar page on the FreeBSD wiki so others can contribute).

FreeBSD feels to me to be the closest environment to Solaris, and would be a bit easier to switch to than Linux. And it already has ZFS and DTrace.

[1] https://news.ycombinator.com/item?id=12837972 [2] http://www.brendangregg.com/blog/2016-10-27/dtrace-for-linux... [3] http://www.brendangregg.com/linuxperf.html


I respect that Linux works for Netflix. However IMHO:

BTRFS/ZoL doesn't beat Illumos ZFS. FreeBSD ZFS is pretty standalone in the OS; It can only since recently deal with a hot spare drive and thats it. IO scheduling on FreeBSD is spartan; It will always favor large IO and starve small read / writes.

LXC doesn't beat FreeBSD jails or Solaris zones since LXC is not considered a security boundary.

Openvswitch can perhaps measure itself with illumos crossbow.

Systemd doesn't beat SMF on Illumos. I think SMF really nailed it (Systemd is overkill and plain RC scripts in FreeBSD are a pain).

So IMHO Solaris/Illumos/SmartOS sits nicely between Linux and FreeBSD.



"why Netflix uses Linux instead of Solaris"

Just a reminder, we also run FreeBSD on our CDN servers at Netflix.


Right, thanks, I should have said "on the cloud" and "Cloud OS team" instead of "OS team".


Is there public information available about why Netflix uses Linux as opposed to FreeBSD for those pieces of infrastructure?


Netflix runs Java and the Oracle JVM runs on Linux but not FreeBSD.

I have no information, but there aren't very many dots to connect here.


Has Linux reached parity with BSD in terms of the TCP stack? My understanding was that it still wasn't as efficient but that info is outdated.


Linux has been beating BSD for at least 8-10 years when it comes to TCP. When it comes to new features in TCP-land, Linux easily beats it. Google added Receive Side Scaling / Receive Flow Steering to Linux years ago, and it is still a WIP in FreeBSD as an example. Also take a look at much of the bufferbloat research recently that has been merged into Linux, etc.


The RSS is Linux is not particularly useful (for netflix scale workloads) because it does not integrate the RSS hashing across the entire stack. So all you get is connection sharding. With real RSS, as done in Windows and FreeBSD where the kernel has intimate knowledge of the hash key and algorithm, you can use RSS to split the TCP hash table up and make it per-CPU. By using multiple accept sockets for per-CPU workers, you can effectively keep everything for a single connection on a CPU, and run almost everything with no cross-CPU contention. You can't move the connection around at will between CPUs, but you don't care to, because no connection is special (in a netflix workload), it just one of tens of thousands.

Adrian Chadd did most of the FreeBSD RSS work, and gave a good talk about it at BAFUG: https://www.youtube.com/watch?v=7CvIztTz-RQ

The RSS in Linux was just used for load spreading (the last I checked, I haven't used Linux much since I left Google 1.5 years ago). If this has improved, I'd love to hear about it.

Linux RFS depends on the packets being dispatched to the correct CPU for the connection by the interrupt handler running wherever the packet happened to land. This has cache & memory locality implications, especially on NUMA.

Linux aRFS lets the NIC do the steering. Unfortunately, each connection requires an interaction with the NIC to poke it into the steering table, and most NICs can't steer 100,000 connections.

So, to sum up, Linux has a lot of cool tech for steering individual connections and support for that varies greatly by NIC. Windows and FreeBSD use standard RSS to predictably steer an unlimited number of connections. For a large CDN server, the latter is more useful. However, for low-latency / high bandwidth applications, I can see the advantage to aRFS.


I wouldn't say 8 to 10 years. A major bug in Linux's default congestion control algorithm was only fixed just last year:

https://bitsup.blogspot.com.au/2015/09/thanks-google-tcp-tea...

Linux is the platform of choice for bufferbloat research, although FreeBSD isn't far behind in adopting the results of it:

https://lists.freebsd.org/pipermail/freebsd-ipfw/2016-April/...


I guess my information was not just outdated, but clearly wrong.


Don't get me wrong, BSD is still absolutely solid, but for anything cutting edge, Linux is spanking the pants off of it.


As is often the case it depends on the specifics of the application and on those building a solution. As far as raw performance is concerned FreeBSD performs very well.

Netflix gets nearly 100Gbps from storage out the network on their FreeBSD+NGINX OCA appliances. Some details in the "Mellanox CDN Reference Architecture" whitepaper at http://www.mellanox.com/related-docs/solutions/cdn_ref_arch..... The closest equivalent I've found on Linux was a blog post on BBC streaming getting about 1/4 of the performance.

Chelsio has a demo video (with terrible music) using TCP zero copy of 100Gbps on a single TCP session, with <1% CPU usage https://www.youtube.com/watch?v=NKTApBf8Oko.

At SC16 NASA had a "Building Cost-Effective 100-Gbps Firewalls for HPC" demo, using FreeBSD and netmap: https://www.nas.nasa.gov/SC16/demos/demo9.html


Thanks for the reference to the whitepaper. FWIW, I'm the kernel guy at Netflix who has been doing the performance work to get us to 100Gb/s from a single socket.

Another interesting optimization we've done (and which needs to be upstreamed) is TLS sendfile. There is a tech blog about this at http://techblog.netflix.com/2016/08/protecting-netflix-viewi.... We don't have a paper yet about the latest work, but we're doing more than 80Gb/s of 100% TLS encrypted traffic from a single socket Xeon with no hardware encryption offloads.


I just wanted to thank you publicly for all your hard work on this. The community will benefit greatly from this. If I recall, correct me if I am wrong, didn't you also port FreeBSD to the Alpha many moons ago? I loved the Alpha and it broke my heart when it died. Sad panda :(


Doug Rabson did most of the early work on alpha. I sent him enough patches that he sponsored me for a commit bit. My primary desktop for several years was running FreeBSD/alpha. First was an AlphaStation 600, then an API UP1000.

I was very sad when alpha got axed, but I agreed with killing it. FreeBSD is about current hardware.


You're spot on regarding the app and FreeBSD performing very well. Don't disagree with you one bit. Also, great link on the Netflix CDN work, they're doing some really fascinating stuff. It is nice to see the openness.

I work directly with both of the gents who gave this talk about 100G networking[1] (on Linux) and still find that much of the actual cutting edge research is done on Linux. Perhaps I'm biased! I've also been to one of Mellanox's engineering offices (Tel Aviv) to speak with their engineers at my previous employer 7-8 years ago. They told me they do most all of their prototyping and initial development on Linux, and RHEL to be specific. Then then port to other platforms.

Maybe I was wrong on some of this, but my use case (due to my employer's industry being finance) is lower latency, where Linux absolutely and positively crushes anything else.

    [1] http://events.linuxfoundation.org/sites/events/files/slides/100G%20Networking%20Toronto_0.pdf


Mellanox is now one of the role model vendors in the FreeBSD ecosystem. They have a handful of BSD developers as well as sales and support staff that are in tune with the needs of high scalability FreeBSD users.


Maybe I was wrong on some of this, but my use case (due to my employer's industry being finance) is lower latency, where Linux absolutely and positively crushes anything else.

Actually, while we're on the subject, SmartOS with CPU bursting from illumos is the leader in low latency trading:

http://containersummit.io/events/sf-2015/videos/wolf-of-what...


That is a slick platform they've built, but I still don't see how it is competitive with Linux for very low latencies. He mentions trading at microseconds, but we're building microwave radio networks to trade at nanoseconds. Unless this has changed extremely recently, Solaris/Illumos and hence SmartOS still don't have tickless kernels. I recall Solaris having a 100hz tick by default which you could change to 1000hz with a special boot flag. Linux has had dynticks since fairly early 2.6 kernels and with the modern 3.x kernels (RHEL7+), you've got full on tickless via the nohz_full options. Without this, the kernel interrupts the application to use cpu time.

Additionally, I don't believe (Experts please correct me if this is wrong) SmartOS has an equivalent to Linux's isolcpus boot command line flag (or cpu_exclusive=1 if you're in a cpuset) to remove a cpu core entirely from the global scheduler domain. This prevents any tasks from running on that CPU, including kernel threads. Kernel threads will still occasionally interrupt applications if you simply set the affinity on pid 1 so that does't count.

These two features, along with hardware that is configured to not throw SMIs, allow Linux to get out of the way of applications for truly low latency. As far as I'm aware, this is impossible to do in Solaris/SmartOS. I'm not even getting into the SLUB memory allocator being better or the lazy TLB in Linux massively lowering TLB shootdowns, etc, etc. There is a reason why virtually every single major financial exchange in the world runs Linux (CME in Chicago, NYSE/NYMEX in New York, LSE in London, and Xetra in Frankfurt), it is better for the low latency use case.


You asked for an expert to correct you if you're wrong, so here it is: this is just completely wrong and entirely ignorant of both the capacity of the system and its history.

On timers: we (I) added arbitrary resolution interval timers to the operating system in 1999[1] -- predating Linux by years. (We have had CPU binding and processor sets for even longer.) The operating system was and is being used in many real-time capacities (in both the financial and defense sectors in particular) -- and before "every single major financial exchange" was running Linux, many of them were running Solaris.

[1] https://github.com/joyent/illumos-joyent/blob/master/usr/src...


Thank you Bryan for the correction, I did after all ask for it :)

One final question while I've got you that your response didn't seemingly address. Does the cyclic subsystem allow turning off the cpu timer entirely ala Linux's nohz_full? If so, I stand corrected.


Yes, it does -- the cyclic subsystem will only fire on CPUs that have a cyclic scheduled, which won't be any CPU that is engaged in interrupt sheltering via psradm.[1] This is how it is able to achieve hard real-time latency (and indeed, was used for some hardware-in-the-loop flight simulator systems in the defense sector that had very tight latency tolerence).

[1] https://illumos.org/man/1m/psradm


You should really chat to some HFT folk in NYC before making that conclusion.


This is an adolescent evaluation. FreeBSD will have a new TCP stack with BBR made public in a couple months. It will be easier to correctly deploy and more cohesive than Linux. The entire packet path is more cohesive and easier to debug and tune using DTrace although Linux might have caught up here recently. By volume, FreeBSD is doing at least 30% of Internet facing traffic between a well known company and some quieter giants. BTW it only took a half dozen people collaborating between 3 companies about 2 years catch up to state of the art.

I've done a great deal of reading and research on OS ethos, IMO a thriving and production worthy operating system can be maintained with as few as 40 people in total. The superiority of Linux feels exaggerated, and systems innovation has chilled because of it.


What's WIP about FreeBSD's RSS?


I'd have to recheck. Linux has done some interesting new systems engineering work with BPF kernel bypass to improve network performance (the eXtreme Data Path project, used by Facebook).


"""Has Linux reached parity with BSD in terms of the TCP stack?"""

Im not sure what you mean. Linux has led TCP implementations for a decade now.


Two years ago, Facebook was trying to hire someone to make Linux's network stack as good as FreeBSD's:

http://m.slashdot.org/story/205565


I work for Google and previously worked at LBL on a team that developed optimized network stacks. I have a fair amount of experience running large scale data transfers as well as low-latency network in LANs.

The Linux network stack is great. It's the preferred system of choice for nearly every researcher in the networking field. I don't know what Facebook meant in their case.


You have more experience than I do in this area, but I am going to reply anyway in the belief that I can contribute some useful information to the discussion. First, there is a discussion of this here:

https://www.quora.com/How-is-FreeBSDs-network-stack-superior...

The main remark seems to be:

> The predominant difference is that the FreeBSD network stack was much more carefully designed. The Linux stack was less careful and thus is much more haphazard. Also, more work has been put into optimizing the FreeBSD stack.

It is not my area of expertise, but the Linux skbuf seems to fit the description of haphazard while the FreeBSD mbuf seems to fit the description of more carefully designed. The same could be said about epoll versus kqueue.

The remark about more work in optimizing the FreeBSD stack also seems to be true. While I cannot speak for everything in FreeBSD's network stack, I do know that FreeBSD's netmap far exceeded anything Linux could do at the time and while it is available on Linux, I never hear of it being used anywhere but on FreeBSD:

http://info.iet.unipi.it/~luigi/netmap/

Development of FreeBSD's network stack had plenty of innovative things in development at the time Facebook's post was made:

https://wiki.freebsd.org/201405DevSummit/NetworkStack

That included additional contributions from a major network equipment vendor that had made many contributions throughout the years. If I checked the commit history, I imagine I would find performance work done by said vendor. From what I can tell, FreeBSD's network stack is improving regardless of whether the rest of us hear about it.

Lastly, there have been multiple things discovered to be wrong in the Linux network stack since that facebook job listing. Two prominent ones that I recall offhand are:

https://blog.cloudflare.com/the-story-of-one-latency-spike/

https://bitsup.blogspot.com.au/2015/09/thanks-google-tcp-tea...

They both could fall into the category of stability problems to which facebook had alluded. The second one more so though:

> The end result is that applications that oscillate between transmitting lots of data and then laying quiescent for a bit before returning to high rates of sending will transmit way too fast when returning to the sending state. This consequence of this is self induced packet loss along with retransmissions, wasted bandwidth, out of order packet delivery, and application level stalls.


For the cloudfare example they didn't actually identify any problems with Linux. What they learned- correctly- is that TCP autotuning needs to be enabled if you want high performance out of your network stack.

This is covered by my previous team's page: https://fasterdata.es.net/host-tuning/linux/ Note: "On newer Linux OSes this is no longer needed." (IE, it's already set properly).

For the second one, they fixed a bug in Linux TCP cubic implementation. FreeBSD didn't get cubic until 8.2, which was around 2009. So, you're criticizing Linux for having a in a bug in a feature that FreeBSD didn't even add until 7 years ago.

Again, I will repeat: I worked on a team that did multi-OS TCP/IP optimization. What you're describing in terms of oscillation is a well-known problem in many implementations. All of the people doing research on this are now using linux as their platform for research and development.


If you "worked on a team that did multi-OS TCP/IP optimization", then you should know that there is no one size fits all solution that is quantitatively better for everyone.

Not implementing cubic in FreeBSD when there was a bug in the only implementation of it in the world could have been an advantage in certain situations, including Facebook's.

There seems to be a hubris by many Linux users that Linux is the best solution in the world for everything and it is not. There is always someone who does something better. Maybe not in everything, but the same applies to Linux. No matter how good it becomes, it is not the best in everything. Networking is a broad topic. I don't think Linux is the best in every area of networking. I am not even sure if it is the best in many of them, given that many platforms do things very well and at some point, it is hard to be better.


My opinion on this as it always has been is Linux gets some new features first before freebsd but they are always done sloppily, half assed, and start getting refined and fixed about the time freebsd implements the same feature after thinking about it, designing it well, then implementing it.


With regard to https://wiki.freebsd.org/201405DevSummit/NetworkStack saying they had plenty of innovative things is really misleading. Most of the comments are "linux has this nice thing that works well, let's copy it".


In the BSD community this awareness of other operating systems is seen as a strength and pride. It's telling that Linux fans see it as weakness.

Subsystems are now done with up front design and some degree of consensus in the BSDs, closer to the cathedral and commercial development than the bazaar of Linux. This necessarily means we are not usually at the forefront of cutting edge features. It doesn't necessarily mean we don't have features before Linux; if the idea exists in academia or other OSes enough to reason about it's reasonable to propose, design, and build. Netmap is a good example. The new FreeBSD selectable TCP stacks are another, where we avoid incremental growing pains and baggage. When these designed features hit, they tend to be coherent, usable, obvious, and lasting.

My opinion of Linux features is that little due diligence was done, especially public acknowledgement of inspiration and why one route was taken over another. For instance, the Linux KPIs are littered with questionable decisions made in isolation. epoll and the various file notification calls are examples. That attitude manifested strangely up to userland through IPC/DBus with the continued systemd drama.

A little bit of logical inference.. there are financial drivers vendors are fleeing the Linux kernel in preference of userspace (i.e. Intel's DPDK and SPDK). One is licensing, which is not an issue with BSD nor userland. The other is the rate and quality of KPI churn. Linux KPIs break all the time, switch licenses all the time, and it is a general nuisance to maintain a vendor tree whether it is open or closed source. The good side is that hopefully drivers and products end up open source. The bad side is, in many modern usages, that does not happen because GPL is not relevant to hosted services, as well as low motivation/quality/incentive/license violation for IoT type things. The BSDs start with no pretense of GPL nor flippant APIs, so it is a lot more comfortable to consume and build great products.


That is only in 4 items and Linux only appears in a few comments on each of them.

This remark seems more to me like a statement of belief that no one else can do good things other than Linux. That is far from true.


"In linux these are managed with device-independent tools, which is much better than the custom methods we have now, and avoids polluting the ifnet with extra information."

"In linux, buffers in the tx queue hold a reference to the socket so completions can be used to notify sockets. Implementing the same mechanism in FreeBSD should be relatively straightforward. "

"We don’t have software TCP segmentation, we have to carry information in the mbufs. Performance was doubled, without hardware support, by doing segmentation very low in the stack, right before input into driver. (Student project.) Linux calls this approach GSO, pushing large segments through the stack; the hardware can do segmentation if supported, otherwise we do it at the bottom layer. Simplifies TCP code since you can send arbitrarily large segments. "

"Linux has their standard ifnet interface, with a single pointer to the extensions; if the interface does not support them, the system still runs. If it does, have interfaces to configure numbers of queues, numbers of buffers, etc. All of this is slow-path (configuration) code. Think we should go for a similar route — ease configuration of 10gig interfaces"

the rest of the stuff in there is just low level optimizations to update the design that was written out in the original FreeBSD book.

I never said that people can't do good things in OSes other than linux. I said that Linux's networking stack has been better than BSD's for ten years, I can cite numerous factual arguments and research papers to support this, along with my extensive experience with linux (my experience with BSD is less, but enough to know it's stack isn't magically better.


If you want to say one is better, then you ought to at least define what being better means. "better" clearly does not mean the same thing to both of us.

Linux does have plenty of nice things and plenty of nice work, but I am not going to dismiss everything being done elsewhere by declaring Linux to be "better". At best, I would say that it is ahead in some areas, behind in other areas and the same in many areas. As for what some of those "other areas" are, I recall Adrian Chadd implementing time division multiplexed atheros wifi support in FreeBSD that Linux does not have. Netflix also contributed a rather nice thing to FreeBSD that Linux did not have:

http://arstechnica.com/security/2015/04/it-wasnt-easy-but-ne...

There are plenty of nice things in both platforms. Labelling one as "better" just doesn't do justice to either of them. It ignores opportunities for the "better" one to improve by denying that opportunities for improvement have been demonstrated to exist. It also denies the "lesser" one the acknowledgement of having done something worth while.


the BSD feature you mention was added to linux 6 months later.

When I say something is "better", I mean "I've looked at the data, and integrated over a wide range of parameters".

I'm still waiting to hear about a magical BSD feature that is better. That hasn't happened in about 10 years, hence my statement.


I just linked one feature that by your own definition was better and you replied that Linux got it in 6 months, rather than "I did not realize FreeBSD does some great things in networking first".

If you are as experienced in networking as you claim, you should stop waiting to hear about magical features that are better. Nothing will ever impress you as being magical. That is a downside of having experience.

Maybe you would find talking to an actual expert on FreeBSD's network stack more interesting. I am not one and while I could list several other things I know, I am clearly is not doing it justice.



I definitely wish dtrace and zfs had come to linux earlier, and without the stupid license restrictions of ZFS which Canonical currently flouts. Both of those were far better than the alternatives that Linux came up with.


That's a great find. And look at what Facebook have done: invested in BPF and eXtreme Data Path.


that job posting provides no specific details about how BSD is "better" than Linux execpt making unspecific allegations about stability.


I imagine that it worked better in their testing and rather than jump ship, they decided to invest in making Linux better. If they knew how FreeBSD was better at that time, they likely would have patched Linux rather than look for talent that could figure it out.


Maybe ten years ago when the first generation of PCI-X Intel chipset 10GbE NICs were available for servers, the driver support and tcp offload engine support were much better on FreeBSD. But the situation is reversed now with all of the drivers and updates to the v4.x series Linux kernel.


ADB was merged two days ago. That closed much of the gap between ZoL and Solaris ZFS in one stroke. :)


ADB?


ARC Data Buffer. It replaces SLABs in ZIO buffers with scatter gather lists.


Brendan Gregg? What does he know about Solaris?!?

(I keep the DTrace book within reach when I sit at the keyboard. This is fan mail. Many thanks, for your work has helped me become a better computer person.)


> FreeBSD

Why not OpenBSD? I'm not an advocate of either; I'm trying to learn more about their usefulness in real world applications.


OpenBSD is awesome, but if you're a performance engineer it won't be your first choice - the CPU scheduler/NUMA support/... isn't great, there are no advanced performance-tracing tools like DTrace or Linux'-perf-flavour-of-the-week, etc.


To be fair, FreeBSD's NUMA support is nothing to write home about, and we tend to avoid NUMA hardware (though I have done a cool hack to improve VM scaling using fake NUMA domains that I really need to upstream).

SMP scalability in general is far ahead of OpenBSDs the last time I looked, as is device support for 100G NICs, NVME storage, etc.

The performance monitoring is also far ahead on FreeBSD, with tools like Dtrace, Intel's PCM tools, and Intel's VTune available for FreeBSD.


Brendan, The difference between Solaris and Linux is mainly scalability. Linux scales well on clusters such as SGI UV3000 scale-out servers, or top500 supercomputers. These scale-out clusters serve one scientist starting HPC number crunching workloads 24-48h. Scale-out workloads are easy to parallelize doing a calculation on the same set of grid points, over and over again. All this fits into a cpu cache and can run on each separate compute node. All SGI UV2000/UV3000 use cases are HPC number crunching, analytics, etc.

OTOH, enterprise business workloads (SAP, OLTP databases, etc) typically serve thousands of users simultaneously. They do pay roll, accounting, etc etc. Such workloads can not be cached in the cpu cache, so you need to go out to RAM all the time. RAM is typically 100ns, which corresponds to 10 MHz cpu. Do you remember 10 MHz cpus? This means business workloads have huge scalability problems because you need to place all cpus on the same bus, in one single large scale-up server. If you try to run business workloads on a scale-out server, performance will drop drastically as data is shuffled among nodes on a network, instead on a fast bus.

Thus, business workloads use one single large scale-up servers, with max 16 or 32-sockets. This domain belongs to Unix/RISC and Mainframes. HPC number crunching use large clusters such as SGI UV3000 which has 10.000s of cores.

The largest Linux scale-up server is the new HP Kraken. It is a redesigned old Integrity Unix server with 64-sockets. The x86 version of the Integrity maxes out at 16-sockets only. Other than that, the largest x86 server is vanilla 8-socket servers by IBM, HP, Oracle, etc.

Linux devs only have access to 1-2 socket PCs so Linux can not be optimized nor tested on large 8-16 socket servers. Which Linux dev have access to anything larger than 4-sockets? No one. Linus Torvalds? No, he does not work on scalability on 16-socket servers. There is no Linux dev working on scalability on 16-socket servers. Why? Because, until last year, 16-socket x86 servers hardly even existed! Google this if you want, try to find a 16-socket x86 server other than the brand new HP Kraken and SGI UV300H. OTOH, Unix/RISC and Mainframes have scaled to 64 sockets for decades.

Look at the SAP benchmarks. The top scores all belong to 32-socket UNIX/RISC doing large SAP workloads. Linux on x86 has the bottom part, doing small SAP workloads. The HP Kraken has bad SAP scores, considering it has 16-sockets. It is almost the same as the 8-socket x86 SAP scores. Bad scalability.

Thus, if you want to run workloads larger than 2-4 sockets, you need to go to Unix/RISC. Linux maxes out at 2-4 sockets or so. The new Oracle Exadata server sporting SPARC T7 (same as the M7 cpu) runs Linux and it maxes out at 2-sockets. If you want 16-socket workloads, you must go to Solaris and SPARC. All large business servers, use Unix or Mainframes. No Linux nowhere.

Linux = small business workloads. Solaris = large business workloads. And the big money is in large business servers. If Oracle kills off Solaris, then Oracle is stuck at 2-4 sockets (small revenue). Only Solaris can drive large business servers (big revenue).

It does not make sense to kill of Solaris, because then Oracle can not offer (expensive) large business servers. Then Oracle will be stuck at small cheap business servers with Linux and Windows.

Regarding Linux vs Solaris code quality: https://en.wikipedia.org/wiki/Criticism_of_Linux#Kernel_code...


From my perspective, if this rumor is true, it's a relief. Solaris died the moment that they made the source proprietary -- a decision so incredibly stupid that it still makes my head hurt six years later.

Fortunately, Solaris was open long enough that we in the open source world were able to fork it with illumos[1]. And because illumos became the home for many of us that brought Solaris its most famous innovations (e.g., ZFS, DTrace and zones), it should come as no surprise that we've continued to innovate over the last six years. (Speaking only for Joyent, we added revolutionary debugging support for node.js[2], ported KVM to it[3], completed and productized Linux-branded zones[4], added software-defined networking[5] and developed first-class Docker integration[6] -- among many, many other innovations.)

So illumos (and derivatives like SmartOS, OmniOS and DelphixOS) is vibrant and alive -- but one of our biggest challenges has been its association with the name "Solaris": I don't think of our system as Solaris any more than I think of it as "SVR4" or "SunOS" or "7th Edition" or any of its other names -- and the very presence of Solaris has served to confuse. And indeed, it is my good fortune to be working with a new generation of engineers on the operating system -- engineers for whom the term "Solaris" is entirely distant and its presence as an actual (if proprietary) system befuddling.

So if the rumor is true (and I suspect that it is), it will allow everyone to know what we have known for six years: Solaris is dead, but its innovative spirit thrives in illumos. That said, I do hope that Oracle does the right thing and (re)opens Solaris -- allowing the East Berliners of proprietary Solaris to finally rejoin us their brethren in the free west of illumos!

[1] https://www.youtube.com/watch?v=-zRN7XLCRhc

[2] https://github.com/joyent/mdb_v8

[3] https://www.youtube.com/watch?v=cwAfJywzk8o

[4] https://www.youtube.com/watch?v=TrfD3pC0VSs

[5] http://dtrace.org/blogs/rm/2014/09/23/illumos-overlay-networ...

[6] https://www.joyent.com/blog/triton-docker-and-the-best-of-al...


While I'd love to think that illumos will rise and be great like Solaris was again, after several years I now think that's an incredible long shot.

The death of Solaris may well be a death blow to illumos as well. It sounds like Oracle, the owners of the Solaris code and copyrights, aren't seeing a future for it. That's an incredible vote of no confidence from the very owners of the code. And the positive energy they have put into Solaris at large for years (marketing, sales, staff) will cease.

While I loved Solaris and illumos back in the day, in the end I'm glad I left and switched to Linux and FreeBSD. I'm working on similar technical challenges with much bigger impact. It's been more difficult, but also more rewarding.


Is it fair to judge Solaris by the fact that Oracle has mismanaged it? I don't mean to imply that Solaris necessarily had a brilliant and uncomplicated future before the acquisition, but it turned it from a product developed as a (profitable?) labor of love into a tool to extract ransoms from hostages. A vote of no confidence from a parasite just speaks to how valuable it is to the parasite, not any inherent value.


I know I speak for many in the BSD camp when I say we thank you too for ALL your work in our corner of the open source world.


What would we need to see from Illumos for it to qualify as being great like Solaris was?


It's already great: SmartOS is the best one can get when it comes to running a cloud, public or private, and if that cloud must, without compromise, function correctly in the face of even the most severe failures, hardware or software wise. Zones + ZFS + fault management architecture (fmadm(1M) / svcadm(1M)) make it possible.

Have a piece of software which must run on GNU/Linux? No problem, it'll happily run inside of an lx-branded zone with zero performance penalty, where both it (/usr) and the illumos native commands will be available (/native), so one can keep one's cake and eat it, too. Otherwise - there are 14,000 packages ready to run, something Solaris never, ever had.

It's not a desktop operating system, it doesn't have that kind of a mass adoption. But on the other hand, when one considers just how Windows-like GNU/Linux became (systemd), it's better that it doesn't: it does one thing and does it well, and that's powering the high performance, massive clouds. For desktop, there's macOS, and that's fine.


FWIW the packages are essentially NetBSD and the dependencies can be spectacular (pkgin in git). But lx works really, really well.


"Vibrant and alive" matches my experience with the SmartOS community. I have watched daily discussions with Joyent folks and the community, and it has been a delight.

Another factor feels critical for me as well. Troubleshooting has felt much faster on SmartOS and Triton due to the quality of logging and monitoring methods. Troubleshooting feels like O(1) because one often knows where to look and the tools are there to gather the data.

Triton and SmartOS are killer technologies, but the quality of interactions with the community are no less so. That's what makes them true open source, IMHO.


Bryan, I just wanted to say thanks for everything. I remember attending a Sun event (as a corporate C dev) where they introduced this thing called DTrace, and a very excited young engineer (clearly the smartest guy in the room), managed to infect us all with his enthusiasm. Though I haven't worked with Solaris much since then, I always was impressed with the quality of the engineering. I think I might check out Illumos this weekend :-)

Edit: Apologies for misspelling your first name.


Thank you for the kind words! One of the things that's exciting about illumos right now is I see so many young technologists who remind me of that excited engineer you describe. ;) For example, look at the presentations from this year's OpenZFS Developer Summit[1]. Yes, there are some established names there, but there are also new ones -- young engineers who are attracted to this system for the same reason I was two decades ago: for not the system itself but for its community of talented, passionate technologists who emphatically believe in innovation in the operating system.

So hoping that you do indeed check out illumos this weekend; I think you'll find that while some of the names have changed, the spirit remains vibrant!

[1] http://open-zfs.org/wiki/OpenZFS_Developer_Summit_2016#Prese...


A quickie take can be found on LX zones in this slideshare deck, also by bcantrill http://www.slideshare.net/bcantrill/illumos-lx

Big fan of Solaris and zones, though at the moment using a mix of other technologies.

One thing I did notice about Solaris at least in the Linux 2.6.x days: Solaris is amazing at handling low-memory situations. Once I logged into a server that was swapping continuously via SSH and had about 2MB RAM left over - it was still somewhat response; while under Linux of that era it would have bogged down under the same situation.


Even current Linux kernels behave very poorly under memory pressure (ssee the various 'kswapd 100% CPU issues', but also many issues with OOM, kernel panics and so on).


No kidding. If not for earlyoom [0], every few hours my machine would grind to a screeching halt with the hard drive thrashing (and yes, I got rid of swap ages ago but it still happens) because the kernel doesn't know what to do with large amounts of RAM being used. Before discovering earlyoom, I would powercycle my machine whenever it happened because a powercycle was faster than waiting for the kernel to finish its tantrum.

[0] https://github.com/rfjakob/earlyoom


Solaris does some odd things when emulating Linux memory. IIRC Linux will "always allocate" then randomly shoot things in the head if it overstepped the mark. Solaris will block until it can allocate the memory but that can be a long, long time. It's also possible (probably only on 'too small' boxes) to allocate memory faster than the ARC can get out the way (you can limit it, https://gist.github.com/RantyDave/4c3a3683a5403040434dda2ead...).


Is there a family tree anywhere of OpenSolaris/Illumos derived distros and what their userspace utilities 'look' like?

I'm mostly interested as a developer of config management tools where our support tends to look like "shrug, probably acts like solaris". I just want a rosetta stone for those distros, particularly when it comes to packaging and service management.

I'd be nice to know which ones are dead and which ones aren't as well, we're still carrying around definitions for nexentacore that i'm not sure are useful to anyone any more.


They have a base list at http://wiki.illumos.org/display/illumos/Distributions but it probably is a starting place and not quite detailed enough for the information you want.


Thanks that's actually a better start than anything else I've seen.


@bcantrill - I read about your implementation of the Docker remote API through Triton. Is this something that's open source that we can play with? The Docker Captains were talking about platforms available for the Docker engine today


@alexellisuk - If you need any help exploring docker on triton or lx-branded zones on vanilla smartos, you should definitely stop by #smartos on irc.freenode.net. There are about 40-60 active high quality SysAdmins / Engineers that can answer any questions and point you in the right direction. I've never received such incredible support, and I'm not even a customer. I recall one event where I was trying to run a KVM branded zone on a CN running on an esxi host using the vmx3net drivers and it would just core dump. 2 hours after talking to (The Man, The Myth, The Legend) rmusttachi, he had a new platform image compiled and running that fixed the min mtu size bug that was in the illumos vmx3net driver. "NEVER EXPERIENCED ANYTHING LIKE THAT" in any other community. Alexellisuk, I will warn you... once/if you switch... trying to go back is difficult, and your forehead might get sore, depending on how hard you slam it on your desk when you try to use something like "Mesosphere". "Come on everyone, lets create custom docker images to handle dynamic Marathon port assignments because I don't have an IP" (dig + awk to find https... ...isn't https 4...4..."STOP" "WRONG" it's ${PORT4} which is 10240... ...I sadly live in this reality at the moment) /barf.



I dunno, I think the word "Solaris" helps as being an enterprise (tm) thing - stops SmartOS from being a mere fringe OS into being a fringe OS that people have and do rely upon.


I'm actually surprised it took this long. I did unix admin work, including SunOS and Solaris since the early 90's.

I was a big fan of Solaris, and it had an edge over Linux for quite some time...as did the Sparc hardware over 32 bit x86.

The writing on the wall was around 2003, when AMD opteron servers came out. 64 bit Linux on dirt cheap, fast, servers.


Sun responded swiftly and justly with OpenSolaris for x86. It's a lot easier to go evaluate this in hindsight and see that cheap x86 was not really a barrier, OpenSolaris could have become another Java level of ubiquity save politics and inertia. I remember how tainted things seemed when reading LWN.net.. which in retrospect were largely FUD around the CDDL and fear of actual competition. The continued rise of Linux was really just inertia at that point, it was largely inferior outside desktop use, but had too much sunk cost investment from IBM, SGI, HP (where a lot of the real scalability culture was imported from - RCU, NUMA, locking refinements in particular).

UNIX rose to prominence because it's what everyone learned in college in the '80s. Linux, because you installed it on your laptop or a VM in high school or college late '90s. Don't underestimate the power of this.


I disagree regarding Solaris x86.

There was a lot of commercial software that either didn't have Solaris x86 binaries at all, or only had 32 bit binaries.

It was arguably "better" from a purely technical view, but cheaper beat out better.


That sounds very believable and would have slowed commercial SPARC users from switching to whitebox x86, likely intentionally, but would have had no relevance to startups and the rapid onslaught of hyperscalers.

A video that amuses me with respect to engineering culture and organization blindness is Cantrill doing a DTrace demo at Google in 2007 https://www.youtube.com/watch?v=6chLw2aodYQ. The audience seems completely unaware about the significance of what they are seeing. The length of time between Linux getting cogent tracing support is telling. GOOG could have single-handedly propped up an extra-Sun OpenSolaris community, and there would have been nice symbiosis considering their early container usage and how long that took to grow as well.


Sun changed its mind about whether SunOS / Solaris x86 was going to be a real, serious thing several times over the years. I wouldn't blame vendors for steering clear of the mess until it worked itself out. Which it certainly did...


It was arguably "better" from a purely technical view, but cheaper beat out better.

The commodity always wins. Never forget that.


Steve Jobs might disagree :)


Solaris had some neat integration with Java, containerization way before lxc, and DTrace which still lives on.

The reason I think it didn't rise to ubiquity in the same way that Linux did is the lack of customization.

One can easily customize the Linux kernel for their use case (i.e. Embedded), compile it, add busybox/dropbear, and you have a decent starting point for an embedded OS. You couldn't do the same with Solaris.


In pre-Linux world Solaris was the best Eco-system out there. Working with various flavors of Unix was frustrating compared to Solaris. It became apparent that Sun was in trouble right after Linux started gaining a foothold.

Now that I am thinking about that time it was like over night. Linux over took everything else so fast. I personally used Linux and BeOS in 1999-2001 and always thought Linux was coming and then it just happened.


Oh BeOS. One of the best promises that failed to succeed.


BeOS was really amazing. You could encode video from two video capture cards at the same time on a 600Mhz P3. You could turn off a processor in a multi-cpu system.

Be realized they couldn't compete with Windows, so they wanted to sell dual-boot boxes. But the Microsoft EULA for OEMs banned that (similar to OHA and Google not allowing Google-Android manufactures to create Amazon-Fire products).


> Be realized they couldn't compete with Windows

Crazy as it sounds BeOS wouldn't run on X86 for the first few years. They were trying to capture away Apple people. Also it was close to being the OS X successor, but Apple walked away when BeOS upped the price.

https://www.wired.com/2015/05/os-almost-made-apple-entirely-...

PS The only desktop OS that could pull off this stunt. Play a bunch of videos and music files and unplug the computer. Boot back up and everything is playing again just as you left it.


Interestingly it wasn't initially targeted at the PowerPC (let alone the PowerMac), the original BeBox prototypes used 2 AT&T ATT92010 Hobbit processors and 3 AT&T DSP3210 DSPs.

https://web.archive.org/web/20110806044928/http://www.bebox....


I always thought of BeOS as basically a reboot of the NeXT. Same niche, same promise, same flop ;-(. BeOS was later so they avoided the H/W side-road. A good thing.


No, it was a different niche, different promise and only one of them flopped (after all NeXTSTEP lives on as macOS, iOS, watchOS, tvOS, etc)

NeXTSTEP was Steve Jobs' attempt to build an OS that fulfilled the promise of what he saw during his visit to PARC (as opposed to just the graphical interface which is what was implemented with the Mac). It was a true-multi-user Unix with a beautiful UI and an object-oriented framework that was far more influential than its marketshare would have suggested (it led to Microsoft starting Cairo, IBM building WorkplaceOS, and Apple/IBM sinking fortunes and thousands of man-hours into Taligent/Pink).

BeOS was Jean-Louis Gasse's attempt to build a successor to the Mac (including Quicktime which came after Jobs) but built to be multiprocessor and SMT (symmetric multitasking) friendly from the start. It was --like the Mac-- a single-user OS but intended to extract all the performance possible from "modern hardware"

ACCESS Co. the current owner of the PalmOS and BeOS assets basically frittered away whatever potential BeOS had. So it definitely flopped.


They didn't - BEOS was first launched only for their own line of hardware, the BeBox - https://en.wikipedia.org/wiki/BeBox


Didn't know that. The numbers sold are tiny.


Technologically it didn't flop, nor did Amiga. Inferior technology won in the 90s.


> You could turn off a processor in a multi-cpu system.

I saw a BeOS demo of turning off processors. The GUI allowed you to uncheck all the processor checkboxes and the machine goes dead. :)


I loved the sense of humor in the API as well.

int32 is_computer_on(); //Returns 1 if the computer is on. If the computer isn't on, the value returned by this function is undefined.

double is_computer_on_fire(); //Returns the temperature of the motherboard if the computer is currently on fire. If the computer isn't on fire, the function returns some other value.


It really was. IIRC, I remember installing it from a floppy (was there a CD too? My memory is fuzzy) on a Pentium based computer in the late 90s (R4 maybe... I think the floppy is buried in a box in my basement).

I was amazed that back then I could have four (!) windows open playing videos (albeit at low res) at the same time without hiccups -- doing it Windows 95 on the same box would choke it up.


But some are still trying to keep it alive: https://www.haiku-os.org/


I know. So brilliant. Still remember the first demo I saw. Just floored me.



I like how the Australians call layoffs "being made redundant"


It's actually a legal term. A redundancy in Australia means that the position you fill is no longer required. You're paid out accordingly to be sacked and the company cannot hire for that exact position (title, role, job) for some time.

The determination if a company violates that (as in, they're getting rid of YOU not the position) becomes a whole set of legal arguments.

But essentially/idealistically;

redundancy = getting rid of the position and it can't come back under another name.

firing = getting rid of you as a person.


Oddly enough, in the US it's legally advantageous for a company to disguise a firing as a layoff, but not the other way around.

Why? Because if you get laid off, you can collect unemployment, but if you get fired for cause, you can't. But if you think the company pretended they had cause when they didn't, you can appeal, and the company will have to spend considerable resources defending their position that they had cause. As such, many employers legally classify all firings as layoffs because it's often not worth the hassle. And there's no penalties to doing so, either.

So if you get fired but the company officially considers it a layoff, it's a good thing for you: you dodged a bullet.


Unemployement benefits, and how they're financed, vary by state. In NC, they keep track of a company's layoff history, and adjust the unemployment tax (levied on the employer) according to that history (hypothetically). There was a series of articles about some companies evading these and other taxes in 2012[0]. I also remember reports about unemployment tax shenanigans in New Jersey, but I can't find them quickly.

[0] http://www.newsobserver.com/news/special-reports/the-ghost-w...


I worked for a large company (in the US) a few years back, and they're in chipmaking so it's common to staff up / down based on market conditions. So what the management would do is, keep a list of people to eliminate in the next 'RIF' (reduction in force), and next time a RIF came around they'd clean house. Nobody was ever fired from that company.

Of course, as these things go, it became an verb in management speak. "Yeah, remember so-and-so? He was ok but didn't really get it so we RIFfed him last year".


That's also commonly used in the US.


I'm surprised they don't list the IBM term: "RA" ("resource action").


Like we needed more reasons to find 'newspeak' abhorrent ...


Thanks! That was a new one for me.


Looks like they're doubling down on cloud. Met with an Oracle recruiter yesterday. Shit load of money being paid to poach from AWS/Azure. I think they're too late to start building out a full blow cloud offering.


AWS just announced their Postgres RDS is HIPAA compliant yesterday. I imagine the rest of Federal restricted data usages can follow shortly soon after (I think AWS is already certified in some areas?).

Given that, I see no reason why anyone should indulge Oracle or patronize them given their revenue-model.


AWS has also been certified to be PCI Level 1 compliant for a few years now.

As far as Oracle Cloud appeal is concerned - I can totally see the big "enterprise" type IT departments using Oracle/Weblogic stack going for it at least in the "paid POC" type mode to get things rolling.


> I can totally see the big "enterprise" type IT departments using Oracle/Weblogic stack going for it at least in the "paid POC" type mode to get things rolling.

As someone who works at an "enterprise" - the default is AWS. They have the consulting network, the certifications, and the list of other big companies already using them. Their biggest challenger is Azure, because Microsoft are already in the enterprise, and have good stories to tell around helping you cloudify your Office deployment model, Exchange, etc etc etc. At that point "hosting VMs" is an easy upsell for them.


"Nobody's gotten fired for buying, uh, Oracle."


Browse federal websites and check how many Sun icons you still see. My hypothesis: Oracle killed Solaris once they realized USDS and 18F had solid long term growth prospects.


I don't track these things closely, but hasn't 18F come to be considered a disappointment in terms of results-per-dollar-spent, so far? (Not to mention results-per-unit-hype)


I think between USDS and 18F, 18F has been showing more stable growth and they're on an eat-what-you-kill budget.


Good to know. I've been more-or-less able to keep clear of the government purchasing super-fun-games for a few years and anything I may have ever known about that world has gotten stale. Thanks!


I believe it was already HIPAA compliant (or else I need to go into hiding).

The path to HIPAA compliance in AWS is just arrange to get a business agreement with Amazon.


Amazon's BAA allows you to use only approved services [1] to process, store, and transmit ePHI. The list isn't very long, it's currently just 10 AWS services, and it doesn't include some basic ones like SQS. RDS with PostgreSQL was just added to that list this week (Aurora was also added, which is neat because now that it has a PostgreSQL front-end, I have two reasons to play with it).

[1] https://aws.amazon.com/compliance/hipaa-compliance/


RDS was only HIPAA for MySQL and SQL Server. PostgreSQL certification is brand new and a huge deal for some of my projects.


So I suppose this means storing data in Postgres RDS is HIPAA compliant now by default? I am not an expert in this, but I do have to sit through a day of training every year for this.

You should probably be doing a bunch of other things to be HIPAA compliant in AWS, it's not just a box you check off.

In the past you could be HIPAA compliant and use Postgresql RDS by signing a business associate agreement and doing things like using dedicated instances in their own VPC.


So I suppose this means storing data in Postgres RDS is HIPAA compliant now by default?

At a minimum, you'd still have to sign that BAA with them. I mention that not for you, but for anyone else at home thinking "oh, I can deploy RDS/PostgreSQL and be OK with HIPAA without doing anything else!" That's (still) not the case.

In logic terms, this certification is necessary but not sufficient. It's not sufficient by itself, but it is a hard requirement because RDS hasn't been covered under their BAA up until the last day or so. That is, it wasn't covered the last time I checked, maybe a week ago, but it is now today. This was confirmed by our AWS tech reps when we recently talked to them: they absolutely did not HIPAA certify PostgreSQL the last time we asked about it. And oh, how I promise you we talked about it.

In the past you could be HIPAA compliant and use Postgresql RDS by signing a business associate agreement and doing things like using dedicated instances in their own VPC.

Citation needed. We were told multiple time by our reps and solution architect that RDS+PostgreSQL was not certified in any way. The only AWS options we had for HIPAA PostgreSQL were 1) hosting our own instance (that is, not using RDS in any way, just plain old EC2) or 2) paying a third party for managed PostgreSQL hosting.


In the past (before this week), if you were storing ePHI in RDS with PostgreSQL, you weren't following the terms of the BAA.


They won't get far in the small player segment with their current try-before-you-buy policy. Their demo/test services suck.

https://cloud.oracle.com/en_US/tryit

You need to enter a phone to receive a verification code.

This is no longer a matter of convincing a few huge players. You need the mid/small size community to build vibe & hype. Oracle hasn't learn this lesson yet.


While late, they, like MS, have the name and history. It is the present day version of "won't be fired for buying IBM".


They've been building cloud a while (acquisitions of Nebula and other teams), but probably with an anemic team.


Well, if not for this being 2016, it would be the saddest news I've heard in a while, but this came within a week of Ron Glass dying, so I guess it's only the second-saddest news item this week alone.

I guess it'll still live on in Illumos and its distributions like SmartOS, OpenIndiana, etc., but still... Solaris brought the computing world so many innovations (NFS, ZFS, dtrace, etc.), and it's going to take a while for it to fully sink in that it's gone.


Solaris was finally done 10 years ago when instead of responding to what market needs, they were touting dtrace (and pouring into it all the dev cycle resources as i could see) as the biggest and shiniest feature. Well, that small and insignificant from market POV feature was really the biggest and shiniest thing (if compare to the rest of things :) in Solaris at the time (and has been since then) ... and that is the issue which did the Solaris. And the easy install, like insert the disk and go, windows and linux style ... i think some people still have PTSD from that.


Don't forget about both ZFS and Zones, both of which are pretty important and demanded by the market. Just ask Nexenta and Joyent.


Yeah, Solaris totally should have been focused on actual customer needs, like containers.


They did, they were called zones.


One of those times where a sarcasm punctuation would be useful.


I suppose it's too much to hope for that Oracle re-opens Solaris.


Getcher open-source fork of Solaris here: http://wiki.illumos.org/display/illumos/illumos+Home

It's been around for a while, initially started by some of the people flooding out of Sun in the immediate wake of the Oracle acquisition, in a (very slightly) less messy version of the Hudson/Jenkins split.


Would you mind pointing me at the Solaris kernel source code? Directly, please, and not a leaked version of it.


It's not Solaris -- it's illumos and has been for a long time. And the source code is available on GitHub.[1]

[1] https://github.com/illumos/illumos-gate


I explicitly asked for a direct link to the kernel source code. Because I never found that online. Your link is basically a pointer to the haystack.



Thank you very much, that seems to be what I was looking for. I had feared, the Solaris kernel hadn't been made publicly available.



> not a leaked version of it.

It's a fork. If you're interested how it came to be check out this epic talk by Bryan Cantrill https://www.youtube.com/watch?v=-zRN7XLCRhc



There is still SmartOS if you're into this:

https://en.wikipedia.org/wiki/SmartOS


Yeah, the Joyent stuff is best bet for FOSS Solaris lovers.


"Oracle says [claims of Solaris being canned] flat out wrong."

https://twitter.com/TheRegister/status/804451784324366336


Sun/Solaris/SPARC and running sunhelp.org was a major part of my life for over a decade.

This is sad to see, but the acquisition of Sun by Oracle pretty much started the downhill slide.


One can probably argue the other way, the downward (market) slide is what led to the Oracle acquisition.


Sad to see this happen, but I saw this coming. I set up an OpenSolaris NAS at home just weeks before Oracle closed Solaris 11, and (with the particular exception of ACLs) loved ZFS. Closing Solaris and offering a "free" download under vague, menacing terms sealed its fate. My NAS became an OpenIndiana system for a while, but it's not CentOS 7 + ZoL.

RIP Sun.


It amazes me that people still do business with oracle.


Thank God we've still got HPUX.


And AIX.


I can't tell if that's sarcastic.

It used to be said [...] that AIX looks like one space alien discovered Unix, and described it to another different space alien who then implemented AIX. But their universal translators were broken and they'd had to gesture a lot.

-- Paul Tomblin


I think they both are being sarcastic. I am migrating from Solaris to AIX (stupid government) at work and AIX sucks.


Well, I know I was being sarcastic.

Although I suppose that if you held a gun to my head and forced me to select a commercial Unix for a project, and if my stunned perplexity didn't get me killed, HPUX would be an admirable choice.


Wow, I didn't realize Intel still made Itaniums and that HP still sold box built around them. Or that HP-UX 11i was, like a Zombie, still alive somehow.


Grrr. Smit[1]. A non-optional (in many cases) tool to accomplish administrative tasks on AIX boxes. Making it nigh impossible to script common stuff that was easy on any other unixish OS.

[1]https://www.ibm.com/developerworks/aix/library/au-smit/


I've always used smit (or smitty) to do an initial configuration, then hit F6 to get the actual commands that get executed, and script the rest from there.


When you're IBM, you want everything to look like AS/400.... or clunky Java Swing apps.


I know IBM is weird...but AS/400 or whatever they call it now is still just an amazing platform. it's one of those things that ends up doing things that are super critical. in some ways it doesn't have real competitors. OpenVMS? If there's a infinite amount of money Non-Stop?


AS/400 was a multi language VM environment long before Java. It really was/is amazing. I wish there was a way for newer generations to learn about things like this. I'm not sure how that would look. Maybe a History of Systems book or something?


There are many retro computing enthusiast groups around (depending where you live - none here around Dublin, it seems) but I'm yet to see one dedicated to midrange or mainframe systems.

It is a shame. Many challenges we find today happen to have been solved in the 60s. Then in the 70s, then in the 80s...


Exactly – everyone does know their retro consoles and home computers, I can't find anyone who would have some interest in the systems you mentioned. Bummers.


Worst - even platforms that are still in use, such as IBM zSeries and iSeries are very poorly represented in tutorial space.


I'd love me some IBM i hands on day.


Is there a Hercules for iSeries?


It's not exactly retro either. I have a friend that works at an absurdly large financial institution...all of their credit card transactions clear through iseries/os400/?they_call_it...

EDIT: One thing that I love is the fact that they distributed apps in intermediate form and then compiled at installation time. (Sounds familiar, right?)


Heck I work on a AS400/iSeries/IBM i or whatever they are called.


The terminal font used to be beautiful...


I imagine many UNIX folks that never touched Aix aren't aware that around 2000 time-frame, the .so model used in Aix was similar to the Windows one.

There were import libraries and symbols to be exported needed to be defined in export files.

Of course, eventually they converged into the standard UNIX model for shared objects.


AIX really wasn't all that bad back when I was using it. It's very unlike HP-UX/*BSD/Solaris/Linux (sometimes annoyingly so), but it's generally rock-solid and very, very easy to manage.

Unlike what someone else wrote, it was IIRC always possible to do anything SMIT did via the command line, but frankly SMIT was often easier & faster than doing it by hand.

One of AIX's problems is that it didn't IMHO age well. It was designed for a different world than the one in which we live in now, and one thing it didn't feel that it was designed to do was evolve.

Would I recommend AIX for any project in the future? Hell no, because it would mean inviting IBM into the project, and that is second only to inviting Oracle into a project on the list of gross management errors. But it wasn't bad software for its time, and I don't think it deserves to be hated.


Not only was it possible to do everything in SMIT (or smitty) on the command line, smit would tell you the exact command line and options that it was running for any given operation, making it very easy to script and learn.


AIX is dead ; Even IBM jumped on the Linux / Opensource bus to use on the dying PowerPC platform .


Worked a lot with Solaris 9/10 a few years ago, end of Sun days. Amazing systems. Reliant AF. Zones and ZFS were a delight. The SPARC/Solaris/JVM combo worked great in enterprise apps. We had one server running, with some minor upgrades, with zero issues/reboots, for more than 2 years.

Then... it stopped being great, and a x86 Linux machine running the same JVM costed a fraction, and equally performed, if not better. Difficult to justify the licensing and support costs in Oracle days.

Farewell Solaris.


I'd forgotten that Solaris was still alive. I wonder to what extent their IP portfolio will linger on, like the zombie corpse that was SCO being used against Linux.


SCO was very much alive (though perhaps suffering a severe case of senility) when it started going after Linux.


Here's how I understood it, in a nutshell:

- SCO decides to sell off it's Unix and becomes Tarantella

- Established Linux vendor Caldera buys the rights to distribute SCO Unix

- Caldera changes its name to SCO and subsequently starts filing lawsuits


Is this the end of sparc hardware? I know Oracle has been supporting Linux/sparc developement, including a distribution: https://oss.oracle.com/projects/linux-sparc/.


Fujitsu went with ARM instead of SPARC for their new super computer[1][2]. It might be another indicator.

1) https://www.top500.org/news/fujitsu-switches-horses-for-post...

2) https://news.ycombinator.com/item?id=12018287


Honestly I hope so. Good riddance. SPARC is strange compared to other RISC architectures. I was never a fan of the register windows (and associated overflows) and the global/out/local/in register types. Reading the ASM was nasty in my experience.


Let it live through open source. A donation to Apache?


That is not the way of the Oracle.

Oracle does not care. Oracle thrives on badwill.


Oracle totally and freely threw away every bit of goodwill that Sun had fostered over the years, when they bought the company. Chunked it in the dumpster out back, didn't care.

The "old" Sun: Encouraged hobbyist use of hardware, put out software under a "free unless you need to pay for support" term, open-sourced Solaris [1], was generous with hardware donations [2] to various organizations, and realized that if a sysadmin liked playing with Sun gear at home, they were more likely to recommend it at work.

The "new" Sun: Oracle flips everyone the bird with both hands, won't even communicate with you unless it's about a paid support contract.

[1] I was lucky to be one of the 250 people picked as the OpenSolaris test/release/publicity team; still have my "xxx of 250" poster print on the wall of my home office. [2] They gave a Netra T1 and a disk shelf to us to run the Sun-Managers mailing list with, told me to keep a review-unit T1000 to run sunhelp.org on, and sent me a loaded Ultra 10 after a bit of a "misunderstanding". These are just three examples of many, many instances. [3] http://www.sunhelp.org/letters/


Unfortunately, the good-will didn't pay the bills


Makes me wonder what would have happened if IBM had been the suitor instead of Oracle, as I saw rumored.


Someone in the know at the time claimed to me that IBMs plan was to keep the hardware and customers, and mitigate anti-trust concerns by spinning off the software to Red Hat.

Given RHs compulsive open sourcing of aquisitions it's one of the great tragedies of the software industry that IBM got cold feet over the concerns that Sun were facing violations of anti-bribery laws.


Oracle goes for the jugular - in this case, the cheque-writing vein ...


Oracle hates open source and free software even more.


Great to see that Solaris still generates as much interest at this. Frankly, given how well the OS works, and how far ahead Oracle has taken the SPARC hardware, I can't see them just chucking it all away. The M7 and S7's are freaking awesome.

That, and the whole "Larry like's Larrys stuff more than anyone else's" thing...

@brendangregg: I'll bet ya $10 that neither Solaris nor SPARC are going away any time soon. :)


RIF:

A generic reduction in force, of undetermined method. Often pronounced like the word riff rather than spelled out. Sometimes used as a verb, as in "the employees were pretty heavily riffed".

https://en.wikipedia.org/wiki/Layoff#Common_abbreviations_fo...


So, will Oracle open source what they closed after killing OpenSolaris? I'd also appreciate ZFS with GPL compatible license.


Prediction: they aren't going to do the decent thing and open-source the whole thing when they stop it.


Is there anything interesting left that's not already in illumos?


Some of their changes to ZFS that were under development at Sun before OpenSolaris was killed would be useful to the community. Also, there was a ton of driver work that they did that the Illumos community could use.


Which drivers? I just looked at a document of theirs which details how they ripped out a whole bunch of drivers on the i86pc platform. For example, if I were crazy enough to upgrade to Solaris 11.3, my cadp160(9D) wouldn't work any more because they ripped it out. All of a sudden, my system could no longer talk UltraSCSI 160. IPFilter - gone. They ripped out a whole bunch of libraries like libpng out. Paravirtualization has been ripped out. Cheap UltraSPARC hardware support like the M3000, T1000, T2000, T5220, T5240, T3-1, all ripped out. /etc/defaultrouter gone. zone archive formats gone. Adobe Flash player gone. sysidtool(1M) gone. smdiskless(1M) gone. SmartCard support gone. lx-branded zones gone and they tell me to use XEN on Linux, no way! PostgreSQL - gone.

http://www.oracle.com/technetwork/systems/end-of-notices/eon...

No thanks.


USB 3.0 for one. Then there is their port of Intel's video driver from Linux and drivers for wireless NICs.


xhci for SmartOS is actually code-complete and in its final testing.[1]

[1] https://www.mail-archive.com/smartos-discuss@lists.smartos.o...


Awesome. Thanks for letting me know that Robert got this working. It is probably obvious by now that I am behind on reading mailing lists. I'll try to catch up before year's end. :)


I don't care enough about USB 3.0 - servers don't need it, and I boot my servers from the network via PXE and DHCP anyway.

intel video - couldn't care less either - I just buy Nvidia, download their SVR4 Solaris package and am up and running in seconds. It JustWorks(SM), so I always buy Nvidia - go sgi engineers.


After a certain level of driver support, more drivers are nice, but they have diminishing returns in the number of people impacted by them. It is quite understandable that it does not matter to you. My point in that having those drivers released as OSS for integration into Illumos would be nice still stands though. :)


Support for ZFS pool/FS versions that were added after the source was closed back up


I thought everyone who cared about that kind of thing moved to Linux or BSD years ago?


The interesting parts of Solaris (ZFS and dtrace) aren't under a GPL-compatible license, so the distribution situation is kind of interesting for people who want them. (I think it's something like "they can only be distributed as source, and there's a big script to make your package manager automatically compile the kernel module when you install it, and then your kernel is tainted".)


1. So how does FreeBSD ship it? I thought their whole OS (kernel+main userspace) is under a BSD lisence.

2. Why can't Linux vendors ship source code to be compiled on clients computers, so no distribution (forgot the legal term) takes place?

3. If the patent license only covers the code they released (and they reserve the right to sue over reimplementations), what will FreeBSD (or illumios) do if they'res a bug in the code? Once you change the code, you very sued


1. CDDL is fully BSD compatible. The license is file based, so it's non-infecting, and binaries can be re-licensed. Win/win.

2. Most already do. Some even believe that 1 makes CDDL compatible with GPL as well and so ship binaries.

3. Patent protection in CDDL is extremely strong. Rumor has it that Oracle wanted to kill illumos via litigation, but never went ahead with it because they knew they'd never win because of the CDDL.


3. So why can't Linux black-box re-implement it?


In my opinion, reimplementing ZFS from scratch would cost at least $100 million and take 5 to 10 years of development by many talented people. Given that we already have the source code under the CDDL and it is good enough, no one is willing to spend that kind of money. Why would they? There is no business case.

If anything, Oracle's software patents are a case against it because they could sue a clean room implementation like they did with Android's Java implementation. They would have a stronger case too due to the hundreds of patents covering ZFS. That is the elephant in the room with btrfs that no one discusses. :/

Anyway, I see no need to reimplement ZFS from scratch after consultation with attorneys of the SFLC and others.


Linux has a black-box reimplementation of DTrace (with different design sensibilities, but essentially all the same functionality) under the name of eBPF, and bcc for the userspace bits. Linux is trying to do that for ZFS under the name of btrfs, but it's not as good.


Oracle also ported Dtrace to Oracle Linux, and was available if you purchased support from Oracle.


> 2. Why can't Linux vendors ship source code to be compiled on clients computers, so no distribution (forgot the legal term) takes place?

For ZFS, they do. Debian's legal advisors say that shipping it as a DKMS module (i.e., as you describe) is fine, so they do that.

https://bits.debian.org/2016/05/what-does-it-mean-that-zfs-i...

https://packages.debian.org/jessie-backports/zfs-dkms

Ubuntu's legal advisors say that shipping the compiled module is fine, so they ship that.

https://insights.ubuntu.com/2016/02/18/zfs-licensing-and-lin...

DTrace is rather more closely coupled to the kernel than a filesystem, and there are good Linux-native alternatives now.

http://www.brendangregg.com/blog/2016-10-27/dtrace-for-linux...


FreeBSD ships a large chunk of software under licenses other than BSD, and used to ship even more until the last couple of versions.


illumos / SmartOS. And we've had lotsa help from our friends in the FreeBSD camp, bless their hearts.


Oracle is such a destructive force. Getting bought out by Oracle = slowly being ripped apart limb by limb.


Hardcore Solaris user here: Solaris 10 on intel is my last Solaris I'm ever going to run. After:

a) Bart Smaalders (with plenty of help from Glynn Foster and Shawn Walker) introduced IPS into Solaris

b) they did away with JumpStart(TM)

c) they did away with compressed Flash(TM) archives

d) they did away with sparse zones

e) Oracle closed the source code

I went to illumos / SmartOS and never looked back. I understand from a fellow engineer who still runs Oracle Solaris that 11.3 is the latest version, and I couldn't care less. I will never accept IPS because it has no preinstall / preremove / postinstall / postremove on purpose. Never, ever. I do my own builds of SmartOS from source code, PXE boot it from the network, and all is well with the world.


I wonder what this means for the future of ZFS and it license? Possibly no change.


OpenZFS is the only meaningful fork now. Unless Oracle open sources the changes to ZFS from 2010 when OpenSolaris was axed, those changes are lost. And even if open sourced, I expect those changes would be (selectively) merged into OpenZFS.


I wonder how this affects Joyent.


Joyent won, that's how. If this rumor is indeed true, it means Bryan's vision and perceptiveness beat Larry, albeit indirectly.

Joyent's SmartOS is built on illumos, and illumos is a fork of Solaris Express / OpenSolaris / ONNV, and a whole bunch of former Solaris kernel engineers, who were at key positions at Sun Microsystems, and are now across several successful companies, still commit fixes and features into the illumos source code. For example, illumos has OpenZFS and KVM, two major features Snoracle Solaris doesn't have and can't take back unless they open source the code again, not that anyone cares what they do.


I don't see how it could. Except that maybe we'll see some more Solaris refugees coming in.

illumos has been a thriving project for over six years, fully independent from Oracle. There has been zero code sharing, and little interaction of any kind.


IIRC, Joyent's stuff is built on Illumos, which is an open-source fork of Solaris 10 created when Oracle announced Solaris 11 would be closed source. The Illumos developers put in considerable work ensuring Illumos would be feature-compatible with Solaris 11, but it's not like they were adding new bits of Oracle code or anything.

Illumos will continue on their own just like they've been doing for the last few years. They'll probably have even more freedom to innovate even, since they won't have to worry about chasing Solaris 12 compatibility.

Edit: And Joyent is one of the major sponsors of Illumos, too, so their platform definitely isn't going away.


Well, improved access to skilled developers, given the layoffs.


Worse than something like this happening is seeing former Solaris engineers rushing to gloat on what would be the mass riffing of their former colleagues (and the remanining Solaris "elders" who helped them become who they are today).


This is a baseless rumor. I have flagged this submission. It's just someone trolling the internet.


what a brilliant idea for a website. they need to change the homepage to jus be the latest layoff news.


wasn't that the whole point behind http://f*ckedcompany.com/ back in the late 90s and early 00s?


So what you're saying is that there are a bunch of unix engineers looking for jobs.


RIP UNIX

SVR4 / UNIXWARE / SCO / AIX.

Without you Linus would have never been cloned you as his own.


Other vendors throwing in the towel interestingly mean AIX will continue on much longer. The UNIX(r) market is shrinking, but high margin. HP-UX and AIX are the only UNIX(r) choices, and IBM is a much more stable entity. IBM has a pretty solid strategy with [Open]POWER embracing Linux while continuing to sustain i and AIX. AIX is both less sticky and compelling than i, but nonetheless 27 years to grab niches like DBs and large sw suites mean it will be around for at least another decade.


The difference between Solaris and Linux is mainly scalability. Linux scales well on clusters such as SGI UV3000 scale-out servers, or top500 supercomputers. These scale-out clusters serve one scientist starting HPC number crunching workloads 24-48h. Scale-out workloads are easy to parallelize doing a calculation on the same set of grid points, over and over again. All this fits into a cpu cache and can run on each separate compute node. All SGI UV2000/UV3000 use cases are HPC number crunching, analytics, etc.

OTOH, enterprise business workloads (SAP, OLTP databases, etc) typically serve thousands of users simultaneously. They do pay roll, accounting, etc etc. Such workloads can not be cached in the cpu cache, so you need to go out to RAM all the time. RAM is typically 100ns, which corresponds to 10 MHz cpu. Do you remember 10 MHz cpus? This means business workloads have huge scalability problems because you need to place all cpus on the same bus, in one single large scale-up server. If you try to run business workloads on a scale-out server, performance will drop drastically as data is shuffled among nodes on a network, instead on a fast bus.

Thus, business workloads use one single large scale-up servers, with max 16 or 32-sockets. This domain belongs to Unix/RISC and Mainframes. HPC number crunching use large clusters such as SGI UV3000 which has 10.000s of cores.

The largest Linux scale-up server is the new HP Kraken. It is a redesigned old Integrity Unix server with 64-sockets. The x86 version of the Integrity maxes out at 16-sockets only. Other than that, the largest x86 server is vanilla 8-socket servers by IBM, HP, Oracle, etc.

Linux devs only have access to 1-2 socket PCs so Linux can not be optimized nor tested on large 8-16 socket servers. Which Linux dev have access to anything larger than 4-sockets? No one. Linus Torvalds? No, he does not work on scalability on 16-socket servers. There is no Linux dev working on scalability on 16-socket servers. Why? Because, until last year, 16-socket x86 servers hardly even existed! Google this if you want, try to find a 16-socket x86 server other than the brand new HP Kraken and SGI UV300H. OTOH, Unix/RISC and Mainframes have scaled to 64 sockets for decades.

Look at the SAP benchmarks. The top scores all belong to 32-socket UNIX/RISC doing large SAP workloads. Linux on x86 has the bottom part, doing small SAP workloads. The HP Kraken has bad SAP scores, considering it has 16-sockets. It is almost the same as the 8-socket x86 SAP scores. Bad scalability.

Thus, if you want to run workloads larger than 2-4 sockets, you need to go to Unix/RISC. Linux maxes out at 2-4 sockets or so. The new Oracle Exadata server sporting SPARC T7 (same as the M7 cpu) runs Linux and it maxes out at 2-sockets. If you want 16-socket workloads, you must go to Solaris and SPARC. All large business servers, use Unix or Mainframes. No Linux nowhere.

Linux = small business workloads. Solaris = large business workloads. And the big money is in large business servers. If Oracle kills off Solaris, then Oracle is stuck at 2-4 sockets (small revenue). Only Solaris can drive large business servers (big revenue).

It does not make sense to kill of Solaris, because then Oracle can not offer (expensive) large business servers. Then Oracle will be stuck at small cheap business servers with Linux and Windows.

Regarding Linux vs Solaris code quality: https://en.wikipedia.org/wiki/Criticism_of_Linux#Kernel_code...


   Going the way of UNIXWARE / AIX /   

   RIP UNIX .


AIX is still around.




Applications are open for YC Winter 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: