Hacker News new | more | comments | ask | show | jobs | submit login
FreeBSD has lower latency, and Linux has faster application speeds (quora.com)
134 points by krn 8 months ago | hide | past | web | favorite | 75 comments

Speaking as somebody who works on the FreeBSD kernel at Netflix, the whole article is kind of nonsensical. We don't use FreeBSD because of "latency". We serve clients over https that could be tens or hundreds of ms away using the kernel TCP stack. The folks who care about latency are high frequency traders, who care about every nanosecond. They tend to use userspace / hardware offloaded solutions and do things like busy-waiting for messages.


Have you ever consider using Dragonfly. It appears to dramatically beat in perf test FreeBSD (2017).


I’m curious to hear your take on Dragonfly since you’re a kernel developer.

Thanks in advance for all your work.

Edit: more recent links (2018) with even higher perf



I have looked at Dragonfly with interest, due to how they manage VM. I really like how they shard the page queues to be per-cpu. With that said, a lot of work has been done by jeffr in the last 6 months to a year to improve NUMA performance (which we partially sponsored) which has dramatically improved VM scalability even on single socket boxes, and which has allowed us to remove a lot of our local hacks.

I considered giving Dragonfly a try in the past, but just never had the time. There would be a depressing amount of work before we could even consider Dragonfly, mostly centered around our async sendfile (which is now upstream in FreeBSD), our TCP changes: BBR (not yet upstream), RACK (now upstream) and TCP pacing (now upstream). Not to mention unmapped mbufs (not yet upstream) and kernel TLS (also not yet upstream). Also, the last time I looked, Dragonfly did not have drivers for Chelsio or Mellanox 10/25/40/100GbE NICs. Without even just one of these things, performance will be so bad that any comparison testing would be meaningless.

Thanks for the thoughtful reply.

Just a thought, the Linux/BSD community might hugely benefit if someone with your deep knowledge released a set of perf test scripts that anyone can run locally to regression test network perf. That way, the OSS community can integrate those perf test scripts into their commit/regression test pipeline.

It would be great to have this kind of test, but often the regression test is going to be run the system on production load. Benchmarks have a way of testing something, but not quite what you need to test. The interactions between the whole system are important.

For the Netflix boxes, they're pushing 40gbps+, not a lot of the community is going to be to be able to test that, unless they have fairly expensive networks laying around.

You're spot on. We don't really have good benchmarks that approximate our production traffic.

BTW, 40Gb/s was so 2015. I have a box serving at 156Gb/s :)

As someone who grew up around Telex machines and current loop interfaces: consider me impressed!

Oh mine! I record there was an article about pushing 100Gb/s not long ago and hitting memory bandwidth limit. Now it is 150Gb/s already ? Are you guys going to try something crazy like 400Gb/s ? At this rate Netflix could start selling their Appliance as another business.

> BTW, 40Gb/s was so 2015. I have a box serving at 156Gb/s :)

That's right, I wanted to put 100+, but I wasn't totally sure. I stopped counting when you got way beyond the 20G connectivity on the servers I manage.

For what it's worth: Sepherosa Ziehau spent some years serving out huge streams of video in a company similar to Netflix, in China. He used DragonFly. So, it may not have the same changes you are listing, but it may be closer than you think.

If I may ask, why do you use FreeBSD in particular then?

Native (properly working) ZFS, I don't need any other reason :)

It works on linux, but to be able too boot from it, still a huge hassle.

I tried to intall Ubuntu into root ZFS and it wasn't much of a hassle, just simple howto, everything worked.

I've been running root on ZFS (GELI- or LUKS-encrypted, no less) since FreeBSD 7 and Ubuntu 12, and to be frank the story on Linux just isn't that great. Documentation for root on ZFS only exists for Arch, Debian, Gentoo, and Ubuntu, with only Debian and Ubuntu having documentation for encrypted root ZFS pools. Last I checked, not a single Linux distribution's installer supported ZFS, so if you want root on ZFS, you must install Linux by hand. That's great if you have unlimited time, but it sucks for any kind of deployment at scale.

Contrast with FreeBSD, with first-class support for installing root on ZFS since 10.0-RELEASE, four years ago.

I’m going go out on a limb abd speculate there might be different hurdles for a single installation vs many installations.

Does it have he same level of QA?

No sources or benchmarks?


> But unless you’re using Fedora, your servers aren’t gonna taste it soon. Just by bad timing, the new Ubuntu LTS (18.04) due out next week will still use 4.15, which it will support for 5 years. Since CentOS and Debian are even further behind on kernels, you won’t see a lot of “fast Linux” in production until April 2020 when Ubuntu 20.04 LTS comes out.

If you are at a point where the differences would matter to you, you can (and will) probably just install a newer kernel and have the improvements available today. Just because upstream doesn't have it doesn't mean you couldn't compile it yourself or use backports.

> No sources or benchmarks?

There was this in TFA:


The benchmark results at that link are from machines with different processor clocks. The results are invalid.

This benchmark is kind of meaningless. What is a "TCP request response" actually measuring? Any Linux server can certainly handle more than 340 HTTP reqs/sec for example.

The "netperf" benchmarking software appears to be from 1993 (complete with a webpage from that era! https://hewlettpackard.github.io/netperf/) so I have a lot of doubts that it is using modern networking APIs or taking advantage of modern hardware.

Netperf is probably the most widely adopted network benchmark out there. Almost every company I've seen has used it for performance and Q/A testing. For what it does, it is quite efficient. I remember when I was doing 10GbE drivers in the mid 2000s, people would complain of terrible performance on "modern" tools like iperf, but would see 10Gb/s using netperf. This is because "modern" tools did things like gettimeofday() around every socket read or write, making them basically gettomeofday() benchmarks at high message rates, but netperf did the gettimeofday() around the entire test.

Netperf's problem has always been that it is single connection, single-threaded. Most people use it coupled with patches or scripts to run many copies of netperf in parallel.

I suspect what happened in these tests is that the author just ran a single copy of the tests, and did not bother to adjust the interrupt coalesing settings. So what he really measured was different interrupt coalescing settings in the FreeBSD and Linux driver. If he'd have run 500 or 1000 copies, I'll bet he'd see vastly different results.

BTW, if you're looking for a modern network benchmark, check out uperf (http://uperf.org/) It seems to have been largely abandoned after the Oracle acquisition, which is a shame, because it was just plane awesome. It could replicate many scenarios very realistically, supported multi-threading, etc.

Iperf 2.0.10+ uses clock_gettime() when a timestamps is needed. For TCP and no interval reporting the only calls needed are at the beginning and end of the test. The performance problem we hit with 2.0.5 had to do with insufficient shared memory between traffic threads and the reporter thread.

It's also incorrect even for LTS supported kernels: LTS releases incorporate new kernel versions along the way as part of the hardware enablement work, though they are not used by default.

For example, here's 4.13 in the previous LTS (16.04), up from the default 4.4: http://packages.ubuntu.com/linux-signed-image-generic-hwe-16...

> Just because upstream doesn't have it doesn't mean you couldn't compile it yourself

True, but if you are managing production machines, compiling kernels for every errata and security update is pretty low on the list of things you get enthusiastic about. Also a lot of commercial or scientific linux software is only "certified" and supported on stock RHEL kernels.

On the other hand, if you're making or losing money on latency (e.g., you're a high-frequency trader), compiling kernels for every potential performance improvement is a thing you're enthusiastic about, a lot of security fixes (local privilege escalations, drivers for desktop hardware, etc.) don't apply to you, and you don't care about things being "certified" because you have the expertise to fix problems in-house as necessary.

And from the other direction, if you're making or losing money on security, expecting distro kernels to turn around security fixes in a timely manner seems like a mistake. At least know how to rebuild your distro kernel with a local patch.

The amount of filesystem/networking issues I've seen fixed and all the performance improvements I got from just updating a kernel probably make up for the time that it caused issues.

I guess it needs a critical mass + certain infrastructure maturity that running on whatever the latest kernel is makes sense.

As for the 'certified on stock RHEL kernel' software... I try to keep away from those, but I'm picky and pretty lucky when it comes to job selection :)

There's always http://elrepo.org/tiki/kernel-ml (and also kernel-lt) so you don't have to compile things yourself. But things like VMWare and various proprietary device drivers will break.

The post seems to be confusing FreeBSD userspace (which OSX adopted parts of) and the FreeBSD kernel (which has the mentioned lower network latency). OSX didn't adopt the FreeBSD kernel, but instead started with the Mach microkernel and has been going in their own direction since.

Not just userspace - OSX adopted large pieces of FreeBSD kernel, including the syscall layer and the network stack. Also, Mach isn't technically a microkernel - the whole kernel runs in Ring 0.

But yeah, OSX kernel is vastly different from FreeBSD - first, there were huge differences to start with, only some pieces of kernel were adopted, and finally there were thousands of man-hours put into each system afterwards, diverging them even more.

Not that it matters (the post is mostly just unfounded opinion), but:

> OSX didn't adopt the FreeBSD kernel, but instead started with the Mach microkernel and has been going in their own direction since.

MacOS kernel uses a combination of Mach and BSD kernel code. It started with 4.4BSD, and at some point the bulk of BSD code was updated with code from FreeBSD 5.

After that, it was piecewise updated with more modern stuff, the MAC framework (used to implement sandboxing on iOS and macOS) and DTrace from newer FreeBSD kernels.

There is a great talk on OS X internals from CCC here:


It particularly talks about the unique hybrid kernel approach and how there is nothing like it.

BSD itself took the VM subsystem from Mach (so BSD is kind of an interesting hybrid as well). While it has been heavily modified in the last 25+ years, the VM portion of the FreeBSD kernel is still stylistically distinct from BSD portions of the kernel source code.

A better explanation I think is found here:


With such a long post, I would have expected some technical details for why FreeBSD has lower latency, and Linux has faster application speeds.

FreeBSD was used in 1999 to render The Matrix on 32 Pentium II boxes because the software in Linux Compatibility mode on FreeBSD was faster then natively on Linux, that is a fact:


FreeBSD can be several times faster then Linux when it comes to network stack:


But often Linux is faster, you just need to find benchmark that favorites one or another, both are fast in general.

Why on earth would you link to a Phoronix screenshot instead of the article that explains it?! Those results look suspicious enough[1] that I wanted to look it up. But of course I can't, and Google isn't helping me much.

Shame! Shame, shame, shame!

[1] Factor of 3+ differences not just between BSD and Linux but between otherwise very comparable linux distros. Something's weird with the setup there, like it's measuring default firewalling overhead or something and not kernel behavior.

Looks like source is https://www.phoronix.com/scan.php?page=article&item=netperf-...

I guess they have everything just set to default disto settings which likely in some cases include firewall while others don't. They also don't mention which NIC they are using, the motherboard has two different NICs (i218-LM & I210-AT).

Those benchmarks were run on machines with different processor clocks and different motherboard chipsets. The results are invalid.

> Phoronix

> Something's weird with the setup there, like it's measuring default firewalling overhead or something and not kernel behavior.

This is something you come to expect if you've seen enough of what Phoronix publishes.

A bit of a tangent but: take another look at FreeBSD folks.

If we re-did our infrastructure from scratch we'd look at either FreeBSD or Alpine Linux. Both would offer an escape hatch from the needless bloat that mainstream Linux has become.

Dependency radii of packages in CentOS are horrible. Your minimal installs are hovering above 1GB. Debian are better at 400MB or so. Alpine is great, in that you can get this down to well under 100MB.

This has been a long standing complaint of mine[1], and an argument I've had with many proponents of "software must come packaged only from the provider of the distro" people.

Physical and logical footprint is a real thing. Ridiculous secondary/tertiary dependencies are a waste of that precious resource, and an unneeded/unwanted installation of useless bits.

[1] http://scalability.org/2018/04/distribution-package-dependen...

Although I agree with you that this bloat is undesirable, I'm not convinced it's actually a problem with the distro. (My primary experience is with Ubuntu, and to a lesser extent, mostly by extension, Debian)

Rather, I think it's a problem with the package maintainers themselves. I certainly admit that for very many packages, this is a distinction without a difference.

However, my point is that this dependency bloat isn't based on some kind of policy that the distro is encouraging or even enforcing. Rather, the problem is merely that too many source packages create too few targets, perhaps because the original packagers never foresaw the need (for, e.g., both an X version of some graphics or GPU-related library and a non-X one for pure server work).

I've occasionally gotten around this problem by custom-building a local version of the package that avoids the dependency bloat. That violates the "software must come packaged only from the provider of the distro", unless one considers the local superset a forked distro.

> Physical and logical footprint is a real thing.

Ultimately, I think you and I are in a vanishingly small minority here. Virtualization (even the full, pre-container kind, with a full copy of the kernel on every instance) gained a startling level of a popularity, and was even lionized as a money-saving tool (which, of course, it was, for some "IT" shops).

> "There are some 'BSD-inspired' networking improvements to Linux starting with the 4.16."

By "inspired by BSD" do the author mean the code has been copied and pasted from FreeBSD to Linux, something which cannot be done in reverse because of the encumbrance of the GPL?

It can be done - kind of. Take a look at how the drm-next-kmod works in FreeBSD. It's a port of Linux GPU drivers using linuxkpi - a Linux kernel API compatibility layer, basically a huge set of wrappers to implement Linux kernel functions, eg printk(), using FreeBSD mechanisms (printf(9) in this case; https://svnweb.freebsd.org/base/head/sys/compat/linuxkpi/com... for the curious). Part of linuxkpi uses code transplanted from Linux, and that part does that that by living in the Ports Collection, like any other third party software.

Of course this prevents it from being committed to FreeBSD base. But from the user point of view it doesn't matter - it's just that you have both linuxkpi.ko and linuxkpi_gplv2.ko loaded automatically when you load i915kms.ko, which is the "real" driver, and there's no license compatibility problem.

Copy + pasting code between Linux and FreeBSD kernels isn't going to work (although see trasz's post about how code is reused for graphics drivers -- doing that for networking would probably not be good choice). There's a lot of differences in data structures, locking, etc. Copying concepts can be fruitful though.

BSD is free to adopt the GPL at their leisure. Who knows, companies might then start contributing to the project instead of building OS's on top of it, looking at you osx.

GPL supporters are all about freedom -- until someone advocates for the freedom to make a proprietary product on top of the open source base.

More specifically it's about freedom for all users, transitively.

Except for developers who wish to use GPL code without it virally requiring them to relicense everything under GPL, that freedom does not exist.

The key word is transitively. Those developers can do everything except depriving their users of freedoms originally granted by the copyright holder.

GPL supporters are all about liberty, and like we all know, liberty does not give everyone freedom to do everything that they wish.

Yes... that is the point.

Oh I forgot -- at HN you never question the GPL. And FreeBSD sux because it's not GPL. Got it. I'll stop wasting my time even mentioning it again.

You didn't question the GPL, you just explained how it works.

It's about freedom for the users and developers.

The freedom to enslave others is not a freedom that one protects. I believe there was a little thing called the civil war where that point was made.

> The freedom to enslave others is not a freedom that one protects. I believe there was a little thing called the civil war where that point was made.

Allowing people to make proprietary code from BSD licensed code is not enslavement. Also if someone improperly used GPL licensed software to make proprietary software it would be breach of contract or copyright infringement not enslavement.

Liberty involve many restrictive laws, with slavery being one of the biggest along with murder and assault. No one is suggesting that all laws must be only behavior worse than slavery or else the country does not believe in liberty.

Thing is though that the BSD and GPL license have absolutely nothing to do with slavery. So I have no idea why you and the other poster keep bringing up slavery in the context of open source licensing.

The reason is that some people use the word freedom and liberty as synonymous, while others put clear distinctions between them and argue that GPL do not fit their definition. Thus people who disagree have to take clear examples to explain why liberty laws restricts freedom in order to create liberty.

If you prefer to use philosophy terms, conditions are negative rights. Those are similar to things like "I have the right to not get assulted, I have to right to not get enslaved, I have the right to not get robbed, I have the right to not get murdered, I have the right to not get recorded in the shower, and so on. Negative rights is freedoms to not have things happen to you by restricting what other people may do. Positive rights is things like I have the right to health care, I have the right to be on public ground, I have the right to social security, or I have the right to ask the government in the Freedom of information act and so on. Positive rights usually but not always demand something from others. Most constitutional laws tend to be negative rights.

Combining negative and positive rights is how liberty is created. Its a balance between freedom to do things and restrictions to not do things to others, with the goal of maximizing free will and personal agency for society. GPL give freedom to do anything but with conditions that limit what you can do to other people. Just like with liberty the GPl tries to maximizing free will and personal agency and uses negative and positive rights to reach a balance.

One could ask why the free software movement picked freedom when they intended to say liberty, and the reason is historical and localized. RMS did not want to associate the movement with the libertarian political platform. Maybe they thought that most people don't really distinguish between freedom and liberty, but now days liberty is generally favored over freedom.

What are you talking about? A previous poster said...

"The freedom to enslave others is not a freedom that one protects. I believe there was a little thing called the civil war where that point was made."

This was said in the context of the BSD and GPL licenses and GPL advocates not liking closed source software being made from BSD software even though its perfectly within the license to do so. This has nothing, and I repeat absolutely nothing to do with enslavement.

"GPL supporters are all about liberty -- until someone advocates to make a restrictive product on top of the open source base."

"The liberty to restrict others is not a liberty. Laws [insert common law here] restrict behavior and it not against liberty to have such law."

Is that simple enough for you?

It is. You allow your code to be used in the subjugation of others.

Using code that I BSD license is not the subjugation of others.

If you used it to create proprietary code it is.

Using BSD licensed software to create proprietary code is not subjugation, in fact creating proprietary software from BSD code is perfectly within the license. If someone doesn't want their code used in proprietary work they would release it under a more restrictive license. Lastly you should probably look up the definition of subjugation before tossing it around liberally in conversation.

the OSX core/kernel (XNU+darwin) has actually been open source for a long time. The interesting bits of the userspace (cocoa, carbon or whatever) were not BSD derivative.

The benchmark use outdated API.

Misleading on distro release a date.

No explanation on the improvements.

I would vote down this post if I could.

You can down vote on quora.

Doesn't Dragonfly (freebsd fork as of 2003) have even lower latency?

I reckon if someone make Java a first class citizen on FreeBSD, it may actually get a larger amount of usage.

Woudn't anyone really into IP latency optimization be using something like DPDK on Linux? If you want high-performance networking, then using the kernel isn't the best idea IMO.

Latency and throughput ("application speeds"), certainly in computing, are competing qualities.

Well, here's just one measurement using iperf 2.0.12. All clocks are synchronized to a GPS disciplined oven controlled oscillator (OCXO). Connected via 1Gbs. FreeBSD latency is significantly better.

Source fedora 28 xeon machine, receivers FreeBSD 11.1 and Fedora 25 (same brix platform) connected via a Cisco 300 switch.


FreeBSD: root@zeus:/usr/local/src/iperf2-code # iperf -s -u -e --udp-histogram=10u,100000 --realtime ------------------------------------------------------------ Server listening on UDP port 5001 with pid 53703 Receiving 1470 byte datagrams UDP buffer size: 41.1 KByte (default) ------------------------------------------------------------ [ 3] local port 5001 connected with port 52536 [ ID] Interval Transfer Bandwidth Jitter Lost/Total Latency avg/min/max/stdev PPS [ 3] 0.00-10.00 sec 1.25 MBytes 1.05 Mbits/sec 0.004 ms 0/ 892 (0%) 0.086/ 0.060/ 0.150/ 0.014 ms 89 pps [ 3] 0.00-10.00 sec T8(f)-PDF: bin(w=10us):cnt(892)=7:160,8:143,9:181,10:159,11:242,12:6,16:1 (5/95%=7/11,Outliers=0,obl/obu=0/0)

Fedora 25: [root@hera iperf2-code]# iperf -s -u -e --udp-histogram=10u,10000 --realtime ------------------------------------------------------------ Server listening on UDP port 5001 with pid 16669 Receiving 1470 byte datagrams UDP buffer size: 208 KByte (default) ------------------------------------------------------------ [ 3] local port 5001 connected with port 35894 [ ID] Interval Transfer Bandwidth Jitter Lost/Total Latency avg/min/max/stdev PPS [ 3] 0.00-9.99 sec 1.25 MBytes 1.05 Mbits/sec 0.010 ms 0/ 892 (0%) 0.261/ 0.098/ 0.319/ 0.021 ms 89 pps [ 3] 0.00-9.99 sec T8(f)-PDF: bin(w=10us):cnt(892)=10:1,21:1,23:6,24:107,25:141,26:209,27:151,28:124,29:81,30:47,31:19,32:5 (5/95%=24/30,Outliers=0,obl/obu=0/0)


[root@rjm-clubhouse-28 rjmcmahon]# iperf -s -e -u --udp-histogram=10u,100000 --realtime ------------------------------------------------------------ Server listening on UDP port 5001 with pid 26016 Receiving 1470 byte datagrams UDP buffer size: 208 KByte (default) ------------------------------------------------------------ [ 3] local port 5001 connected with port 20343 [ ID] Interval Transfer Bandwidth Jitter Lost/Total Latency avg/min/max/stdev PPS [ 3] 0.00-10.01 sec 1.25 MBytes 1.05 Mbits/sec 0.010 ms 0/ 893 (0%) 0.094/ 0.064/ 0.476/ 0.023 ms 89 pps [ 3] 0.00-10.01 sec T8(f)-PDF: bin(w=10us):cnt(893)=7:12,8:38,9:291,10:311,11:198,12:28,13:10,14:1,15:1,42:1,45:1,48:1 (5/95%=8/11,Outliers=3,obl/obu=0/0) [ 4] local port 5001 connected with port 58391 [ 4] 0.00-10.00 sec 1.25 MBytes 1.05 Mbits/sec 0.013 ms 0/ 892 (0%) 0.079/ 0.048/ 0.127/ 0.009 ms 89 pps [ 4] 0.00-10.00 sec T8(f)-PDF: bin(w=10us):cnt(892)=5:1,6:10,7:115,8:265,9:425,10:51,11:18,12:3,13:4 (5/95%=7/10,Outliers=0,obl/obu=0/0)

He forgot to say that Dave Miller brought in siphash for recent networking hashes on Linux, which should bring down performance of the Linux networking stack closer to Ruby/Python levels. About 20 slower. There's nothing slower than this, whilst not improving security, but increased the accompanied security theatre. It's a mess over there.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact