
Linux 3.8 introduced unprivileged user namespaces [pdf] - majke
http://man7.org/conf/meetup/understanding-user-namespaces--Google-Munich-Kerrisk-2019-10-25.pdf
======
geofft
The given title (currently "Linux 4.6 introduced unpriviledged user
namespaces") isn't accurate - unprivileged user namespaces have been around
since this commit:
[https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5eaf563e53294d6696e651466697eb9d491f3946)

This is included in kernel 3.8 released in 2013, and the linked PDF says
that's when CLONE_NEWUSER itself was released - i.e., ever since user
namespaces have existed, they've been allowed to unprivileged users.

Some downstream distributors, including Debian, started carrying a patch that
added a restriction back in via the kernel.unprivileged_userns_clone sysctl.
This is because allowing unprivileged user namespaces exposes a lot of attack
surface to users. There's nothing fundamentally unsafe about it, but there are
many kernel components for which bugs are only exploitable by unprivileged
users if the feature is on. See
[https://lwn.net/Articles/673597/](https://lwn.net/Articles/673597/) for more
discussion. (I thought a version of that patch eventually got accepted
upstream, but I can't find it.)

The only reference to kernel 4.6 in the document is CLONE_NEWCGROUP, which is
entirely unrelated to unprivileged user namespaces.

~~~
dang
Ok, we've subtracted the pair (1, -0.2) from Linux in the title above.

Edit: corrected from 0.8

Edit edit: corrected from (1, 0.2)

~~~
toraobo
Linux versions are not decimal, you subtracted 1.-2

~~~
dang
Oh dear. Fixed. Thanks!

~~~
naniwaduni
Subtracting (1, 0.2) would've left 3.4!

~~~
dang
I need to stop doing this so hastily. Argh!

------
tony
Will this make it easier to run docker without root / sudo?

That could fix the barrier of entry and make docker tutorials passed around
more canonical. I would love it if docker "just worked" across machines when
testing locally, commands and all.

Even sharing docker in open source projects, there's a learning curve for me
where commands in the README won't work. Is it my docker installation? Version
differences with docker? Docker compose? Did the container images I'm pulling
in change some way?

Do I need sudo or not? I guess with proper group permissions I'm okay - but
will the developer(s) I share instructions with have these permissions ready
to go?

On StackOverflow: copy/pasting docker CLI (even given proper context fitting
into a larger whole) commands and configs probably has a 50% success rate for
commands, and maybe 10% if it's a config of some sort (e.g. compose files)

~~~
feanaro
> That could fix the barrier of entry and make docker tutorials passed around
> more canonical. I would love it if docker "just worked" across machines when
> testing locally, commands and all.

For an almost drop-in daemonless, rootless docker replacement for this use
case, see podman ([https://podman.io/](https://podman.io/)). You can `alias
podman=docker` and it will just work.

You do need a bit of configuration, specifically you need to create
`/etc/subuid` and `/etc/subgid` if they don't exist and add subordinate
UIDs/GIDs which will be used to map the containers users and groups. E.g.

    
    
        usermod --add-subuids 100000-165536 $USER
        usermod --add-subgids 100000-165536 $USER

~~~
aorth
> You can `alias podman=docker` and it will just work.

Yes, and it's super awesome. Unless you need docker-compose!

~~~
thinkmassive
Podman supports pods, hence the name. It follows the k8s approach, and it’s
nowhere close to a drop-in replacement for docker-compose, but managing
multiple related containers is supported as a core feature.

[https://developers.redhat.com/blog/2019/01/15/podman-
managin...](https://developers.redhat.com/blog/2019/01/15/podman-managing-
containers-pods/)

------
hiasen
I would recommend watching the same (or similar) talks about namespaces in
Linux done by Michael Kerrisk at NDC TechTown (Kongsberg) September 2019.

I learnt a lot from these talks.

[https://www.youtube.com/watch?v=0kJPa-1FuoI](https://www.youtube.com/watch?v=0kJPa-1FuoI)

[https://www.youtube.com/watch?v=73nB9-HYbAI](https://www.youtube.com/watch?v=73nB9-HYbAI)

------
rwmj
Has anyone tried fuzzing random sequences of unshare(2), clone(2), setuid(2),
capabilities, uid/gid map calls (etc) to see if there is a sequence that
eventually gains real root or some other privilege escalation? I'm dubious
that Linux is theoretically sound, what with the multiple layers of historical
baggage.

~~~
pcwalton
Syzkaller fuzzes that stuff.
[https://github.com/google/syzkaller](https://github.com/google/syzkaller)

~~~
rwmj
It fuzzes individual syscalls. Does it string them together into sequences?
(Edit: yes it does, but it doesn't look for priv escalations, only crashes,
kernel panics and the like)

~~~
simcop2387
It can be adapted to privlege escalations, but it doesn't do it on it's own.
You have to give it a stub to check after running the syscalls to check if the
permissions are intact.

~~~
sitkack
Does it trace existing programs and use those as seeds in how to compose
syscalls?

~~~
simcop2387
I don't think it's got the ability on it's own to do that, but I can imagine
you could do it with strace and some other scripting.

------
_pmf_
If Michael Kerrisk is reading this: would be great to have a small
supplementary book to the great "The Linux Programming Interface" that handles
namespaces, capabilities et al.

~~~
Huggernaut
I had the absolute pleasure of attending a week of all-day sessions with
Michael Kerrisk and on 2 or 3 or those days he covered only container related
primitives. His materials were great and he was an excellent teacher.

------
tyingq
Typo in title, should be "unprivileged". Not nitpicking, just should be
searchable later.

~~~
loudmax
Also the bane of many a DBA trying to update permissions.

------
AdrienLemaire
I have never used OpenBSD, but isn't Linux getting closer to their equivalent
pledge and unveil?
[https://news.ycombinator.com/item?id=17277067](https://news.ycombinator.com/item?id=17277067)

This is great news, because I can't switch over to OpenBSD (docker, bluetooth,
etc) or more folkloric distributions like VoidOS and Qubes. Going to make a
bunch of Anki cards today to remember these namespaces and how to use them!

------
zorked
The author's book, The Linux Programming Interface, is an amazing book and a
new classic.

------
sargun
user namespaces are a super rad feature. They've protected us at $WORK from
multiple vulnerabilities that have come out.

~~~
londons_explore
How do you use them? Are we talking about developer desktop PC's here? Or user
namespaces as part of a containerization setup?

The main use of user namespaces seems to be running stuff that wants to be
root as non-root. It would seem better to simply fix all those tools to not
check if they are root, and instead just try to do the thing they were trying
to do.

~~~
woadwarrior01
For developer desktops, there's firejail[1].

[1]:
[https://github.com/netblue30/firejail](https://github.com/netblue30/firejail)

~~~
AdrienLemaire
Lovely! Thanks for sharing this awesome-looking tool :)

edit: oh, it's also mentioned in the document slide 53 along with Flatpak

