
Linux 3.8 Released - moonboots
http://kernelnewbies.org/LinuxChanges
======
contingencies
If I understand correctly, user namespace support was the major thing missing
for the secure use of Linux Containers (LXC). This should bring widespread,
extremely rapid container-based virtualization under Linux closer to a
reality. Redhat is nominally backing this via libvirt, but mainly offers
paravirt-based solutions. IBM who authored a lot of the kernel code seems to
be interested the same stuff for large-scale servers. We live in interesting
times.

~~~
DASD
Does this solve the problem of constraining root to the container?

~~~
dylanvee
Yes.

~~~
DASD
Thank you.

------
andyl
I know open-source is old-hat by now - but damn - look at the man-years of
contributions. Linux is a miracle.

~~~
chubot
I've had the same sentiment for many years... Linux is indeed a miracle.

But after spending a lot of time with it recently (e.g. getting LXC running
last night), I've concluded that it would be very hard to design a system with
a security API that is worse than Linux.

The issue is that Linus doesn't design ANYTHING. He doesn't believe in design;
he only believes in evolution.

Unix was designed, whereas Linux is mostly a bunch of code bolted on top of
Unix. It's not sustainable in the long term. Someone needs to actually design
something eventually, so there is a stable base for more evolution.

Spend some time looking through these:

    
    
        - traditional Unix ACL-based security
        - traditional resource limits
        - chroot (not secure, but used as a "part" of many  security solutions)
        - capabilities
        - seccomp
        - LSM-based
          - SELinux
          - AppArmor
          - ...
        - LXC
          - cgroups
          - namespaces (apparently completed with this kernel release)
          - LXC user space tools
        - ptrace sandboxing 
          - (at least a dozen projects use this)
        - user mode linux
        

And you'll realize it's just a huge mess. I'm sure the complexity makes Linux
measurably more insecure in practice. Or it just provides employment for a lot
of people -- who knows.

There's never going to be a way to clean this all up, since people are relying
on all of it.

I don't have that much experience with the alternatives; I'm sure they're
messy in their own right. (I've used many OSes, but not security-wise.) But
this definitely has me looking towards FreeBSD and such. Too bad it is more
expensive on EC2.

I mentioned Minix 3 here before -- it's probably a pipe dream, but being a
microkernel, it seems like a good basis for a future secure Unix. It actually
was designed in some sense.

From what I gather people take the existence of root escalation exploits on
Linux for granted. If that weren't so (and it shouldn't be with a
microkernel), then traditional Unix security might actually cover a lot of
cases that all these hacks on top are patching up.

EDIT: Also, Linux should look to DJB for guidance. Out of all the hairiniess,
how do you even do this on Linux (or any Unix)?
<http://cr.yp.to/unix/disablenetwork.html> It just seems crazy.

~~~
bcantrill
With the disclaimer that I very much have a dog in the fight, you might want
to look at illumos[1] and its distributions like SmartOS[2] and OmniOS[3]. It
has a secure, robust container model (with a hat tip to FreeBSD jails for
providing inspiration over a decade ago) and a mature least-privilege model
that minimizes attack surface -- not to mention ZFS, DTrace, KVM and other
goodies. At the very least, you can take solace in knowing that others share
your desire for cleaner alternatives...

[1] [http://smartos.org/2011/12/15/fork-yeah-the-rise-and-
develop...](http://smartos.org/2011/12/15/fork-yeah-the-rise-and-development-
of-illumos-2/)

[2] <http://smartos.org/>

[3] <http://omnios.omniti.com/>

~~~
chubot
Thanks for the links; I had heard of SmartOS but not known much about the
technology.

What sort of disappointed me about LXC is that you end up with an init process
and 7 or 8 children of it in each container. I am more interested in
sandboxing at the level of a single process. In a lot of cases you just want
to run somebody else's Python code and look at its stdout; you don't need to
spin up init and family do that.

There are a hundred and one projects like this but most of them seem half-
baked.

Capsicum [1] looks like what I'm interested in; there seemed to be effort
around a Linux port a couple years ago but I don't think it happened. Does
Illumos/SmartOS provide anything like this?

<http://www.cl.cam.ac.uk/research/security/capsicum/>

~~~
justincormack
You don't need init in each container, and the encouraged model of having a
whole distro in a container is bonkers. Play around with clone(2)/unshare(2)
directly, and it is fairly simple. All you need to know about pid 1 is if it
terminates your namespace goes, and orphan processes will reparent to it (and
some signals are blocked). If you have a single process then this doesn't
matter really. You can do all this from Python I expect, I have done it all
from Lua with no issues.

~~~
chubot
OK from what I understand "LXC" is basically the user space tools that give
you the distro in the container... it's more of a VM model.

But yeah I think I just need the underlying cgroups, and possibly some of the
namespaces. Although I don't car aell that much if untrusted code can see what
processes are running; just as long as it can't affect them.

Just curious what you were using containers for from Lua? Sounds interesting.

~~~
justincormack
I started using them largely for testing netlink code, as it is much easier to
create some isolated network devices than risk messing about with the real
ones. This is part of a fairly comprehensive Linux binding for Lua
<https://github.com/justincormack/ljsyscall>

------
arielweisberg
I have high hopes for the automatic NUMA balancing work. We are getting a lot
of cores per socket capable of generating a lot of traffic per core and the
disparity which was already pretty large continues to grow.

That said the scheduler does pretty well, it beats manually binding without a
lot of experimentation.

------
ch33zer
Does anyone else love reading these, even though they are mostly way over your
head? There is something about kernel release notes that makes them
fascinating...

------
loser777
3.7.x has been borderline breaking on my laptop for a month now and I'm fairly
sure that the issues I've experienced affect a large number of users
(excessive heat, low battery life on Sandy/Ivy-Bridge based notebooks).
Compiling 3.8 as I'm typing this.

------
mtgx
Pretty excited about seeing F2FS in mobile devices eventually, as mobile
devices still need all the I/O performance they can get, considering most
manufacturers are using cheap flash storage, and even the high-end ones aren't
that fast.

~~~
knackers
Agreed. Really looking forward to trying this on my Raspberry Pi.

------
stefantalpalaru
Will the ext4 inline data be enabled by default? Any idea how older kernels
will deal with files inlined by 3.8?

~~~
wmf
Older kernels refuse to mount a filesystem with feature flags they doesn't
understand; see INCOMPAT_INLINE_DATA in
<https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout>

------
msoad
Does F2FS requires special hardware? Can I format my SSD with F2FS in near
future?

~~~
drivebyacct2
Yeah, I'd like to know more about this. Should I be using it instead of EXT4
on my old, boring desktop SSD?

~~~
simcop2387
At the moment it's had some reports of syncing issues (e.g. not syncing when
it should), and a few performance odditites. All that being said, I'm not sure
i've seen reports of any corruption outright but it is fairly new compared to
the ext family. I'd say stick with EXT4/3 if you're using the desktop ssd for
anything serious, if not, go for it; it doesn't look like it'll kill your cat
and/or wife.

~~~
drivebyacct2
Oh, I don't care if it eats my root install. I've got a "reinstall everything
and reconfigure everything" script that gets me back to 99% after a new
install. Plus, I keep my home partition on a "totally stable TM" RAID10 BTRFS
array, so I don't mind a bit of risk.

~~~
wazoox
Isn't "totally stable" and "btrfs" somewhat contradictory?

~~~
specto
_So far_ my experience has been btrfs is stable, meaning I haven't had any
data corruption and TRIM works great. That being said I have multiple backups
and images of the system :)

------
SoapSeller
Storing files data inside inodes is really great feature - it should make
usages of file-system-as-DB approaches usable in a lot of cases where it
didn't make sense before.

------
codex
It's human nature to tool worship. I am no exception. But please, let's
worship tools that enable radical innovation. The Linux kernel was that tool--
in 1999. It's still improving, but it's not worthy of this much attention. It
merely allows existing innovation to work better. It's like celebrating the
latest Xeon processor--cool, yes, but not worth this much collective
distraction.

------
a_a_a
Support for 386 being removed. My children compile kernels on their 386 to
keep warm in the winter. Linux 3.8 MURDERS CHILDREN (<http://xkcd.com/1172/>)

~~~
beatgammit
I guess it's time to upgrade to a 486. I hear they've come down in price.

------
drivebyacct2
Sad. This had Thunderbolt hotplug working on my Mac at one RC and then it got
removed and didn't get added back in. Thunderbolt Ethernet adapter does work,
but it was kernel panicing yesterday on rc7. I'll have to see tonight if it's
fixed.

To others, where would I report such an issue if it's still broken?

~~~
caf
The linux-pci mailing list: <http://vger.kernel.org/vger-lists.html#linux-pci>

