
Two Objects Not Namespaced by the Linux Kernel (2017) - setra
https://blog.jessfraz.com/post/two-objects-not-namespaced-linux-kernel/
======
haberman
> The current set of namespaces in the kernel are: mount, pid, uts, ipc, net,
> user, and cgroup. [...] [Time is] not namespaced. [...] The kernel keyring
> is another item not namespaced.

I've always argued that "everything is a file" is an exaggeration. These
moments make the extent of that exaggeration clear.

If everything truly was a file, the only thing you would need to namespace is
the filesystem. But in reality there are a lot of other kernel objects that
are not files at all.

~~~
zapita
You are 100% correct. “Everything is a file” was more of an early design
insight, which was gradually abandoned as new features were added.

There is a movement of “Unix purists” who lament this deviation from founding
principles, and advocate for a return to them. The most notable example is
Plan 9.

In Plan 9, everything actually is a file. And exactly as you said, all
resources are namespaced via the filesystem. It’s quite elegant and practical.

Sadly Plan 9 has remained a fringe OS, and although it influenced mainstream
operating system design in many ways (including the concept of /proc), I wish
that influence had been stronger.

~~~
AceJohnny2
I also liked QNX, when I worked with it.

You really did access devices through the /dev/ system, and device-drivers
were userspace programs that created files in /dev/.

If your driver crashed, you could kill the userspace driver (which deleted the
file under /dev) and restart it (assuming hardware blah blah blah).

~~~
Someone
_”device-drivers were userspace programs that created files in /dev/. If your
driver crashed, you could kill the userspace driver (which deleted the file
under /dev)”_

I think that shows not everything is a file. If everything were, you would
start the driver by creating the file (say as a hard link from a file in /dev
to the driver executable) and kill the driver by _rm_ -ing the file.

(Chances are that, if you follow this through, this idea won’t support
everything we want to do with drivers, but if so, that’s an indication that
“everything is a file” doesn’t work)

~~~
zapita
To give you a sense of how far Plan9 took the idea... To open a tcp
connection, you create a special “control file” at `/net/tcp/ctl` or some
similar path, then write newline-terminated text commands to the file
descriptor. That descriptor now represents your socket. You can also browse
its contents as a directory (in plan9 each node in the filesystem can be both
a regular file and a parent directory).

------
wmf
Since this was written a time namespace was proposed:
[https://www.phoronix.com/scan.php?page=news_item&px=Linux-
Ti...](https://www.phoronix.com/scan.php?page=news_item&px=Linux-Time-
Namespace-RFC)

~~~
DonHopkins
This proposal implements clock offsets, but does it support continuous time
scaling? One clock-handy use case would be to run your programs really slow or
fast (or backwards!), for testing purposes.

Kaleida Lab's ScriptX (a multimedia programing language kinda like Dylan with
classes) had built-in support for hierarchal clocks within the container (in
the sense of "window" not "vm") hierarchy. The same way every window or node
has a 2D or 3D transformation matrix, each clock has a time scale and offset
relative to its parent, so anything that consumes time (like a QuickTime
player, or a simulation) runs at the scaled and offset time that it inherits
from through its parent containers. And you can move and scale each container
around in time as necessary, to pause movies or simulations.) You could even
set the scale to a negative number, and it played QuickTime movies backwards!
(That was pretty cool in 1995 -- try playing a movie backwards in the web
browser today!)

[http://www.art.net/~hopkins/Don/lang/scriptx/tech-
qa.html](http://www.art.net/~hopkins/Don/lang/scriptx/tech-qa.html)

Q: How does the ScriptX core class library compare to class libraries
available with other programming languages (e.g. MFC, OWL, MacApp, or TCL)?

A: The ScriptX core class library has many similarities to other object
oriented frameworks in that it provides many basic services common to all
applications built on them. All frameworks provide classes for creating
windows, handling keyboard and mouse events, reading and writing files, etc.

Where ScriptX is unique is in its focus. The ScriptX core classes are oriented
towards interactive, media rich applications. For example, clocks and timing
are fundamental in the ScriptX class library; most other frameworks have no
concept of timing built in.

ScriptX also tightly integrates media data (bitmaps, video, audio) with the
class library, and hides the details of storing, retrieving, and presenting
media to the user.

Q: What are the major sections of the core class library?

Clocks, players, and animation.

Time is a fundamental element of the core classes. Starting with basic clocks,
subclasses extend the capabilities for animation, video, and audio playback.

Clocks can be tied to underlying hardware clocks or slaved in a hierarchical
fashion to other clock objects. Varying the rate of a master clock, all sub-
clocks will stay synchronized to the master clock, permitting the programmer
to precisely control time in a title. Clock hierarchies also free the
programmer from dealing with differences in performance between slower and
faster CPUs.

Player classes build upon clocks. These classes allow you to create and play
sequences of actions that take place over time. These sequences can be used to
create animations as well as control other presentation elements such as video
or sound.

A special type of clock object, the action list player, can be used to play
actions in sequence at specified times. Various action objects are added to an
action list specifying the time at which the action is to occur. Action
objects are used to move graphic elements on the screen, execute ScriptX code,
or modify the action list.

Other player classes provide simple ways to play digital video, audio, and
MIDI. As with all players, the clocks underlying these players can be sped up,
slowed down, or run backwards.

Q: Can video be synchronized with other events?

A: Yes, internal video players are based on ScriptX clocks and can be slaved
together to provide synchronization with animations and other events. For
example, buttons can appear in a window at precise times based on video
playback.

~~~
cyphar
> This proposal implements clock offsets, but does it support continuous time
> scaling?

No. The main reason why is because it's very difficult to do with the current
time-keeping machinery within the kernel. Some people also want the ability to
freeze the current time, which is also similarly difficult -- and in some
cases harder because then what should CLOCK_MONATONIC give you? There's also
the fact that there's currently no interface to set the "clock speed" do any
of these things.

Making time go backwards I think would simply be impossible, due to how many
things in the kernel that interact with time probably make the (reasonable)
assumption that time goes forwards. Also CLOCK_MONATONIC would do the exact
opposite in such circumstances.

~~~
cperciva
You mean "CLOCK_MONOTONIC", not "CLOCK_MONATONIC". (I'm guessing this is a
misspelling, not a typo, since it appeared twice.)

And the simple answer is that if time stops then CLOCK_MONOTONIC always
returns the same time. This is perfectly fine given correct software;
CLOCK_MONOTONIC is guaranteed to _not go backwards_ but it it not guaranteed
to _always go forward_. One could imagine for example a system with a very
inaccurate clock where CLOCK_MONOTONIC simply counts days, for example.

------
derefr
I wonder whether namespacing time would also result in those namespaces being
able to have separate "clocks" (time backends? time schedulers?) that progress
at different rates, or for different reasons.

Being able to put a process into a time namespace with a deterministic "clock"
would obviate a large benefit of
[http://www.zerovm.org/](http://www.zerovm.org/).

Also, having "clock slew" be a matter of perspective—with processes that _can_
handle leap seconds seeing them happen instantaneously; and processes that
_can 't_ handle leap-seconds, seeing slewed time—would be nice. Then you could
have different system facilities that care about _monotonic_ time, vs. _synced
to calendar_ time, vs. _one second per second_ time, all having that kind of
time available to them as "the time", rather than through different APIs.

~~~
kbenson
> Also, having "clock slew" be a matter of perspective—with processes that can
> handle leap seconds seeing them happen instantaneously; and processes that
> can't handle leap-seconds, seeing slewed time—would be nice.

I imagine there might be some really interesting (for meanings of interesting
that include _shoot me now_ ) and hard to track down bugs as you deal with
inconsistent clocks not just across systems within a network, but processes
within a single system.

~~~
derefr
> I imagine there might be some really interesting (for meanings of
> interesting that include shoot me now) and hard to track down bugs as you
> deal with inconsistent clocks not just across systems within a network, but
> processes within a single system.

I feel like the "safe assumption" that the other end of a given IPC channel
(or even inter-thread communication channel) _is_ on the same machine, is
responsible for the vast majority of failures we see in e.g. Jepsen testing of
databases.

After all, in sufficiently-large computers (i.e. HPC clusters that pretend to
be one "computer"), you've got NUMA zones that are light-microseconds away
from one another, where even _threads of the same process_ can literally end
up needing vector clocks to linearize events between themselves.

It probably wouldn't be too bad a thing if things like the Linux base-system
used only internal IPC mechanisms that exposed this unreliability (like e.g.
Erlang does with "unreliable async message passing" as its IPC primitive),
forcing each component to deal with the fact that its peers _may or may not_
be netsplit away from it.

Even if that scenario will only come up if you're writing code to get your GPS
position from a Dyson sphere of 10-mile-deep Matryoska brains.

~~~
kbenson
I bet that assumption is responsible for a large number of problems. I just
_also_ think it's correct enough most the time and relied on enough that if it
all of a sudden _often_ wasn't true, we'd see our carefully crafted
applications for what they really are, a pile of assumptions that sometimes
have little relation to reality.

------
theamk
I personally miss core pattern namespacing. I would love to give some of my
containers a custom coredump handler, but this is impossible.

And in general, a sysctls settings namespace would be really useful. Sure,
sometimes it makes no sense to namespace a setting, but
net.ipv4.tcp_congestion_control for example? I'd love to be able to change it
without modifying the code.

------
vxNsr
meta: This is from 2017,

Super interesting though, the keyring thing especially seems to have broader
implications...

------
tyingq
Syslog seems to be on the proposal list as well.

------
lalaithion
Why is this the case? No one has bothered to do it? It would break backwards
compatibility? Linus thinks it's a bad idea?

~~~
emmelaich
Probably merely because it's hard to do and no one has sufficient motivation.

I can think of one good use case -- y2k style problems.

Also sometimes apps are tied to external events like legislation. It would be
good to set the time forward for testing.

You can sort of do this with LD_PRELOAD but it can get hairy.

Also see @wmf's comment above.

~~~
briffle
If there is one thing Y2K taught us, its to ignore any worry about the 2038
problem until 2036, then make a HUGE deal out of it.

[https://en.wikipedia.org/wiki/Year_2038_problem](https://en.wikipedia.org/wiki/Year_2038_problem)

~~~
cyphar
Linux and glibc have been working on 2038 problems for at least the past
decade.

~~~
pjmlp
There are plenty of other POSIX platforms out there.

------
Sharlin
I’m not sure that people who think ”containers are just like VMs” should have
any business working with containers.

------
timeattack
You can't change time in container, but it's possible to change timezone
files.

With generating fake timezones it is possible to change time in container.

~~~
cyphar
This doesn't change what gettimeofday(2) gives you (and actually you can't
even use ptrace easily to fake the time of day because gettimeofday(2) isn't a
real syscall -- it's actually implemented as a read from the vDSO page the
kernel maps into every process).

