
Torvalds: Standards need to be questioned - l1k
https://lkml.org/lkml/2018/6/5/769
======
tptacek
He's probably right, but he expresses his ideas so poorly that he turns
straightforward points into drama, I think intentionally. He doesn't just not
give a shit about standards (probably healthy), but also what any of his
audience thinks about his ideas (probably not so much).

At this point, his flailing supposed anger is really just schtick. He deploys
it so casually you just assume it isn't serious. Ironically, he could convey
his contempt for standards, or for people who adhere to them slavishly, far
more effectively if he simply wrote in a civil tone, rather than continuing to
try to affect his Andrew Dice Clay of Programming persona.

~~~
__jal
I don't understand why people feel the need to go Kremlinologist on his tone.

He swears a lot. Who the fuck cares? His audience isn't HN or The Register or
anyone else not doing kernel development.

The apparent desire for bland, well-scrubbed and boring is not universal. I
haven't heard of a sudden increase in demand for talk therapy for kernel devs.
Lots of engineers swear a lot, and that's unlikely to change anytime soon.

Now, if you'd like to discuss actual abusive managerial behavior, we could
have something to talk about.

~~~
zaarn
I think James Mickens expresses it correctly;

"""

When you debug a distributed system or an OS kernel, you do it Texas-style.
You gather some mean, stoic people, people who have seen things die, and you
get some primitive tools, like a compass and a rucksack and a stick that’s
pointed on one end, and you walk into the wilderness and you look for trouble,
possibly while using chewing tobacco.

"""

Kernel developers tend to, in my experience, be a specific type of person.
They are writing software that will possibly save someone's life today.
Billions of dollars changing hands using their code every second.

The average programmer on HN does not write code that will see more than a
couple thousand users at once. Even people at google usually don't write code
as important as the linux kernel (they write on top of the kernel, usually).

\---

There is also some narrativ that Linus is being abusive and I kinda disagree.
Linus doesn't take a dump on people who don't know better, he takes a dump on
people who should definitely know better and who are long time kernel
contributors in important positions. Because these people count. (Plus, he's
finnish)

~~~
Footkerchief
Link for those interested in reading the rest:
[https://www.usenix.org/system/files/1311_05-08_mickens.pdf](https://www.usenix.org/system/files/1311_05-08_mickens.pdf)

------
pavanky
Before anyone flips out, he actually merged the code:
[https://lkml.org/lkml/2018/6/5/774](https://lkml.org/lkml/2018/6/5/774)

    
    
        Side note: I've merged it, and it's going through my build tests, 
        so it's really not that I hate the code.
    
        But I really find that kind of one-sided rationale that ignores reality unacceptable.
    
        And I find it dangerous, because it *sounds* so "obviously correct" to people who don't know any better. 
        If you don't know that gcc explicitly says that you should use unions to do type punning to avoid aliasing issues, 
        you might believe that union type punning is a bad thing from that commit message.
    
        So it's dangerously misleading, because lots of people have a dangerous reverence for paper over reality.
    
        In programming, "Appeal to Standards" should be considered a potential logical fallacy. 
        Standards have their place, but they definitely have their caveats too.

~~~
peterwwillis
> And I find it dangerous, because it _sounds_ so "obviously correct" to
> people who don't know any better.

> So it's dangerously misleading, because lots of people have a dangerous
> reverence for paper over reality.

It's the same thing everywhere. You see it as "famous person said X" or "the
standard is X" or "the industry does X" or "X is popular". Then you talk shit
about X, because you actually know it's not right, and people dump on you
because you're a heretic, or not famous, or not an authority, or don't have "a
piece of paper" to quote from.

Everybody just goes with what sounds right rather than what is proven right.

~~~
adrianratnapala
Which is why it nice to have famous but intemperate people like Torvalds to
keep things in balance.

------
dragontamer
I find it ironic in that C++ has the correct solution to this problem. If you
need type-pruning, use reinterpret_cast<foo>(bar). Done and done. Ironic,
because Linus's weapon of choice (C... or more specifically, GCC's particular
implementation of C) would require far more expertise to use correctly.

Lets break it down.

In C, type-punning is NOT part of the language. Its technically "undefined
behavior". There's an expectation that when you do:

    
    
        short foo[2] = {1, 2} ;
        *(int*)foo = 0x12345678;
        assert(foo[0] == 0x5678 && foo[1] == 0x1234);
    

A little-endian machine will pass this assert. But this isn't guaranteed by
the C standard! This is "undefined behavior". The C-standard allows a C
compiler to assume that foo[0] and foo[1] are still == to 1 and 2
respectively. Which would cause the assert to fail. In GCC -O2 or -O3, this
may happen, depending on how registers get mapped to the variables. (The
canonical memory location changes, but should registers be updated when
optimizations are enabled??)

When an optimizer can assume, and when it can't assume, "aliasing" is very
much an undefined behavior within the C language.

\-------------

In effect, Linus knows GCC inside and out. GCC guarantees that unions will
ALWAYS work for this aliasing problem. But this requires knowledge above and
beyond the C Standard.

Linus may be "hating" and "criticizing" the standard in this case. And I guess
there's certainly a gap. But the general expectation that everyone knows the
intricacies of GCC to properly understand the kernel code is misplaced IMO,
and goes back to "Angry Linus yells at random dev unnecessarily" territory for
me.

And instead of simply explaining this VERY simple fact (although super-
obscure) that GCC makes unions safe against aliasing issues at the compiler
level... Linus yells at the dev. Not fair IMO.

~~~
bigcheesegs
Don't use reinterpret_cast for type-punning in c++, you'll end up with UB. You
should instead use memcpy.

~~~
kazinator
Nonsense. Aliasing unlike-typed objects via _memcpy_ is equivalent to pointer
aliasing. _memcpy_ has void pointer arguments. You're relying on the addresses
of the source and destination objects being converted to void pointer.

All type punning is in the hands of the implementation. The language
definition provides the syntax for it, which has the virtuel that all code
which attempts to do type punning expresses it in the same manner. However,
the language leaves it up to implementations to define whether type punning
works, and with what caveats and restrictions.

~~~
tedunangst
Except using memcpy specifically does not violate aliasing rules, while
aliases do.

~~~
kazinator
It absolutely does. E.g. you can't _memcpy_ a uint64 to a double and expect
well-defined behavior.

There is some hand-waving in the definition of _memcpy_ so that copying
compatible objects is well-defined.

~~~
comex
> E.g. you can't memcpy a uint64 to a double and expect well-defined behavior.

Yes, you can. The behavior is implementation-defined but not undefined. (Well,
it can trigger undefined behavior if the value corresponds to a signaling NaN,
or if the implementation uses a nonstandard, non-IEEE format for doubles that
has other "trap representations". But it's not otherwise undefined.)

The basis for this is that the aliasing rule has an explicit exception for
reading or writing to objects using char pointers, i.e. byte-by-byte,
regardless of the object's type. This exception is in both the C standard:

[https://port70.net/~nsz/c/c11/n1570.html#6.5p7](https://port70.net/~nsz/c/c11/n1570.html#6.5p7)

and the C++ standard:

[http://eel.is/c++draft/expr.prop#basic.lval-11.8](http://eel.is/c++draft/expr.prop#basic.lval-11.8)

The memcpy function is defined as copying characters, so the exception applies
to it too.

Both standards also explicitly define that objects (at least of POD types)
have byte representations and those representations are implementation-defined
(as opposed to triggering undefined behavior if you depend on them). For C:

> Except for bit-fields, objects are composed of contiguous sequences of one
> or more bytes, the number, order, and encoding of which are either
> explicitly specified or implementation-defined.

[https://port70.net/~nsz/c/c11/n1570.html#6.2.6.1](https://port70.net/~nsz/c/c11/n1570.html#6.2.6.1)

C++:

[http://eel.is/c++draft/basic.types](http://eel.is/c++draft/basic.types)

~~~
kazinator
"access" refers to reading there. Not reading or writing. Objects may be
treated as arrays of character type to the extent that their value may be
examined that way.

If you memcpy a uint64_t to a double, the implementation is not required to
notice that the double variable's value has changed; a subsequent access to
that variable can continue to refer to a register. It's not a matter of what
bit pattern was stored there.

~~~
comex
No, "access" is defined as reading or writing:

> 3.1 > 1 access >〈execution-time action〉 to read or modify the value of an
> object

Thus, the requirement is symmetrical. The compiler must consider any char
write as potentially aliasing a subsequent read (or write) of any type, unless
it can prove non-aliasing without depending on type. And similarly, it must
consider a write of any type as potentially aliasing a subsequent char
read/write.

------
simonbyrne
Linus has long railed against the aspects of the C standard, and how they're
been interpreted by compiler writers [0].

I'm not much of a C programmer, but the C aliasing rules are incredibly
confusing, and seem to be interpreted differently by GCC and Clang. e.g. GCC
allows type punning via unions[1], but Clang does not [2].

[0]
[https://www.cl.cam.ac.uk/~srk31/research/papers/kell17some-p...](https://www.cl.cam.ac.uk/~srk31/research/papers/kell17some-
preprint.pdf)

[1] [https://gcc.gnu.org/onlinedocs/gcc/Optimize-
Options.html#Typ...](https://gcc.gnu.org/onlinedocs/gcc/Optimize-
Options.html#Type-punning)

[2]
[https://bugs.llvm.org/show_bug.cgi?id=31928](https://bugs.llvm.org/show_bug.cgi?id=31928)

~~~
ajross
That was just a clang bug, and as I read the comments they fixed it quietly
without exactly admitting as much. That kind of local bitsmithery on floating
point values is a pervasive idiom, clearly explained by the syntax of the
language, and the compiler had no business getting smart with it.

------
allengeorge
So much swearing to make three points:

1\. What's in practice conflicts with what the standard prescribes

2\. The reason it conflicts is because the standard is misguided and everyone
knows that; for reliable code-generation you have to take the approach in the
kernel

3\. He disagrees with the rationale for the change, and wants a better reason
for it

~~~
icelancer
"So much swearing..."

Yes, yes, that's Linus. Most of us have learned to get past that.

~~~
dragontamer
> Most of us have learned to get past that.

Have we really? The post is over 100+ points right now. People seem to LIKE
this swearing thing Linus does.

I don't think its healthy for the general programming community. The tone that
was written here is certainly not acceptable behavior on YCombinator in
general. But we put up with Linus because... well... Linus is an elevated
programmer and we all respect him. For better or for worse.

~~~
zaarn
I think it gets points because Linus' Rants are usually good breeding ground
for some discussion on various things.

Also don't forget to contrast the occasional (about ~1/month) rant with the
thousands of emails where he's level headed and polite.

------
camgunz
Strict aliasing shouldn't have been set as the default [1]. It was a huge
mistake that instantly broke maybe all C programs everywhere. The standard
also provided no guidance on how to work around the problem. `memcpy` is
insufficient because it's a copy and that's a huge performance issue. Swapping
through unions is UB. Casting about through `void * ` and `char * ` is gross,
dangerous, and often runs afoul of alignment problems. It's a mess, and it has
been for 20 years. Linus is right to be pissed.

[1]:
[https://blog.regehr.org/archives/1307](https://blog.regehr.org/archives/1307)

~~~
FartyMcFarter
> `memcpy` is insufficient because it's a copy and that's a huge performance
> issue.

Compilers (at least clang and gcc) are smart enough to see through those
memcpy's. For example:

[https://godbolt.org/g/5L5PT5](https://godbolt.org/g/5L5PT5)

~~~
camgunz
For stack allocated scalars sure. What about stuff on the heap or arrays and
structs?

~~~
comex
Generally speaking, if a memcpy's size argument is a constant expression equal
to the size of a native integer type, it will be transformed to a native load
from the source pointer followed by a store to the destination pointer. After
that, the load or store operation (or both) may itself be optimized away if
the pointer points to a local variable. Thus, for example, given something
like:

    
    
        void write32(int *ptr, int val) {
            memcpy(ptr, &val, 4);
        }
    

…the load is removed entirely, and the store is kept but becomes a native
store:

    
    
        write32:
           mov DWORD PTR [rdi], esi
           ret
    

Godbolt link: [https://godbolt.org/g/HYv24F](https://godbolt.org/g/HYv24F)

------
devnonymous
I read the comments here before I read the post and was honestly expecting a
full out swear fest. Instead, I read linus being passionately expressive about
something he clearly has strong opinions about... With a few swear words and
over exaggerated metaphors thrown in.

I'm not saying that the language is not not-nice but it's not like he was
being a bully or arsehole to someone in particular nor was he being over the
top abusive. The delivery shouldn't be the take away here.

------
digi_owl
Reading it fully and it seems to line up with Torvalds attitudes towards
things like userspace facing API stability.

Standards are standards, reality is reality, and when the two conflict the
only way to maintain long term sanity is for the standard to be amended to fit
reality.

The alternative is seen up and down userspace, where we have piles upon piles
of workarounds, and whatsnot, to deal with APIs that change and break between
releases because someone suddenly decided to re-read a standard spec like the
devil reads the bible.

------
twoodfin
And he committed the patch anyway...

[https://lkml.org/lkml/2018/6/5/774](https://lkml.org/lkml/2018/6/5/774)

~~~
s-shellfish
It's a reasonable response, imho. The community wants to go in a given
direction based on directed guidance, a standard. The community that follows
the standard blindly lacks the intellectual tools to question the standard.
The community needs the standard at the present moment but also needs directed
guidance to be able to learn how to question the standard. It doesn't seem
like hypocrisy to me, it just seems like he's annoyed at the general problem,
aware that an individual mind does not have the capacity to solve the problem
alone, aware that a collaborative effort is needed, and otherwise, venting,
because problems like these are honestly, incredibly frustrating, especially
when computers basically have to all be able to 'think' in ways that are more
or less, equivalent, yet, humans don't think perfectly identically.

~~~
gkya
> The community that follows the standard blindly lacks the intellectual tools
> to question the standard.

Yeah Linus is the sole person who has some "intellectual tools" in a community
of thousands of people.

~~~
s-shellfish
That's not what I said.

------
infogulch
Torvalds just doesn't give a shit about appeal to authority arguments.

~~~
jdoliner
Authorities often don't.

------
AceyMan
Every time I've gotten sucked into these <cough> notable LKML messages from
Linus I'm not taken aback in the least.

To my ear, his language comes across as that of a drill sergeant:
intentionally loud, pejorative, and foul for reasons of effectiveness, in
addition to reinforcing the hierarchy he sits atop (as do the tirades of any
prototypical drill sergeant).

My 0,02$.

NB: my parent cussed like a sailor, certainly influencing my PoV.

------
chomp
Commit in question:
[https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux...](https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-
pm.git/commit/?h=dp-4.18-rc1&id=63dcc7090137a893322432e156d66be3ce104615)

------
mark-r
Is there any part of the standards process that is geared to the needs of the
users of the compiler? Or is it all determined by the compiler writers
themselves? I'm wondering why this kind of push back is necessary.

~~~
hshehehjdjdjd
The people on the standards committee, at least for C++, tend to be heavy
users of the language.

~~~
Gibbon1
My theory is there ware two kinds of programs.

Type one: where all the side effects are hidden behind an OS call.

Type two: Where side effects are primary and unavoidable.

People on the standards committee universally write type one code (compilers
and the like). Where the Linux kernel is type two.

------
busterarm
I can't imagine what it's like to be in Linus' position. After all these years
he still hasn't found Linux's own Junio Hamano to turn over maintainer-ship
to.

------
alexandercrohde
I wonder if there's a way to put all this engineering thought into
reducing/simplifying standards and optimizing compilers so that individuals
don't need a Linus-level knowledge.

In my opinion, the knowledge-cost of a system is a major downside that is
often ignored (probably the biggest downside of unix, vi, git).

------
davesque
Honestly, I read a few sentences, began stumbling over the expletives, then
decided I didn't care what the issue was or what his opinion is. Though he
acts like everything that isn't done precisely as he would have done it in
hindsight is "utter garbage", there are probably all kinds of historical and
logistical reasons that things were done a certain way and that doesn't make
the people involved "f*cking morons." And, to save everyone the trouble, the
usual refrain of "he's actually right" is meaningless to me.

~~~
gkya
Well the guy is just full of himself, and the community around him continues
on to fill him up with himself every day. On other news *BSDs are mature, nice
OSes with nicer communities around them (even De Raadt is better than Linus).
Unfortunately I don't have time to try FreeBSD again to see if I can get
suspend/resume working. But if I used a desktop I wouldn't think one bit and
go with it.

~~~
2trill2spill
>Unfortunately I don't have time to try FreeBSD again to see if I can get
suspend/resume working. But if I used a desktop I wouldn't think one bit and
go with it.

Suspend and Resume mostly work for me with the new drm-next-kmod drivers on an
XPS-13 running FreeBSD current, it's just a little slow sometimes[1].

[1]: [https://www.freshports.org/graphics/drm-next-
kmod](https://www.freshports.org/graphics/drm-next-kmod)

~~~
gkya
Thanks, made note of this!

------
yedawg
Roasting other developers for getting it wrong is so much worse than just
saying "the standard doesn't apply here. Try this instead"

------
thelastidiot
He is so full of it. It wouldn't hurt to say it nicely in 3 sentences.

------
Bromskloss
I would rather that he made a standard wherein he specifies how he thinks the
language should behave, and then followed that standard slavishly. One must
follow _some_ standard slavishly!

------
jaequery
there is nothing new here.

