Hacker News new | past | comments | ask | show | jobs | submit login
A regression is the kernel not giving the same result with the same user space (iu.edu)
408 points by signa11 on Nov 2, 2017 | hide | past | favorite | 385 comments

Mercurial, made by another (in my opinion) much more well-spoken kernel hacker, is what really introduced me to the concept that you do not break interface for downstream users, no matter how wrongly you may think they are using the interface.

It’s an attitude that is difficult to convey because software developers always want to have the freedom to “improve” their own software, even at the possible cost of breaking something for some users (and worse, even telling users that they shouldn’t have been doing that in the first place).

I keep going back to this blog post which I wish more people agreed with (Steve Losh is another person influenced by Mercurial):


Rich Hickey has a wonderful talk about it.

"Spec-ulation" - https://www.youtube.com/watch?v=oyLBGkS5ICk

Which I fully agree with, basically there are no major, minor, patch level changes, in reality every change is a possible breaking changes, because we don't control what downstream is doing with our software.

> there are no major, minor, patch level changes, in reality every change is a possible breaking changes

I noticed this when recently writing a spec that defines backwards-compatibility. There's surprisingly few things you can do when you want to stay backwards-compatible.

It's much like an immutable data store. You get one operation: insert.

With full backward compatibility, you agglomerate features, and changes are implemented as optional, with the default being the old behavior.

You can add layers that depend on existing code, but you cannot make existing code depend on those new layers.

Oh well, he should try a language with types ...

/me runs ...

What are you trying to say? There are already a lot of explanations for why Clojure is dynamically typed and how it fills its role nicely that way.

I know, I know. We've been having some interesting discussions about Clojure, Haskell and dynamic and static typing on these stories



My comment was intended to be a wry one referencing those very interesting discussions and hinting that if you have types then you can more easily make breaking changes internally that aren't visible externally.

Dubious. The maintenance argument is weak. https://www.youtube.com/watch?v=2V1FtfBDsLU&feature=youtu.be...

I'm not sure what you mean. I thought it was self-evident that if your interface is an abstract type then you can change the implementation without breaking consumers. If you don't think it's true then please can you explain how? I thought the Clojure argument is not that types don't provide beneficial encapsulation, but that it's not worth paying the cost for them. Again if you know any different I'd be pleased to hear.

Apples and Oranges. We're talking about changing the interface. Did you watch the video segment?

> We're talking about changing the interface.

Who's "we"? I'm talking about changing the implementation.

> Did you watch the video segment?

I watched the bit where he said if you add a constructor to a sum type you have to change everywhere it's pattern matched. True. I'd love to see Clojure's solution where you don't have to change any existing code!

EDIT: Oh wait, I think it's even worse than that. I think he's talking about product types. In which case you should use projection functions and indeed you need to make no changes to old code!

All my favorite languages are typed and I do agree with him.

Really? How so? If you have abstract types then you can easily make internal changes without making those changes visible externally.

As a simple example, suppose I swap out one container type with another one in the internal code. Both implement the same interface/traits, so the change isn't visible to users. But the time complexity of the container operations are different on the new type, and the user was depending upon the time characteristics of the old container. E.g., deletions are now O(n) instead of O(1). I've violated a user expectation without breaking type safety.

Nice counterexample!

Some examples,

1 - Not all changes can be kept internal, for example adding a new case to an ADT

2 - Internal changes can still alter the semantic of the API, even if from the outside everything still looks the same from the type system point of view

Yes, both are true. 2 is a particularly good example.

> there are no major, minor, patch level changes, in reality every change is a possible breaking changes

It is "possible" that my blowing my nose will disturb butterflies in China.

Internal to the organization that produces the software, they are highly meaningful. Extended to user base that reads the documentation, it remains equally meaningful.

Downstream that resists information and can not be controlled is on its own. Good luck.

I agree that platforms should stay backward compatible as much as they can. That said, if upstream changes, sometimes you've got to change too.

There should be a mechanism for releasing breaking new major versions. And it is this:

1) Release a new version for all your DIRECT dependents. Ideally they should be subscribed to your updates.

2) After they have had ample time to test it, they will release an update for THEIR direct dependents.

At no time should C update A if C depends on B and B depends on A. In fact, ideally, if C depends on B1, ..., Bn then C will NOT upgrade A until all B's have upgraded. C should push the B's to do it and supply patches.

This means version pinning for all dependencies, my friends!!

The alternative is "dependency hell" or making multiple copies of A to live inside each B.

If some B's are taking their sweet time then it may be better for some C to submit a patch to B's repo, which even if not accepted can be pulled by other C's. The question is who "owns" or "maintains" a B project if they don't react to changes of some of their dependencies. The answer, I think, should be - fork the B if it's not being actively maintained.

Edit: I think HN needs a "reason for downvote" field, perhaps seen only by the person being downvoted. As it is, many downvotes convey very little useful information.

If upstream changes, pick a better upstream.

So if a large platform like facebook changes, simply stop supporting facebook? Wouldn't that kind of make your library very limited?

>Mercurial, made by another (in my opinion) much more well-spoken kernel hacker, is what really introduced me to the concept that you do not break interface for downstream users, no matter how wrongly you may think they are using the interface.

It matters less, IMHO, the further up the stack you get.

Being right at the bottom of every stack it matters most of all for the kernel, which is why Linus needs to be such a stickler. The amount of damage you can do by breaking userspace is vastly greater than the damage you do removing an hg command.

For various reasons I would not go back to Java, but that was wonderful, your very old code still worked with newer versions, nothing broke during upgrades (even when behind deprecation one version).

We do enjoy breaking changes every now and then. :)


Of course, they are never at the level of a Python 3 release.

Yes, but compared to e.g. Scala breaking changes in Java are heaven, thanks :-)

There's a great talk about backwards compatibility in Java by Brian Goetz: https://www.youtube.com/watch?v=2y5Pv4yN0b0

Watched, great talk, thanks!

Go is really good about this too. It's great.

It's not as strong as the kernel. Go does not break APIs but may break undocumented behavior.

> you do not break interface for downstream users, no matter how wrongly you may think they are using the interface

How does Linux manage to never break its API? For example in the event of a refactoring.

I mean, is the answer really that hard? If the "refactor" would break the interface, you can't do it. Maybe you deprecate the old method and make a new one.

How do they handle bug fixing though? What if one application wants a bug fixed, but another application is depending on the bug inadvertently?

That is a situation they work towards never happening, by never leaving a behavior change in the wild long enough for such a situation to develop.

If you look closely at the kernel code it is easy to find two or more syscalls that have names and behaviors that are very similar, but where one is recommended above all the others.This is because the older syscall(s) were found to be flawed.

But rather than break existing software, they introduce a new syscall that fixes said flaws and leave the old one intact and recommend that in the future software use the new syscall.

The bug they fix is the bug that keeps the API contract intact. The first application gets its bug fixed, and the second application breaks because it was relying on a bug that didn't match the API contract. You program to an API, not to an implementation. The second application did the latter, and will therefore pay the price.

That's not Linus' position. The "API contract" for the kernel is "we don't break userspace code," not "we don't break userspace code that follows the specs." He would bounce (and flame) the fix you're suggesting.

That sounds insane. Is it written somewhere besides a sporadic Linus rant? (e.g. somewhere here https://www.kernel.org/doc/html/latest/ etc.)

In other words, the ABI you have is the ABI that got used, not the ABI you thought you were building.

Well, sometimes, but the Windows philosophy is probably to introduce a shim so that an app relying on the broken functionality still works.

If you can afford to do that, it's nice to do it indeed. It can get messy and hairy quite quickly though.

It seems to me a lot of it boils down to "who gets impacted". If I have a bug in my kernel and a lot of apps are somehow relying on it, I need to be mindful of that and perhaps swallow the purity of the contract in order to provide a pragmatic fix.

Sure, it's not a strategy without costs and Microsoft goes to an extreme. But there's probably a middle way in there; if you're aware that tons of programs now depend on the buggy behavior you probably want to deal with it. But if you're working on Windows or the Linux kernel the tolerance for breaks is probably much lower.

Introduce a new API.

You trivialize the broken usage.

Certain parts of the API once able to respond with a value now just spit out '0', it means that whatever application that used it just continue to work even though that part is deprecated.

> software developers always want to have the freedom to “improve” their own software

Developer can still have that freedom in most[1] situations. When you have a new idea for a better/faster implementation that is, unfortunately, incompatible with the existing interface, you can usually provide compatibility with a shim that implements the old interface using the new interface.

Sometimes this is little more than some #define renames and a handful of very-thin wrapper functions. Occasionally it will be a bit heavier, which might make the legacy interface slower, but this should be balanced (approximately) the faster underlying implementation. If the compatibility shim is significantly slower (or has other unavoidable runtime issues), an upgrade that is slower for legacy uses that still works is always better than breaking existing code.

For an example and opinionated discussion, see JWZ's post[2] about his OpenGL 1.3 -> OpenGL ES 1.1 header.

[1] Exceptions are cases where there is a *fundamentally unavoidable cost to providing the interface (e.g. where N is the total number of interface entry points, something in the critical path unavoidably MUST take >O(N) time or space, or N is limited externally to a small value ("we only have silicon for at most N instructions"))

edit: [1b] Another exception would be when the legacy interface itself is a serious security problem. If your legacy interface unfortunately forgot to include a way to supply credentials and simply accepts writes from anyone, a prompt fix that breaks that API is necessary and appropriate. Next time you're designing an interface, be sure to remember that security has to be baked in from the very beginning!

[2] https://www.jwz.org/blog/2012/06/i-have-ported-xscreensaver-... (you might want to copy/paste that URL if your browser betrays where you came from with a Referer header)

> Sometimes this is little more than some #define renames and a handful of very-thin wrapper functions.

That's more or less the thesis of Steve's blog post too.

There are cases where a major change to a software project caused a handful of minor regressions but added new functionality, vastly improved performance in the majority of cases, etc.

In these cases, a Zero Regressions policy is a bit silly. You can flip things around and look at the new code as baseline and from that perspective the old code has many more regressions relative to it.

There seems to be a pretty stark divide between these mindsets.

Why not simply bake these contracts and tests into the actual software APIs so they are impossible to break?

That way the app would be "backward compatible certified" if it used this layer. The layer would only run during deployment.

I guess an approximation is the Passing indicator on Travis or GitHub.

Some contracts are so complicated that we need 100-page RFCs to express them. What makes you think we can express these in a regular type system? (Unless you use Idris or Coq or something like that.)

I was about to post that it was funny this URL popped up when I read the Lobste.rs discussion until I saw your username :)

If only compiler developers had this attitude.

I think Rust has done this quite well, managing to have the best of both worlds. Unstable features are only available on the nightly build and need specifically enabling (and their interface may change). Once a feature is considered stable it is carefully maintained. They do have some breaking changes, but they're not common and for unusual edge cases:


This explains the Rust approach:


Rust also uses a sizable fraction of all publicly-available Rust code as a regression suite for compiler updates. This allows them to check edge cases that are "breaking in theory" to find out if there's actually any code that would be broken, or if the edge case had literally not yet been encountered.

By sizable fraction, he is referring to Cargo.io, Rust's public package repository.

They literally run all tests against every crate published to the package ecosystem. That is amazing.

Makes you wonder why no-one else came up with that idea before. I mean, it is pretty revolutionary, but also really obvious in hindsight.

I'm curious why this is seen as a new idea. The capability to pull it off is somewhat new. In particular, at such a large scale.

But the idea is far from it. Commercial compilers probably had this more than open source ones in the past. Though, oddly, it was probably a weakness in some respects, since it almost certainly slowed feature development/releases.


Rust also has yet to develop a stable ABI, I know they’re hard at work on so many things so I don’t say this as some snide remark - but until this is the case and Rust supports proper dynamic linking there’s whole domains of tasks I’m not comfortable using it for.

Well, Rust has a stable ABI in the same sense that C++ has a stable ABI: you get a stable ABI if you use an extern "C" and repr(C) layer at the linkage level, or you use the same version of the compiler to compile all your code. It's not ideal for some use cases (though I think not having a stable ABI is the right decision at this stage), but it's not worse than C++.

I can assure you we’re not actively working on a stable ABI.

This is a thing I've been wondering about. Is dynamic linking (in the way that C/C++ programs use it all the time) not a usecase that Rust wants to cover?

You can do dynamic linking, you just can't do it across compiler versions. This is how all of the Linux distros are doing it, for example.

It's a spectrum:

Rust: recompile the world when you update the compiler

C++: recompile the world when you change ABIs or when the ABI changes (My understanding is that MSVC++ changes the ABI every ~2 years? I could be wrong.)

C: never recompile the world

It's not that we don't want the benefits of a stable ABI, but it's a monumental task, and there are more important things for now.

Don't forget that you can get a stable ABI today by using the C ABI. You just lose out on the fancier features Rust has.

> C++: recompile the world when you change ABIs or when the ABI changes (My understanding is that MSVC++ changes the ABI every ~2 years? I could be wrong.)

This is only half true: every new release, the ABI of the runtime library changes. But the Visual C++ compiler ABI hasn't changed in decades.

If you design your own library ABIs so they don't pass std::foobars around, you can link C++ libraries built with VC++6 with ones built with the latest version, and the whole thing is going to work fine - the trick is that multiple versions of the runtime DLLs can coexist in the same process.

GCC/libstdc++ hasn't changed its ABI in more than a decade, since release 3.3 or 3.4; they even did incompatible changes to std::string in GCC 5 without breaking ABI:


Great, thank you for the clarification.

Starting with vs2015 microsoft promises more stable abi.

Unfortunate, but expected. On that note, would there be any breakage loading multiple Rust .so's with differing compiler/stdlib versions into the same process? This is one of my main concerns about the lack of ABI stability and the static-linking-first design of Rust. With C++ at the very least I know Red Hat isn't going to break the ABI within a major release, but the Rust 1.21 packages are already in epel-testing.

If you want to ensure that there's no breakage, the only way is to use the same version of the compiler on all the Rust code.

Do they not? It certainly seems to be a guiding principle of C# compiler developers, who have almost always opted to preserve unintended quirks of the compiler/spec rather than "fix" them.

gcc and clang are _fairly_ good for this, and even MSVC has been getting better.

... just recompile

Minor point, but he puts Python in with software that "almost never breaks" on update. Really? Am I missing something? Maybe he just hasn't tried running Python 3 yet...

You're missing that Python 3 is clearly labeled as a breaking change? He is talking about changes that are not labeled as breaking changes but do infact break things.

One major breaking release in how many years, clearly announced as such and with parallel support for the old version for a long time (so you didn't just update python and it broke). Fits quite well under "almost never breaks".

Or have their been issues with minor releases? I haven't come across any I can remember, but I might be wrong. Maybe some things were deprecated and then removed?

Python 3 is a similar language with a similar name to Python 2. It shouldn’t be considered an update, nor applied as one.

I get the idea, but this is user hostile. If it shouldn't be seen as an upgrade, it should have had a new name. :( Especially if it is going to be a forced migration someday? (I'm assuming python2 support will eventually stop?)

Yet a huge number of Python modules are both valid Python 2 code and Python 3 code.

Sure, they're similar enough that a single file can be valid in both with less effort than most language pairs. That doesn't mean it's an update, any more than C is an upgrade to Lisp because I can make a file that's valid in both.

Yet the Python 3 executable has the same name as the Python 2 executable...

Only on Arch and Windows... All other Linux distribution have python3 and macOS binary is also python3.

No they don't. At least one of them has a number after it.

I got both Python 2 and Python 3 installed on my machine, from the official distribution, and both have "python.exe", no numbers.

If you're talking about Python 2 -> 3, I think you're misunderstanding his point.

Can't edit orig comment...yes I misunderstood the point about explicit breaking changes. I recently watched Rich Hickey's "spec-ulation" talk referenced elsewhere in this thread, which takes an extreme view (tl;dr SemVer sucks and we shouldn't have breaking changes ever), so that's where my mind was. Thanks for the clarity, all.

He says Python packages, many of which bend over backwards to maintain backwards compatibility. (I wish Django were one of them)

Linus's followup post later on in the thread explains his position in more detail and is a pretty good read:



> Behavioral changes happen, and maybe we don't even support some feature any more. There's a number of fields in /proc/<pid>/stat that are printed out as zeroes, simply because they don't even exist in the kernel any more, or because showing them was a mistake (typically an information leak). But the numbers got replaced by zeroes, so that the code that used to parse the fields still works. The user might not see everything they used to see, and so behavior is clearly different, but things still _work_, even if they might no longer show sensitive (or no longer relevant) information.

I'm not sure I prefer the described result. If my application depends on some of those zero'd fields, it seems like that has potential to cause serious debugging problems, as opposed to just causing an error on the parse.

> If my application depends on some of those zero'd fields, it seems like that has potential to cause serious debugging problems, as opposed to just causing an error on the parse.

if your app didn't used to handle the 'zero' case (assuming zero was a "valid" value for said field), then the app was poorly written and would'be been broken in the first place. However, if the app _did_ handle the 'zero' case, then the app continued to work! The end user would've been none-the-wiser about the change (mostly), and some time later, the app could be upgraded to ignore those known fields.

Causing a parse error would mean the app stopped working altogether. Much worse a result for the end user.

There are many cases where continuing with invalid data is much worse than a parse error. Perhaps that's less likely in this case (hopefully your credit card payment doesn't depend on the values in /proc/<pid>/stat), but as a rule, silently continuing with meaningless data is scary.

Yea that's what I was thinking. If the behavior changes, what "works" entirely depends on the context of that specific application. I can understand mitigating complete failures as much as possible, but trying to trick programs into thinking that they're running the same as they were before seems.. well, scary as hell. I'm honestly not sure what's worse - breaking the application, or making it think everything is fine.

Not when the application was to supposed to handle the invalid data anyway before.

Because you had to handle the zero case before right?

The idea is that in the worst case you get an error message, not an altogether crash.

If your application relies on these sort of interfaces with the kernel, then presumably you're paying attention when you upgrade the kernel.

Zero could be a perfectly "valid" value and still break your application if changing the function to always return zero breaks an implicit contract. Suppose the previous behavior of the function returned something you could use as an ID for some internal representation in a database that you could look up and write to. Before, the ID you got back depended on the input, but now it doesn't. And you can't tell you're not getting real IDs back anymore, so you just happily write to the database. Now you're corrupting your database because the implicit behavioral contract you were relying on was broken - I'd MUCH rather the program just crash than for it to continue "working" in a way that irrecoverably corrupts data.

> then the app was poorly written and would'be been broken in the first place.

If you present this argument to Linus, you will certainly get yelled at.

Not breaking the user space means not breaking the user space even when it's broken. A couple years back glibc broke Adobe Flash because Flash used memcpy wrong. Linus was quite vocal about it.

I guess that depends on whether or not zeros were valid in the first instance. My post assumed they were. I agree with your point if zero would be outside of the valid range in the first instance.

It sounds like a chance for logical errors to go undetected before they cause issues somewhere totally unrelated.

Having had to fix a number of production issues where a change happened XX days ago but the code just silently kept going rather than breaking, I’m sorta of the mind to agree with you. I’d almost rather it break hard, so then I know it’s broke and can fix it. Very nuanced situation though and as such definitely not one size fits all.

That's fine when it's code that you wrote, but it would be really annoying to me as a user if htop (for example) started crashing on a kernel upgrade (instead of one the columns suddenly being zero).

Completely and totally agree! Why I said it was a nuanced situation and not one size fits all.

Isn't that just 'fail fast' behavior vs. silent ata corruption?

Perhaps, but what if the app developer is long gone, and I am a user hoping to keep using my favorite program? Apparently the original Raymond Chen article is gone but I think this quote captures it: https://news.ycombinator.com/item?id=14202707

Raymond Chen's URLs have changed but the original articles [0][1] are still available.



This is also quoted in a Joel Spolsky's article:


I love the 'Raymond Chen Camp's way of looking at things. xx

I really wish all those people chasing the "desktop Linux" mirage while breaking APIs at whim would internalize this.

If they want desktop Linux to happen, they need to stop chasing eyecandy and stabilize the plumbing behavior, no matter how much that thought offends their developer pride.

That's really not a problem of the kernel, though, is it?

Also, I think bigger challenges to desktop Linux are the degree of configurability, drivers (maybe that's better), and the fact that the GUI is never a first-class method of using the app but instead a hastily-written wrapper on top of a command line tool.

If your code had actually broken with the change to a zero (for example it didn't just report 0 of some resource in use in the log, but spawned an unlimited amount of threads until it crashed or something) then you could have reported it as a kernel regression, and it would have been fixed in another way.

True but there is more to it. Linus’s ambition is to also not break the unknown app in the basement, behind the leopard.

Well, it only returns numbers, so padding retired calls with zeros should really break anything other than giving the wrong data. The only other practical alternative is to return 0 bytes and most likely crash the application expecting a result. How many old school apps making invalid stat calls are engineered enough to handle odd exceptions like this? I'd say probably few to none.

Someone needs to codify this shit for when he leaves. Its surprising that he still needs to scream at people after two decades of kernel work. I don't have a lot of faith that whoever comes after him will be as proud of the kernel as he is given they just inherited most of it.

Succession is a weird concept with linux but the fact that it has always been more stable than the other OS' is I think the main reason it is been so successful in the internet age.

An example of a failed succession is Tim Cook; the guy is the complete opposite of Jobs. Just take a look at the apple website nowadays. A product guru would not sell three generations of iphones that are all cannibalizing each other and lack any real differences.

We will hopefully have a new kernel written in Rust by someone new.

God, I hope not. Ground-up rewrites (when they even succeed) typically lose features, performance, and security, since they don't have the benefit of 20+ years of correcting mistakes. Plus you lose all your contributors.

However, Rust is linker compatible with C, and can be runtimeless, so rewriting the kernel piecemeal over time instead could be great. ;-)

This is what I meant. Any ground up rewrite will be reinventing the wheel and hence too expensive.

I don't think the Cook/Jobs comparison is fair.

It's always been well understood the the same people who are typically good are starting companies aren't typically the best people to run them once they achieve a certain scale.

How are "three generations of iphones" "cannibalizing each other and lack any real differences" ? Care to elaborate?

John Johansen will survive. Linus has made it incredibly damn clear what "no user space regressions ever" mean. See f.e the similar rant from 2012: https://lkml.org/lkml/2012/12/23/75 And it is his kernel so it is his rules. John Johansen admitted he messed up (http://lkml.iu.edu/hypermail/linux/kernel/1710.3/02539.html) and promised to handle things better in the future so all is well.

> http://lkml.iu.edu/hypermail/linux/kernel/1710.3/02539.html

Wow. Now that's how you do an apology. If anything my confidence in this guy is increased.

Simple. Contrite. To the point. No excuses, just solutions.

Yeah, this is a _very_ professional response to a (typical for Linus tbh) very emotional (but correct) post that could be easy to react emotionally to.

Fair point, but I'm coming round to the idea that emotional 'anger' might be a very effective tool to use when in charge of things. It seems well applied in this case.

The problem is that everyone can be an asshole, but most of us are not so consistently correct on a subject as Linus is.

It can be, absolutely. Emotions are an intensifying tool.

There are ways to use that language without being a prick about it though. Linus does not have that ability.

Great links. To make Linus his live easier I've made a website on http://www.firstruleofkerneldevelopment.com/ with those links so he can just reply with: "See www.firstruleofkerneldevelopment.com" instead of posting a rant.

He is never going to do that unless he owned the domain. He is not going to point to websites that could potentially change and show ads, malware or whatever.

If that is the case and if he's interested I would be happy to transfer the domain to him.

Was there a "first rule" before the 2012 rant, or was it invented in that email thread?

Linus has said earlier that this rule has been in place pretty much since day 1 (in a google+ thread here: https://plus.google.com/115250422803614415116/posts/hMT5kW8L...), although he's made it much more explicit than it was at first.

> Linus has made it incredibly damn clear what "no user space regressions ever" mean

out of curiosity, how does a policy like this translate to microkernel OSes (e.g. Redox) where most everything runs in userspace?

Microkernels still need syscalls for usermode to interface with, so it is just as important. Kernels exist to serve userspace, not the other way around. Though, given the relative immaturity of Redox I assume this is less important compared to Linux.

i think Redox explicitly tries to limit the syscall exposure as well relative to linux. (dozens vs hundreds)


I've never programmed a microkernel OS (and I've only done minimal Linux kernel programming), but I would assume it's even _more_ important, as a single change would impact almost every component of the system.

Does it need to? This is the rule for Linux development. Other kernels need not follow it if it doesn’t make sense.

Yes, exactly. Linus does things his way. Others (OpenBSD and maybe other BSDs for example) view the kernel + userland as a complete release and you upgrade everything together.


Let's not join in with rantings of our own?


> John Johansen will survive.

We really need to stop excusing Linus's behaviour. Just because we may happen to agree with what he is saying doesn't mean we also have to agree with how he is saying it.

What behavior? Linus's comments were carefully directed at the bad behavior that needed to be fixed. I don't see any personal attacks. Linus is a manager, and part of being a manager includes putting your foot down when someone is causing problems. Ideally, the situation should be handled directly and unambiguously ("The thing you're doing is incorrect and causing problems. Stop doing that.") before the problems become worse.

> trying to shift the regressions somewhere else is bogus SHIT.

That's accurate and specifically points to the unacceptable behavior that needs to be fixed. Linus's "WE DON'T BREAK USERSPACE. EVER." policy is well known, documented, and mentioned frequently. It's reasonable to expect anyone submitting patches to understand this, which makes trying to shift the blame elsewhere... well... "bogus SHIT", regardless of the specific words you use to describe it.

> I will stop pulling apparmor requests until the people involved understand how kernel development is done.

Note that Linus is specifically indicating that this is fixable. Given John Johansen's very commendable apology (see link in grandparent comment), it looks like the message was understood. I suspect this will work out much better in the future.


edit: Diluting language to make it "nicer" can remove important information. See: "How a plan becomes policy" https://funnyshit.com.au/the_plan.html

"Seriously, it's the kind of garbage that makes me think your opinion and your code cannot be relied on" is, quite bluntly, a personal attack, directly commenting on an entire developer's code history and knowledge based on one single issue. This could be worded a lot better. The profanity is also unnecessary.

The rule of thumb I've seen in your typical management oriented business books ( eg, your books / training like Crucial Conversations: https://www.vitalsmarts.com/crucial-conversations-training/ ) is to be "persuasive, not abrasive" as that site says. The issue with being abrasive is that one way people might react is by instinctively attacking back -- no matter how technically correct you are, f-bombs and personal attacks might tempt others to throw counterpoint f-bombs and personal attacks back. Another is that repeated behavior along these lines might drive away some people who would rather not see profanity and anger in their mailbox.

Everyone loses their temper occasionally, and John Johansen reacted in the way you should to someone who loses their temper.

Unfortunately, the Linux community as a whole has a reputation in some circles for being overly "hostile" and "toxic". Linus himself has this reputation. How much this reputation dents Linux contribution and usage isn't known, but I doubt anyone can say that this is a positive.

Linus isn't a manager, he's in charge. Persuading people to move in the same direction is a critical management skill but fundamentally not the job of someone actually steering things, and it's entirely common for the person in that role to just bluntly do the decision loop thing and let someone else put a happy face on it.

If you're driven away from having the job of transforming specific concrete direction into action for a large important organization then you should probably find a different job. Not everyone needs to put up with that, although if you have ambitions as a manager of people it will probably be career limiting.

The only strange thing about the Linux kernel is how it's all done in public instead of face to face.

> Unfortunately, the Linux community as a whole has a reputation in some circles for being overly "hostile" and "toxic". Linus himself has this reputation. How much this reputation dents Linux contribution and usage isn't known, but I doubt anyone can say that this is a positive.

Except that it is still being used unlike the other "nice" projects. BeOS. For example.

To be fair, BeOS was killed by Microsoft threatening OEMs into refusing to carry their product. Linux only escaped because it wasn't owned by a corporation that needed revenue to survive.

It is possible to be technically brilliant and wildly succeed in your project, and usually do overall good when it comes to project management... and yet have certain personality flaws that may hinder your management skills.

What I know is that I've never read advice geared for managers which states that the route to success comes from being an abrasive boss.

That's because people love being an uninspiring middle. It is safe. It is sufficiently pleasant. It is secure. And it definitely sells lots of books.

> quite bluntly, a personal attack

I don't see how this is a personal attack. I call the quoted comment a plainly-stated critique of a developer's portfolio. If you have repeatedly demonstrated a problem following basic requirements (ref: "denying the regression now for three weeks"), that's a good reason to presume problems might exist (i.e. "your code cannot be relied upon"). Linus tends to write this type of "blunt" letter only after other ("nicer") attempts have repeatedly failed, and a pattern of problematic behavior exists.

Accurately describing someone's reputation based on their actual work and behavior is not a personal attack. Without false claims - aka actual slander or libel - you earn your own reputation which other people can and will include in their opinions about you.

> The profanity is also unnecessary.

I don't see any profanity in the quoted comment.

> the Linux community

...is another matter entirely, and a discussion for another thread.

The profanity is in the original post ("Stop gthis f*cking idiocy already!", "bogus SHIT").

From my viewpoint, the problem with the personal attack is the overreaching scope and the hostile wording, not necessarily the critique. It is easy to be firm about laying down the rules of development without getting into the realm of "your entire code and opinion".

There is absolutely nothing wrong with his later post on this in my opinion ( http://lkml.iu.edu/hypermail/linux/kernel/1710.3/02487.html ) and I don't think he's any less firm about the development rules here.

How about stopping to try to throw wrenches into what he's doing? It's not a hard concept, don't break user space under no circumstances ever, if you do it's a bug. If you cannot understand and know this by heart, don't even fire up your editor, you are not wanted in Linux kernel development. Fork it and PROVE he is not doing it the correct way by spending weeks and months and years of arguing with people who simply will not stop arguing, because you can't tell them to fuck off.

This is the Sphinx asking you an easy question, and if you don't simply give the correct answer she helpfully provided in advance, you get burned to ash. You don't get to stand there arguing, and you certainly don't pass through. Instead complaining about it while someone is still doing it, learn to become so principled where it matters for the time when he can't do it anymore. That's infinitely more important than not swearing if you're older than 6. Have you looked around the world you're living in? Plenty of nuts people with plenty of vested interests. Plenty of weak people feeding them. If "not hurting feelings" is a priority to you, you are a hostage to those who will fake feelings until the cows come home. In turn you do not get to hold those hostage who see through it.

How much of what people use, how much profit does Linux enable? But oh no, the bad man is swearing. Unbelievable. Uninstall everything that has traces of it in it, but spare me your advice on what I need to "stop excusing". There is no aggregate by the way, "we" doesn't have to agree with anything because it doesn't exist. I have my stance, you don't even propose yours.

He has previously been unnecessarily insulting, which I don't excuse, but in this case someone caused a problem, and after 3 weeks still does not understand how serious a problem it is. He's addressing it bluntly because that is what is required. He's not being unreasonable.

If it was anyone else we'd be calling them an utter asshole and then citing them as reasons why Silicon Valley has a reputation for <insert bad behaviour here>.

Linus is still a dick and it isn't funny, refreshing or cool.

I've called Linus an utter asshole many times before, I have no problem calling him an asshole when he is one. I've made big career changes because I was sick of certain toxic people in open source projects. I'm not defending Assholes In OpenSource(TM). I just don't think this is an instance. He isn't personally insulting the person, the person was defending a choice that anyone exposed to kernel development should know does not meet fundamental expectations. After 3 weeks, that requires some bluntness.

There's a thicker-than-fine line between being hostile to new contributors and being unkind, and tolerating bad practices from people who ought to know better.

You are not your code.

And behavior can be changed.

If someone criticized your code, taking it as a personal attack is YOUR problem. If your behavior is criticized, it means you can improve: acting like a fool may not mean you're a fool. It's on you to disprove it.

Nope. This is a valid case to call someone out. Not just for Linus, but for anyone.

Yep. Not by cussing at them or raging like a toddler.

If this was some Joe Schmo at a random company he'd be fired for treating co-workers like that. Since it's Linus it's hero worship.

And if an employee pushed for 3 weeks ignoring the company's leader's best-known rule for not screwing over customers - what does that say about the employee? How would you deal with it if they haven't listened to anyone else? It's like you're not even reading anyone's responses to understand why they disagree with you...

Not sure where you're working but in every US company I've ever worked for management behaved like Linus and worse and it wasn't them getting fired.

Unfortunately you've gotten a run of bad luck with some bad companies.

I agree with how he's saying it. He's being honest. He's letting you know where he stands and what the rule is, without any kind of hidden message or room for interpretation. The other guy? Not being so honest.

There's an implicit idea around our culture that the more uncontrolled emotional content that someone communicates, especially anger, the more 'honest' they are - I think because it's intimidating, leaving less room for debate; it makes reasoning very difficult, leaving even less room for a rational approach; it's message is of being uncompromising, which also cuts off any thinking or discussion. By eliminating thinking and discussion, it's simple, and I don't mean that in a good way, but that makes it appealing to people too lazy to deal with the necessary complexity of situations and who are not committed to getting the best outcomes. But it's not honest at all and it's not hard to see how it leads to bad results.

Acting out is actually dishonest and it's mostly about a hidden message. It's a failure to recognize and deal with your won frustrations which you then take out on others, usually when an opportunity presents itself to rationalize your behavior - someone does something you perceive as wrong, so they become a justifiable target for the anger that's already there. What I read in the email's message is, 'I'm out of control, I'm not able or willing to mange my emotions, so you better watch out'. That is a poor management and communication technique.

IMHO, a leader or manager should hold their leadership and management work product to the same standard that they hold their subordinates' technical work product. If they want the highest standards for the kernel code, they need to have the highest standards for their communication and other conduct.

A great challenge of being the boss is that there is nobody to hold you to high standards; you have to do it yourself and that's not something people are good at. As they say: Power corrupts, and absolute power corrupts absolutely. Clearly, this message could have been communicated much more effectively - while being truly honest and without hidden messages or room for interpretation - if the communicator chose to make an effort.

Finally, if this form of communication is effective, do others communicate to Linus in this manner? Does he ask them to?

I will never understand why this has been downvoted.

It's a cliche hacker news comment that adds nothing to the conversation:

- Speaks in platitudes about the topic in a way that shows a thin understanding of the context and only sounds smart to people that also aren't familiar with the context.

- Armchair psychiatry used to advance a moralizing message.

- All over the place while deep in a comment thread, seemingly more interested in hearing themselves talk than carrying on the conversation.

As the author of the comment in question, I think some of parent comment is fair criticism - that my comment is general and too superficial, without a strong basis in fact and it doesn't contribute much hard knowledge; it's all analysis. Also, it's long.

I considered those things myself before I posted it. However, I didn't see how to overcome them; some issues don't lend themselves to strong factual bases, and some are much more nuanced than can be addressed in a few words - issues such as leadership - and yet they still are important enough to discuss. The problems of the world don't always conveniently fit our model of how we prefer to address them. How else could these issues be addressed?

> It's a cliche hacker news comment

This I strongly disagree with; these issues are rarely addressed with much serious thought and that's one reason I posted it.

Just to make sure: I found it insightful, valuable, refreshing, and the right length.

Thank you; much appreciated. To be clear, I was responding to another commenter's criticism.

Conversely, I see an insightful take on communication and hierarchy.

Is this like an alt account to issue apologia for unpopular comments? You literally went through the comment tree and rehashed your personal opinion on the comment over and over again.

No. May I offer proof? I’m not sure what the best way would be. Suggestions?

It's a maze of generic crap that no one wants to wade through?

I enjoyed reading it, and grasped meaning in it. What would you call your comment?



We've asked you not to post like this before, and it seems as though you're going to continue to violate the guidelines, so we've banned the account.

We're happy to unban accounts if you email us at hn@ycombinator.com and we believe you'll post civilly and substantively in the future.

Are you able to provide a patch for the aforementioned elementary grade mistakes to the open source kernel project without introducing any kind of regression in user space?


But they're elementary bugs you could have just fixed for yourself. You clearly were trying to do something and the mistakes that were made were easily spotted, I'm confused as to why you just didn't do it at the very least for yourself?

>That's obviously not happening with his attitude.

Seems he's making a lot of things happen with his attitude as evidenced by the thread that was linked.

"But they're elementary bugs you could have just fixed for yourself"

I'm not selfish enough to want something to work for me and only me, thank you. That's why I submitted a bug report.

And that ends up just providing another reason why Linux is a crap OS - everything is done entirely backwards as if everyone on the team learned their entire education the same way RPN is written.

I'm willing to bet my entire life savings that Linus understands the meaning of the word regression in this context better than you.


As it's used almost everywhere in coding circles, regression means a patch that made things worse than it was before. No implication of the worse state being there before.

No, what we need to do is stop listening to fragile cry babies. If you ever work for a real company with real deadlines you'll get a lot worse than what he gave.

And he only responded this way after 3 weeks of unacceptable behavior. If he blew up out of nowhere after the very first report of a potential problem before it was even confirmed, that would be a different story (though that's something that probably happens in real companies every day).

> If you ever work for a real company with real deadlines you'll get a lot worse than what he gave.

I think you must have worked for some horribly unprofessional companies. I've worked places with tight deadlines before, but berating your colleagues or employees with profanity is just ridiculous. I would walk out on the spot if anyone spoke to me the way Linus did in that email.

>I think you must have worked for some horribly unprofessional companies.

I have for sure, but the worst of those companies was better than fortune 5 (if I said exactly, you'd know which company), so while primitive they obviously manage to get results that coddling hasn't.

EDIT: And as for walking out, if I were your manager and you broke customer systems and made excuses for 3 weeks about it I wouldn't be yelling at you. I'd be firing you.

Firing somebody for lousy performance & making excuses is an entirely correct and professional response. Berating and profanely insulting them is not.

That sounds like a horrendously selfish attitude.

The problem was fixed; nobody was harmed; lessons were learned. The sum total of effect of this "Linus tirade" seems overwhelmingly positive. But just because he didn't conform to your arbitrary standards, you write him off?

Linus is more humanist in this case than your policy apparently allows.

You have no idea who was harmed. Treating other human beings with this kind of flagrant disrespect is childish and unprofessional. It's ridiculous how many people are making excuses for his behavior here.

As an experiment, let's see if this comment gets flagged...

You stupid, mouthbreathing motherfucker. How can you sit there and say "nobody was harmed"? I was harmed by having to read that word-vomit, and I've been further harmed having to respond to your boundless idiocy. You have no idea how many people have been put off working on open source projects because they'd rather contribute nothing than have to interact with human shitstains like yourself or Linus.

How's this for a positive comment? Jesus Christ I hope I never have to work alongside a worthless rat bastard like you. Your family should be ashamed to have raised such a sack of shit.

[Look how clear and direct I've been! Isn't it great!]

Actually, it's good you present this.

A major difference between your comment here and Linus' rant is that Linus didn't deliver personal attacks. Notice how all his vitriol is aimed at the code and not the person:

"Stop gthis (sic) fcking idiocy", not "You are an idiot"

"You trying to shift regressions somewhere else is bogus SHT", not "You are bogus sh*t"

Your comment on the other hand is mainly full of empty name-calling. Besides the fact that it accomplishes nothing, it's a personal attack instead of a clear message that "XYZ behaviour is Bad!"

That said, you are indeed indirectly evidencing that a good many people have trouble seeing a difference.

What is the point of your experiment? Insults like this are routinely flagged on HN. If it weren't, I suspect it would only be because it hadn't been seen by enough members or the mods.

My point is that tons of people in this thread seem to think this style of communication is really great because it's emphatic and gets the point across clearly or something.

I'm trying to highlight that maybe it's not so great if you're on the receiving end of it.

I'm sure a bunch of hypocrites will flag me down, though.

I've noticed others mention that it's context dependent: what's acceptable in some communities is not acceptable in others. Your comment, if meant in earnest, while it may be acceptable when uttered by Linus on the kernel mailing list (I'm extrapolating here based on the few comments I've perused in this thread), is not acceptable on HN. Given the different conditions, I don't think it would produce anything meaningful, whether it gets flagged or not. And given that you've prefaced it with "As an experiment, let's see if this comment gets flagged...", I suspect some that would otherwise flag it won't because it's clear it's not meant in earnest.

Yeah, obviously comments on this shitty message board should have a much higher standard for civility and professionalism than the development process of a major operating system. /s

I've seen messages where I thought he went over the top (personal attacks), but this seemed mostly focused on the issue at hand -- didn't bother me. (Not that it would matter if it did.)

> We really need to stop excusing Linus's behaviour. Just because we may happen to agree with what he is saying doesn't mean we also have to agree with how he is saying it.

I couldn't agree more.

More than ten years ago I was involved in the Monotone project. It was a VCS which was an ancestor to Git and a contemporary of Mercurial.

That project really prioritised good conduct, and having a well functioning community. It was a huge awakening for me. Insisting, gently, upon respectful and kind behaviour made it a project I'm proud to have been a part of.

I'm still grateful to graydon, pcwalton and all the other folks that made it such a great project to be involved with. In free software we need more leaders like that, who set the right example.

How would you rate the downstream impact of delivery failures in this project?

Respect and kindness are great virtues when you're shipping widgets. If your product failing means that end-users are going to physically suffer and/or die, I don't think they're going to care one bit about those virtues or respond in kind though.

I'm not saying you can't run a big project without kindness and respect, but I'm saying that's way down the list of importance vs not harming your users. Whatever it takes to get that chief goal accomplished needs to be done. Linus' approach has clearly been working for decades now...and it especially is in this instance.

Agreed but I don't think that it's going to happen.

We tend to excuse people for being bratty assholes when they're smart and correct. Which Linus was here.

Doesn't make it right but it's not just a Linus problem (I mean, generally), it's a people problem.

> We really need to stop excusing Linus's behaviour. Just because we may happen to agree with what he is saying doesn't mean we also have to agree with how he is saying it.

We need more behavior like this. The reason why the modern software tend to be a pile of garbage is because it is managed like popularity contest of the project managers instead of people who want to get shit done.

Disagree. It seems like Linus’ acerbic and direct means of communicating is the only thing keeping garbage code out of the kernel. He’s also made his views on “why not be nicer” well known. People need to get over it - his project, his rules.

Plenty of other projects out there to hack on if getting yelled at when you disengage your brain is too much. Most of which don’t have the obsession with not breaking stuff for the end user...

> acerbic and direct

Isn't this just swearing?

> the only thing keeping garbage code out of the kernel.

I'm not sure if I'm buying that. It's possible, sure but on the other hand: It's not because the other way (i.e. communicating in a nice way, without all the swearing and direct attacks - which is imo not the same as direct communication) hasn't been tried that somehow proves it won't work. E.g. I know repos where one maintainer is kind yet gets things done, and another one seems to have borrowed his communication style from Linus and gets less done. Because people just give up, being sick of being yelled at in an unpolite manner. Is it that hard to be respectful, even when disagreeing?

This isn’t a “disagreement” talk in the op. This is a “you screwed up and then lied about it and now I have to work around your bad faith” talk. Different purposes.

Do people who don't break users and lie about it get yelled at? Why are impolite words worse that actual harmful actions?

Though I don't disagree with Linus' words. He could convey the same thoughts without swearing. I don't swear and I'm very capable of correcting people when they mess up.

Why is not swearing a virtue?

Matt Mackall, another cranky kernel hacker, showed me a lot about how it's possible to say "no" to many eager changes that would have broken Mercurial and not be yelling obscenities about it.

Linus is no role model. His imprecations are not what make Linux work.

You can have lots of theories about what would or wouldn't make Linux work, but the facts on the ground are that Linux is wildly successful, and Linus behaves as he does. The latter has been an integral part of the former.

Correlation, however, is not causation. You can't say that Linux is wildly successful because Linus behaves as he does.

It's not even correlation. Correlation is:

  (if A then B) AND (if not A then not B)
Every day, the rooster crows and the sun rises; that's not correlation. Periodically, Linus vocalizes in this way and Linux releases.

> (if A then B)

That's causation, surely.

> Every day, the rooster crows and the sun rises; that's not correlation.

Why not? That's literally (if A then B) according to your definition.

Linus created and then named Linux after himself. He runs it and has for decades.

And we're trying to deny that his behavior is even correlated with Linux' success?

Just because Linus is effective at communicating what he wants doesn't mean his way is the best way. Why wouldn't it be even better if he alienated contributors less?

Because high quality kernel code is more important than not alienating contributors.

If someone prefers softer conduct, they are free to fork the kernel and start their own efforts.

> If someone prefers softer conduct

The big issue I see people reacting to is profanity. You can be as hard on someone, given the nature of the infraction, without using profane language or denigrating someone.

If they don't comply, threaten exactly what Linus threatened, no more commits will be accepted.

While I don't disagree with Linus' right to speak as he does. He can be just as effective without the type of language he uses.

Some people don't care about "foul language" some others clearly do...

How was the contributor in this case "denigrated"? There's a difference between calling someone stupid and calling their actions stupid. I've seen plenty of the latter from Linus, but never the former.

In my culture we use what you'd call profanity as punctuation marks.

I feel it is imperialist for people from one culture to want people from other cultures to adhere to their way of communicating.

I understand where you're coming from. It's important to take the cultures and expectations of each participant if one's goal is to communicate effectively. And that goes both ways. Both the speaker and the listener need to take this into account. If one's dogmatic, believing that requiring any change on their own part to accommodate the listener, that gets in the way of the goal of communicating effectively, just as expecting a speaker to conform to one's own standards as a listener. It really needs to be looked at from each person's perspective.

And part of that is understanding when those perspectives are valuable and when they are irrelevant noise. I'd wager (but have no way of proving) that the quality of the Linux kernel has not been negatively impacted by a lack of people who get upset over word choice.

Culture ownership is a thing, and Linus and team were there first. Oncomers need to adapt, or go elsewhere. They certainly don't need to provide this constant distraction.

> quality of the Linux kernel has not been negatively impacted

I don't think that would be a good argument to make from either perspective. If people who choose to leave the community don't contribute you can't really measure lost contribution.

That not being the crux of the issue, we should focus on the following.

Just because a product is successful does not mean that the abuse that happens during development of that product is good or okay. And I think that's really the gist of what people are saying.

Profanity, like any other piece of language, is a tool.

It exists and is perpetuated in a culture's lex lingua precisely because it is an effective tool.

Expletives like this cut right through rational and emotional filters to trigger exactly the kind of response Linus was looking for -- after three weeks of arguing, the guy finally realized his mistake.

I don't see the problem with profanity, different strokes for different folks.

Profanity by itself isn't the problem. I don't have a problem with shit piss cunt fuck either. It's telling a contributor that their contributions are worthless, cannot be fixed, and other ad-hominem angry insults that just fosters a macho, jockish atmosphere in Linux that has demonstrably alienated capable contributors. I'm sorry that Sarah Sharp is no longer maintaining the USB modules because of Linus's attitude, because I've never seen anyone else who knows as much about USB as she does.

Let's put this in a vacuum (since I'm not too familiar with the whole saga over time and I'd like to see an honest response.)

> It's telling a contributor that their contributions are worthless, cannot be fixed, and other ad-hominem angry insults that just fosters a macho, jockish atmosphere in Linux that has demonstrably alienated capable contributors.

I'll quote bits one at a time.

> It's telling a contributor that their contributions are worthless, cannot be fixed

What if they are? How do you handle the situation?

> and other ad-hominem angry insults

I would argue that saying some patch is worthless and cannot be fixed isn't ad-hominem.

But I'd like to get your view, since it seems you've read a lot on the topic and care about the OSS community.

(Edit: As an aside, my personal policy in this regard is temperance in my reaction, and this is especially easy when written.)

> It's telling a contributor that their contributions are worthless, cannot be fixed, and other ad-hominem angry insults that just fosters a macho, jockish atmosphere in Linux that has demonstrably alienated capable contributors.

Linux is a project of such a massive scale that Linus has to accept kernel commits from circles of trust in his developer community. If someone demonstrates that they don't understand the fundamental goals of the project, and their contributions are accepted largely based on trust, then yes, their contributions are effectively worthless.

They have to be moved to a further out circle and have their work filtered up through others, and you have to second guess every piece of work that they've contributed.

I think you're addressing the first half of his objection to Linus' behaviour, not the second half though.

I agree with you completely, but I can see how this can be accomplished, i.e. removal of trust and correction (public at that) without the need to demolish the person him/herself.

I think we can both agree that it's a fine line though.

Personally, "losing your temper" (i.e. something you have to apologize for later) is something that should be relegated to children. Adults should exhibit self-control in this regard.

You can get the "job" done without this. Some have to work harder at it than others. But saying that it's just okay to denigrate others is not correct.

He has lost his temper needlessly in the past and acknowledged it as something to work on and from my limited perspective it looks like he has.

I don't see this particular rant in question to be a personal attack though. Personally, I don't see a macho, jockish atmosphere to the Linux kernel community either. Also, I'm also not particularly concerned whether an environment is cultivated where anyone feels they can contribute. People contribute to the kernel out of need as well as want.

To directly address the part of the comment about otherwise capable contributors (and the Sarah Sharp example): There's multiple kinds of capability. Technical capability isn't the be-all, end-all in this industry. In fact, I think most of us would say that soft-skills are what differentiate high levels of success in this career. If a company's management culture disagrees with you, then you aren't capable of contributing in that atmosphere.

It's not a matter of either party being to blame but a matter of fit between two parties working towards a mutual goal. I don't think it's remotely fair to say that it's solely Linus's responsibility to facilitate a culture where anyone can feel comfortable contributing. Especially with a project as fundamentally necessary where people will have to contribute, regardless of the culture. If someone's incompatibility with the culture trumps their need to work in it, then yes, they should part ways and that should be perfectly okay.

The same applies to personal relationships, marriages, etc. I think it's good to step back from things from time to time and realize that, ultimately, you're dealing with people and they're not always going to value what you value and they're not always going to give you what you want.

Linus is BDFL. That's all you need to know in four letters. Why are people trying to give ultimatums to God? Seems pointless.

Why do we have to choose one over the other? Why can't we have high quality kernel code AND not alienating contributors?

Because enforcing politeness and civility in large groups is impossible without top down leadership enforcing it, which Linus doesn’t appear to want to do. He has chosen code over comfort as the de facto leader of the project.

That's not an argument for why Linus's way is the best way to lead a project, or why we can't have civility and high quality. You just restated what he has chosen to do.

> high quality kernel code is more important than not alienating contributors

That's not the choice; there are other, more effective ways to produce quality code; they are well known and well tested. I would expect that Linux loses more quality code, from contributors who have better things to do than deal with unprofessional behavior, than it gains.

> I would expect that Linux loses more quality code, from contributors who have better things to do than deal with unprofessional behavior, than it gains.

The evidence doesn't bear that out. I would expect more contributions from experienced developers who don't care about the words used in kernel mailing list discussions.

Everyone excused Johansen's behavior -- which broke Linux for users, until Linus stepped in and told Johansen to stop making excuses for hurting users. Johansen apologized for his mistake. Sniping at the team -- the people doing actual work -- from the sidelines is inappropriate

It's sad that often the only thing that stops a developer from spinning out more yarns of rationalization is Linus coming in and barking the first rule of kernel development at them.

It's like Fight Club, except instead of fighting all the characters keep asking Brad Pitt's character why they can't perform a one-man show on Broadway about fight club.


> basic definition of 'regression'


According to Wikipedia: A software regression is a software bug that makes a feature stop functioning as intended after a certain event (for example, a system upgrade, system patching or a change to daylight saving time).

Seems like the two definitions are similar. There's no requirement that the defect be one that was seen earlier, fixed, and then reappears.

> It's not a regression unless something was broken first, then fixed, then the SAME ISSUE comes back in a later code change.

No, it's when a change results in unintentional behavior that breaks previously supported scenarios. You might want to double-check your definition. I think you're actually referring to a recurring issue.

Different people have different definitions of things all the time, and yet we get on with our lives.

nope, that would be a re-regression :)

You've posted your incorrect interpretation of 'regression' twice. Please stop. Breaking something that used to work is a regression.

As much as I think the general principle of "no userspace visible change" is great, I can't make up my mind on whom I agree with the most in this specific instance.

My understanding is that some distros (e.g. OpenSUSE) ship with AppArmor in a whitelist mode where things have to be explicitly permitted by policy, otherwise they're denied by default. AppArmor introduced a new class of actions that wasn't there before, and with these restrictive policies it's causing daemons to crash (because it gets auto-denied).

The whole point of these policies is to not expose new attack surface if it's not needed by the software. It's going to break whenever AppArmor will start supporting more checks, unless policies get pinned to a specific version of AppArmor. But AppArmor doesn't currently support this (they refer to it as "features ABI" in the thread).


The principal here is not technical, it is cultural. It exists to prevent the community from having to having to debate this class of pseudo-issue. There is reasonable logic on both sides and the choice of one or the other is somewhat arbitrary. Being part of the Linux kernel community means doing it in one arbitrary way and not the other.

The convention provides organizational efficiency. This is an example of Torvaldes saying exactly what everyone knew he would say about the technical issue and knowing exactly what Torvaldes would say about the technical issue means core developers don't have to waste his time asking him and the feature developers don't have to waste the core developers' time asking them and those with limited commit rights don't have to waste time debating with people submitting outside patches.

In the end, having a clear principal pushes people predisposed to argue their nonconforming point of view is valid or better toward the periphery of the kernel community because the behavior is deemed unproductive. The community operates because there is trust that people won't waste each other's time. That's the violation here...e.g. "three weeks".

It's interesting to see how it works with SELinux. The Fedora distribution upgrades to a new kernel version often, so after a few months you start seeing lines like these in the kernel log:

  [   14.586798] SELinux:  Permission getrlimit in class process not defined in policy.
  [   14.586887] SELinux:  Class infiniband_pkey not defined in policy.
  [   14.586888] SELinux:  Class infiniband_endport not defined in policy.
  [   14.586888] SELinux: the above unknown classes and permissions will be allowed
That is, the kernel part of SELinux supports more checks, but since the policy is old and doesn't know about them, they are allowed by default.

That's not good from a security point of view but I guess it does avoid breakages.

No, it is neutral from a security point of view: previously you couldn't configure the new thing at all, now you upgrade to where you can configure it, but isn't you didn't configure it the last time you configured everything (because you couldn't) so it default to as if you had enabled it. When you come back latter and think about this again you enable the new configuration with the right whitelist, but until then things work just as well as before.

>That's not good from a security point

It is. You get the security you explicitly ask for, no more, no less. If you want more, you get to know what new is there, and enable it.

It's not clear to me either. From John's replies it also seemed to be an OpenSUSE bug.

On the other hand, if it's bad enough it certainly makes sense to treat it as a kernel bug. The difficulty lies in finding where to draw the line.

It's laid out further in the thread [1]. The key quote comes from Thorsten:

> All that afaics doesn't matter. If a new kernel breaks things for people > (that especially includes people that do not update their userland) > then it's a kernel regression, even if the root of the problem is in > usersland. Linus (CCed) said that often enough

[1] http://lkml.iu.edu/hypermail/linux/kernel/1710.3/02487.html

Does that mean the kernel is not allowed to change version number anymore because I could write an app that segfaults if it sees Linux >= 4.14?

Another example is what happened when Linux moved to 3.0; some programs expected a 2.x version, or even 2.6.x, these programs were clearly buggy, as they should check that the version is greater than 2.x, however, the bugs were already there, and people didn’t want to recompile their binaries, and they might not even be able to do that. It would be stupid for Linux to report 2.6.x, when in fact it’s 3.x, but that’s exactly what they did. They added an option so the kernel would report a 2.6.x version, so the users would have the option to keep running these old buggy binaries. Link here.

From: https://felipec.wordpress.com/2013/10/07/the-linux-way/

The kernel version is a value not a field.

You're welcome to assume forever that there will be a version field, if it ever becomes the case that this simply makes no sense anymore then a sensible dummy value will be placed in[0]

[0] - http://lkml.iu.edu/hypermail/linux/kernel/1710.3/02487.html

    "There's a number of fields in /proc/<pid>/stat that are printed out as zeroes, simply because they don't even *exist* in
    the kernel any more, or because showing them was a mistake (typically an information leak). But the numbers got replaced
    by zeroes, so that the code that used to parse the fields still works."

When the Opera browser updated from 9.60 to 10.0, they found that some sites stopped working because they blocked user-agents with low versions of Opera... and they checked versions by looking at only the first digit after "Opera." So version 10 looked to them like version 1.

Opera "fixed" the problem by having version 10 report itself as version 9.80 in the user-agent string.

Because time is a flat circle, the same thing happened when Microsoft was planning the successor to Windows 8. Too many programs saw "Windows 9" and thought the OS was Windows 95 or 98, and tried to use outdated versions of APIs (or refused to launch). So Windows skipped a version and that's why the current version is called Windows 10.

So to answer your question: There is precedent for that scenario.

You're just being disingenuous, what does that help?

It's clearly an error on the part of the kernel if the owner of said kernel has specific expectations of behaviour on upgrade with respect to user spaces and those are not being met.

It is just as disingenuous as saying "details don't matter". bonzini's comment was saying a line needs to be drawn, and I agree with that. It's pretty clear the line needs to not include "breaking on version number updates", less clear what it also needs to not include.

No, in fact, it is pretty obvious, at least to people doing kernel dev.

The contract in this case is that there will always be a version number, and anyone who expects it not to change is an idiot or just being obtuse.

Well, I do kernel dev. I am the maintainer of KVM. :)

> anyone who expects it not to change is an idiot or just being obtuse

If this was the case, Microsoft would have never had to skip from Windows 8 to Windows 10 (see below in another comment: "starts with Windows 9" was used in the wild to detect Windows 95/98).

Eh? The version number changed. MSFT just changed to something not completely predictable in advance. Or am I missing your point?

They _had_ to change it to "10" because there was existing code in the wild that misbehaved for a hypothetical Windows 9.

I still don't get it; either I'm remarkably dense today, or there's a disconnect. I wrote:

> anyone who expects it not to change

And you appear to be using an example of a version-change curiosity as a counterexample. It is not a counterexample, because the version number in question changed from 8 to 10. In my original formulation, anyone who claimed to worry that the number would stay 8 forever was being an idiot or being obtuse.

I think that holds up. It was not part of my comment, but someone who expected it to be 9 would not be an idiot or obtuse, because that would have been a reasonable guess, absent additional information. They were expecting it to change but were surprised by an exceptional circumstance.

But this is getting a bit silly; there may be someone out there who thinks version numbers should be immutable across versions, but I bet they're pretty lonely. It was an example picked up while making a wider point.

Windows contains code like "If SimCity is running, allow it to use memory after it's been freed"[1]

So yes, if you had a significantly deployed app that segfaults because the version number >= 4.14, Linux probably would seriously consider workarounds for your app.

1: https://www.joelonsoftware.com/2004/06/13/how-microsoft-lost...

Also, Microsoft skipped releasing Windows 9 because so much code out there think anything labeled "Windows 9..." is of the Windows 95/98 lineage.

Damn it, you can still install Windows 10 in 32-bit form on a modern x86 CPU and expect win16 binaries to work.

The only reason there is a problem with the 64-bit form is that AMD made the 64-bit mode and the 16-bit mode mutually exclusive. You can jump from 64-bit to 32-bit mode, or from 32 to 16, but not from 64 to 16.

No, BS contrived scripts are allowed to regress.

And the other key quote from John: "It is a userspace configuration issue. Your userspace is set up to basically do policy development".

The problem is that apparently this is true for all of OpenSUSE, Debian and Ubuntu, and that's when a userspace bug becomes in practice a kernel regression.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact