
Linux follows an inverse form of Conway's Law - bhjs2
https://medium.com/@aserg.ufmg/does-conways-law-apply-to-linux-6acf23c1ef15
======
tytso
Conway's Law absolutely applies to Linux. The trick is to remember that the
communication patterns that Linux is optimized to reflect is "git tree pulls".
Over time, things have been factored to minimize the amount of cross-tree
merge conflicts. That way we can decentralize the development effort, and
worry much less about conflicts when Linus has to pull from a hundred-odd git
trees (many of which have sub-trees that were merged together by the subsystem
maintainers before Linus then pulled them into his tree). But _that_ is the
primary communication pattern which is critical, and we have absolutely
optimized how the code has been laid out to minimize communication
difficulties --- when defined in terms of merge conflicts.

------
sunir
I don't think the author makes a very convincing point here; Linux is
frequently cited as a prime example of Conway's Law in practice for a reason.

The reason the driver subsystem is architected as pluggable modules
("drivers") is to support the extremely wide array of organizations that have
to build into it.

The reason why Linux is broken down into subsystems is to support the
"specialists" who work in only on system at a time.

The reason Linux is a monolithic kernel that has a large degree of
complication internally (vs. a microkernel) is because Linus is strong enough
to make it happen.

I mean, the logical error is right in the title. The author inverted cause and
effect.

~~~
Doradus
It seems absurd that an article claims to disprove a hypothesis linking
structure to communication patterns without even the slightest mention of
trying to observe those patterns. But they go even further: they claim to have
determined the direction of causality between two variables without even
having measured one of them!

------
smitherfield
Misleading y-axis strikes again: [https://cdn-
images-1.medium.com/max/800/1*MiF0VmKW19IC3OlVDI...](https://cdn-
images-1.medium.com/max/800/1*MiF0VmKW19IC3OlVDISgZA.png)

~~~
cyphar
It's not just a misleading y-axis, it's completely the wrong kind of graph for
showing that kind of data. Should've used a stacked bar graph...

~~~
theemathas
I would have gone with a pi graph of the average over the years.

~~~
cyphar
That would only be useful if you didn't care about historical data. Also,
plotting averages like that can be _exceptionally_ misleading because you
don't get an insight into the distribution of the data.

------
scribu
What the article is suggesting is that the Linux architecture wasn't affected
by organisational pressures that closed-source systems face.

That is to say that subsytems were defined solely based on technical
considerations, which is how it should be if the goal is sound engineering.

Not sure what to make of the ratio between "specialists" and "generalists". A
comparison to ratios from other projects would provide some helpful context.

~~~
lomnakkus
> That is to say that subsytems were defined solely based on technical
> considerations, which is how it should be if the goal is sound engineering.

I think that's too idealistic. As another sibling poster pointed out, it's
more like a democracy... with all the attendant upsides and _downsides_.
(Also, I'm _sure_ there's a lot of interpersonal politics involved, even if
it's all over email.)

Don't get me wrong, the Linux kernel works surprisingly well and I rely on it
for almost everything (including my livelihood), but if you _really_ look at
it, some of the subsystems are _shockingly_ bad.

I think a good example is containers/namespaces which have a ridiculously bad
security record. (See "user namespaces".) Again, I'm sure the people working
on these things had the best of intentions, are _very_ competent generally,
and were hampered by the "never break userspace" rule. However, if
containers/namespaces were truly _designed_ , it could have been done a _lot_
better. (See "Zones" on Solaris et al.)

~~~
X86BSD
I agree with you. But I think you're being generous when saying _some_ of the
subsystems and architecture are shockingly bad. Some of them are not even
designed. If you look at them it's like "This isn't even useable!"

The problem with namespaces and all the "container" tooling built on top of it
is, as you noted, security. They were never designed to be secure from the
start. It was not meant to be. And it was always puzzling to me why someone
would use them to build containers on. Knowing it wasn't secure.

As you also note Solaris got it right with zones, FreeBSD got it right with
jails, there was plenty of options to use and build on. But as is the typical
Linux story they had to do things there own way, and poorly.

I still have no idea what Linux land is going to do for a filesystem. Since
it's pretty clear ZFS is kept at arms length. Btrfs seems like the horse they
are hitching to? Filesystems are hard. Which is why they can't seem to write a
solid, reliable one and keep starting over. I think it's a big issue for them
and is only going to get worse as data continues to explode exponentially.

I wish zfs had been BSD licensed. And I wish Linux had simply adopted it as
well. But those ships have sailed. Where to now?

~~~
lomnakkus
Agreed, I probably was being a bit too generous, but I usually try to be as
"diplomatic" as possible :).

I don't know much about the BSDs, but my impression was that jails don't
really offer nearly enough to rival Zones. Zones is really "containers done
right"[1].

Definitely agree on ZFS. I'm currently using ZoL on my Ubuntu servers, but I'm
not going to try using it on anything else. (FWIW, it's working marvellously
so far.)

Personally, my current best hope for a good Linux FS at the moment is
Bcachefs[2]. It's based on the proven Bcache. It's mostly done by a single
very talented programmer/designer who _knows_ when he's out of his depth[3].

[1] Not "virtualization" per se, but as close to it as makes no difference, if
that makes sense?

[2] [https://www.patreon.com/bcachefs](https://www.patreon.com/bcachefs)

[3]
[https://bcache.evilpiepirate.org/Encryption/](https://bcache.evilpiepirate.org/Encryption/)
(the initial paragraph; AFAIR there's a lot more out there about the process,
but I didn't stash any links and that's all I could find in 30 seconds :).

~~~
X86BSD
Zones was the logical conclusion to FreeBSD jail work. They added the full
features jails now has finished. Each zone getting its own IP stack for
instance. Jails didn't have that in the beginning. It was written by one guy
to simply offer a basic but secure container for running multiple web servers
in isolation from the host. Had he had the resources of sun Microsystems I'm
sure he would have had all the features zones had from the start. But one man
can only do so much with finite resources. And the main design goal was
security and isolation. Which was accomplished. Separate network stacks was
added later.

I'm glad to hear your using it. I really can't imagine using anything else.
Open source or commercial. Nothing is as compelling or reliable and easy to
use. Me personally, i am forever grateful for Sun open sourcing it. I remember
pre zfs days, with volume managers, and how hard it was to grow a ufs/ffs
anything, just an absolute nightmare. And backups? Ugh. Tapes. Hoping every
bit was copied, but never really knowing for sure. Just ugh. The pain.

I've heard bcachefs talked about to. And as you originally stated I think
these folks writing btrfs and bcachefs are talented and are doing the best
they can but again FS's are hard. Look at all the garbage we've been stuck
with from all the major commercial vendors until ZFS. Everything Microsoft
puked up, or Apple, even the major UNIX vendors sucked at making file systems
and volume management. Veritas? I'd rather stab my eyes with soldering irons
than use any of that stuff ever again. Seriously ZFS compared to all of them
is night and day. I'll never be able to thank the developers enough for
helping me out of the pit of filesystem and volume management hell. I'll be
interested to see how bcachefs ends up. But mostly I think everyone will be
trying to play catch up to ZFS and more than likely doing it poorly. But I
applaud their efforts to try!

Thanks for the links btw, I always enjoy reading up on various tech.

~~~
lomnakkus
FWIW, I think there's a huge difference in approach (and humility[1]) between
the Btrfs and Bcachefs developers. I'm sponsoring the latter.

[1] I tried to make this point, perhaps badly.

------
davidst
The article makes a good point but I wonder if it is an incomplete
explanation. What would happen if Linus Torvalds walked away and there was no
single leader to guide (or "dictate", depending on your point of view) its
development? Would it begin to fragment and exhibit signs of Conway's Law?

I believe the answer is, yes, it would. While Linus is a stubborn and
opinionated leader ("Benevolent Dictator For Life") it is those qualities,
coupled with his extremely high standards, that have preserved the coherence
of Linux's system architecture all this time.

~~~
pmoriarty
I wonder if there are any large, successful open source projects that are
leaderless and function well without a social hierarchy?

If non-hierarchical social structures are really more effective, such examples
should be easy to find, no?

On the other hand, maybe their absence only indicates that online communities
simply tend to mirror the social structures of offline communities, or that
they're just mostly made up of people who prefer hierarchies.

~~~
dTal
Extremely relevant:

[https://en.wikipedia.org/wiki/The_Tyranny_of_Structurelessne...](https://en.wikipedia.org/wiki/The_Tyranny_of_Structurelessness)

tl;dr - hierarchies will form whether you want them to or not. If you refuse
to endorse an official structure, you'll simply get an unofficial (and more
often than not, unaccountable) one instead.

------
jacques_chester
I've elsewhere seen described the "Inverse Conway Manoeuvre": make the org fit
the emerging architecture.

We do this at work. It mostly works, modulo "Distributed Systems Are Hard".

~~~
mpweiher
I have also done this at work with (what I believe) some success, w/o the
distributed systems aspect.

~~~
robotresearcher
Unless you are a company of one person, you have a distributed system.

~~~
mpweiher
No, not really. Now the total system is distributed, but the parts I tried
this on weren't. They were Mac and iOS clients with shared code. I moved the
architecture from two separate apps that have some shared libs towards a
headless app that is basically the same and has two UIs (also with shared
code) sitting on top.

~~~
robotresearcher
So in what way did you change your org to reflect the software structure, re.
the grandparent comment?

My reply was about orgs, not your software. Since orgs are made of people,
they are distributed systems.

------
ryanmarsh
There is an unsubstantiated ocean between the "Therefore" beginning the last
paragraph and the paragraphs before it. If anything, the author's data points
lead me to draw the opposite conclusion.

------
superlopuh
The statistics seem to be obviously incorrect, there is no discount for the
distribution of the number of files that a contributor might author/have a
significant effect on. Since most contributors will have made a small number
of contributions, this is a large bias.

The graph that would ultimately support the point of the article would have
the difference between a simulation of a uniform distribution of contributions
by the authors, and have a full 0-100% axis for scale, as opposed to the
35-65% presented in this article.

------
dorfsmay
> the Degree-of-Authorship (DOA) measure to define the authors of each file in
> a system

But in source control, author is typically defined as the first contributor to
a file, which doesn't always reflect the person who contributed the most
content to the file.

~~~
cyphar
The paper they linked to has a much better description of how they figure out
file authors, and how they avoid issues with one-time contributors.

------
notalaser
In my experience, while the statistics that the article quotes are obviously
correct, the reasons have very little to do with the architecture, and they
very much mimic the way that the "community" works. Linux' architecture has
very little to do with why communication (and contributions) are the way they
are. In fact, the architecture is largely designed precisely so that it can
_withstand_ the sort of organizational pressure that the Linux kernel faces.
See, for example, the recent(-ish) rejection of AMD's drivers: they got
rejected because they included a HAL, which -- based on previous exeperience
-- is usually a bad idea in an open system, as it tends to depend on highly
organization-specific knowledge, and the volume and difficulty of maintenance
work makes it difficult to manage by a non-committed community once the main
owner drops it for greener pa$ture$.

The very separation that the article draws "core" vs "drivers" is actually
highly representative of how the Linux community is structured. Most of the
core work (including the driver subsystem's backbone) is done by long-term
contributors who actually work on the Linux kernel full time. Most drivers
actually come from occasional contributors.

Driver contributions are "specialized" for the same reasons why they're
specialized on pretty much any non-hobby operating system, namely:

1\. The expertise required to write complex drivers mainly exists within the
organization that sells the hardware. Needless to say, these people are
largely paid -- by the hardware manufacturers! -- to pay drivers, not
contribute to what the article calls core subsystems. There are exceptions
("trivial" devices, such as simple EEPROMs in drivers/misc, are written by
people outside the organizations that sold them), but otherwise drivers are
mostly one-organization shows. In fact, for some hardware devices,
"generalists" don't even have access to the sort of documentation required to
write the drivers in the first place. (sauce: wrote Linux drivers for devices
that you and me can't get datasheets for. $manufacturer doesn't even bother to
talk to you if you aren't Really Big (TM))

2\. Furthermore, there really are subsystems in the kernel that are largely a
one-company show and are very obvious examples of Conway's law. IIO drivers,
for instance, while started by Jonathan Cameron who, IIRC, is really an
independent developer, are largely Intel' and Analog Devices' \-- to such a
degree that, even though they follow the same coding conventions, if you've
worked there enough, you can tell who wrote a given snippet. Same goes for
most of the graphics drivers. Most of Infiniband used to be IBM, I think. If
you dig down in the drivers subsystems, you'll see even funnier examples (my
favourite example are ChipIdea USB controllers; a few years ago, support for
USB slave mode on some Broadcom SoCs broke down because Freescale pretty much
took over de facto ownership of the drivers, and some of their changesets
worked fine on their ARM cores, but broke on Broadcom's funky MIPS-based
cores)

Also, this is very weird to me:

> Adherence to Conway's Lay _(sic!)_ is often mentioned as one of the benefits
> of microservices architecture.

Back in My Day (TM), adherence to Conway's Law was usually considered a
negative trait, summarized by the mantra that, in the absence of proper
technical leadership, an organization of N teams tasked with writing a
compiler is going to produce an N-pass compiler.

Of course, this _is_ a most negative example, but are we really, seriously
considering that adherence to Conway's law is a positive thing today? That
it's actually a good idea for the architecture of a software system to reflect
the "architecture" of the team that created it, rather than, you know, _the
architecture that 's actually best for what it's meant to do_?

~~~
regularfry
Adherence to Conway's Law _is_ regarded as a good thing, but not that way
round. We want to build the teams to match the architecture of the software
we're building, not vice versa. If you don't cut it that way, you're always
going to be swimming upstream.

------
RandyRanderson
[https://en.wikipedia.org/wiki/Conway%27s_law](https://en.wikipedia.org/wiki/Conway%27s_law)

------
JacksCracked
Could this be related to the fact that all communication between Linux devs is
done by email?

