Say the cost in time/money/resources of coordinating between any two individuals is on average x dollars. Let's call this figure pair coordination cost.
The pair coordination cost between 3 individuals, ignoring overhead, is 3x dollars, as there are three possible pairs that need to coordinate, i.e., there are three connections in a network with three nodes.
The pair coordination cost between 4 individuals, ignoring overhead, is 6x dollars. At 5 people, the pair coordination cost is 10x dollars.
At n people, the pair coordination cost is (n(n-1)/2)x dollars, which is asymptotically proportional to n²x as n grows.[a]
When coordination costs are high, as with software development, it's not surprising that small teams outperform large teams.
Large software development teams generally cannot accomplish much -- unless they break into smaller, highly autonomous teams with well-separated responsibilities.
This applies to other high-coordination-cost endeavors, such as building a successful company from scratch.
This is, incidentally, the formalization of my intuition about bureaucracies and general organizational efficiencies[0] - basically, organizational inefficiency comes almost entirely from coordination costs. This is why government organizations are inefficient (way too many stakeholders), why big companies tend to be about as bureaucratic as governments, and while small governmental projects are much less efficient than small private project (because "small" government ops still has lots of non-obvious stakeholders that come from accountability requirements).
Government organizations also have an obligation to make many of their services available to everyone, without discrimination, whereas companies can choose to ignore portions of the market (unless it's a protected class) that cost more to service than the value gained.
This is why large teams are bureaucratic. Spending 2 hours a day filling out forms is a net time savings over sitting down for an average of 10 minutes each with 13+ people.
Now the data is organized, and anybody who needs it can get it on a pull basis. Some teams will have someone whose primary job is to triage information added and put the important stuff out in a daily update (making it "push" rather than "pull")
Hierarchy can also help; if you have mostly unrelated projects, then you have smaller groups that are more efficient.
The Linux kernel operates on a combination of these techniques; the mailing-list is non-pairwise, there's a significant amount of non-code information that goes along with any PR, and Linus doesn't even look at it until the subsystem maintainer has.
I think some of the most successful big companies are ones that have figured out how to significantly reduce coordination costs. For example, Amazon's extreme insistence on small teams and coordinating through interfaces[0] seems like something that might make individual teams less productive, but cause coordination costs to scale linearly instead of quadratically. Similarly, Facebook has a ton of emphasis on decentralization.
This leads me to believe that at large companies, you should heavily prioritize decentralization and low coordination costs to the expense of almost anything else. One reason companies fail to do this is trying to avoid redundancy: having two teams work on the same thing seems like a waste. But if you need to add a communication node to avoid redundancy, that's often not worth it.
I think this is why microservices "work" (when they work) despite being a suboptimal approach technically. It's a fairly straightforward way to loosely couple team deliverables.
Yep, sociotechnical theory uses the same insight to design work so that people have complex tasks in simple organizations. Unfortunately most companies insist on doing the opposite and they create complex organizations with simple tasks.
It's sometimes interesting to pose a little thought experiment to somebody about what they think they could get done if they had 10 world class developers and practically unlimited resources - certainly the ability to hire or acquired almost anybody or anything within some degree of reason. What could they get done in 10 years?
Now multiply that by ~2,000. That's, roughly, the sort of resources Google has had. Kind of makes you wonder. Do we dramatically overestimate our abilities, or does the efficiency of companies like Google truly diminish as much as this little thought experiment would seem to suggest?
That thought experiment was why I joined Google after 4 years of working in startups. The communication cost overhead the grandparent mentioned is why I left Google after 5 years.
While there, I observed a lot of projects where I was literally working with a room full of world-class people - folks who had given TED talks, folks who had started major open-source projects, folks who had written the "Bible" of their particular subfield of computing - and the final design we came up with was worse than what any one person in the room, working on their own, could've come up with. Good designs tend to be both controversial and coherent: they take a position, not everyone agrees with the position, but they do it anyway because self-consistency has its own benefits that are often intangible but highly valued by users. When you design by compromise, you end up sanding off all the most innovative (= hard to communicate) parts of the design, and end up with only the bare minimum that everyone can agree on.
It's interesting that when you put a bunch of average people in a group, have them independently make a prediction, and then average the predictions, you end up with a result more accurate than any one participant's prediction. When you put a bunch of really smart people in a group, have them cooperate to make a design, and look at the design, you end up with a design that's worse than any one expert's original design.
Yes, the joy of a great design is in the delicate balance it strikes between simplicity and conflicting needs. There's plenty of anguish in finding the balance even in the head of a single person.
The space of solutions to a design problem is really high dimensional and the quality of the design has many local maxima. So from that point of view it's not too surprising that averaging the features of several designs puts you at a point in the space which is not a local maximum. For a more apples to apples comparison with the averaging of predictions, it might be useful to consider predicting the value of a random variable drawn from a multimodal distribution. In this case it's highly likely that different people will focus their predictions on different modes and averaging several predictions will be worse than taking a single one in isolation.
I'd say both. Programming/CS is particularly prone to practitioners overestimating their ability because SW is always incrementally "easy". From my own experience working across SW, HW and some mechanical design it's always the SW part where I get overconfident and I've been at it for almost 30 years.
This doesn't reflect my experience working in large teams. Usually there's some amount of cost devoted to managing (e.g. a dedicated manager, processes, etc.) the overhead of a larger group of individuals.
Poorly managed projects (usually) collapse if the cost becomes higher than the benefit
This is why vertically-sliced product teams work - they minimise the communication and coordination overhead associated with Doing a Thing(TM), while layered or function-specific teams make everything take so much longer - particularly as the formal documentation and communication overhead grows enormously when different teams are accountable for different aspects of the same system. When a small team owns an independent(-as-possible) vertical slice, accountability is shared and the need for well-documented communication is largely mitigated.
The pair coordination cost between 3 individuals, ignoring overhead, is 3x dollars, as there are three possible pairs that need to coordinate, i.e., there are three connections in a network with three nodes.
The pair coordination cost between 4 individuals, ignoring overhead, is 6x dollars. At 5 people, the pair coordination cost is 10x dollars.
At n people, the pair coordination cost is (n(n-1)/2)x dollars, which is asymptotically proportional to n²x as n grows.[a]
When coordination costs are high, as with software development, it's not surprising that small teams outperform large teams.
Large software development teams generally cannot accomplish much -- unless they break into smaller, highly autonomous teams with well-separated responsibilities.
This applies to other high-coordination-cost endeavors, such as building a successful company from scratch.
--
[a] This is Metcalfe's law, but for a network's pair coordination cost instead of value: https://en.wikipedia.org/wiki/Metcalfe%27s_law