Hacker News new | past | comments | ask | show | jobs | submit login
Software Projects and Heroes: Lessons Learned from GitHub Projects (arxiv.org)
337 points by gmolau on April 28, 2019 | hide | past | favorite | 134 comments

Everyone who's interpreting this in the context of full-time work / the traditional notion of "10x programmer" (i.e. ~every toplevel commenter) should read the abstract:

> A "hero" project is one where 80% or more of the contributions are made by the 20% of the developers... We identify the heroes developer communities in 1100+ open source GitHub projects.

(emphasis mine)

Of course nearly every Github project is going to have heroes--we call them "maintainers." This study is totally unsurprising and totally non-generalizable to professional teams working on proprietary software.

The use of the term "heroes" in the abstract is massively misleading and tendentious, IMO.

Of course most of the commits are by a small number of developers: Nobody else cares about the project enough to commit.

However, the interesting claim of the paper is that restricted numbers of developers increases code quality. However, having skimmed the paper, I'm not really convinced. I see a lot of posturing and quotations from other literature, but very little in the way of data analysis. I'd be extremely happy to be proven wrong, but I don't think this paper has done the analysis to back up its claims.

Ok, we've added "Github" to the title above.

Edit: and maybe we can make the phrasing a bit less tendentious. Bad sign when paper titles need debaiting, but that's the trend.

This was my first submission here, but is it really helpful to change the title of a published research paper? Just because some commenters didn't like it, and at the cost of this discussion now being harder to find when people google the paper?

The site guidelines say: "Please use the original title, unless it is misleading or linkbait" (https://news.ycombinator.com/newsguidelines.html). Being the title of a published research paper doesn't immunize against those things, unfortunately, and based on the comments posted here, that title was arguably both.

We know from long experience why it's helpful to make such a change: if we don't, people will keep complaining about and bickering about the title. Editing is the only way to soothe title fever. https://hn.algolia.com/?sort=byDate&dateRange=all&type=comme...

This thread still comes up if I google the paper's original title, and is the #1 result if I google "software project heroes".

p.s. Welcome to HN!

Not trying to start an argument over words, but that seems a bit unfair to the authors. Their title promised they would analyze 1100 projects, and that is what they did. Being accused of misleading or even linkbaiting their readers by (let's be honest) armchair scientists imho doesn't reflect well on this platform.

The number of projects analyzed was not the issue, though "1100+" does add to the impression of the title being hotrodded a bit. The criticisms HN users made of the title, that I heard, were first that the paper looks only at open-source projects on Github, which is not representative of software projects in general, while the title refers to software projects in general. Second, "Why X needs Y" (where Y is a sensational word like "heroes") is a linkbait trope, easily recognized by anyone who spends time around internet articles, as well as an overselling of the finding. An analysis of commit rates on Github does not explain why, it shows that—which is great, and which we edited the title to represent more clearly.

On HN, an article does not get a pass to use these tricks just because it's a paper on Arxiv. The internet game is to inflate titles to oversell the body, and the HN game is to rejig them back to scale, so there isn't a disappointing delta between what the title promises and the body delivers. We did that in two ways: by adding Github, and by replacing the "Why" bit with a simple mention of the relationship explored by the article. I'm sure it is possible to come up with something better, i.e. more accurate and neutral, and if anybody suggests better we can change it again. But if a paper is going to play the internet game—which, dismayingly, we're seeing more of, as academics increasingly have to fight for clicks in their own right—then HN is going to play the HN game. Your authors are expert in their domain and the HN community is expert in its.

I wouldn't say this change is unfair to the authors. The purpose is to steer the thread toward a substantive discussion of their work, just what any good scientist wants, and away from title nitpicks and complaints, which is boring and no one wants. I can tell you from years of experience that once these complaints start appearing, they continue unless and until the title is edited to bring down the inflammation. We can't fight it, we can only heed it—it's a force of nature. Why would that be? I think it's that people come to HN for relief from the onslaught of bait and manipulation that plagues them on most of the internet. The implicit contract here is that the front page will be a bit calmer, more neutral, more accurate. HN doesn't do that perfectly by any means, but if it's noticeably quieter and more bookish than what is typical elsewhere, that's the main thing.

You can call HN users armchair scientists if you like, but I think you'd be surprised by how many working scientists participate in this community, not to mention many who have had scientific careers before moving into industry. They are humble and don't flaunt their credentials, but when something relevant to their work comes up they mention it, and it always surprises me how many there are. It seems a good sign for the health of the community.

As a casual reader of HN, thank you for your efforts.

Just because it's true doesn't mean it isn't misleading. Consider if they had the same title but only analyzed projects with a single contributor or pet projects.

By only including open source projects they've limited the ability to generalize their conclusions, which they themselves allude to in their "threats to validity" section. Changing the title simply better reflects the content of the article, which the researchers didn't do (sometimes this is due to external pressures such as university PR departments, but I doubt is the case here)

"The 80/20 rule applied to Github projects"?

>> This study is totally unsurprising and totally non-generalizable to professional teams working on proprietary software.

I agree that it's unsurprising but disagree that it's non-generalizable.

Companies need to break large projects into small subproject teams and one or two people on those teams need to be deeply involved in the subject matter of the subproject.

Also companies tend to treat developers as disposable entities. If you treat your engineers like crap (and don't reward them with company shares), then your company will never be able to keep any 10x engineers. Your company's products will always be mediocre, buggy, slow to develop (you will need a lot. more tests)...

The "disposable entities" part reminds me of sprints where everything is broken down into 2-3 week chunks. A deliverable is expected in that time. It can be a tiny part or bigger part but must be a complete "thing" with unit tests and what not.

It tracks with the idea that programming is like factory work.

They provide a lot of references in the article to other research that make similar claims from both proprietary and much larger open source projects. At the very least the "Hero model" seems to be as prevalent outside open source as it is in open source.

The purpose of the paper was to test and infer from a larger dataset, which open source provides more easily then proprietary.

That being said, it really seems like the research was highly guided by an agenda, and I'm not sure how much one can take from this research.

Have you ever watched a git graph visualization? Some committees are different, they usually start with a small module which is accepted and starts a whole chain reaction of follow up rewrites and sudden appearance of previously unfinished additions.

> A "hero" project is one where 80% or more of the contributions are made by the 20% of the developers.

If that 20% leave, the other 80% would contribute equally or the ration 20/80 will be preserved?

This seams a feedback loop. Once a developer has written more code than others is in a better position to write even more. If that is the case "hero developer" is a floating position, depending so much on your skill and how much time you have been working in a project and other circumstances.

> Organizationsshould reflect on better ways to find and retain more ofthese software heroes.

I had this discussion recently. If you create a team of "heroes" all become way slower than when they were in different teams.

My personal conclusion, and the way of working that I try to implement at work is that small independent projects allow for developers to be fast and reliable. If your project is correctly divided, anyone can contribute ( at their level of expertise). If you have a big complex projects then old-timers that created originally the code are the only ones that can work with it efficiently if no other developers make big changes (reducing the ability of other developers to contribute independently of their level of skill).

So, before going in search of "heroes" think if your projects are correctly divided. And think about the risk that you create when a "hero developer" is needed and leaves.

I recently left a company organized in this way. It was quite disheartening: small changes would break the code in multiple ways, hundreds of public functions, no interfaces. The architect was seen as a ‘super’ developer...

Perhaps an ideal scenario would be to have a single 'super' developer have relatively free reign, under the condition that she can explain the code/architecture to others effectively also.

Yeah, definitely not an ideal scenario.

A good developer should write code that is easy for computer and human to understand. Unfortunately most firms seem to have as 'As long as it works, its good!' policy.

And how many of those were untested? Most I imagine.

Who designs the project and "correctly" divides it? Some hero?

Some one that has that job like an architect, technical director, CTO. With the help of the people that needs to with with the code. No heroes, no working extra hours, no crunch time but just people with different functions.

The "hero" phrase refers to the devs doing 80% of the commits.

Have you found projects where work is spread thinly across many developers successful?

Having worked on many, many software projects myself, this is 100% accurate. One good programmer who is intensely focused and interested in a project is worth 20 - 50 (or more) programmers who are just code monkeying it without really giving a shit. This is why a small, passionate startup team can consistently build better products than large corporations. Much better products.

There is no substitute for caring, and people consistently, dramatically underestimate the difference in thought that happens in people who care and people who don't. Anytime I'm asked why someone should listen to me and not someone else I tell them "because I care". It is really all it takes to separate from the incompetent, so few understand what caring does.

Absolutely. I'm sure there are many reasons for this, but a big one i always see is attention to detail. People who really care are obsessed with getting all the details right, and hate bugginess, bad ux, and other issues typical of crappy software. Big projects with no real owner may have lots of programmers, but none of them care so much that they want to get in the really deep weeds to make the thing a truly amazing software experience.

And that is because there is always more to care about, just as true love has no end.

As someone who has generated 99% of the code- not counting StackOverflow- for a project, it honestly is a double-edged sword- I can't rely on anyone else to fix anything, but when I break things the impact is rather minimal.

> "This is why a small, passionate startup team can consistently build better products than large corporations."

I am curious if there is a way to test this scientifically, large teams make some very good products.

What if you have three groups: One with a small software startup team that, by some criteria, is judged super passionate about the product. Another from a corporation where at least one lead programmer is super passionate about the product (usually the "owner" in agile talk), and another corporate project with a shit ton of programmers, but no clear passionate programmer in love with the project.

I find it silly they designate these as heroes. Seems simpler to say that most projects are small enough that a single person can do the majority of the work. Not shockingly, most smaller projects don't fail that heavily.

Add in the extra work that collaboration necessarily brings to the table, and it isn't that shocking that things both slow down and get more error prone. At a certain size, though, this is worth the cost of admission.

More, it isn't like we call all projects done by fewer individuals heroes. We have authors, not "writing heroes."

And the author joke is meant to point out that the vast majority of what you read has a single person's name on it, but without the editors and other workers involved, that book would never have come about.

Yes, but that brings into question how large a project needs to be for the hero model not to be feasible anymore. They surveyed over 1 100 projects and I would guess a significant portion of them would be counted as "large". I found their raw data on github here, https://github.com/ai-se/Git_miner, although it is in some weird .xlsx format so I can't verify if my guess is true.

.xlsx is a Microsoft Excel file. I bet you can open it with Google Docs, LibreOffice, or something similar.

You can also change extension to .zip, unzip it and read individual xml files from worksheets/ subdirectory. However, because of the horrible schema in those xml this is probably a really bad idea to go that way. It's way better to use the software that parent mentioned

The "hero" phrase is fine. It just refers to the devs doing 80% of the work.

This is really fascinating.

If true, it suggests that big companies with huge teams that try to force the entire team into equal contribution metrics are creating stress that might not help the team, nor lead to fewer bugs (big surprise).

I've never been on a team which was set up to truly recognize the value different forms of contribution, like writing tests, or removing code, or simplifying it, or other things which would not count towards the hero metric presented here. As an industry we just don't think about alternative narratives to "let's make sure everyone is an equal contributor..."

Yeah that’s a very interesting observation, and sort of corroborates my own anecdata. More specifically, when projects are in there infancy, and there’s a lot of risk associated with working on them, I feel that management tends to stay away, which is how these heroes originate. Then, after it’s obvious that project will succeed and it’s merits are overwhelming, you see this shift in project governance/responsibility towards a more corporate mode. However, like you said, that’s probably counter productive and just creates stress and tension. If anything, I think sometimes management has an uncontrollable tendency to position itself strategically in these gatekeeper roles, whereas in open source they don’t get to do that simply by virtue of being someone’s boss, so you don’t see that same trend, as the paper argues.

Not sure if that makes sense, I quickly typed this on my phone.

That is the untalented who are interested in the rewards of success moving to gain what is possible to get from talent. They have none, so they have to see it manifested and then they capture it for their benefit. They have no other access to the good stuff. All industries, human endevors, work this way. All power structures, all groups.

Code Reviews are also devalued, to the point of most developers I worked with considering it a chore.

That's a great point. The cost of doing that is never calculated and the impact is so positive.

> let's make sure everyone is an equal contributor

that's also because they don't want to pay the "hero" 10x than the other developers

It's the second time I mention Fred Brooks on HN today, but he dedicates a chapter in Mythical Man-Month that agrees with the assertion of the article:

> Much as a surgical team during surgery is led by one surgeon performing the most critical work, while directing the team to assist with less critical parts, it seems reasonable to have a "good" programmer develop critical system components while the rest of a team provides what is needed at the right time.

Other team members perform other tasks, and some of those even administrative (!), but all of them supporting the "vision" of the surgeon.

IMO this is a healthier way of communicating roles, avoiding fights and training juniors. It's better than the "everyone is replaceable" mindset companies have this day.

Surgery is a dangerous comparison to make. Most surgeons get to their seniority by doing a lot of surgery. Most of which are near identical to each other.

Programmers? We, for some reason pride ourselves on doing something new every time around, it seems. I've argued before with my leadership that, if we really wanted something done better/faster, we shouldn't be promoting the folks that did it out of the position for the next go around. Rather, we should have them do it again. And then again. And then again. Not only would they be learning from their experiences, but benefiting much more directly.

> I've argued before with my leadership that, if we really wanted something done better/faster, we shouldn't be promoting the folks that did it out of the position for the next go around

This is similar to what Fred Brooks suggests in the "Surgical Team" chapter of "Mythical Man Month", although he's a bit harsh, I'll grant that:

> if a 200-man project has 25 managers who are the most competent and experienced programmers, fire the 175 troops and put the managers back to programming.

EDIT: We were also talking about it yesterday here at HN: https://news.ycombinator.com/item?id=19763562

And not forget to fire those other managers who never wrote a single line of code.

Or anyone who thinks in terms like 'programmers' and 'resources'.

Brooks idea was even more radical:

Put non-coders to work under the chief programmer. Let them help instead of control.

That makes total sense to me. Having engineers in charge of the challenges that require engineering.

> We, for some reason pride ourselves on doing something new every time around, it seems.

You can't copy and paste a successful surgery from one patient to another.

> I've argued before with my leadership that, if we really wanted something done better/faster, we shouldn't be promoting the folks that did it out of the position for the next go around.

I'm confused by this - why are you (re)building the same thing multiple times?

Why are you not building on what was done before, regardless of who did it? It seems to me that either the people choosing to start over are making the choice irresponsibly, or the previous work that got someone promoted actually wasn't that great (not extensible/reusable/documented/etc). In both cases, I'd be asking management some tough questions about what they've done and/or allowed to happen.

The ideas are built on. The code? Our industry is constantly modernizing things in the name of technical debt. Often, the call gets made to make a new offering to replace the mistakes of the old.

And I'm, frankly, calling bs if you are claiming this as not something that happens regularly. Just look at all of the articles that constantly crop up of the form "why we left technology x for y".

I don't pride my team on doing something new when there is a tried and true way of doing it. Only when there is a need for a new way that the tried and true way doesn't cover and is a bottleneck.

From my experiences, this attitude is in a large minority.

I'm tempted to say this isn't bad, either. It is amazing what people can do retreading topics.

I agree, but time and place.

Completely agree. Thomas Pornin told me he wrote 10 TLS implementations before writing BearSSL. And it shows.

Fred Brooks agrees with that. That's what he says in Mythical Man Year, referring to prototypes:

"Plan to Throw One Away"

What if the surgery analogy doesn't compare favorably with an entire project. How about a code review comment or an email thread?

Microsoft has an idea like this - with the same surgical analogy. Sometime during the MS Word initial development process they tried it out - and the "surgeons" loved the idea, but everyone else hated it - especially as the half fleshed out ideas from the surgeons turned out poor or wrong.

I think the skill differences between individual contributors are never as great as you think.

cf "Software Heroes" book - has a wireframe cowboy hat in the front cover. v good

Joel Spolsky talks a little about this: https://www.joelonsoftware.com/2000/10/04/painless-functiona...

Microsoft's idea was very different from what Brooks describes in his book.

Microsoft approach was more about having senior programmers telling juniors what and how to do their job, aka micromanagement.

Brooks is about having programmers actually coding (and documenting), and have other people handling and other important tasks such as clerical non-programming tasks, developing tools, language research, version management, doing adversarial testing, helping with documentation, etc. Some of the tasks he describe are obsolete these days, but the book is from the early 70s.

According to him, doing that "relieves programmers of clerical chores, systematizes and ensures proper performance of those oft neglected chores, and enhances the team's most valuable asset — its work-product".

Except that you're usually building knowledge specific to the organization in which you're working. IMO, you should avoid monopolizing vision and knowledge. I see this as different to the surgical team, where each patient is a different "project".

Communication and knowledge should be considered part of the product. You often times still need these kinds of developers to deliver even this aspect. Basically someone who groks the project so thoroughly that they can synthesize disparate pieces of information into something holistic and absorbable.

Yeah, because at a certain level it's not monopolizing knowledge, it's just not letting people without ability act as though their lack of ability is inconsequential and their agendas matter. In surgery they don't have lack of ability or agendas.

Building good documentation and sharing knowledge is crucial for such teams to work, according to Brooks. Brooks is all about (good) communication. The "Surgical Team" chapter comes right after the "Communication" chapter in MMM.

And I don't think that having other members of the team supporting the vision of the leader is monopolizing the vision... it's just being focused on a single goal.

There are trade offs to everything - generally I haven't seen the best results from attempts to make people inteechangeable to put it mildly.

Let me chime in. I've been consulting crypto(graphic) projects for years and systems that are developed by one person are way way clearer and simple than systems developed by nultiple developers. Things like TLS become a mess once multiple developers start collaborating. It's weird. My theory is that one dev working on a project can refactor it many times during development, whereas this would be an insane thing to do when collaborating.

> systems that are developed by one person are way way clearer and simple than systems developed by multiple developers

Brooks called this "conceptual integrity":


Yes. After many years I have exactly the same experience that you do.

Programming intent and vision is hard to communicate, so sharing code is always a compromise. It's much better to work with smaller units of code that communicate with each other using simple, clear and testable APIs.

Yep. That what makes Golang so awesome. One person can own one package.

What’s special about Go packages, as opposed to packages in other languages?

Why would major refactoring be insane when collaborating? We do this regularly in my team in early stages of projects and only when really required once projects become stable. But these projects are usually not about crypto, so I might miss the difference here.

You can (should) also give leadership to juniors on appropriately sized projects. That's how you make better programmers.

> also give leadership to juniors

You don't give leadership - you assign ownership, and hope leadership evolves from those responsibilities. Leadership is nurtured by helping the junior navigate issues, holding a vision for them long enough until they can realize it on their own.

There's a reason good leadership is so rare and appreciated, and it's not because it grows on trees.

Simon Sinek has a great presentation in regards to that topic if someone is interested.

Famous "Do you love your wife?"question and more.


You should give leadership/ownership/responsibility of smaller components of the larger vision to juniors or people new to the team.

Most companies want more business value, they dont care for better programmers.

Next time said companies complain about a talent shortage, they would do well to remember TANSTAAFL.

Because I had to look it up: TANSTAAFL = There ain't no such thing as a free lunch


Have that ingrained in my mind since reading Abrash forever ago.

You mean Heinlein?

Never read Heinlein.

I know it is reification but most companies end in banrkruptcy.

In all seriousness I have noticed that "business value tunnel vision" is one of the surest ways to run a company into the ground in many if not most fields and Wall Street keeps pushing it anyway to nobody's benefit. The fundamentals of the field are what you have to care about or else you will have a really bad time.

Yes, and investing in employees pays dividends.

Many businesses are poorly invested and put themselves at a disadvantage.

This sounds like a question for management. I suspect the healthy attitude is to not even care what a good managerial approach is—simply show up, do whatever the easiest thing that lines up with management priorities, and leave, managing the rest of your career outside this. Unfortunately I feel like some managers become enraged if you “check out” of doing their managerial duties for them

there's one issue that large companies face, after two/three years all the juniors in the "surgery team" model start looking at position of seniority within or outside the company, causing high concentrated turnover in such built teams.

the surgical team is great in a crunch but long term you need to stagger your turnover to keep competency on the project high.

this isn't as relevant on team that are building projects as they were in the mmm time, today companies tend to build service software that has maybe the same shelf life but gets continuously improved instead of being dropped whole and left into maintenance mode.

That sounds like a secondary management problem to have other projects lined up. Although that isn't new - one or two years is already an effectively encouraged time to look for new jobs because raises don't keep up and changing jobs is the current defacto way to raise salaries. Seems like a needless expense and hassle to all involved and yet it seems strongly preferred oddly.

> after two/three years all the juniors in the "surgery team" model start looking at position of seniority within or outside the company, causing high concentrated turnover in such built teams

Is that a problem?

I mean, the same thing happens with literal surgeons/doctors and residency, where their early years of career eventually end and they expect a mid-level position in the organization.

It sounds like a healthy industry that juniors/apprentices work under journeymen and masters at a large organization early in their career, then move up or branch out after a few years.

I'm not sure why companies expect to be able to cost-optimize away a functioning economy, but remain in business.

while moving on is natural for the persons to a software company it's a costly proposition and quite different than surgeon teams, because while humans are all more or less similar projects are wildly different so there's less interchangeability.

also while surgeon tend to specialize into one in the many fields, programmers that rise in a "surgical team" tend to become of the "full stack" variety

> not sure why companies expect to be able to cost-optimize away a functioning economy

I'm with you here, I'm not stating this is the optimal way to handle the issue, just reporting the patterns I saw on many workplaces of all kind I've ben into

Maybe I've just been unfortunate/fortunate, but in 6 years as a software developer on all kinds of projects, I've never once worked on a project where even two people share the major contributions to a single repo. Either I run away with one part of the project or someone else does while I work on some other part. Usually split between frontend/backend/devops (webstuff) or app/libraries/tooling. We obviously plan and discuss how the different parts of the projects work together, and we do make contributions to the other repos, but these are usually minor.

I don't necessarily see it as a bad thing. It just feels the most natural, reduces friction and the need to communicate minute details. Ownership is also important. The downside of this is that some part of the project might fall apart if someone leaves, because no one else can take over.

> The downside of this is that some part of the project might fall apart if someone leaves, because no one else can take over.

This is the reason, i think, corporate world HATES with passion such devs. Even tho they need them to meet deadlines.

It also gives a bit too much power in negotiating to the employee which is also seen as "bad" in corpo.

> This is the reason, i think, corporate world HATES with passion such devs.

Which is the right attitude, imho. Even if that person loves the company and would never leave - what if a bus runs them over?

On a large scale, consistency is more important than excellence. You need to be able to plan, "if that guy stays, the project will be done in 4 weeks, but if he leaves, it will take four years" just won't work, because you cannot rely on it. It's much better to be able to say "it will be done in 12-14 weeks" with high confidence. Sure, your whole dev team might get run over and spoil your plans, but that's really a case of force majeur.

Okay, so let's be optimistic and say a team of 4 can do in 12 weeks what that one dev could do in 4 weeks. 48 dev-weeks vs 4 dev-weeks. Do you think it's reasonable for a project to cost 10x as much just for that precious consistency? Not to mention that longer projects with larger teams have higher variability in outcome, with most projects overshooting by 100% or more. Is this really the right attitude then?

> Do you think it's reasonable for a project to cost 10x as much just for that precious consistency?

Somewhat, yes. You want to be able to plan. Inefficiency is often the price for being able to plan. Since it's often not only about this one project, a higher price is just fine - you may have to coordinate with marketing, legal, sales, manufacturing etc, then the total price will often be so large that the 10x of the software project doesn't really matter, but if it falls flat and gets everything delayed, the damage may be huge (think global launch date missed).

No to mention if its well written and documented it's not impossible or even necessarily all that hard for another talented dev to ramp up and take over...

What they hate is losing power. They want power that they can only get by using the power inside others. Hate is just the energy to justify oppressing and using other minds for things theirs can't do. Also known as capitalism. What they hate is not owning and controlling their own little world.

I like non-ownership style development. Don't get me wrong, I also enjoy ownership :-) I will guess, though, that the feature size that you work on is fairly large. For example, if someone wants something done, you go away for a week or so and do it. This is the style of programming that I grew up with so I'm quite familiar with it.

I changed to a different style of programming where we break the "features" into much smaller parts called "stories" (yes, I'm one of those -- XP developer). I try to make stories about the size of a day... If you've been doing the week (or more) kind of stuff, it seems impossible to organise work at this level of granularity. In reality, it is usually possible to break things up into very small vertical slices. However, it is admittedly quite difficult.

The small slices really enables non-ownership on the team. There are a host of very nice benefits for non-ownership. It's actually a bit hard to explain, unfortunately.

First, non-ownership tends to force you into conflict early. This doesn't sounds great at first (and for some teams, it is definitely not going to work). However, it's usually better if people on the team sort out how they want to work early so that they can develop a good rapport instead of silently begrudging each other. Everybody has strengths and weaknesses and by being able to work closely together you can capitalise on that. I actually only resort to ownership if it becomes obvious that some people can't work together.

Swapping people in and out of code also results in them having to get up to speed in it. If the code is easy to understand, then people are happy and get up to speed quickly. If not... then ideally you want people to refactor the code. Umm... see the part about "conflict early" LOL. After you have a good rapport going, you'll have programmers refactoring each other's code and you'll have them be thankful for the result (honestly!). Yeah... I know.... You'd probably like what I've been smoking ;-) But it's true.

This constant vetting of code and refactoring leads to a much simpler code base. The final result is that you are able to maintain your production over long periods of time. On the best projects I've been on, we've actually improved our perceived productivity as we've added more functionality.

This seems odd, because it's the opposite of what you'd expect -- more functionality means more lines of code which means more complexity which means slower going. However, if you think of a modern OSes, languages and frameworks, we are building on huge piles of code. But the reality is that these underlying systems undergo massive churn because of the amount of use they get. In the end we get software we can build on that is stable and simple to use -- the more underlying code, the faster we can build new systems.

In my opinion, one of the most important aspects of getting to this seeming utopia is to avoid code ownership. It definitely doesn't always work and it's definitely not good for some people and some teams. But when it works, it's pretty damn amazing.

I hope that provides a bit of counter point to your experiences. I still like ownership, though... That's why I have my own private projects ;-) I'm pretty loathe to share them with anyone!

> A "hero" project is one where 80% or more of the contributions are made by the 20% of the developers.

Where does this definition come from? It sounds quite typical to me. Even in large projects like the Linux kernel, if you break them down into comprehensible units and subsystems, I would expect you'd find essentially the same thing.

The vast majority of projects likely meet the "hero project" threshold defined above, whether they're good or bad. I don't know of any movement to specifically prevent having primary contributors; the main interest from the corporate world is ensuring that primary contributors can be cycled as necessary.

This doesn't seem like news in any way. I don't want to be unduly dismissive as I haven't read the paper, but I think I'm catching a distinctive whiff of "publish or perish" in the vicinity...

This is a poor argument attacking a straw man. Each of the cited papers discussing "hero" projects are in the context of medium to large organisations. While open source GitHub projects are frequently self-contained tools or libraries. Equating the two is a monumental and unsupported jump.

I'm glad I'm not the only one who thinks this is nonsense. I wrote a fairly long comment, which you (might) find elsewhere in the discussion, then started reading the other comments and was starting to wonder what other people were seeing that I wasn't.

Short version: the OSS versus every other type of software project comparison seems very apples and oranges.

This isn't an argument, it's a description. I suspect you didn't read the paper.

I wonder _why_ hero projects are more bug free. I could probably write lots of terrible but bug free code on a personal project, because I know where all the flaws are and avoid them. If I invited more developers to work on this codebase, of course there would be more bugs, because the code quality is so bad. If I wrote good abstractions, types, and documentation, theoretically others contributing to the code would write less bugs too.

> and documentation

that's the gist of it, abstractions are always imperfect in some way and leave the door open for interpretation, at which point if team member don't communicate each module intent efficiently code responsibilities get mixed and entropy rise dramatically, causing more bug per line of code written.

imho an important figure on large projects that is largely ignored or conflated with "the architects" is the oracle, or the guy people ask where new feature/code should go. not in term of application structure or component design, but at a lower level, at modules and interfaces level.

"Hero" is a bad name for it, but one or two people on the team need to _really_ understand what is it that needs to be done, and the management needs to understand that those people _understand_, and support them. At least 80% of any given team either don't understand what needs to be done or don't give a shit, or both. Leaving such folks to their own devices is a very wasteful and counterproductive exercise.

That matter has caught my attention some other times. I'm aware of dozens of projects where such "hero developer" phenomenon happened and it's really interesting (some I worked for, some I knew by other people).

I ask myself why that happens. No doubt that variations in developers' talents is a factor. But I think there are some social forces operating there as well.

I have become the hero of four projects after the previous hero had left. And I have also already arrived in projects with pre-established heroes.

Despite I've been able to be a reasonable hero whenever I ended in such position, the other times when I worked with established heroes, they were condescending over me and the rest of the team. There has always been that trend to patronize and to be a plain jerk. That was annoying.

I knew I could contribute more and maybe even to help to herorize the project. But, simply put, there weren't minimal social conditions for that to happen.

Maybe the heroing concentration would dilute a bit by improving social conditions.

This is the 80/20 rule, except applied to the proportional contributions made by a group of developers on a given project, rather than how time is allocated relative to problem solving related to software development...

You know, a similar argument could be made for history... that 80% of societal progress was created by 20% of the individuals living in that society...

"The real hero of programming is the one who writes negative code." - Doug McIlroy

"[T]he meaning of negative code is taken to be similar to the famous Apple developer team anecdote (i.e., when a change in a program source makes the number of lines of code decrease ('negative' code), while its overall quality, readability or speed improves)." - http://en.wikipedia.org/wiki/Douglas_McIlroy

"When the Lisa team was pushing to finalize their software in 1982, project managers started requiring programmers to submit weekly forms reporting on the number of lines of code they had written. Bill Atkinson thought that was silly. For the week in which he had rewritten QuickDraw's region calculation routines to be six times faster and 2000 lines shorter, he put "-2000" on the form. After a few more weeks the managers stopped asking him to fill out the form, and he gladly complied." - http://www.computerhistory.org/highlights/macpaint/

"One of my most productive days was throwing away 1000 lines of code." - Ken Thompson

"Measuring programming progress by lines of code is like measuring aircraft building progress by weight." - Pete Kirkham

> A "hero" project is one where 80% or more of the contributions are made by the 20% of the developers. ... We identify the heroes developer communities in 1100+ open source GitHub projects.

My scepticism has been piqued to the extent that I call b######t.

Open source projects. Projects contributed to by people in their spare time, or perhaps with some corporate backing with developers on the payroll, and often with quite different governance models.

Should we therefore be surprised at the widely varying levels of contribution? Should we be surprised that those who do more work on the projects tend to make contributions resulting in fewer bugs, perhaps because they have a much better understanding of the projects due to their greater familiarity with them?

Whatever: OSS projects with contributions coming in from all manner of sources outside the core team aren't necessarily terribly similar to having a team of 6 - 8 working on a project full time within a company. The only similarity being that a relatively small core team do most (or all) of the work.

Colour me not even slightly surprised. My blunt conclusion is that the idea software projects "need" so-called "hero" developers is extremely questionable based on the data presented[1].

[1] Granted I've done no more than skim the paper but this comment, from page 3, amused me: "Most prior researchers deprecate heroism in software projects." Really? Do they now?

You're calling bullshit on a paper defining the terms that it is using in the paper?

The headline's been changed since I submitted.

I was calling bullshit on the fact that it appeared to be drawing general conclusions about all software projects based on a survey of open source software projects, which often operate under different conditions to those in commercial organisations.

The abstract and title of the paper are somewhat baity, as others have pointed out, but the headline here now reflects the fact that the paper only represents OSS projects hosted in GitHub.

It would be interesting indeed to apply the same study to commercial products.

The result itself is unsurprising, but I'm curious of what caused their low bug rate ("code quality" in their term). They haven't looked at the actual code, but I guess that code consistency plays a big role for reducing potential bugs, and it's easily achieved with having fewer developers in a team. Also this is not contradictory to sayings like "a lot of eyeballs make software better" because I believe eyeballs mainly contribute to bug detection, but not bug prevention.

Edit: After posting this, I dug into this problem a bit more. I think code consistency is less important between modules (in a large project). But this requires a premise that these modules are independent enough and they don't need much communication each other. This line of thought is pointed out by this SO question. https://softwareengineering.stackexchange.com/questions/1906...

There is certainly some survivorship bias involved. The filter (50+ weeks of activity, 8+ contributors, etc) only selects somewhat successful software projects.

Big professional software development also emphasizes reliable planning, requirements engineering, and documented processes. In contrast, most open source projects begin in a cowboy style with a single person starting to code without requirements and processes and often not even much of a plan. This lightweight process implies a lot more risk and indeed most of these projects fail. That is fine. The free software world rather goes with this evolutionary approach.

I wonder if there is any company which works in a similar way. Start lots of projects with a fraction of the resources. Be fine with most projects failing. I guess not because this approach is hardly compatible with career development. Googles 20% rule might be considered a step in that direction.

I believe the conclusion is basically true, but I also feel that sampling from the Github project showcase list induces a selection bias that is going to support it. The analysis would be more convincing if the authors had gained access to some enterprise codebases whose contribution graphs were more evenly distributed than their open source counterparts.

In larger projects, each module or discrete functionality works like what's being described in the article. There will always be a handful of programmers, most likely coordinating other programmers, that intimately know that module. That is OK and in many occasions development and intervention, from concept to implementation and then test and release to production, can be sped up, but it must be taken into account that it also lowers the so called truck factor (I like to call it lottery factor, in which a developer wins a lottery ticket, becomes a multi millionaire overnight and leaves the project, rendering hopeless all those parts he had exclusive knowledge)

I guess then the goal is to find balance between having heroes and increasing the lottery factor of the team.

It's obvious that the projects with the lowest bug counts are maintained by one person with a vision. Projects that accept any drive-by pull request without serious scrutiny inevitably suffer code quality and consistency issues. It is often more important to know what features to exclude from a project as knowing what to include. The "heroes" know the history of the project and are responsible for a consistent design.

apart from questionable choices of data to analyze (open-source has its own specific), the study proves only that heroes are good in hero projects. perfect.

Probably tangential, but I think this ties in nicely with the study about the effectiveness of Reddit deleting toxic subs (1). Leadership does in fact matter. But at the same time, all politics are local.

(1) https://news.ycombinator.com/item?id=19770237

Very interesting. But that probably doesn't translate to professional projects where people have more incentive to contribute and core contributors the obligation to onboard and mentor new hires...

"...In 2018, Agrawal...studied...661 open source and 171 in-house proprietary projects. In that sample, over 89% of all projects were hero-based..." In theory a "professional" project would want efficiency so all developers contributing fairly equally however the data does not support that. Wide variances in developer productivity are widely documented. I would think open source projects also have to work to onboard and mentor new developers or they would eventually die out. You might say that professional projects are possibly larger but the study indicated size didn't matter "...the bug injection distributions of heroes and nonheroes is barely changed after stratifying the data according to project size. Hence, when discussing the external validity of these conclusions, we need not explore issues of team size and hero prevalence." I don't see any data that suggests this study would not generalize to professional projects.

If you think that "why" someone contributes is meaningful, I'm curious how you think that professional projects (whatever that means) and whatever arbitrary metric is being used to encourage incentive would give a different result.

For instance, on a minuscule open source project of mine, I wrote all the code and another developer has (kindly) stepped up and managed upgrades to one of the dependencies, bumping the version number when it's time to do so.

If he and I were employed together to work on this project, a workload distribution of me writing and maintaining the entire codebase (minus one line) and him coming by every couple months to update a version number would not be a very fair or realistic division of labor.

The open source development process is very different from anything else. None of my projects at work have a long tail of contributors submitting comment typo fixes.

It'd be much more fair if the project at your job started out as a proof-of-concept that you wrote, while the other dev worked on other things.

Once your proof-of-concept grew into something management wanted, you'd probably still be writing a lot of the code, and the other developer would be contributing, but in the realm of 20% of the codebase, at least to start (and according to this article, that wouldn't change).

You're right, and I don't doubt the value of studying open source projects or that it's the best that can be done given the data available, but there are limitations on how much light this data can shed on the great mass of code being written within organizations.

Maybe the results do apply to the situation you describe, I'd be curious to know that.

I'd say two things, the first is that if you're not paid, you don't have that much incentive to contribute hardly so it's no surprise that most OS projects are maintained by a handful of (great indeed) highly motivated developers. Second is that it's something widely known that developers are not flexible, so if they're in charge of their baby, they won't make so many efforts to encourage outsiders contributions. Whereas in a corp environment, managers and other devs should make sure it doesn't become an obstacle to contributions.

That's not very precise but I think people can relate to what I said.

I can't particularly relate, There are things that I am paid to do and there are things that I want to do...sometimes those two things overlap, sometimes they don't. Money is one way to maybe produce incentive..it's not guaranteed and it's not the only way.

I can say for sure that the quality of the things that I want to do will always have better quality on average than the things that I am paid to do.

I can certainly generalize that humans are not flexible, but I don't think that's limited to programmers.

indeed. i dont think its wise to dismiss the corporate aversion to 'heroes'. while its certainly possible to make an outsized contribution without getting burned out or devaluing other contributors, thats often not what happens.

devaluing other contributors

Maybe this isn't what you meant, but why is it that human beings see someone they know producing a lot of value as diminishing the value produced by others, especially in technical matters?

Because people compare and if someone is doing a lot morr than everyone else it makes everyone look bad.

Why? Because the mba will see a metric elevated and use that to define a norm. Now they only want to hire 10x performers. They don't know how to find them so they invent this complex hiring process that filters out so many good candidates they have no chance of ever finding that 10x developer. They settle on the fake 10x who speaks well and presents all of the right signals. The dev comes in and struggles and blames the code/past developer/way something is done and claims a rewrite is needed in a new and trendy language. This prompts a huge rewrite that takes place over a year at the end of which its determined that nothing works and that project gets cancelled. By this point that developer has been promoted to your manager.

Isn't this paper saying exactly the opposite, though? That it does happen, and quite often?

What is the opposite of hero project? I didn't see one project in my life that doesn't have a BDFL style person in charge for my 10 years programming experience...

Did we really need a study to come to this conclusion?

Heros are called maintainers, or creators.

This is every project. This is what this study concludes. Does the world not understand this?

All good teams revolve around stars. Talented brains can create more value that two hands can do, and the organization exist to help them win. Nothing new.

I feel this is the Pareto efficiency Isa play which is a very common phenomenon.

likely this is also one of the core reasons why microservices are so successful these days. they allow having many small hero developers that create replaceable components fast.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact