Hacker News new | past | comments | ask | show | jobs | submit login
Bus Number – The GitHub plugin my coworkers asked me not to write (scannedinavian.com)
277 points by todsacerdoti 29 days ago | hide | past | favorite | 117 comments



This is one of the features of https://codescene.com/

It looks for knowledge islands and relates those to frequently modified code, to identify hotspot, or areas of high risk due to low knowledge distribution in areas of high change.

Another use is if someone hands in their notice you can easily see all the code that only they know, so your handover planning is mapped out easily.

I’ve never thought of it being used maliciously, it’s for visibility. It would be a shitty manager that would use it that way and if they’re already shitty then this tool won’t change that.


>I’ve never thought of it being used maliciously, it’s for visibility. It would be a shitty manager that would use it that way and if they’re already shitty then this tool won’t change that.

You are a member of the intelligence community of a country, let's call it Tussia, which has been locked out of the leading kernel for military hardware in the world. Let's call that Kinux.

You know that the guy down the office has started a project to fork that kernel for your countries own internal usage. You're an over achiever and want a promotion before he gets one. You call acquisitions for 8 female agents with special training for intimacy with nerds, you also make a back up call for 8 doses of polonium in case the agents aren't successful.

In case you think the above is fiction I know a CEO of a unicorn startup who got the first part of the treatment when he was looking for seed funding.


Yes, honestly that's so glaringly obvious that the author of the tool really ought to stop dismissing the criticism out of hand and take it seriously.

He's built a tool that generates hitlists for any competitor to use.


Did it work?


To protect the guilty, and any future books I might write on the subject, I'll leave it to your imagination.


Dont' forget the movie...


You'd basically be writing a treatise on the Peter Principle, anyway ..


> I’ve never thought of it being used maliciously, it’s for visibility. It would be a shitty manager that would use it that way and if they’re already shitty then this tool won’t change that.

I've had three jobs where Pluralsight Flow was introduced. At two of them, the managers immediately started using the metrics for feedback, performance reviews, employment decisions. At the third, the developers saw this coming a mile away and refused to engage with or evaluate the tool.

Unfortunately, the absurd pricing of these tools means that people who approve them have to get some sort of ROI. Since they don't have a good way to measure productivity/output/knowledge silos, they instead turn to "Well Jose had less PRs this week..."


Excellent points about visiblity, as long as you can keep it in that domain.

But this always lurks in the near shadows:

>>I’ve never thought of it being used maliciously, it’s for visibility. It would be a shitty manager that would use it that way

Therein lies the problem, on both sides. It would just become another arms race, as the developers would use it to identify and move into target project areas/components to get themselves on the list of un-fireable workers. Ideally, the workers would ensure work together to ensure that the truck_factor was zero, i.e., none of them could be fired.

Of course all of this rapidly becomes a (nearly)complete waste of time, proving the blogger's friends original point: >>"My coworkers said it would immediately hit Goodhart’s Law. "


An outside firm might use this to help a company conduct layoffs. Manager could be amazing


Amazon has these numbers easily accessible as reports on their code systems runnable at any manager level, and many other ways to inspect what the team is doing and the risks you might have. I find them useful.

Bus factor is one way to think of it. Another is it lets you spot silos, or engineers who aren't working with others, or places where you can't as easily move engineers around (so you can fix that).

Some developers fear fungability, they think that that one system only they know is job security. I see it the other way, I see that as a technical risk, but also a thing that might be keeping a great engineer from working on more important projects. Or the way to work on something else when you get fed up with that one system you hate.


> Some developers fear fungability, they think that that one system only they know is job security. I see it the other way, I see that as a technical risk, but also a thing that might be keeping a great engineer from working on more important projects. Or the way to work on something else when you get fed up with that one system you hate.

I don't fear fungibility - if a place doesn't want me there I don't want to be there either.

I dislike idea of fungibility because it creates huge overhead and avoids using talented people to full potential - because they are not fungible.

In some places that's warranted - but in others - the process overhead is far greater risk to your project success than bus factor.


I know a def who was denied a transfer and promotion because he couldn’t easily be backfilled.

Spoiler he left company within 3 months


Each department in my company can designate someone as a "critical man" so they can't change teams. However, those people usually get the highest possible ratings and raises. I've only seen it used maybe one time.


This is the other reason I want people to be somewhat fungible... so management can't shoehorn them. Really the ability to move is a good thing for both the company and the employee. Moving doesn't HAVE to happen, but it's really much better when everyone has options.


> Some developers [...] think that that one system only they know is job security

"You can't be promoted if you can't be replaced."


But a promotion also means change, maybe people are happy where they are? After all, you're most productive in a familiar codebase. Then again, if a codebase is "done", productivity is less important and often work dries up. There's plenty of people that are happy with working 10 but getting paid for 40 hours a week.


This. I am not a manager, but I LOVE writing code and solving problems. I would NOT love being a manager.


Some companies have parallel career progressions for individual contributors and managers. For example, Amazon has a distinct career progression for software development engineers (SDE): SDE I, SDE II, Senior SDE, Principal SDE, Senior Principal SDE. Sr Principal is a VP level position, but has no direct reports. It’s an advisory role.


I've noticed that, while these tracks are still technical, you will still find yourself in more and more meetings and writing less code. I think everyone needs to come to terms with whether or not they want a promotion regardless of the track.


Senior Principal engineer here (not at Amazon). Yeah :-(


At Amazon, if you don't get promoted, you get pipped, eventually.


Only if you aren't at a terminal level already, and you are easy to replace. When I was there one could find plenty of SDE IIIs (and a smattering of SDE IIs) who had found nice quiet niches of the codebase to hide out in for 5 years or more...


Who needs promotion? I just want more money


Perhaps another framing of the same idea is that you can’t work on other (bigger and better) things if you can’t hand your work off to someone else.


Article author here: I've always said I want to be able to go on vacation and not get a call.

I like your framing better!


I don’t want to frame it positively. In my experience, people who hoard knowledge and/or relationships (as with other teams or customers) are toxic.

I hate all sorts of gatekeeping behavior but I especially despise having to wheedle information or effort out of someone like that. They almost invariably have an ego that is out of proportion to their contributions.

Side note: Occasionally I have seen people who take on the thankless task of being a maintainer of something no one else wants- those people tend to welcome others and share as much as others are willing to listen to, so being the single person who knows something doesn’t make you a gatekeeper.

It’s the gatekeepers that I hate.


honestly every system I saw someone corner themselves in didn't take very long to have someone else figure out. you just toss some smart motivated folks at it. paging them every time something breaks also seems to help the knowledge transfer. I know it motiviated me to get in there and do a gut rehab.


> Some developers fear fungability, they think that that one system only they know is job security

And this is pragmatical for an employee who's primary goal is not optimizing Amazon's income.


I've spent my entire career trying to make myself and every other cog-head as replaceable as possible. Part of this is because that's the job in large parts of digitalisation, but another big part is because it's annoying to deal with knowledge silos. One of the reason we use Typescript for a lot of our back-end stuff is because it's the one language that our small team has to know. This means our front-end dev can go on a vacation and actually disconnect because other people can cover. It also means it's less of a pain when someone changes jobs.

It has never once been an issue and I think fungability is a part of any healthy system. I spent a few years on the other side of the table, and one of the first lessons you learn in management is that everyone is replaceable and that's solely a matter of cost. Because of this, high levels of knowledge may also work against you as management will likely work on reducing the risk you present. Another thing you'll learn is also how random layoffs can be, especially if they happen for economic reasons.

That being said. I avoid working at places with these silly metrics. The more red tape you put in the way of good work the less likely I'll want to work with you. There are a lot of reasons for this but the primary one is that these sort of things tend to create work cultures which just aren't good for productivity and quality as people "game the metrics" instead of doing good work.


At the end of the day the engineer has a valuable skill and you're the one telling them they need to be more replaceable. You're the enemy. Of course you're going to think the tools that let you dehumanise your employees at scale are good things.


Hard disagree. I think a huge part of our job as engineers is to build systems that can outlive us and comfortably change hands (without the next team cursing the former).

Maybe this is born from spending so many years in Amazon (with it's high turn over and near-quarterly re-org shuffling), but what's getting called "replaceable" here I'd call "writing maintainable software."

The goal is to get knowledge out of your head and into the codebase so everyone can reap the benefits. Knowledge hoarding is lame.


> Why does gnu parallel use all the cores when we tell it to use only 8? We only saw eight git clone processes at a time, but a large number of git index-pack processes that maxed out all 32 cores on my laptop. I’m guessing git’s index-pack is a forked subprocess and allows parallel to start another git clone?

Parallel is running 8 git clone jobs at once (as asked for) and each git clone is starting as many index-pack threads as it wants. Temporarily setting pack.threads to 1 (via git config) would help here.


This is an increasingly common problem: two layers, both trying to utilise the entire machine by parallelising by the number of CPUs/cores, and so the inner layer gets N² threads/processes. Being quadratic growth, this means that the problem becomes worse the more CPUs you have. With 32 cores, 32² = 1024, though in practice Parallel was told 8 so it’d cap out at only 256 index-pack processes. But that’s a lot of memory needed to support that, especially given that it’s doing no good.

The solution is for only one of the two layers to parallelise.

—⁂—

Given that you talked about pack.threads, here’s its description from `man git-config`:

> Specifies the number of threads to spawn when searching for best delta matches. This requires that git-pack-objects(1) be compiled with pthreads otherwise this option is ignored with a warning. This is meant to reduce packing time on multiprocessor machines. The required amount of memory for the delta search window is however multiplied by the number of threads. Specifying 0 will cause Git to auto-detect the number of CPUs and set the number of threads accordingly.


>The solution is for only one of the two layers to parallelise

If you have a common scheduling API, you can manage this much more elegantly. For example make(1) can control the concurrency level across recursive invocations by using a "job server" https://www.gnu.org/software/make/manual/html_node/Job-Slots...

With this, you can have, for example, 3 top level subprocesses, each spawning multiple threads of their own, but never exceeding the CPU count.

Alternatively, parallel could make it's subprocesses think that they're running on a 1-core machine, although this may have some subtle side affects.


Reminds me of a Zig proposal I saw recently to make the std.Thread.Pool API respect the Linux jobserver and macOS dispatcher automatically out of the box:

https://github.com/ziglang/zig/issues/20274


Those are some really interesting proposals. I agree that a lot of code ignores resource cleanup, especially when it comes to driver-userspace communication. For example, driver authors just implement a netlink/ioctl system which manages persistent state in kernel space, even though they could bind the state to a device file descriptor, which automatically gets cleaned up upon process exit (but still can be handed to another process, if really needed)


The solution is a global scheduler (the one thing that Kubernetes does well). It would be aware of what every program needs, not just in terms of CPU, but also memory, devices, time etc. and split the available resources fairly.


> Temporarily setting pack.threads to 1 (via git config)

Or just use `git -c pack.threads=1 clone`: https://git-scm.com/docs/git#Documentation/git.txt--cltnameg...


Thank you, didn't know this. I'll add it to the article.


I don't like this take. This post is for any engineering leader.

The Bus Factor is how hard your team would suffer if you - or anyone else on the team - got hit by a bus.

Ideal Bus Factor for all team members is 0. This might sound counter-intuitive at first, almost like "make everybody expendable", but it's quite the opposite and kind of the point.

Teams should be good enough that they are a) autonomous and b) there are no mysteries. In the ideal state, everyone understands how everything works. New employees should hit the ground running and be able to produce value immediately. Departing employees should feel comfort in knowing that there are no unknowns.

An ideal team with 0 BF across the board is desirable. It means that team members are fungible. It means that every single team member can fill in the gaps if someone is ill, or on vacation, or actually leaves or is removed.

More importantly, a 0 BF is a reflection of simplicity. The software, its build/test/deployment pipelines, documentation, and support, should all be cohesive and coherent. Siloing information in team members is bad, everyone should be able to build and deploy.

0 BF is a healthy metric, but it is absolutely 100% not measured in email rate, commit rate, PR rate, lines of code, timeliness, GitHub heatmaps, etc. Those metrics indicate nothing at all. Quite the oppositve. They are harmful, awful metrics.

Measuring people by these metrics is just monkeys on typewriters. More startups need to hear this.


> The Bus Factor is how hard your team would suffer if you - or anyone else on the team - got hit by a bus.

> Ideal Bus Factor for all team members is 0. This might sound counter-intuitive at first, almost like "make everybody expendable", but it's quite the opposite and kind of the point.

I've always heard bus factor described in the inverse fashion, as in "how many people would need to get hit by a bus for the project not to be able to continue", with the optimal number being the same as the number of people on the team. It sounds like the idea is the same, but I'm surprised to find out that the number people to convey the concept isn't always the same.


A number between 0 and 1 can easily scale to whatever company you have. A number that is different depending on how many is on your team is harder to compare between teams. I guess it depends on if you ask a programmer, administrator or mathematician what a logical system would be :-)


For sure! The simplicity of having the same ideal bus factor for all sizes definitely appeals to me, but maybe due to familiarity, I think I prefer the well-defined units from the version I cited. It's a bit of a https://xkcd.com/883/ situation in terms of how bad your imagination is for what "maximum suffering" would be.


> Teams should be good enough that they are a) autonomous and b) there are no mysteries. In the ideal state, everyone understands how everything works. New employees should hit the ground running and be able to produce value immediately. Departing employees should feel comfort in knowing that there are no unknowns.

I've worked on projects where we had engineers who were one of a countable handful of people in the world with their particular skillset.

The bus factor was most certainly 1 at that point.

> More importantly, a 0 BF is a reflection of simplicity. The software, its build/test/deployment pipelines, documentation, and support, should all be cohesive and coherent.

For projects which push the frontiers of what is possible, simplicity isn't an option. (Granted these are a small % of overall software projects that exist!) When something has never been done before, you aren't worried about keeping the code As Simple As Possible, you are worried about how the hell you can even do this particular thing.

I'm not saying the code should be low quality! However sometimes doing hard stuff involves complex code, and maybe a couple generations later people have figured out design patterns so the hard stuff can have less complex code, but that may be a decade down the line!


In the wise words of Grug:

> grug understand all programmer platonists at some level wish music of spheres perfection in code. but danger is here, world is ugly and gronky many times and so also must code be

https://grugbrain.dev/


If you only build things that are so simple that anyone can understand the code on day one and you don't need any domain knowledge, then what is your value proposition?

If the most complex thing you can build is a todo-app, then I think you don't produce much value to society.


Being able to write code that is able to be understood by someone new to the project is a skill set. It is certainly one that is not universal. And it is certainly one that should be admired. Solving hard problems in the simplest way, with clear information about why/how it works is one of the most important skills of a developer, imo.


Not day one, but anyone should be able to follow the code using a debugger and understand it. If you write spaghetti code segments, it's high time to change them.


If you ask society what has helped them the most, you will be surprised to learn how many claims the todo-list (paper or app in whatever time frame) is their main way of actually getting anything done.


Um, I would argue that what has helped society the most is agriculture, sewage systems, healthcare, electricity and heating, etc. All of which are technological innovations.

Also, how many variations of a to-do app does the world need?


Dental care and air conditioning (more generally, refrigeration) are probably my top 2.


Agreed. People who are good at their jobs and confident in what they do actively try to make their Bus Factor as low as possible. If you have a high Bus Factor that means your employer keeps you because of what you have done in the past, not your potential to do great stuff in the future.


> Ideal Bus Factor for all team members is 0.

The military thinks that way. They expect to lose people and keep going.


"You see, killbots have a preset kill limit. Knowing their weakness, I sent wave after wave of my own men at them until they reached their limit and shut down." - Captain Zap Brannigan


The military operates that way with 99% of their personnel, who are grunts, expected to only ever follow orders, to never think for themselves. They're expendable cannon fodder - think of them as pieces of hardware in a software company. But with the 1% at the very top (basically just generals), I'd say the bus factor comes into play, same as in any other organisation - certain individuals have all the knowledge of certain domains, and if enough of those individuals are taken out, the wheels grind to a halt. That's why targeted assassinations happen to the top brass.


Sure, if you manage to assassinate the entire command chain, things will go pear shaped.

I dare say you could likely assassinate half the command chain, and the military will still managed to get where they need to be, when they need to be there. Military command chains have levels of redundancy that civilian organisations wouldn't dream of.

As a concrete example, it's estimated that the British lost ~40% of their officers in the Battle of Albuera, and they still managed to repel Napolean's forces.


Not really sure what the point is meant to be here. I’m not sure if this is news to anyone.


From the original paper the key argument is:

> Our estimation relies on a coverage assumption: a system will face serious delays or will be likely discontinued if its current set of authors covers less than 50% of the current set of files in the system

An author of a file is defined as a user who has made significant (based on pre-calculated weights) contributions to a file


Thanks, this is the key info I was looking for here.


On the one hand, I would be very surprised if this sort of dashboard metric didn't already exist in some enterprise software somewhere. My management at a previous company literally asked me if we could do a daily report on who sent and received the most email in our department.

I refused to do it because I didn't like where I saw it going, but another coworker did. Unsurprisingly, the person who got the most email and sent the most email was the Sysadmin whose email account was set as sender for all the automated emails from the various servers, as he would literally email himself hundreds of alert emails a day, on top of all the crap newsletters and digests he subscribed to.

On the other hand, all of your coworkers asking you to not do something that could potentially impact their jobs, and you do it anyway as a hobby project? Sounds like kind of a jerk move.


I didn't write it in 2015 when my coworkers asked me not to.

I do want to see if open source software I use has spread the knowledge to increase survival.

I refused the jerk move!


Nice!


Kudos to you. Bullshit metric.


> As you go up the career ladder, developers should do more review and less hands on keyboard.

A common misunderstanding in tech companies, I think. You don't want to exchange great developers for mediocre managers.


Yep, we have a tech lead who has great code writing abilities. But he has terrible leadership, feedback, etc skills. He would be much better as a senior dev instead of a tech lead. I can't imagine how miserable his team would be if he was a manager.


If you think that code review is “management” then that’s very very concerning.


Well, it certainly isn't "development". It's a hoop we jump through so that management can tick a box labelled "all commits were reviewed by another engineer" on the SOC2 audit...


That’s a really weird way to view code review.

Everywhere I’ve been part of development, code review is part of ensuring code quality, catching bugs, and avoiding silos.


Just going to second this. Good code reviews (not just typo nitpicking) can be a great way to simplify down code, and spread knowledge horizontally across the org. Not to mention catching bugs.

Unit test aren't a substitute because unit tests check that the success paths are good. That's a good start, but it's not the same as verifying all the possible ways code could go wrong in a complex system, and one of the cheapest ways to spot those problems is with people familiar with complex system looking at new code.

Code review give you the double benefit of building more people who understand the whole system, and having the code looked at by people who understand the whole system.


If a non-trivial percentage of your code review feedback is about code quality and bugs, you are severely underinvesting in autoformatters/linters/strong type systems/testing/continuous-integration. It's not a cost-effective use of (expensive) software engineers to have them scanning every PR on the off-chance they notice a typo.

I'll grant you they can help break down silos, but the question you should be asking, is why your codebase is so convoluted that silos are developing in the first place?


The sad irony of this is that its still the wrong question.

When startups have to do layoffs, the question isn't "who can I fire and keep the existing business going" it is instead "who is the team I want building the next version of the product quickly enough to not go out of business"

But every fork in the road is just exactly that, and not choosing a path quickly enough has been the death of many


How is this ironic?


CPAN has been tracking the Bus Factor for a long time now. For example, https://metacpan.org/pod/Moose shows a Bus Factor of 5 in the left column info.


Also, we did document how we arrive at the number: https://www.olafalders.com/2021/06/30/cpan-bus-factor/


We like to call it the lottery factor. Would the project keep going if someone won the lottery and took off to an off-grid tropical island.

Less morbid that way.


People who win the lottery give their 2 weeks notice, and you can call them on the phone afterwards if you really have to.

People hit by a bus disappear immediately.

It's just not the same thing.


Once the big tech employers started having security march employees out of the office the moment they handed in their notice, a lot of folks stopped giving them two weeks notice.

Also, it's not guaranteed that your management will actually tell you if they did - one employer asked me not to tell my team I was leaving until the last day for morale...


Being marched out isn't universal. A household company [left out of FAANG, but still huge] let me actually do a hand off during that time


A good employee will want to hand off his projects and take questions, but you need to be ready if he can't.


Ye using sudden death as an eupheism for changing employer leaves a bad taste.

It is like when people complain about people using sports metaphors. I say that at least it is not military metaphores.

Also, there is usually no handover anyways. So the suddeness factor is not that important.


> Ye using sudden death as an eupheism for changing employer leaves a bad taste.

Well, see, we can't just straight-up tell employees that we're not going to give them promotions or raises, so they'll have to jump ship in a year... That'd be a disaster for morale!


I'd argue that it is easier being hit by a bus so you have to factor that.


True, it's about 1000 times more likely to get hit by a bus, but only 5% of those are fatal. So it's not that far off.

And still less morbid. :)


I tend to phrase it as hit by a bus OR jump on a bus to another company.


Don't know anyone who won a lottery big, or got hit by a bus, but know several people who quit cause of crypto riches.


It's just gallows humour.

To me "lottery factor" seems overly po faced and pious.

I prefer bus factor because... memento mori.


From TFA: >In 2015 or so, my employer had layoffs. One of them was the only contributor to part of the codebase that made money for our company.

Maybe I missed it, but I didn't see the author mention what came of this. I'm very curious: did someone else take it over, or did the employer go down in flames?


I imagine the employer limps on with the current codebase and/or is spending beaucoup bucks to rewrite the codebase while still using the old one.


Probably, but I'm always hoping to hear a story about a company laying off a highly valuable employee(s) (usually for cost-cutting reasons) and then going bankrupt as a result.


I really don’t like the term "bus factor" because I lost someone in a bus accident.

On the other hand, millions of lives have literally been lost on various hills on various battlefields, yet no one seems to mind when someone says, "this is the hill I’m willing to die on." :)


These days the "politically correct" phrasing is:

"Lottery winner"

Someone wins the lottery.

(I am old - I start with the poor bus driver scenario - but then try to inject the new phrasing into the conversation - as typically I am the only technical person on my team and they begin to rely on me for EVERYTHING)


i worked a software job that sometimes involved literally going to the busyard to debug on-board equipment. people at that job used the phrase attrition risk rather than bus factor because the latter was just a little too "real".


Not only is this a metric that doesn’t make any sense, simply floating the concept that this metric corresponds to anything in real life is harmful.

Teams are social constructs, and you simply cannot apply an algo to some observable code metric and get any kind of proper result. People leave, others step up, or don’t. End of story.

The problem with these kinds of metrics is that even if people know they are ridiculously off what they mean, some still think that the idea they convey is correct: ie. That if x people were to leave, the project would stall. That premise is simply not true.

Don’t even think about this stuff. It’s stupid. If you want to know more about these risks, talk to the team. People on the team know if anyone is irreplaceable.


If you wanted to view this from a darker side it could also be titled:

How to generate a hit list to hurt a global economy


True! The original 2015 article I found on Linux Weekly News had comments that worried about that https://lwn.net/Articles/651384/

I think Linux is safe because everyone uses it.


There's a very quick fix to this, which also radically improves the strength and potency of any organization:

Do not ever accept management directives from someone who couldn't do your job.

Yes, you heard me. If your manager cannot do what they are asking you to do, fire them immediately.

This has the added bonus of ensuring that Peters Principle doesn't become a major mountain instead of a mole-hill.

This does not mean that the manager has to do your job. It does mean that making sure your manager knows how to do your job is your responsibility and something that you, yourself as a worker-cog in a large machine, actually do control.

I have shipped software all over the planet in all kinds of markets for all kinds of users. The most successful project is composed of individuals who have great deals of trust in each others' ability to perform, and who share the load in ways adequate to the task. In the most successfully managed projects, those managers who I knew could do my job, but didn't (because I did it), were absolutely the best to work for .. whereas those who had no idea how to wrangle a single line of code, yet gave themselves the altitude required to be a 'manager' were, across the board, a catastrophe.

BTW, if you feel 'seen' by this comment - i.e you are a manager who feels a bit imposter'ish - don't worry, this is Peters Principle at work, and you can easily fight this by better communication with the folks whose work you cannot do ..


The Codescene product provides an in-depth analysis of knowledge distribution over time:

https://codescene.io/docs/guides/social/knowledge-distributi...

I recall seeing the linux kernel repo analysis as a show case, but I can't find it anymore.


The title and the first paragraph mentioned a plugin but the article is not about a plugin but about trying to replicate some old results, and they've failed at that since they didn't 100% respect the method of original authors.


Post Author here! mclare posted the other half with visualizations: https://mclare.blog/posts/the-bus-factor/

Check it out!


It would be cool if you guys produced a graph for truck factor vs number of github stars.

It could roughly model the importance per team member or the net impact it would have on the "stargazers" if the maintainers were hit by a truck.


Could happen?

mclare published the other half of the blog post with nice graphs a few hours later: https://mclare.blog/posts/the-bus-factor/

I'd like to build a tree of open source dependencies and find out which parts need the most knowledge spreading.


Wonder how many managers are thinking about the bus/lottery numbers of all the open source projects their developers are relying on, and then doing something about it.


This is what I was most interested in with this project (especially in light of the https://opensourcepledge.com/). We didn't run it against any new libraries, so it would be interesting to see the state of the most starred libraries.


The Nebraska problem really is terrifying, I wonder how to get it in front of the eyes of CEOs and the like.

PS: I suggested to LWN https://lwn.net/ that they might want to contact you about the Linux numbers.


Step 0. Document everything in an organized manner, create runbooks, and forbid knowledge hoarding and manual processes only (so-and-so) knows.


While it is the platonic ideal for operating an existing service, this is also a great way to kill your velocity if you are an early-stage project which needs to deliver results yesterday, and/or pivot on a dime, in order to stay funded...


Or kill your velocity by playing with creaky, fragile hacks only you understand that duplicate code and are a huge mess. I've seen countless wannabe startups that stupidly accumulated so much technical debt with a "let's worry about that tomorrow" techbro attitude that they had a relative velocity of zero when asked to make minor changes.


Building for rapid iteration is a skill like any other, and a lot of engineers who work in big companies never learned it.

Blindly staffing one's early stage startup with BigTech engineers is a pretty good way to tank a startup.


so where is the "plugin" mentioned?

I want to see for funsies what my projects look like, but I dont want to run some random crusty java.


"Bus" and "Truck" terminologies invite you to imagine your coworkers being crushed to death. This isn't great.

I've always preferred the term "lottery factor". If this employee won the lottery (almost certainly leaving their day job behind), would you be able to survive their sudden departure?


Over 30-ish years of work, I’ve lost about 0 coworkers without warning or transition to lotteries or similar events, and more than a few (whether permanently or for an extended period) to unexpected sudden death, injury (traffic related being particularly common), or other unheralded misfortune (law enforcement involvement, on a short out of country trip and then encountered visa issues, whatever.)

“Bus factor” is a more realistic reflection of the threat model, and resonates with experiences anyone who has been working more than a short time probably has more than “lottery factor”.


Put a positive spin on it. We don’t talk about coworkers getting hit by a bus, we talk about them winning the lottery and quitting on the spot due to their newfound fortune.

Same discussion but it’s less morbid, and doesn’t end up sounding like your prioritizing the health of your project about the literal lives of your employees.


The implications are different. There has to be a high euro sum you can offer to get someone who won the lottery to still brief a colleague. You can throw money at a dead person all you want but that will not have the same result. It is a sudden end in knowledge that can not be restored with time or money (like running out of lottery funds)


Also, the idea that the only reason people work on the project is that they need the money surely seems less appealing than that the only thing stopping them from working on it is death? Or are we now just okay with saying the quiet part out loud and acknowledging the exploitative nature of the economic system?


> Also, the idea that the only reason people work on the project is that they need the money

No one has presented this idea. Your arguing that we should stop paying employees, because they will obviously continue to work unless they are only working because they need the money?

Saying that if someone wins the lottery they’ll likely leave to Perdue the opportunities that bag of money will present them to is not the same as saying the only thing keeping them at the job is money.


You don’t need to plan for people leaving the company if you have a emergency reserve big enough to literally compete with the lottery payout wise in emergency. If someone leaves you can just hire a couple of hundred people to replace them and still save money.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: