Hacker News new | past | comments | ask | show | jobs | submit login

This was basically my life with building Ansible for three years. Given, I loved many aspects of it, but it was really hard and it took a major toll. (clarification: this has post has absolutely nothing to do why I am not working on it anymore)

I've been considering a bit of a blog post on this (particularly the unknown parts of wide scale OSS projects), but basically open source projects get harder as you have more contributors.

Time to review code properly (and to be overly friendly and helpful in doing so) often can completely eat into the ability for you to write and architect code properly, which strips out the ability to do the design that needs to be done. (and as you're doing this - taking maybe 15 minutes per patch, the odds of someone fixing the submission are I would guess about 25% - and you might need to do a couple dozens of these a day).

People can get annoyed at filtering out of decisions unless you overcommunicate at a rate that is probably 5-10x times what is normally expected in normal business conversation. (I was already communicating at a rate that was probably 3-5x what most developers do on the lists, and generally got crucified for it, with myths building up about my character, etc).

Often decisions have no right answer, there's a good and a bad, and either decision will irk someone, and you'll get negative blog posts about what you are doing in either direction. You optimize for the thing that will help the most people, and the one guy who doesn't have his obscure itch scratched will assume you are deliberately ignoring him. You eliminate a problem user so you can concentrate on 1000 people and doing real work, and then others jump all over you after the micro-incident (that they only saw part of).

GitHub makes some things harder -- the issue tracker doesn't have issue templates, so you have to overcommunicate the need to fill out proper templates, and people also throw code at projects prior to asking what the code should be. I love GitHub for the OSS explosion it has enabled, but it is a chaotic way of managing a project as that project gets big.

It also makes it very obvious when a project is buried behind a lot of incomplete contributions, but similarly doesn't provide good tools to sort and manage large numbers of incoming tickets. It makes it very obvious when those ticket numbers build up, and arguments happen on closed tickets.

Twitter is probably one of the worst things, as there's a lot of passive aggressiveness on it. Twitter is an "argument machine". It doesn't provide context, but it does provide a place to rally a mob with pitchforks over the slightest percieved offense, that often should not have been an offense.

Often you don't want people leaving trash on your lawn, but if you edit out the offensive things, the same people claim you are censoring them.

My advice is dont try to acquire users too fast. The "Go" project recently said something like "we had to open source this once it was a certain way along". My conditioning from Red Hat was "do everything in the open", but that was not really something Red Hat always did, as they had a lot of projects with low contribution numbers.

My ultimate feeling is that good projects have good central direction. Contributions can be good, and making a project around a base that encourages lots of shallow contributions can be a very successful strategy for making a successful project, if you do that, you end up doing a lot of custodial work and can't always hit the goals you want to hit.

While it wasn't true in the last 5 years, now I feel the code is more important than the contribution process, and focusing on that allows the users to get a better experience. Users matter a lot - and folks trying to contribute matter a lot - but I don't like the way the inherant focus of contribution turns the creator of a thing into a project manager and a PR manager, and takes away their ability to innovate on the thing.

Being able to work on code is great, but I'd still want to see contribution structured around a mailing list. Strongly encourage talking about code/ideas before submission, but most people will not read it and will submit directly anyway.

I think part of my problem was the barrier to contribution was really low (and that was great) because it was pretty modular at the smaller ends, and we quickly got overwhelmed. I like to thrash very complex codebases for not being contributor friendly, but the breathing room would have been nice.

I guess there's no clear answer - holding things longer before open sourcing them might help. Making sure you have very high coding standards helps. But eventually you're just going to have that very large number of people.

Most of everyone (95%) are awesome, it's just that the virtue of something being so open exposes you to everyone that might not be - and even those people are probably awesome, the nature of low-bandwidth communication on the internet probably just exposes you to misunderstandings and you end up stressing out over things vs being the friends you normally would.

Ignore what you can - it's a problem when others don't understand this and bug you about every single comment and interaction, and judge you on it. The ratio of complaints to thank you's is not always worth the pay at times, so just make sure you're doing it because you care about it, and find the best way to make it work for you, even if that's moving something to redmine and bitbucket :)

I think parent referred more to contentious issues (e.g. sexism, systemd hate) rather than uncontrollable growth of a project. That said, excellent post, thank you!

The opposite of course are projects with too few contributors that accept any patch out of desperation, be it reasonable or not. (ZFS on Linux comes to mind, it's a super nice community and Brian Behlendorf does a great job as project lead but sometimes features and patches creep in of which I'm wondering why nobody dared saying "no".)

The Linux kernel community solved growth by delegating responsibility to subsystem maintainers. Such a hierarchical model is not supported by GitHub. Also, the kernel community's process of submitting and discussing patches on mailing lists, while somewhat arcane, raises the barrier of entry and keeps at least a portion of the Twitter mob out.

Yeah we'd had contentious issues on the various feature lines (systemd-analogous), absolutely no issues (that I was made aware of) on the -ism lines. The points about various feature items totally hit home though :)

> I think part of my problem was the barrier to contribution was really low

Interesting. I was just the other day wondering if a github mirror of the OpenBSD ports tree existed. In my searching, I found this thread[1] in which Ted Unangst sort of alluded to the same idea:

github is all about social coding and they have a point. But many of the things they enable are considered antisocial in the OpenBSD development process.

[1] http://openbsd-archive.7691.n7.nabble.com/OpenBSD-on-GitHub-...

> (...) I'd still want to see contribution structured around a mailing list. Strongly encourage talking about code/ideas before submission, but most people will not read it and will submit directly anyway.

Django enforces this strongly. Anything other than trivial pull requests will be ignored. Contributors are strongly advices to start a discussion on the dev mailing list before working on code and for anything big there's now a formal process for 'Django Enhancement Proposals' (obviously modelled on Python's PEPs)

I didn't know you wrote Ansible, but I just want to say that it is really great. Love that it can be bootstrapped from nothing and that you can bring up another machine that is identical to what you currently have with just a few lines of code.

To not derail this entirely they say that there are two kinds of languages, those nobody use and those people complain about - that is also true of projects in general, so while it may not feel that way people complain because they care (maybe too much and maybe about the wrong things).

We're thinking about adding issue templates to GitLab Enterprise Edition and GitLab.com https://gitlab.com/gitlab-org/gitlab-ce/issues/1378#note_161...

You got some great points there but I don't think contributing a change without asking get before is bad even if it misses the project goals. I sometimes need a feature and posting it on mailing lists, etc tasks too long (considering the contribution is only going to be a few lines ~100). If it misses the goal of the project, I don't reallt care. It has fulfiled the licensing agreement (to post code back upstream). I don't care because I'm just going to use my fork and I would have done that anyways if my feature was declined in the mailing list. If the author finds it useful and wants it back upstream, then, I'm happy to have helped.

The great thing about open source is that you can basically do whatever you want with the software and git hub makes that easy by allowing to just fork the repo, hack together the features you want (and post it back upstream).

Otherwise, great summary!! Really appreciate it

There is a sense of pressure to merge PRs, though. When you have a lot of open PRs it starts to discourage further contributions. And if you reject a lot of PRs, or if you are too strict about it (i.e. nit-picking their PRs) it can also create a negative atmosphere that discourages further contributions.

Also... when you deal with dependencies in a package manager (like npm dependencies), it is less than ideal to rely on a GitHub fork rather than the module itself.

Personally, I try to merge PRs as often as possible. I aim to keep my open PR count at 0.

If a PR is not quite right (which is often the case), I will clone their fork locally, and edit their commit(s) using git rebase and then merge it.

That way the user is still acknowledged as a contributor on GitHub (and they can see both of our names/profile pictures next to the commit message). Then I make a comment explaining why I edited their commit(s) on the pull request itself or the issue page. That keeps everyone happy and encourages further contribution.

It's not always practical to do this though but it feels natural in many cases.

> There is a sense of pressure to merge PRs

So quickly outline the reason why it's not ready for merge, and if the contributor doesn't respond in a reasonable window you can reject the PR with a clear conscience. If someone else takes an interest and offers to clean up the code, you can reopen the PR.

When I first started using Ansible, the fact that you replied directly to many of my Google Group posts was what really sold me on it. So, from a users perspective, the time you spent on that was well spent. Thanks for that! Ansible has saved me a ton of time spent on tedious and annoying tasks.

Personally, I think the balance between giving good user support and spending enough time on actual development work is really really hard to figure out. But one way projects could help it work better, is make sure it's very clear how users should get support, and how much they can expect. And I have yet to see any project do that. If it was clear, then you could politely point newbies who just don't know better at the support doc. People like me who spend a lot of time making sure our requests won't waste time, would have solid guidance on how to do that. And jerks who are just wasting time could be politely pointed at the doc and then ignored when they don't follow it.

I didn't know you were moving on. Thanks for Ansible (and Cobbler).

Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact