Hacker News new | comments | show | ask | jobs | submit login
Maintainers Don't Scale (ffwll.ch)
161 points by diegocg 68 days ago | hide | past | web | 119 comments | favorite



Here's an idea, somewhat tangential - modernize the freaking dev tools. It frustrates me that these projects still use mailing lists and have no sane UI to track issues, submit PRs like Github, or maybe a Gitter channel for ad-hoc chat/questions.

It would bring a lot more attention to whats going on and let the community start policing bad behavior by maintainers.

He even mentions how contributors will often not modify their PRs according to maintainer and view code review as a rubber-stamp. I rarely see this happening on Github - because PRs are a product feature. They've already spent time designing the feature just right to convey what the contract of a PR is, so the linux kernel maintainers don't have to. The README would also have a giant "Contributors" heading outlining the guidelines that everyone can see.

This is really a UI problem. But projects from Linux/Apache consistently refuse to acknowledge its importance.


> It frustrates me that these projects still use mailing lists and have no sane UI to track issues, submit PRs like Github, or maybe a Gitter channel for ad-hoc chat/questions.

I'm not a kernel dev, but I have to ask: what's wrong with mailing lists?

Unlike your examples of GitHub and Gitter, they're unencumbered FOSS; this is very important for the Linux kernel, since Linus et al. invented git specifically to remove BitKeeper's proprietary sword of damocles from hanging over the kernel. GitHub is basically BitKeeper 2.0.

UI-wise, each dev is free to access their mail however they like. It also has advantages of being federated, accessible offline, easy to write bots for, etc.

Presumably some devs have nice UI and automation scripts for their particular workflows; it would certainly be nice if links to such things were collected somewhere, but I don't see any dichotomy between using email and having a nice UI. I certainly find my email UI (Emacs + Mu4e) far nicer than any Web site I've ever seen.

As for GitHub pull requests, I've never actually seen the appeal. Once I've cloned a repo and made a change, why do I then have to "fork" the repo, add a new remote to my clone, push the changes then open a pull request; when I could instead run `git send-email`?

I agree that bug tracking seems to be a bit lacking. I've dabbled with things like bugseverywhere, but so far nothing's managed to stick :(


> I'm not a kernel dev, but I have to ask: what's wrong with mailing lists?

No way to vote on comments, no formatting or embedded images (which work with all clients, including the web UI), no way to edit comments after you've posted them, no easy way to subscribe/unsubscribe to just one individual thread... need I go on?

> Unlike your examples of GitHub and Gitter, they're unencumbered FOSS

So use GitLab and RocketChat then.

> It also has advantages of being federated, accessible offline, easy to write bots for, etc.

Fair point. I think RocketChat is working on federation, and GitLab and RocketChat both have well documented APIs, but for the most part you're correct that email is superior in these points.

> I don't see any dichotomy between using email and having a nice UI. I certainly find my email UI (Emacs + Mu4e) far nicer than any Web site I've ever seen.

That may be true for you, but not every new developer is going to come into your project with that kind of setup. They'll be reading your plaintext, hard-wrapped emails on Outlook or Gmail while struggling to set up filters so they only have to read emails from the mailing list about threads or topics they're actually interested in. A web UI ensures everyone has a great user experience right out of the gate.

> As for GitHub pull requests, I've never actually seen the appeal.

All the advantages mentioned in my response to "what's wrong with mailing lists" above, plus inline code review, immediate feedback from CIs, linters, and code coverage analysis tools, and integration with the issue tracking system.

> Once I've cloned a repo and made a change, why do I then have to "fork" the repo, add a new remote to my clone, push the changes then open a pull request; when I could instead run `git send-email`?

Personally I've never found the process of creating a fork to be any sort of inconvenience. Maybe you'd be interested in the `hub` CLI though: https://hub.github.com/


> No way to vote on comments

Good. Speak your piece if you have something to contribute. It's easy to game upvotes; oh look lots of upvotes... oh it's a pointless fluff comment or a flame. Above all, that stuff adds complexity to the system, and no one has shown the value is real versus perceived.

> No way to edit comments

People should consider what they're about to share. Editing is like starting a company and figuring out your business model after you "reach scale". Having to think before you say something is not a bad idea.

> So use Gitlab and rocketchat then

Yes, make everyone else move over to these new things! Because needing an email address isn't enough. We need accounts with every service there is! And we need more UI's distracting us with notification bubbles!

And let's get all these devs to learn all these new tools. Rather than focusing on producing useful output. Sounds good.


Speaking as someone who has gotten a small number of patches into the kernel (via email, naturally), the implication that sending patches by email requires no additional accounts or tools or learning is a false one.

Sending patches by email requires an email setup that works. You have, loosely, three options: you can send it in your normal client, you can set up `git send-email` to submit over SMTP, or you can set up `git imap-send` to get it into your normal client's drafts folder. If you choose the first one, your client has to preserve spaces, tabs, linebreaks, and everything else byte-for-byte intact. If you choose the second two, git has to be able to auth to your email server, and nothing else along the way can break the byte-for-byte integrity of your email. Each of these is tricky in various ways; see https://www.kernel.org/doc/Documentation/process/email-clien... for a discussion. (It turns out that email between humans does not depend on byte-for-byte integrity of whitespace.)

That's a lot of reading and fiddling for devs who are busy trying to produce useful output!

I recall that if your current email address is behind Microsoft Exchange, you simply can't use it for patch submission - the Exchange outgoing SMTP server does something weird with spaces, even if you don't use Outlook and connect directly to it with git send-email.

Now, certainly that might be Exchange's fault, but it doesn't change the fact that you'll need to get a new account to submit patches.


GMail also turns tabs into spaces when sending plaintext emails. I shouldn't have to set up a mail client to contribute.

Emailing patches is just as hard as making pull requests.


> preserve spaces, tabs, linebreaks, and everything else byte-for-byte intact

Hmm, sounds exactly like a use-case for Base64 and/or mime attachments. Perhaps it's worth persuing this, if there's no existing support?


Yeah, but the intent of sending patches by email is that you do code-review by hitting reply and commenting inline. If you have to import the patch into your local git repo, or build tooling for seeing it in your email client, you're strictly worse off than with a web-based code review system.


> If you have to import the patch into your local git repo, or build tooling for seeing it in your email client, you're strictly worse off than with a web-based code review system.

That's either an argument against encoding patches, or for reading email in Emacs (M-x base64-decode-region). I don't suppose many would take the latter stance ;)


> If you choose the first one, your client has to preserve spaces, tabs, linebreaks, and everything else byte-for-byte intact.

An email client which doesn't do that is broken; why would one use a primary email client which is broken?

(Yeah, it's a rhetorical question: I'm using an app client these days like everyone else, but I hate how broken it is)


Sometimes you don't have a choice and all you get is your school's or employer's email service, which can be arbitrarily crappy. Services that are crappy enough to cause problems include Gmail and Exchange, which are probably used by 90% of schools and employers. (Gmail SMTP only saves you if you can auth to Gmail SMTP; if G Suite is set up with some SSO and with app-specific passwords disabled, this can rapidly get annoying and potentially impossible.)

In which case, yes, you can set up a personal email address, but that just reduces it to the previously unsolved burden of having to set up another account just to contribute to a project.


> A web UI ensures everyone has a great user experience right out of the gate.

That may be true for you, but it's in no way as universal as you seem to think.

I have a productive workflow (that gives me enough free time to comment on HN :-) that is effective for the several things I work on during the day (a couple of development projects, some management, etc). Hopping over to some web silo doesn't speed things up for me.

The real reason people stick to the current model is because it works. A common baseline lets people use the tools that work for them rather than trying to cram everyone into the same procrustean tools space.


Yeah, I realize a lot of devs have already have their own setups that work extremely well for them. Unfortunately anytime you transition to a new tool you're obviously going to break _someone's_ workflow.

That said, switching to something like GitHub doesn't necessarily mean developers all _have_ to use exactly the same workflow. For example, if you're more comfortable in the command line, there are tools like [hub][1] and [git-gitlab][2] available which let you perform common tasks in GitHub/GitLab using the CLI.

I just think that long-term it's better to have a good baseline user experience that devs can then extend with their own personal tools if they choose than to have a not-so-great one that _requires_ you to extend it with your own tools before it's anywhere near as good as what a well-designed Web UI provides out-of-the-box; even if in the short-term that means breaking some dev's workflows.

Obviously that doesn't mean you shouldn't try to maintain compatibility with developer's existing workflows wherever possible though. For example, GitLab and GitHub both let you reply to issues by email, and RocketChat has an IRC plugin that lets you connect RocketChat rooms to an IRC channel.

[1]: https://github.com/github/hub

[2]: https://github.com/numa08/git-gitlab


I feel like you're addressing a nonissue. If you want to start your own project, you can set the standards for development process, coding standards, language, etc. If I don't like those choices it's no big deal and no slur on you: I'll just work on what I want and folks who agree will be delighted to work with you.

But your note feels like you're saying "why don't the kernel dev switch to Ruby?" Of course when I say it that way it sounds absurd, but really it's simply an equivalent.


By that reasoning though, isn't almost every problem in open source projects a "non-issue", because you can always just fork and do things your way?

The point is that reducing the friction required to contribute to an open source project is a great way to attract new contributors. Using more user-friendly dev tools is a great way to do that, and is therefore something existing maintainers would be wise to consider doing.


> By that reasoning though, isn't almost every problem in open source projects a "non-issue", because you can always just fork and do things your way?

Yes. Because that's true. Either you have a good point and manage to convince people in the project, or you have to fork.

Writing irrelevant comments on some news site is not going to convince anyone to spend a combined hundreds to thousands of man-hours work (at the very least) to move kernel development to GitHub or whatever platform you like.

Also notice how you are ranting about tooling in a project you're not contributing to, while the submission here is about something entirely different -- and written by a contributor.


> Writing irrelevant comments on some news site is not going to convince anyone to spend a combined hundreds to thousands of man-hours work (at the very least) to move kernel development to GitHub or whatever platform you like.

So you're saying we shouldn't discuss this at all? Just shut up and stop trying to convince anyone to adopt a viewpoint I believe would be beneficial to open source projects in the future?

IMO open discussions like this are a good and healthy thing for any community of developers (such as the one on Hacker News) to have. It's a great way to share ideas with each other and collectively come up with ways to improve the open source community. It's not something that should be discouraged.


> Yeah, I realize a lot of devs have already have their own setups that work extremely well for them. Unfortunately anytime you transition to a new tool you're obviously going to break _someone's_ workflow.

So don't.

Do you have a single compelling issue why they should switch?


Yes. All of the benefits I mentioned in my previous posts further up in this thread. It all essentially boils down to making it as easy as possible for new potential contributors to contribute.


New potential contributors to a kernel that are deterred by not having a ruby-on-rails web GUI wouldn't have contributed anything anyway. (IOW there are none that fit you description).


No kidding. I'd say a web interface sets a ceiling for how good the interface can be. And 99% of the time it's an awfully low ceiling.


...As opposed to, what, a native application? A TUI?

I mean, you can do a lot of amazing stuff in a TUI, but at this point I'm fairly sure that I only like them out of Stockholm syndrome.


As opposed to a mailing list based interface which allows everyone to read and respond with the tooling of their choice.


>No way to vote on comments, no formatting or embedded images (which work with all clients, including the web UI), no way to edit comments after you've posted them, no easy way to subscribe/unsubscribe to just one individual thread... need I go on?

Last point aside, these are all in favor of mailing lists, you know.


What? Why? All of those are extremely useful features I find myself missing whenever I'm forced to use a platform which doesn't have them.


It's not for your sake, it's for the sake of the maintainers that have to deal with your code.


Why would allowing developers (maintainers included) to post, vote on, and edit formatted markdown comments make it harder for maintainers to deal with my code?


Because voting isn't important and plaintext is easier to deal with than markdown. Plaintext works in a terminal and with embedded patches and inline replies and so on. You don't grok it, that doesn't make it worse.


> Because voting isn't important

Voting is absolutely important. It makes it easier to gauge the community's reaction to issues, PRs, and comments, and cuts down on "me too" type comments which might otherwise clutter issues.

> plaintext is easier to deal with than markdown

Markdown _is_ plain text. I've been writing all my comments up to this point in Markdown. Did you notice?

> Plaintext works in a terminal and with embedded patches and inline replies and so on.

So does Markdown. The difference is that Markdown can _also_ be rendered as pretty, fully-formatted text in a web browser.


You have to understand that you have a severely different outlook on FOSS maintainership than these projects do. You simply don't grok it. It's not like the maintainers here are unaware of the alternatives.


Is it possible to get the plaintext of comments from GitHub or the other platforms you prefer?


Yes, using the API you can get the raw plaintext in both Github and Gitlab, at least.


"A web UI ensures everyone has a great user experience right out of the gate."

I have a few blind users who would heartily disagree with this point considering most 'modern' web UIs totally break any sort of reader usability.


Yeah sorry, I should have phrased that as "a well-designed web UI". It's certainly not impossible to create a modern web UI which works well with screen readers.


You're missing the core point: mailing lists are simple, they require no vendor or stack lock-in, and their quirks are well-known.

For a project that needs to last more than 5-10 years, using really boring (fossilized?) communications archive tech is a feature, not a bug.


Using fossilized communications tech, while ignoring the conveniences embraced by nearly the entire rest of the FOSS community, optimizes for the comfort of the Current Developers, while making it less convenient for New Developers to contribute.

New contributions require a person to forego the conveniences which the fossilized tools don't support well, and that friction _might_ contribute to fewer smart people wanting to contribute to your project. Consider as the OSS world's equivalent of a sales funnel. The more I need to deviate from the way I normally work, the less likely I am to want to make that contribution.

For a project where the overloading of maintainers is an issue, it might be worth considering the value of making it easier for other people to contribute. Sure, smart people who want to contribute _can_ adapt to the way your project wants them to work, but not all of them will want to jump through that extra hoop.


> Using fossilized communications tech, while ignoring the conveniences embraced by nearly the entire rest of the FOSS community, optimizes for the comfort of the Current Developers, while making it less convenient for New Developers to contribute.

I think I'd rather not have my kernel code written by someone who is afraid to use an email client.

Email is not, FWIW, fossilized. It's far superior to GitHub or Slack, because it's decentralised and truly free.


Could you show your work? You're valuing "decentralied and truly free" as significantly more valuable than (far superior to) the sum total of GitHub Slack features. Could you list out the GitHub Slack features that you considered (and which ones you didn't consider) and your heuristic for their value and how they all combine to be much, much less valuable than decentralized/free?


Email:

- multiple clients on multiple platforms

- multiple clients on emacs (e.g. Gnus, mu4e, NotMuch)

- does not require a browser, ever

- does not require executing untrusted code on my computer, ever

- open protocols

- multiple servers on multiple platforms

- federated

Github, Slack et. al.:

- may have a decent text-only client (e.g. there's a Slack mode for emacs, and there are some GitHub clients out there, which may or may not be convenient)

- strongly encourage and sometimes mandate use of a browser with execution of untrusted code allowed

- protocols controlled by a single organisation, with no possibility to add functionality on the server side

- servers controlled by a single organisation

- no federation

- fundamentally proprietary

What I value: freedom; TUI clients; emacs clients; open protocols; being able to run my own service.


You listed only positives for Email and only negatives for Github etc. but you implicitly made the claim that decentralization and freedom are more valuable than all of slack and Github's features combined. Or are you saying that the value of everything except your "what I value" list is zero, so all of github/slack's objectively useful features are weighted to zero in the calculation?


> you implicitly made the claim that decentralization and freedom are more valuable than all of slack and Github's features combined.

I can't speak for others here, but personally I treat freedom as my first priority: it doesn't matter what features a platform offers if it doesn't offer freedom, because I will not use it if it doesn't offer freedom, therefore I'll never see any of those purported features.


Forget "afraid". Developers have better things to spend their time on than jumping through hoops to learn the obscure workflow of an old community.

If I have the choice of a) contributing to the latest hippest JS browser framework right away, or b) spending several hours setting up a custom email-based work environment, then contributing to the driver for my bleeding-edge Skylake system, I know what I'm going to choose - despite having way more experience with kernel hacking (in a corporate, GitHub-Enterprise-using environment).


> Developers have better things to spend their time on than jumping through hoops to learn the obscure workflow of an old community.

Again, I'd rather have my kernel code written by someone who has respect for others, and doesn't view spending a little time learning how to be productive in a group as 'jumping through hoops to learn the obscure workflow of an old community.'

> If I have the choice of a) contributing to the latest hippest JS browser framework right away, or b) spending several hours setting up a custom email-based work environment, then contributing to the driver for my bleeding-edge Skylake system, I know what I'm going to choose - despite having way more experience with kernel hacking (in a corporate, GitHub-Enterprise-using environment).

That's a-okay by me.


That's true, but the same respect-for-others is valuable in the other direction.

Over time, the number of people who are _willing_ to learn your conventions, versus those who _want_ t o use those conventions, may change. You may have several colleagues (and many more who have not yet contributed) who are willing to use email, but would prefer a web-based collaboration platform.

Consider that there are more ways to "be productive in a group" than e-mail -- there's a reason many of us in our work lives have transitioned to using Github or similar web-based repo-collaboration tools, rather than e-mail. Rejecting those out of hand as "not productive" seems a bit strange.


> That's true, but the same respect-for-others is valuable in the other direction.

Very true! And certainly not all change is bad, and not all new things are worse than their older analogues.

> Consider that there are more ways to "be productive in a group" than e-mail -- there's a reason many of us in our work lives have transitioned to using Github or similar web-based repo-collaboration tools, rather than e-mail.

True enough, but I wonder how much of that is not because they are fundamentally better but because email has gotten popular. I can't easily review source code via email because I have to wade through meeting invites, corporate communications and recruiter spam.

Honestly, though, GitHub doesn't really give me a whole lot, other than centralised issue tracking and a data store. I never use its git UI; I just use Magit locally. I've actually been thinking about how a team could use commands to manipulate text files in a git repo to manage history for that repo — then one would never need to leave one's editor to manage issues. With a central file server open to users, merges and PRs could just be operations on the repo itself.

My idea's not fully fleshed out, but I think it'd be awesome, and more productive that using GitHub via a web browser.


And I'd prefer to have my kernel code written by a community that cares about its long-term health and efficiency rather than respect.


> Developers have better things to spend their time on than jumping through hoops to learn the obscure workflow of an old community.

Strongly disagree. As a general observation, people who don't have the patience to learn community norms generally don't have the patience to follow coding styles or write tests or properly document their code.


> Developers have better things to spend their time on than jumping through hoops to learn the obscure workflow of an old community

Do developers really think that way?

I know if I'm going to invest 50+ hours of my time on an activity then an hour or two to get setup is no big deal.


> spending several hours setting up a custom email-based work environment

You...do realize that a patch email and subscribing to a mailing list isn't exactly something that should take "several hours", right?


Simple is not easy.

Mailing lists are a nightmare for newcomers. That's why they progressively disappear.


Maybe that's a feature? If something as simple and straightforward as a mailing list is a "nightmare" for someone, I certainly wouldn't want kernel patches from that person. If they don't have the patience/tenacity/knowledge/whatever to deal with email, then they sure as heck don't have those qualities in sufficient quantities (yet) to produce more signal than noise in an open source project.

In this sense, having a developer mailing list is like FizzBuzz for potential contributors- if they can't handle the easy stuff, they definitely won't be able to effectively contribute as software developers to a nontrivial project.

Another nice thing about email is that in practice, it's a slightly more formalized form of communication. Lots of people write terrible emails, but lots more people type terribly written messages into little boxes on web sites. The nature of the medium (in my experience at least) is such that people are much less likely to treat email like chat than they are to treat web forums like chat, and they're much more likely to make an effort to communicate effectively, which is absolutely critical for successful distributed development.

Finally, I think people forget that it's not necessarily a net good for the users of an open source project to have the most frictionless mechanism possible for interacting with or sending patches to the developers of those projects. People who are new to open source, especially if they're new to software development in general, sometimes have this ideal that more community is always better than less community, and more communication is always better than less communication, and more patches from the community are always better than fewer patches from the community, but I don't think this is true at all. It's very dependent on how the project is run and the goals of its developers. It can be a net good for the whole project to have some hoops that people have to jump through before they can start pestering the developers with patches, and email is about the most trivial hoop there is. Lots of projects intentionally erect many more barriers than that, and that's not necessarily a bad thing. The goal of most open source projects is to produce high-quality shared software, not to be a social experiment with the lowest possible bar for involvement.


> No way to vote on comments, no formatting or embedded images (which work with all clients, including the web UI), no way to edit comments after you've posted them

Can you cite any study evidence that suggests that these are helpful features in this context (or even that they aren't massively harmful? Comment voting in particular appears to be an especially strong source of toxicity and online forums that feature it prominently have a pretty reliably history of going down in flames after a few years).


... he wrote, on a forum that's existed for 10 years now, with few problems with toxicity.


> Linus et al. invented git specifically to remove BitKeeper's proprietary sword of damocles from hanging over the kernel.

Didn't he invent it after the sword had fallen and the BitKeeper people withdrew their permission to use it? I get the impression Torvalds is fine with proprietary tools if they're significantly better than the free alternatives.


This is correct. He wrote git after the BitKeeper relationship went sour. He was fine with it being proprietary software. BitKeeper only became a problem for Linus after the BitKeeper developer took away free licenses for Linux kernel developers, after getting into a kerfluffle with (I believe) Jeremy Allison over the terms of that license.


> Didn't he invent it after the sword had fallen and the BitKeeper people withdrew their permission to use it?

Yes, perhaps I should have been more specific that the sword (BitKeeper as gatekeeper) was the problem, rather than the "hanging above" (i.e. the threat rather than the actuality).


> what's wrong with mailing lists?

* They have no relation to code. It's difficult to discuss patches. * I don't want to make my email public (mailing lists are public). * The interface for threaded conversations is almost always terrible. * It's trivial to get your concerns ignored. This is much harder will pull requests (but still possible).

> UI-wise, each dev is free to access their mail however they like.

If they all suck equally, I'm not sure this is a good thing. I certainly don't want to waste my own time coding around the deficiencies of a medium that wasn't built for holding complex conversations.

> As for GitHub pull requests, I've never actually seen the appeal.

Well, they make it super easy to leave comments on code and iterate before accepting. I've literally never seen this happen quickly via a mailing list: it's either poor quality when it's accepted, or it takes forever to merge.

Most of all, though, if I need to figure out how to generate and mail a patch, I'm not going to even bother looking up the mailing list: I know it's not worth my time. This is exactly the type of problem git is good at solving; why resist?


You mostly seem to attack GitHub because it's propietary.

GP's argument can be applied to GitLab (eg: more "friendly" for new devs), and is FLOSS, and can be self-hosted. Interaction via email is also possible.


Once I've cloned a repo and made a change, why do I then have to "fork" the repo, add a new remote to my clone, push the changes then open a pull request;

Because noone wants to see or deal with your cruft, or that of the thousands of other people also contributing to a large project. Keep it in your own private space until it's accepted as being actually part of the project.


I'm not sure what you're talking about here. I don't have commit access to their repository; I just have my local clone. In normal usage, should I want to try fixing a bug or making some other improvement, I'd simply hack on the local repository until I felt satisfied with the change, then mail the diff to some maintainer for review. What "cruft" is there for anyone to see?

With the pull-request model, there's a lot of Github-specific folderol necessary to accomplish the same task. Perhaps the pull request process offers some significant benefit for the maintainers, which I haven't been able to see, but whatever it is, it must be weighed against the fact that the complexity of the pull-request process creates a barrier to entry. I can think of two situations where I had enough interest in contributing to a project that I downloaded the code and diagnosed the issue that was bugging me, but gave up when it started to look like it was going to take an hour of extra github hoop-jumping and potentially days of review and testing process just to share my one-line fix.

If I'd already been a participant in those projects, dealing with the process might not have been such an issue; but as an outsider, the pull request work flow is a pretty clear "get serious about joining our project, or go away" message. So, I went away. Perhaps those projects will fix those bugs themselves, eventually.


> Keep it in your own private space until it's accepted as being actually part of the project.

I don't understand; is a clone on my laptop not my own private space?


I know this is not a popular statement on Hacker News, but this development model existed for dozens of years before Github, and it will persist long after Github is gone.

The entire concept of a 'pull request' is a very recent fad, one with many drawbacks -- most of which have been enumerated by the very development teams you're insisting should adopt PRs. The largest single problem with PRs is the implication that we all have git repositories with a shared authentication mechanism. It's very difficult to enforce this sort of thing in a developer environment as diverse as these large projects have.

Sending patches via SMTP is arguably the most inclusive possible development model and I really value its availability in projects to which I contribute.


> The largest single problem with PRs is the implication that we all have git repositories with a shared authentication mechanism. It's very difficult to enforce this sort of thing in a developer environment as diverse as these large projects have.

Why? Can't you just set up a GitLab instance and require that PRs should be submitted through that system?

> Sending patches via SMTP is arguably the most inclusive possible development model and I really value its availability in projects to which I contribute.

So just keep that option available for those who want it. Set up a bot that opens PRs automatically for any patches it receives by email. Doesn't seem like that'd be terribly difficult.


> Why? Can't you just set up a GitLab instance and require that PRs should be submitted through that system?

This requires the contributor to have (yet another) account. Requiring that account to be on your gitlab installation isn't particularly better than requiring that account to be on Github.com.

> So just keep that option available for those who want it. Set up a bot that opens PRs automatically for any patches it receives by email. Doesn't seem like that'd be terribly difficult.

What would they be pulling from, exactly? When you send a patch, you're just sending the changed code. There's no 'other repo' to pull from. I suppose you could allow the system to create dedicated branches for incoming email, and then the PR could come from that branch, but this is computationally expensive (possibly ruinous, depending on the complexity of the repository and the size of the diff), and could easily be abused by sending boatloads of spurious diffs.

Introducing this web interface requirement simply takes way more engineering than people assume, and even suggesting it presupposes the existing system doesn't work (it does) and that there is some inherent superiority about using a web interface to do the job (there isn't).


> This requires the contributor to have (yet another) account.

That's true, but I'd say the amount of time involved in setting up a GitHub/GitLab account is pretty insignificant. Probably comparable to the amount it takes to subscribe to a mailing list.

> What would they be pulling from, exactly?

Have the script dump the code onto a new repo or branch and pull from that.

> but this is computationally expensive (possibly ruinous, depending on the complexity of the repository and the size of the diff)

Huh? A branch in git is literally just one file on the disk, and the diff has to be downloaded anyway regardless of whether you're sending it by email or by PR.

> and that there is some inherent superiority about using a web interface to do the job (there isn't)

There is: it's more user-friendly, and thus makes it easier for people to contribute to your project. When the success of a project depends on people being willing to devote their free time to work on it, it's important to make that experience as frictionless as possible.


> That's true, but I'd say the amount of time involved in setting up a GitHub/GitLab account is pretty insignificant. Probably comparable to the amount it takes to subscribe to a mailing list.

Yes, and now you have yet another remote internet account to manage, to worry about platform compromises, your government commandeering, etc etc. And you don't necessarily have to subscribe to a mailing list to send mail to it.

> Huh? A branch in git is literally just one file on the disk, and the diff has to be downloaded anyway regardless of whether you're sending it by email or by PR.

I host large git repositories. Creating and managing that file is not computationally trivial on an extremely large and extremely active repository.

> There is: it's more user-friendly, and thus makes it easier for people to contribute to your project. When the success of a project depends on people being willing to devote their free time to work on it, it's important to make that experience as frictionless as possible.

It might be more user-friendly _to you_, but to someone whose internet connection is a 2G GPRS connection, on a computer whose operation comprises a significant portion of the user's monthly power budget, email is infinitely more user friendly than a web app. Your speculation regarding its essentiality to the success of the project is a little off-target, considering we're discussing the kernel of the operating system that runs on the majority of mobile devices, servers, and supercomputers worldwide ... and it doesn't do pull requests.


Not terribly difficult to describe, sure.


GitLab has a well documented API: https://docs.gitlab.com/ee/api/merge_requests.html#create-mr

You could probably do the whole thing with like 50-80 lines of bash.


That is a great idea. If we need to add features to GitLab to do this please let us know.


GitLab already has a built-in way to reply to comments by email, right? I wonder how difficult it'd be to make "submit PR by email" part of GitLab itself.

Personally I probably wouldn't use that feature, as I'm already very comfortable with the Web UI, but judging from some of the other comments in this thread it sounds like some people might...


Submit patch by email to create MR is a feature I think we're open to if people indicate they would actually use it. I have the feeling that if we would that the kernel maintainers still wouldn't use it. But I'll gladly be proven wrong.


I can't believe the CEO of Gitlab has time to read all these threads. Do you use an HN bot yourself to see Gitlab mentions?


I must admit I read HN quite a lot :/

I found out about GitLab via Hacker News and the company got started after the kind reaction to https://news.ycombinator.com/item?id=4428278


We also have a "mentions-of-gitlab" channel in our chat that will alert us of any mentions of GitLab on any major place in the internet.


I'm ignoring that for now because it is overwhelming. Luckily we have a team of community advocates who respond to that.


Do you run a GitLab instance? It requires someone to maintain this. What if the server goes down?


If you're unwilling to invest the time host it yourself, you can always use GitHub or GitLab.com instead. Hosting your own instance is only important if you're the type who wants to have full control of the server hosting your repositories.


It's only the most inclusive possible development model if you narrowly define "inclusive" as "least technical friction for contribution".

Most people have a different notion of what inclusivity means, most notably having a welcoming UX with clear rubrics for contributions so that contributions have the best chance of actually being accepted.

When people complain about something being difficult for humans, promoting something that's easier on a technical level is turning a deaf ear.


No, I mean inclusive, because it does not require the contributor to identify in any way, it does not require the user to have mastered the UX du jour, and it does not require the user to even have a full copy of the git repository.

All you have to do is send your code, on whatever device you want, using open, standardized protocols.


I consider myself a git novice, so please explain if I am wrong. But isn't a Pull Request basically: "I cloned your project and made a bunch of new commits, please pull from [url to repo with .git folder in it] to merge those changes in to your project"?


Contributing by patches is even simpler, since you don't even need a place to host your repo. Instead you literally do "git format-patch some_base_ref" and get a bunch of files you mail someone.

It doesn't get any simpler...


Yeah, that's exactly what it is. Most Web UIs that implement pull requests though only support Pull Requests between repositories hosted on their platform. You can't just provide a url to an arbitrary git repo to pull from. In principle though, there's no reason they _couldn't_ implement a feature which would allow that.


There are tools that get used by maintainers. For example, patchwork is used by many subsystem trees to track patches and the discussion around them. And those developers and maintainers who fix bugs (and who are getting paid for it, as opposed to bugs that get fixed for free), do generally use some bug tracking system, but it tends to be one specific for the production kernel tree or product tree. Yes, that means that there is some duplication of effort, but the fact of the matter is that people don't use the upstream kernel tree directly for any product or production service, so having a bugzilla tree for that only makes sense if you have a large team of people who are exclusively working on upstream --- and often times, that's not the case.



4 and a half years is a lot of time. Taking that comment at face value means accepting the assumption that Github's PR UI hasn't changed in that entire time.


You very clearly didn't read the actual linked comment.

> github throws away all the relevant information, like having even a valid email address for the person asking me to pull.

Github explicit decided to eschew valid email addresses: https://help.github.com/articles/keeping-your-email-address-...

He explains in a later comment: https://github.com/torvalds/linux/pull/17#issuecomment-56599...

> real explanation, proper email addresses, proper shortlog, and proper diffstat.

> no sane word-wrap of the long description you type: github commit messages tend to be (if they have any description at all) one long unreadable line.

I don't know if github supports editing like that but by default it does one long line.

Related pain point: when I want to type out something in a fenced code block, it's a pain to have to switch back and forth. I do wish the editor forced a monospace font.


Don't write your commit messages on Github... It doesn't reformat them if you push with git's standard wrapping.


> Github explicit decided to eschew valid email addresses

No, that's a choice individual users make. As you can see from that help article, they have to explicitly set their git client to use an invalid email address if they want that. GitHub cannot and does not alter the email addresses contained in pushed commits. (Doing so would change the commit hash, making it an entirely different commit.)

> He explains in a later comment

Seems like the only thing missing is a way to wrap commit messages to 80 chars. Everything else is the fault of the person sending the PR, not GitHub. Also it only applies to commits _created_ from the GitHub Web UI. I don't know about you, but I only do that for extremely trivial changes.


As an OSS maintainer that is currently struggling with this I can tell you the solutions are not that simple:

- We use several mailing lists (like a forum its good for async communication)

- We have several IRC channels for immediate feedback

- We don't have glitters/slack/discord/etc because more avenues of communication actually makes us scale less

- reviews are not rubber stamps

- we use github to host our repos and as a ticketing system

- we have developer docs (sometimes they do lag or are incomplete)

We still have much larger backlog than we want and plenty of people mad/exasperated on our scale and turnaround for versions and features.

I don't think UI is an issue as much as other factors, many other things will affect the project.

As for the communication channels, a good community actually offloads work from maintainers. As users get to know/use the project and pay attention to the channels, they end up doing a lot of support and freeing the maintainers to work on the code. To get to this point the maintainers have to have put in the work and have helped the community gain the knowledge needed. That said, it is not a license for maintainers to check out, just a good way to share the burden.

Github has a huge community around it, but it's ticketing system is limited and not great for big projects, they are working hard to fix this, releasing many updates and features this last year that have made our lives easier, but it is still an issue (see "open letters" from last year for details).

Much time is spent dealing with insufficient data from users/contributors or just plain unreasonable requests: "soo this does not work in AIX 4.1 (current is 7.2?) with custom built python 2.4 from which you removed core libraries and a non POSIX custom built grep? no .. we don't have tests for this". A good ticketing system can help, but it will not solve the issue (skynet might, but that might be too drastic).

A factor is maintainer 'control', it is hard to let others 'raise your baby', specially as their vision might diverge from yours. Even w/o sparking flamewars there is a constant time sink in discussion of ideas, feature set and roadmaps. Design by comity or popularity does not normally lead to good results either and IME sparks more flamewars than it avoids.

Another is code quality, most projects start w/o a big test infrastructure (one guy really doesn't need much), once you have hundreds of contributors you struggle to get proper linting, testing, access to resources, etc.

No one writes perfect code, no one can perfectly review code, more code == more bugs == slower turnaround.

Reviews are a bottleneck .. yet they are the primary guarantee for code and interface quality. Rubber stamps will create more work for you in the end as you'll spend multiple cycles fixing bad code , tests can only help you so much so much of the burden falls on the maintainers.

Tests are code too! People forget this and put little/no review into tests or have many trivial tests (check that setters/getters actually work...) to get 100% code coverage. Code coverage is not a goal, code quality is. The point here is that test are also a lot of work, specially getting them 'right' for your project, what this means is also fodder for flame wars. Not having good testing, also creates more work as more bugs will get into the code.

Most contributors won't maintain their code, many in OSS will contribute code and then disappear, any problem with it has to be dealt with by either the maintainers or someone else in the community that might also 'contribute and evaporate'. As code bases grow and features are added, the maintainers increase their burden. This is alleviated by increasing maintainer numbers, but that is always trailing the burden by a lot.

Adding too many maintainers, it gets to a point that adding maintainers creates more burden, coordination, design decisions, knowledge sharing, etc ... team management in general.

...

I'm going to stop now, I could probably write many pages on this but I think the above is enough to show that this is not easily solved by using a few new tools or UI, docs do help, but that is another maintainer burden.

Maintainers normally just use what 'works for them' and most will look to improve the process. We all get many suggestions, it is impractical to try them all and the 'latest trend' dies down faster than the time it takes one to shift the workflow to try it. The few times I've been able to do so, it is either not an improvement or just not enough of one to justify the effort of switching the workflow.

If someone has a way to make managing big|popular OSS projects simple and seamless ... LET ME KNOW!


This is the correct response IMHO. It's a social problem and all this yakking about tools and web interfaces versus email must be coming from armchair experts who have never tried to be a maintainer.

I encourage them to try it and learn what the real difficulties are. More maintainers are needed almost everywhere, just choose a project and dive in. And please, remember your health and the real world too! They will be easy to forget.

Lowering the cost of entry for contributors is no longer a problem. I doubt it was ever a problem.

I also doubt computers have changed the organizational problems very much. The same complaints (with some translation) could be made for any nonprofit, and many businesses too.


>"If someone has a way to make managing big|popular OSS projects simple and seamless ... LET ME KNOW!"

I've had ideas about this, but just to be upfront I've not tried them in the real world.

Basically I think if code is designed around specifications it's possible to automate a lot of what goes into software maintainance.

Let's imagine a scenario. I'm designing a piece of software. The first thing I do is to define specifications on how the system sits together (i.e. the function interfaces, the general architecture). I then put into place tests that can validate that these requirements have been met. If you're familiar with functional languages essentially you're looking at a bunch of type signatures and a map of how they can interact.

Next, I put in place a CI system. This will run tests every time someone submits code to the main repository, as well as run linters, code style checks and performance checks. Commit access is completely open. If the code matches the expected style, doesn't cause the related tests to fail, and doesn't cause performance regressions it's accepted into the codebase. If it doesn't it's removed.

With this approach, development discussion between people can be focused on altering the specs to permit refactoring and/or new features to be added.

Any thoughts?


This is similar to the typical workflow I see in most GitHub projects I've worked on. Once you submit a PR, various tools will come along, lint your code, run the tests, check your test coverage, etc and give you a report. Then a project maintainer comes along and manually reviews your code to check for various style issues and things that can't be checked automatically (like whether they even want this feature in the first place) and if everything looks good they click merge.

> Commit access is completely open. [...] With this approach, development discussion between people can be focused on altering the specs to permit refactoring and/or new features to be added.

Not gonna happen. Why? Because: http://www.commitstrip.com/en/2016/08/25/a-very-comprehensiv...


>"Not gonna happen. Why? Because: http://www.commitstrip.com/en/2016/08/25/a-very-comprehensiv...

That's a misunderstanding. Most specs aren't fully fleshed out code, they're the bare bones of describing what something needs to do.

Think of it as if you're designing electronics out of integrated circuits. You can design a schematic just by knowing the inputs and outputs that the integrated circuits need to provide. The actual implementation of the integrated circuits is abstracted away.

It's the same with the relationship with specs and code. Specs are meant to be at a higher abstraction level than the code they describe. You can use code to write specs (and there are advantages to doing so), but the idea with the specs is to define something which is universally true, rather than get into the detail of the work done to meet this specification.

An obvious example of this is design by contract:

https://www.eiffel.com/values/design-by-contract/introductio...

Whilst code contracts aren't necessary in all languages (this is a good starting point for looking into why: http://blog.ploeh.dk/2016/02/10/types-properties-software-de... ), I believe they do offer benefits to software written in most imperative-style languages.


> That's a misunderstanding. Most specs aren't fully fleshed out code, they're the bare bones of describing what something needs to do.

Right, the problem though is that you seemed to be suggesting that the spec would be so detailed and complete that a script checking that the code submitted in the PR matches the spec would be good enough to decide all on its own, with no human intervention, whether or not the code can be merged:

> Commit access is completely open. If the code matches the expected style, doesn't cause the related tests to fail, and doesn't cause performance regressions it's accepted into the codebase. If it doesn't it's removed.

A spec that detailed would basically have to _be_ the code itself.


>"A spec that detailed would basically have to _be_ the code itself."

No it wouldn't. The specs are applied as part of the test suite. Please read up on code contracts to see how this is possible.

If you didn't like the link I shared before, here's a video:

https://m.youtube.com/watch?v=YQJMe0Eahyg


> Commit access is completely open. If the code matches the expected style, doesn't cause the related tests to fail, and doesn't cause performance regressions it's accepted into the codebase. If it doesn't it's removed.

It seems to me that any spec. that went into sufficient detail to allow this would be more or less writing the project in another language, meaning there's no actual benefit. I can't imagine that a spec. that doesn't get into that level of detail would be sufficient to prevent all malicious/unwanted commits (say, subtly weakening cryptofunctions or leaking user data over side-channels).


>"It seems to me that any spec. that went into sufficient detail to allow this would be more or less writing the project in another language, meaning there's no actual benefit."

The benefit is in designing code at a higher level of abstraction, one that can be easily reasoned about. It is possible to design code at a high enough level where the specs and the code are one and the same, but most languages haven't got the type system sophistication of something like Idris or Haskell, which is a key component of pulling off this feat. The vast majority of code is written in languages that do not lend themselves to code as specification. Code contracts and other complimentary techniques (such as automated test generation) can go a long way to counteract those shortcomings.

Crypto functions are a special case. In this instance you won't save time by defining specifications as the requirements on algorithmic correctness are much higher than average. However, in this case the ideal would be formal verification, and that still requires a specification, it's just likely to be more verbose than you'll want for day to day code checking.

Lastly, consider the alternative if you don't use specifications. At this point the burden of performing code checks is with humans. With a large, fast-moving codebase it's unreasonable to expect any one individual to understand all the parts that constitute the whole at a sufficient level to stop new bugs creeping into code. It happens on every project, no known exceptions. With that in mind, why wouldn't you want to put in a framework to help catch bugs automatically? New bugs will still occur, but with a well specced program this should be at a vastly reduced rate.


This doesn't deal with the design of code and the long term effects of choosing the wrong API.

It also doesn't handle new features, at all, because the tests have not been written. Even if you enforced a requirement for tests with coverage, the tests could (and usually would) still be wrong.

Trying to take conscious, adaptive, intelligent response out of the loop is a mistake.


>"This doesn't deal with the design of code and the long term effects of choosing the wrong API."

Yes it does. With this approach, any change to the design requires a change to the specs to be made first. I already indicated the specs could evolve over time to allow for refactoring and new features.

>"It also doesn't handle new features, at all, because the tests have not been written. Even if you enforced a requirement for tests with coverage, the tests could (and usually would) still be wrong.

Trying to take conscious, adaptive, intelligent response out of the loop is a mistake."

You've misunderstood, as I stated before, discussions still happen, they're just focused on the specs.


I agree with Ajedi32 on this. Sufficiently detailed specs will be indistinguishable from code. So reviewing code and its behavior is more efficient than having another language to review.

It may depend on the domain, however. I find that a lot of misunderstandings arise in these discussions because people assume the problems are the same in all kinds of computing. That's not so.

I work in viz/UI (interactive charting) so there aren't great libraries for testing or writing specs. All the features interact in unpredictable ways.

Your process may work better where the API is purely functional, where inputs and outputs and side effects are better defined.

I'm happy for you if you've applied this successfully!


I very much agree.

Much of this is in the area of management (and thus social - building a culture around a project), and has relatively little to do with tooling per se.


Totally agreed!!!


Great example of what you're talking about: https://github.com/FFmpeg/FFmpeg/pull/153


yes, many times we need to work 'around' the tools as they do not do what we want and cannot disable to avoid confusion:

https://github.com/ansible/ansible/releases


The title is slightly misleading - the author mostly critiques the maintainer model used by Linux, not the broader model of "maintainer" used in OSS.

Speaking from personal experience as a somewhat new maintainer for a large project (we were #3 in PRs for all of GitHub last year[1]), modern tooling and aggressive automation decrease the amount of busywork substantially. I also have a relatively large amount of independence in my decisions as a maintainer, but this hasn't translated into an absence of oversight or fractured quality standards.

[1]: https://blog.jessfraz.com/post/analyzing-github-pull-request...


Isn't the time needed for triage and review - since most PRs will be package updates - much lower than in most other projects?


First it should be understood that even in the LK there are different maintainer models, depending on the subsystem. Many of the driver maintainers don't even have public git tree's much less public postings reviewing peoples code.

So, with any large project, many of the subsystems are really dysfunctional. Overworked maintainers are just a symptom of someone not being able to delegate (might even be the maintainership itself if they want to code more than review). Frequently though, It all seems really territorial, with the maintainer themselves bike shedding over function naming, hunk placement in a patch series, comment wording, the list goes on. No wonder these "maintainers" are overworked, instead of acting as architectural reviewers, or even bug reviewers they force themselves and the people "just trying to scratch an itch" to jump through hoops for patch revision after revision. Others are more passive, and simply don't look at patches they disagree with, even if it adds a significant piece of functionality. So, there is a wasteland of patches that never make it, only to be rewritten a couple years later by the maintainer, or a frequent contributor.

Frequently a lot of it boils down to what is effectively territorial responses. How dare someone come in an move the tree i've been pissing on for the last two years.

Frankly, I wish that more of the maintainers actually acted like Linus, who is pretty clear about whether he is going to take a patch set, and back when he actually did more than take pull requests at face value, would himself fixup any minor issues he saw in the patches as he committed them rather than waiting for the submitter to figure out what was wrong and repost a whole series with some minor tweak.


>"Frankly, I wish that more of the maintainers actually acted like Linus"

Linus has a number of maintainers he trusts, and has indicated before he doesn't need to question code that has been approved by these maintainers. So for most of Linux development he's delegated work to others. This can be a good approach, but it's a luxury to have other people who can do most of the code quality work for you, it's not an approach you can rely on for all projects.


I'm not sure what your point is. He built that trust by accepting patches from people without to much hassle. That was the difference between linux and the BSDs and one of the reasons why we are not running [386|net|free|open]bsd these days.

This means people got experience by writing code, and having it merged, bugs and all, until they gained enough experience to be "trusted". Today, there is a mindset frequented by maintainers that new developers should have to jump through lots of hoops rather than being aided by the maintainers. Linus is/was famous for bitching about something, and taking it anyway. Today, that is incredibly rare. Most of the core maintainers had it easy, they didn't have to setup complex SCM's, figure out how to split their patches into bisect-able chunks, read a whole bunch of howto's, guess at the coding variation accepted by the maintainer (no checkpatch is frequently not sufficient), and on and on. They simply had to run diff, pipe it to a file and get it on the list somehow. Frequently they would get bitched out, but it was rare to submit a patch more than twice. These days you can find bikeshedders on many of the mailing lists complaining about function naming in a patch that has a double digit version number.

Consider what happened to my first kernel patch. I submitted it and Linus pointed out what he didn't like, while simultaneously correcting it, verifying with me that he didn't break anything and then merged it. What happens today is a nightmare by comparison, and yes, that include me because I'm infrequent enough, and my patches rarely land into the same subsystem, that no one really recognizes or "trusts" me.

There is another whole discussion about whether "trust" even belongs in the lexicon of an engineer. The old saying is "trust but verify", where the verify part is the most important.


This is a pretty one sided view of the problem. I don't see many arguments apart from a) "our model avoids burnout" and b) "it is easier to apply wholesale patches".

Regarding a) the opposite can be the case (for the maintainer!). Regarding b) the question is (as always) whether the patches are actually necessary or just some activity.

The problem is far more complex than this blog post acknowledges.


Well that explains the crapification of the Linux graphics stack...


Can anyone read this? It is white on white.


Foreground is #666666 Background is #FDFDFD. Gpick tells me the contrast is 56.1%. That should be quite legible in most circumstances, though personally I would try to shoot for 70+ -- especially when you have absolutely no constraints.

If it's not visible for you, then it is likely because there is something wrong with your set up. Could be insufficient backlight. Could be improper font rendering. Could be your browser is screwing up the CSS. Could be lots of things.

I always get in trouble with this kind of discussion because the usual response is "But this is a stock system. I shouldn't have to adjust it to make up for crappy web designers." And I sympathise with this sentiment (and especially in this case where there is no particular reason for going with a low contrast presentation), but... I'd really rather people complain that their devices are broken/misconfigured-by-default. There is no reason this website should appear illegible, even if I don't completely agree with their colour choices.


> Gpick tells me the contrast is 56.1%.

Which fails WCAG AAA.

It's body text. There's no reason for this.

> I'd really rather people complain that their devices are broken/misconfigured-by-default.

Eyes are expensive to fix. Someday I'll get the surgery. In the meantime I'll bitch about people's shitty design choices.


As I said, I knew I would get in trouble ;-) FWIW, I have a vision problem and have difficulty seeing things that have low contrast. In fact, I use a 24 point font and obsess with colours because if I don't, I get ocular migraines (things that are difficult to read literally make me go blind). So, I'm not insensitive to your argument.

Saying that 56.1% fails WCAG AAA is not a terribly convincing argument, because that is the highest level of accessibility. If you want to have a convincing argument, then why not point out that it also fails to meet the 3:1 contrast ratio of the lowest recommended contrast level for people with healthy vision?

Like I said, I'm not against helping web designers make accessible web pages, but if you literally can't see something with a 56% contrast ratio, then it's because your device is set up improperly (or you have vision problems that you already know about). The frustrating thing for me is that people tolerate these completely broken by default systems and complain to web developers that they don't have a 7:1 contrast ratio in their web pages.

What that means is that for people like me, I have to spend my life configuring my blasted machine for situations where colour contrasts are lower due to design constraints (rather than trendy, bone headed decisions). I want a machine I can use every day and I specifically don't care if some random blog writer decides to make their content unavailable to me.

So I will repeat: If you can not see that text, fix your computer and complain to whoever set it up. If you also want to help the web designer make accessible choices, then at least tell them what they should be aiming for rather than complaining that the text is "white on white", which is completely untrue.

Sorry for the rant, but it's a bit of big deal for me.


Barely, not sure if it's just the rendering on my tablet, but the font's way too light for the background.


I can read it


If it's white on white then fix your gamma.


When you fix your gamma, it's light gray on white.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: