When bots get reported to us by people using GitHub our support folks reach out to the bot account owner and encourage them to build a GitHub service instead. As a service, the same functionality would still be available to everyone using GitHub, but it would be opt-in instead.
A few months ago we heard from some developers of service integrations that beyond the existing API features, it would be handy to be able to provide a form of "status" for commits. We added the commit status API  in September to accommodate that. We're always open to feedback on how the API and service integrations can improve.
The point is, GitHub services are a much better way to build integrations on GitHub.
Why not establish an opt-out convention similar to robots.txt? The idea is that people who want to opt-out would create a ".robots" file in their repo, with "none" in it. Any bot that doesn't respect the .robots file is hellbanned.
The problem with opt-in is that people won't use it unless they (a) know it's available, (b) know how to get it, and (c) actually go get it. So people don't really do that. But establishing an opt-out convention like this solves the problem entirely, and it's simple.
If no one finds or wants to use your service without it being forced on them, it can not be that great of a service.
If it's about discovery of bots, then come up with some meaningful way to make them discoverable (presumably better than the Apple App Store zing) without people needing to be exposed to them by default.
How about a header in email that the sender can include that forces the email to the top of your inbox and makes it undeleteable? You can opt-out by having an email address of firstname.lastname@example.org all the time.
How small-minded. There are many of us who want to be discovered by bots. We just keep quiet because people like you have such strong opinions and aren't afraid to be mean about them.
Wouldn't it be ironic if your opinion was in the minority?
I think there is room in every community for automated users, if the tools for managing misbehaving users is strong enough. Github however has been known to not have strong moderation tools, so perhaps a temporary opt-in policy could be used until those tools improve?
This "add a file to your repo" is starting to get annoying, there is even a project for it, called Filefile.
Also I don't see the problem with opt-in. If your bot/plugin/service is worth its salt you'll be putting some effort into marketing it (which includes explaining how to use it).
The alternative is Github marketing the opt-out, which seems a bit strange if they're launching a feature and advising users how to not use it so they don't get spammed.
It could be opt-in (initially at least), i.e. a .robots with contents "all", "typo" or "whitespace" etc would allow bots of the given type (an empty or missing file would imply "none").
You don't think it would be appropriate to warn the bot-owner about this first? Hellbanning must always be a last resort, not something you throw around as standard procedure for even the smallest misdemeanours.
The recent spread of hellbanning on internet forums really is a plague. I've been hellbanned on several sites for no obvious reason at all. On reddit for example it was because their spam-bot detected that i posted two posts containing the same link within too short time -.-
I don't see any of those things as "problems". IMO, it should be opt-in, on a per-project basis. There could be another checkbox in each project's "Settings" tab. The default should be that people are left alone.
We're not opposed to bots or services. We encourage it, and use one ourselves. The key is making it opt-in so it doesn't bother people that don't want it.
Travis CI is a popular addon, but they don't have a bot that runs tests and tries to get you to setup their service. They just focus on providing a bad ass service that you _want_ to setup.
Edit: You 'opt' in to a bot one of two ways:
1. You add their GitHub Service to your repository (see the Service Hooks tab of your Repository Settings). This is how Travis CI started out.
2. You setup an OAuth token with the service. Travis does this now, and provides a single button to enable CI builds for one of my repositories.
What about splitting them off into a separate interaction lane to eliminate the noise they create elsewhere? A "Bot pull requests" tab. Make it passive, so it does not trigger notifications, emails or other active communication, but is available and streamlined.
Good thing he likened it to something we can all relate to.
My personal server returns a 410 to robots.txt requests.
If someone has a good track record of useful pull requests, would you mind if they contributed to your project?
Would you care if it was really easy for them to write that helpful code because they've crafted the ultimate development environment that practically writes the code for them? So why do you care if the editor actually writes all the code for them?
That's essentially what's happening when someone writes a bot and it makes a pull request.
Sure, it sucks if there are unhelpful bots or people spamming up a storm of pull requests. But the solution to this problem is not to ban all bots or all people - it's to develop a system that filters the helpful "entities" from the unhelpful ones. This might be hard in some fields like politics and education, but in software development this is tractable, right now.
I sincerely hope that this is what actually happens. This is one of the first steps towards a world where common vulnerabilities are a thing of the past because whenever one is committed, it is noticed and fixed by the "army of robots". When an API is deprecated, projects can be automatically transitioned to the new version by a helpful bot. Where slow code can be automatically analyzed and replaced.
There are details to be figured out, an ecosystem to be constructed, perhaps more granular rating systems to be made for code producing entities (human or bot). Because it's "easier" for a bot to send a pull request, the standard of helpfulness could perhaps be higher. Communication channels need to be built between coding entities, and spam detection will become more important. But simple blocking and a cumbersome opt-in system is not a good solution.
This might be a stopgap until better systems are built, but it is not something we should be content with.
1. real people (they can make regular pull requests)
2. bots advertising you a code/assets improvement service (this should never be in the form of pull requests you should see this as adds and you should have the opportunity to disable them and github could try to get some revenue by taxing the guys advertising through this)
3. smart "code bots" that could actually do what you say: maybe at first start by doing code reviews, then static code analysis, then even start refactoring your code or writing new code, who knows... but you would have these in a different tab, like "robots pull requests", at least until we have human level general AI :) ...for the same reason that you have different play/work-spaces for adults and children and animals (you don't want your son and your neighbors' pets running around your office or bumping into you in the smoking lounge of a strip-club!).
EDIT+: What the bot owner did in this case was to advertise without paying the guy on whose land he placed the billboard (and on whose land he himself stays without paying rent), except that it's much more intrusive than a regular billboard you can ignore!
The gp solution seem more adaptive and open to the unknown.
1. Pull request or issues file by real human being for non-advertising purposes (using the equivalent of a "spam filter" for them)
2. Any other stuff! - I want this labeled as "something else", regardless if it useful or spammy real bots or "human-bots" sending me adds.
It's a great future what the gp suggests, and I want it, but for now I want a clear distinction between "ham" and "spam", and for now it's probably better to separate "really human made content that's not advertising" and call everything else "possibly spam". If the need appears, they can start filtering the real spam. For now I just want everything that doesn't directly come from a human labeled as "bot pull requests" or "bot issues" or anything else, but labeled!
You might log in one day and find that your repo has pull requests from fifteen image optimizing bots, thirty-eight prettifying bots for different languages, four .gitignore patching bots, seven <!DOCTYPE inserting bots, eight JS semicolon removers, nine JS semicolon inserters, twenty-four subtly broken MySQL query sanitizers, and seventy-nine bots fighting over the character encoding of the readme file.
These slippery slope arguments are a bit silly. If you're running an open-source project, you can either accept PRs or not, and if you're accepting them, you can review the code and approve it or not approve it. A PR from a bot is the same as a PR from anyone else, it's either helpful or not helpful. It's not currently a problem, and it's too early to speculate about worst-case future scenarios.
You're missing a key point here. The difference between PRs from bots and people is that there is a balance of effort on the part of the person giving the PR that bots do not have. PRs take time and effort to evaluate on the part of the repo owner. A PR from a person has a higher chance of being a meaningful change as the person had to spend their own time and effort to offer it. There is also a consideration of the nature of PRs from bots. They will necessarily be of a certain class of actions and in the vast majority of cases be low effort, low impact changes. Having these compete with PRs from actual people is not a good direction for github.
Also, the slippery slope retort is getting tired. We can and should use our reasoning skills and historical precedence to evaluate likely usages of new rules (you'd be negligent if you didn't). In the case of bots, the possibility of having repo owners spammed with low value changes is enough to disallow it.
Humans can easily look at the existing pull requests and see if their work is redundant. Bots can't. And as hackinthebochs said, human-submitted pull requests involve effort which limits them.
It's not really a slippery slope argument. It's more an application of what we can see having happened with bots in the past. Email spam for legitimate offers is, after all, just about as annoying as email spam for scams.
Is that what they said about SMPT and spam?
It would also make GitHub a much more intimidating place to submit code. I don't want those things for my smaller projects.
The account in question has been closed now, but it submitted thousands of pull requests to random projects to advertise their build service.
Even if it was the best build service in the world, as a hacker I wouldn't personally use it.
That was also a troll/spam bot that had no affiliation to Travis CI. Their GitHub integration has always been classy.
Either they insert something in your pictures that you don't realize is there, or just good ol' spam (like Travis4all), or a backdoor that looks like a fix…
If you're not on a Mac, the individual tools are still quite usable. Here's some of them.
I'm not sure he'd have written that he "likes the idea" and would have failed to mention Service Hooks if he did.
"Wouldn't it be a great idea if there was a 'hook' mechanism you could opt into that provides a way to add additional functionality to their site from third parties?"
Or maybe not. I don't know, text doesn't convey emotion and body language.
I would love for github to make bots something that you can subscribe to on a "bot subscription page". I think they can be incredibly useful so long as they aren't promiscuous, unwelcome and frequent enough to be seen as spam. You should be able to handle these the same way you handle permissions for third-party apps on Facebook or Twitter. The subscription page could also provide bot ratings and suggest bots that are likely to be useful for your project.
This approach would also create a way where these apps could be useful for private repos as well.
What if there was a community-vote that turned a bot and a particular version of said bot from Opt-Out (app style) to Opt-In (bot style)?
I, for one, welcome our bot-coding overlords that clean up my code and optimize it on each commit. Might save me a lot of time and a lot of power and thought... if it's peer reviewed, like all open source software.
I prefer when people have to specifically say 'yes, I want my dead body to go to waste instead of saving lives'.
Nuuton is currently crawling the web. The plans include crawling Github (actually, Github has a specific and exclisive crawler built for it). Is that permitted? If so, what are the rules? If not, to whom may I speak regarding it? I know DuckDuckGo does it, but I don't know if they are crawlin gyour site or just using what the Bing index currently has.
Perhaps now that they've taken money, they aren't as interested in tackling new problems. Perhaps that's reasonable, since they'll need a lot of that money to hire and keep operations folks who can keep the site up.
Do they say "No Thanks" to themselves?
Maybe the title should read: Google Says "No Thanks" to Other Bots