
GitHub Says ‘No Thanks’ to Bots — Even if They’re Nice  - cyphersanctus
http://www.wired.com/wiredenterprise/2012/12/github-bots/
======
briandoll
I'm quoted in the article and I wanted to clear one thing up that was missing
from it.

When bots get reported to us by people using GitHub our support folks reach
out to the bot account owner and encourage them to build a GitHub service[1]
instead. As a service, the same functionality would still be available to
everyone using GitHub, but it would be opt-in instead.

A few months ago we heard from some developers of service integrations that
beyond the existing API features, it would be handy to be able to provide a
form of "status" for commits. We added the commit status API [2] in September
to accommodate that. We're always open to feedback on how the API and service
integrations can improve.

The point is, GitHub services are a much better way to build integrations on
GitHub.

[1] <https://github.com/github/github-services> [2]
<https://github.com/blog/1227-commit-status-api>

~~~
sillysaurus
Well, devs are always going to create bots. They're a fact of life.

Why not establish an opt-out convention similar to robots.txt? The idea is
that people who want to opt-out would create a ".robots" file in their repo,
with "none" in it. Any bot that doesn't respect the .robots file is
hellbanned.

The problem with opt-in is that people won't use it unless they (a) know it's
available, (b) know how to get it, and (c) actually go get it. So people don't
really do that. But establishing an opt-out convention like this solves the
problem entirely, and it's simple.

~~~
thwarted
Opt-out sucks because you are forced to deal with it. Opt-in makes the most
sense because there are no hurdles to jump in order to _not_ be bothered.

If no one finds or wants to use your service without it being forced on them,
it can not be that great of a service.

~~~
zem
you could have a single opt-in to bots-as-a-whole (as a github account setting
rather than per-repo), and then blacklist individual bots if they end up being
spammy. that seems like a nice balance between "do nothing by default" and
frictionless adoption of the feature.

~~~
thwarted
While this is a fine idea, why can't that be "opt-out to bots -as-a-whole (as
a github account setting rather than per-repo), and then whitelist individual
bots if you want to find out if they provide some utility"? That seems like
the proper balance between "do nothing by default" and the lest friction
required to use bots without requiring those who don't want to be involved at
all needing to do anything?

If it's about discovery of bots, then come up with some meaningful way to make
them discoverable (presumably better than the Apple App Store _zing_ ) without
people needing to be exposed to them by default.

~~~
zem
white listing individual bots is not low-friction. ideally i want the bots to
discover me, not vice versa.

~~~
thwarted
You do, but no one else does.

How about a header in email that the sender can include that forces the email
to the top of your inbox and makes it undeleteable? You can opt-out by having
an email address of user+optout@example.com all the time.

~~~
sillysaurus
_You do, but no one else does._

How small-minded. There are many of us who want to be discovered by bots. We
just keep quiet because people like you have such strong opinions and aren't
afraid to be mean about them.

Wouldn't it be ironic if your opinion was in the minority?

~~~
thwarted
The people who want opt-in are not the bad guys here, and neither are the
people who want opt-out. The bad guys are those who poisoned the well, tragedy
of the commons, by abusing the system. Unfortunately, there is a greater
chance of abuse than there is utility, and it's more difficult for everyone to
manage opt-out on an individual occurrence basis than it is to manage opt-in
centrally by _trying_ to ban bots across the board. And neither are that
actually that successful, despite there having been individual successes in
some communities.

------
technoweenie
We got a lot of angry feedback about the whitespace bot that was roaming
GitHub for a while. We tried to sit back and let people deal with it
themselves (e.g. send feedback/patches to the bot owner).

We're not opposed to bots or services. We encourage it, and use one ourselves.
The key is making it opt-in so it doesn't bother people that don't want it.

Travis CI is a popular addon, but they don't have a bot that runs tests and
tries to get you to setup their service. They just focus on providing a bad
ass service that you _want_ to setup.

Edit: You 'opt' in to a bot one of two ways:

1\. You add their GitHub Service to your repository (see the Service Hooks tab
of your Repository Settings). This is how Travis CI started out.

2\. You setup an OAuth token with the service. Travis does this now, and
provides a single button to enable CI builds for one of my repositories.

~~~
jayferd
I distinctly remember a Travis bot sending me like 4 pull requests that added
a `.travis.yml` file...

~~~
stock_toaster
That was a troll bot someone not affiliated with travis wrote, as I recall.

------
StavrosK
> But here was a pull request from a GitBot. Bots don’t debate. “It’s like the
> first time you see a self-driving car on the road,” Michaels-Ober says.

Good thing he likened it to something we can all relate to.

------
calpaterson
In case any github people are reading this: you also have an annoying approach
to web crawling "robots". Your /robots.txt is based on a white-list of user
agents with a human readable comment telling the robot where to request to be
whitelisted. Using robots.txt to guide whitelisted robots (like Google and
Bing) is against the spirit of the convention. This practice encourages robot
authors to ignore the robots.txt and will eventually reduce the utility of the
whole convention. Please stop doing this!

~~~
sbierwagen
Robots.txt is a suicide note.

<http://www.archiveteam.org/index.php?title=Robots.txt>

My personal server returns a 410 to robots.txt requests.

~~~
tomjen3
I have no clue as to why the author of that shit is as angry as he is, but I
have zero interest in his opinion until such time as he learns to show me the
issues, and not just blindly assume that anybody who is not as enlightened as
him is a blind idiot.

~~~
sbierwagen
Okay.

------
avivo
Git bots may not be that impressive right now. But imagine a future where an
incredibly knowledgeable "programmer" is working with you on every project,
doing lots of the busy work, and even code reviewing every commit _you_ push.
Except that programmer is a bot. This future is possible - but we need to
encourage it, and not shut down the precursor at the first "sign of life".

If someone has a good track record of useful pull requests, would you mind if
they contributed to your project? Would you care if it was really easy for
them to write that helpful code because they've crafted the ultimate
development environment that practically writes the code for them? So why do
you care if the editor actually writes _all_ the code for them?

That's essentially what's happening when someone writes a bot and it makes a
pull request.

Sure, it sucks if there are unhelpful bots or _people_ spamming up a storm of
pull requests. But the solution to this problem is not to ban all bots or all
people - it's to develop a system that filters the helpful "entities" from the
unhelpful ones. This might be hard in some fields like politics and education,
but in software development this is tractable, right now.

I sincerely hope that this is what actually happens. This is one of the first
steps towards a world where common vulnerabilities are a thing of the past
because whenever one is committed, it is noticed and fixed by the "army of
robots". When an API is deprecated, projects can be automatically transitioned
to the new version by a helpful bot. Where slow code can be automatically
analyzed and replaced.

There are details to be figured out, an ecosystem to be constructed, perhaps
more granular rating systems to be made for code producing entities (human or
bot). Because it's "easier" for a bot to send a pull request, the standard of
helpfulness could perhaps be higher. Communication channels need to be built
between coding entities, and spam detection will become more important. But
simple blocking and a cumbersome opt-in system is not a good solution.

This might be a stopgap until better systems are built, but it is not
something we should be content with.

~~~
nnq
You need to _clearly make the difference between_ :

1\. real people (they can make regular pull requests)

2\. bots advertising you a code/assets improvement service (this _should never
be in the form of pull requests_ you should see this as _adds_ and you should
have the opportunity to disable them and github could try to get some revenue
by taxing the guys advertising through this)

3\. smart "code bots" that could actually do what you say: maybe at first
start by doing code reviews, then static code analysis, then even start
refactoring your code or writing new code, who knows... but _you would have
these in a different tab, like "robots pull requests"_ , at least until we
have human level general AI :) ...for the same reason that you have different
play/work-spaces for adults and children and animals (you don't want your son
and your neighbors' pets running around your office or bumping into you in the
smoking lounge of a strip-club!).

EDIT+: What the bot owner did in this case was to advertise without paying the
guy on whose land he placed the billboard (and on whose land he himself stays
without paying rent), except that it's much more intrusive than a regular
billboard you can ignore!

~~~
gbog
Those categories are artificial. What about a bot finding patches to send but
with a human review? And an army of humans sending spam PRs like they create
fake accounts on Facebook?

The gp solution seem more adaptive and open to the unknown.

~~~
nnq
(3) is artificial, at least for now. But I will always want to see the
difference between:

1\. Pull request or issues file by real human being for non-advertising
purposes (using the equivalent of a "spam filter" for them)

2\. Any other stuff! - I want this labeled as "something else", regardless if
it useful or spammy real bots or "human-bots" sending me adds.

It's a great future what the gp suggests, and I want it, but for now I want a
clear distinction between "ham" and "spam", and for now it's probably better
to separate "really human made content that's not advertising" and call
everything else "possibly spam". If the need appears, they can start filtering
the real spam. For now I just want everything that doesn't directly come from
a human labeled as "bot pull requests" or "bot issues" or anything else, but
labeled!

------
philfreo
A bot which does lossless compression on images in open source projects and
only submits a pull request (with all the relevant details) if there was a > X
percent filesize savings? That's not spam, that's just helpful...

~~~
mistercow
Potentially, yes, but what if the idea catches on and you have swarms of
overlapping bots submitting pull requests? And what about bots that are well-
intended, but dubiously helpful?

You might log in one day and find that your repo has pull requests from
fifteen image optimizing bots, thirty-eight prettifying bots for different
languages, four .gitignore patching bots, seven <!DOCTYPE inserting bots,
eight JS semicolon removers, nine JS semicolon inserters, twenty-four subtly
broken MySQL query sanitizers, and seventy-nine bots fighting over the
character encoding of the readme file.

~~~
viscanti
What about the well-intentioned but dubiously helpful PR from someone who just
doesn't know what they're doing? What if that were to catch on and you have
swarms of overlapping non-programmers submitting PRs?

These slippery slope arguments are a bit silly. If you're running an open-
source project, you can either accept PRs or not, and if you're accepting
them, you can review the code and approve it or not approve it. A PR from a
bot is the same as a PR from anyone else, it's either helpful or not helpful.
It's not currently a problem, and it's too early to speculate about worst-case
future scenarios.

~~~
mistercow
>What if that were to catch on and you have swarms of overlapping non-
programmers submitting PRs?

Humans can easily look at the existing pull requests and see if their work is
redundant. Bots can't. And as hackinthebochs said, human-submitted pull
requests involve effort which limits them.

It's not really a slippery slope argument. It's more an application of what we
can see having happened with bots in the past. Email spam for legitimate
offers is, after all, just about as annoying as email spam for scams.

~~~
thomaslangston
Why can't bots look at existing pull requests?

~~~
mistercow
They _can_ , but it's unrealistic to think that they will do so one percent as
intelligently as a human contributor.

------
atsaloli
I'd like to optimise my images. (The images on my website.) I looked at
<https://github.com/imageoptimiser> but didn't see which tool would do that,
or any way to contact the author. Is there an image optimisation tool in there
somewhere?

~~~
nwh
If you're on a Mac, ImageOptim is the perfect tool that combines a bunch of
open projects to crush down images. I use it daily, it's an incredible (free)
tool.

<http://imageoptim.com/>

If you're not on a Mac, the individual tools are still quite usable. Here's
some of them.

<http://advsys.net/ken/utils.htm> (pngOUT)

<http://optipng.sourceforge.net/>

<http://pmt.sourceforge.net/pngcrush/>

<https://github.com/kud/jpegrescan>

<http://freecode.com/projects/jpegoptim>

<http://www.lcdf.org/gifsicle/>

~~~
kawsper
Wouldn't that project also benefit from jpegtran?

~~~
nwh
ImageOptim includes it, I just couldn't remember the full name at the time.

------
lazyjones
I wouldn't mind bots that fix spelling mistakes in comments or even actual
bugs in code. But why not let github projects be configured to allow certain
kinds of bots?

~~~
jevinskie
Consider bots as "plugins" that you activate on a per-repo basis. I like this
idea!

~~~
masklinn
Isn't that exactly what Service Hooks or OAuth-authed systems provide?

~~~
uxp
I think that's what jevinskie was implying, as in this is a non issue since
service hooks are already established so there is no reason bots should be
allowed, they should plug into the correct API.

~~~
masklinn
> I think that's what jevinskie was implying

I'm not sure he'd have written that he "likes the idea" and would have failed
to mention Service Hooks if he did.

~~~
uxp
I read his comment as either sarcasm or passive aggressiveness.

"Wouldn't it be a great idea if there was a 'hook' mechanism you could opt
into that provides a way to add additional functionality to their site from
third parties?"

Or maybe not. I don't know, text doesn't convey emotion and body language.

------
malandrew
Right now they can be an annoyance, but this is something that could easily
become a great feature of github, the same way that @tweets and #hashtags
innovations came from the twitter community.

I would love for github to make bots something that you can subscribe to on a
"bot subscription page". I think they can be incredibly useful so long as they
aren't promiscuous, unwelcome and frequent enough to be seen as spam. You
should be able to handle these the same way you handle permissions for third-
party apps on Facebook or Twitter. The subscription page could also provide
bot ratings and suggest bots that are likely to be useful for your project.

This approach would also create a way where these apps could be useful for
private repos as well.

------
toobulkeh
Sounds like a debate between opt-in and opt-out. Why not both? Do an AB test
of a Bot vs. a Service. In some cases, opt-in is good (see: organ donors), in
other cases it's bad (see: Internet Explorer).

What if there was a community-vote that turned a bot and a particular version
of said bot from Opt-Out (app style) to Opt-In (bot style)?

I, for one, welcome our bot-coding overlords that clean up my code and
optimize it on each commit. Might save me a lot of time and a lot of power and
thought... if it's peer reviewed, like all open source software.

~~~
zalew
> opt-in is good (see: organ donors)

I prefer when people have to specifically say 'yes, I want my dead body to go
to waste instead of saving lives'.

~~~
tomjen3
Hey, when you put in price-controls don't complain about a lack of supply.

------
tocomment
I personally would use gists a lot more if they were indexed by google. As it
is I feel like I'm putting code down a black hole when I create a gist.

~~~
jQueryIsAwesome
Good idea; but maybe only the ones with a title and description; to index the
ones that have a clear purpose and not some random code without any idea how
to use it or what is for.

------
orangethirty
Question to the Github team:

Nuuton is currently crawling the web. The plans include crawling Github
(actually, Github has a specific and exclisive crawler built for it). Is that
permitted? If so, what are the rules? If not, to whom may I speak regarding
it? I know DuckDuckGo does it, but I don't know if they are crawlin gyour site
or just using what the Bing index currently has.

~~~
jgeralnik
Not connected to github, but look at <https://github.com/robots.txt>,
specifically the first 2 lines.

~~~
tomjen3
So yes, but only if you change your bot name.

------
lukeholder
I do think bots can be a great part of software development. I love the likes
of travisci and codeclimate integrating with GitHub - GitHub just need to
build a better app to deal with them. I assume private repos don't have bots
bothering them, but maybe they want to allow some? Checkboxes for types of bot
services you would like to allow per project?

~~~
technoweenie
We have GitHub Services: <https://github.com/github/github-services>. Anyone
can submit one. We'll probably accept it as long as the code is decent, is
tested and documented, and is for a stable service. If you're running some
custom build on a personal hosting account, use the web hooks. You can attach
web hooks or services to any of these events:
<http://developer.github.com/v3/activity/events/types/>

------
badgar
I've been annoyed by GitHub bots and enjoyed their contributions. IMO, GitHub
could/should have taken this opportunity to solve a problem and (once again!)
change how people code for the better through collaboration.

Perhaps now that they've taken money, they aren't as interested in tackling
new problems. Perhaps that's reasonable, since they'll need a lot of that
money to hire and keep operations folks who can keep the site up.

------
rsyncinside
I heard that Google is a "bot".

Do they say "No Thanks" to themselves?

Maybe the title should read: Google Says "No Thanks" to Other Bots

~~~
caf
The title has nothing to do with Google.

