Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A bot which does lossless compression on images in open source projects and only submits a pull request (with all the relevant details) if there was a > X percent filesize savings? That's not spam, that's just helpful...


Potentially, yes, but what if the idea catches on and you have swarms of overlapping bots submitting pull requests? And what about bots that are well-intended, but dubiously helpful?

You might log in one day and find that your repo has pull requests from fifteen image optimizing bots, thirty-eight prettifying bots for different languages, four .gitignore patching bots, seven <!DOCTYPE inserting bots, eight JS semicolon removers, nine JS semicolon inserters, twenty-four subtly broken MySQL query sanitizers, and seventy-nine bots fighting over the character encoding of the readme file.


What about the well-intentioned but dubiously helpful PR from someone who just doesn't know what they're doing? What if that were to catch on and you have swarms of overlapping non-programmers submitting PRs?

These slippery slope arguments are a bit silly. If you're running an open-source project, you can either accept PRs or not, and if you're accepting them, you can review the code and approve it or not approve it. A PR from a bot is the same as a PR from anyone else, it's either helpful or not helpful. It's not currently a problem, and it's too early to speculate about worst-case future scenarios.


>A PR from a bot is the same as a PR from anyone else, it's either helpful or not helpful.

You're missing a key point here. The difference between PRs from bots and people is that there is a balance of effort on the part of the person giving the PR that bots do not have. PRs take time and effort to evaluate on the part of the repo owner. A PR from a person has a higher chance of being a meaningful change as the person had to spend their own time and effort to offer it. There is also a consideration of the nature of PRs from bots. They will necessarily be of a certain class of actions and in the vast majority of cases be low effort, low impact changes. Having these compete with PRs from actual people is not a good direction for github.

Also, the slippery slope retort is getting tired. We can and should use our reasoning skills and historical precedence to evaluate likely usages of new rules (you'd be negligent if you didn't). In the case of bots, the possibility of having repo owners spammed with low value changes is enough to disallow it.


>What if that were to catch on and you have swarms of overlapping non-programmers submitting PRs?

Humans can easily look at the existing pull requests and see if their work is redundant. Bots can't. And as hackinthebochs said, human-submitted pull requests involve effort which limits them.

It's not really a slippery slope argument. It's more an application of what we can see having happened with bots in the past. Email spam for legitimate offers is, after all, just about as annoying as email spam for scams.


Why can't bots look at existing pull requests?


They can, but it's unrealistic to think that they will do so one percent as intelligently as a human contributor.


They can, but it's unrealistic to think that they will do so one percent as intelligently as a human.


> It's not currently a problem, and it's too early to speculate about worst-case future scenarios.

Is that what they said about SMPT and spam?


A "mark as spam" button would work as long they could reflect the user's preference toward bots in general.


As article says, I wouldn't want hundreds of them. Imagine uploading a simple website, only to have your issue tracker harassed by JSLint bots, HTML validators and pull requests correcting your indentation, replacing <b>s with <em>s etc.


Honestly? That sounds amazing (as long as there's a way to manage the noise).


So some sort of "plugin" interface where you can opt in to certain bots on a per-repo basis? I could really get behind that! 100% automated and unsolicited? No thank you!


Hey, check the Service Hooks tab of your Repository Settings.


It would be really helpful if each had a short summary when you hover over them as its an awful lot of names which don't mean much. I came across Travis from elsewhere but never would have guessed it was CI from this list.


yeah, +1. I actually cloned the repo so I could peek at the source for the hooks to all these services I'd never heard of. Was easier (lazier) than googling.

https://github.com/github/github-services/tree/master/servic...


Exactly. I think bots are exactly the wrong shape for this task. What would be better would be a framework for automatically processing the files in a git repository and then generating a commit. GitHub could have an interface for uploading them and for applying them to repositories. You could then apply them yourself, or someone else could fork, apply, and then submit a pull request.


I'd much rather use those tools at my own discretion than have them clutter an area of the site where I expect to see feedback from my human users.

It would also make GitHub a much more intimidating place to submit code. I don't want those things for my smaller projects.


It's not always helpful. Consider code with a similar purpose on github (transforming images, compressing them) with an accompanying image test suite, where modifying these images will break tests.


The problem is that you will also get these: https://github.com/ajaxorg/node-github/pull/45

The account in question has been closed now, but it submitted thousands of pull requests to random projects to advertise their build service.


That was also a troll/spam bot that had no affiliation to Travis CI. Their GitHub integration has always been classy.


I'm pretty sure with a community like GitHub, those advertisements would turn around and hurt their service, instead of help it.

Even if it was the best build service in the world, as a hacker I wouldn't personally use it.


As technoweenie said:

That was also a troll/spam bot that had no affiliation to Travis CI. Their GitHub integration has always been classy.


Not if you don't want your images compressed, eg if they are test cases for a face recognition algorithm or indeed for an image compression service. Not everyone is using github for the same stuff you know.


That seems a pretty rare circumstance. I'm doubting a single pull request will cause that much bother, just deny it. Then hopefully the bot is smart enough to not resend, if it isn't then block the bot's user from communicating with you again.


Actually I think the instances where this bot would be useful are the rare ones. Most images on github are probably for the project's logo or gh-pages branch. It's simply not important that those be compressed and getting pull requests on things that are not core to the project's purpose are distracting.


Yes. A bot could try to identify what kind of project it was (are there html and css files etc). That might be an interesting project in itself trying to classify github projects into libraries, web sites, documentation, standalone apps etc.


Those images don't sound big enough to trigger the bot. So we're back to it being usually useful.


Explain to me how a better png is going to interfere with facial recognition? Are you suggesting some kind of algorithm that finds faces based on Huffman trees or related compression metadata? It sounds unimaginably fragile.


That wasn't a very good example. But you never want to modify any test cases.


That's certainly helpful but there's no doubt that someone would use a GitBot for nefarious purposes disguised as something useful.

Either they insert something in your pictures that you don't realize is there, or just good ol' spam (like Travis4all), or a backdoor that looks like a fix…




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: