Hacker News new | past | comments | ask | show | jobs | submit login
DigitalOcean's Hacktoberfest Is Hurting Open Source (domenic.me)
816 points by domenicd on Sept 30, 2020 | hide | past | favorite | 386 comments



Oh, this is really unlucky. I like Hacktoberfest and always get my T-shirt. Perhaps opt-in would be a great idea.

I can see why this happens, though. I've noticed that a whole bunch of projects have `good-first-issue` being something like "Re-architect module loading system" while most commits are like "correct typo". Like, jeez, man.

The participants are probably just pattern-matching against the commits available.

EDIT: Decided to go look at the spam that OSM got (a project close to my heart) and what the hell, man, look at this diff

<removed>

      * Tom Hughes [@tomhughes](https://github.com/tomhughes/)
      * Andy Allan [@gravitystorm](https://github.com/gravitystorm/)
    +
    +
    + Made with Love
This is just awful! I really feel for the maintainers. This user is just adding nonsense to a bunch of places.'

EDIT again: Whoops, guys, I didn't mean to cause more spam to the project. Removed the diff link. Jesus Christ, I ended up becoming the villain I was complaining about by linking it.


Sometimes it feels like software development has devolved into a sea of posturing and marketing oneself. It's faintly depressing to see reminders of that trend which are as stark as this.

It's a fitting allegory, though. This contest has used free T-shirts to solicit open-source contributions in the same way that the industry has used high salaries to solicit creative and impactful contributions.

Now imagine that you're trying to fill a position, and these pull requests are analogous to candidate interviews. You might start to have some sympathy for people who believe that we need some sort of professional certification for the trade.

It's unfortunate considering the democratizing promise of low-cost computing, but how else can you effectively deal with this "market for lemons" caused by large swaths of people acting in blatant bad faith?

    When seeming is taken for being, being becomes seeming. \

    When nothing is taken for something, something becomes nothing.


> Sometimes it feels like software development has devolved into a sea of posturing and marketing oneself.

I've been saying it for a long time, but the reason that this and other problems (like high developer burnout) seem like especially bad problems that the world of "software development" is facing is primarily because they're especially bad in the GitHub culture (and as a consequence of that culture), and the developers who are experiencing the worst of it are part of that community. Ditch the 'Hub, its userbase, and what is considered "best practice" there, and then many of these problems get dialed back a lot.

Much like follower counts on other social media sites, GitHub's contribution graph and profile timeline should have never been public. They should have been neat features of your personal dashboard that you alone are able to see when you're signed in—providing some form of encouragement à la the Seinfeld hack and to help you manage your work—but not for others' eyes. The gamification of "social" leads to degenerative behavioral patterns.


> Ditch the 'Hub, its userbase, and what is considered "best practice" there, and then many of these problems get dialed back a lot.

This sounds like "Without GitHub you will get less spam", which is probably true, but I think the reason is not "github is bad", it's: Less people will find your project.

Maybe that's a worthwhile trade-off, but it's very different from "all will be better without Github"


Less people finding your project could be an improvement if the people you loose are the onse that just create the kind of spam contributions mentioned in the article. You don't need GitHib for useful software to become popular, it is just one channel. And you definitely don't need the gamified bullshit like total stars on your profile.


The spam problem was caused by the gamification. It probably would occur with any platform.


> Less people will find your project

That is false. Firstly, even if your project is not hosted on github, clones of it will appear on github anyway.

Secondly, planting yourself in the middle of a vast ocean of garbage is not a good strategy for being found. You might be thinking of the Github of twelve years ago.


> Maybe that's a worthwhile trade-off, but it's very different from "all will be better without Github"

Making up quotes is not cool; those aren't my words, and that's not my position, so I'm not going to be gulled into defending it or kept from calling attention to what amounts to a sleight of hand here, even if it wasn't intentional.

(And this really chafes, because after I wrote what I meant, I even revised it to pre-empt[1] getting sucked into a discussion where someone responds to the wrong reading—specifically trying to avoid things like this. But when people don't even respect the constraint of sticking to others' actual words and instead conjure up other words that make for a more convenient world[2] to operate in, then there's almost nothing that can be done.)

> This sounds like "Without GitHub you will get less spam"

Well, it shouldn't; that's reductive.

If the bad stuff that arises from GitHub, its culture, and its practices were proportionate to its size, that would be one thing. (But also not itself a good reason not to consider ditching it—just like it's not obviously true that it would be a good idea to use Windows because the risk of malware is rational given its size as a target.) What's bad about GitHub, though, might in fact be disproportionate to its size—and in some cases, especially with respect to the practices that get promoted in that world, are things that are bad irrespective of GitHub's size.

1. https://pchiusano.github.io/2014-10-11/defensive-writing.htm...

2. https://wiki.lesswrong.com/wiki/Least_convenient_possible_wo...


I have no skin in this game, but I find that "all will be better without Github" is a reasonable short approximation of your "Ditch the 'Hub, its userbase, and what is considered "best practice" there, and then many of these problems get dialed back a lot.".

You have explained your criticism of GitHub, and I agree that it should have done things differently from the beginning. Still, your proposed solution for users is literally to "Ditch the 'Hub", promising that "many of these problems get dialed back a lot". It's really not a far stretch to "all will be better without Github".


> Still, your proposed solution for users is literally to "Ditch the 'Hub", promising that "many of these problems get dialed back a lot".

This is accurate. Let's let that be the place we stay.

> I find that "all will be better without Github" is a reasonable short approximation [...]

Well I don't, and it's my position, isn't it? It's not accurate. I don't think that "all will be better without GitHub"—and what's more is that I practice the "without GitHub" part; I have the firsthand experience to be able to say it's not true, so I wouldn't try to tell anyone that it is—and I didn't. I'm responsible for my own ideas, not ones imagined upon me.

Moreover, if I argue that A and B are not equivalent and that I prefer deal with A in its original form and not B, and you argue that they are equivalent, it's not rational for either party to insist that we deal with B in place of A. So let's not.


The effect of those things on github are minuscule comparing to that of medium and twitter. I never even looked at anyone else's github profile. Most of the time it's medium and twitter that take me to their projects and I only evaluate the project with the context of who they are on twitter or what they've written on medium.


I've seen quite a few people market themselves by saying how many stars they have on github, and some even started putting something like "If you find this useful, please star it" in their documentation. It's not quite "like, share & subscribe" yet, but it's on the way. Any public metric will be optimized for, I guess.


Starring a project increases visibility because anyone who follows the person who started it will see it too. It is a question of marketing.


Why should malware writers be the only ones who's code goes viral, huh? /s


> Most of the time it's medium and twitter that take me to their projects and I only evaluate the project with the context of who they are on twitter or what they've written on medium.

Please revisit my original comment. When I wrote it, I put some effort into qualifying things to make it clear that I'm not talking about just what happens on GitHub on the site. I referred to its culture. The things you just described are part of that culture, and very notable elements of it.


I agree but calling that "github culture" is unfair. It's a culture that emerged independently from github and like I said those metrics on github should not be the target for criticism. I found them pretty useless but some metrics can be good in a team context. I don't think they are a major part of the problem, let alone part of the cause.


> I don't think


it all started when people blogged about their programming tricks way back, and hacker news should be part of the problem before github is. Github at least has functions other than a social hub.


I’d argue getting a shirt for some PRs is better than the status quo, where you get nothing for a PR.

Which just goes to show how bad the status quo is.


> where you get nothing for a PR

You already got payment up front: software that the author(s) have made available to you for free.

You get payment by the author spending time to review your changes.

You also get payment afterward: free maintenance for your pet feature. (Not guaranteed of course but generally the case.)


“But we obtain the puzzling result that, when rewarded, volunteers work less. These findings are in line with a large literature in social psychology emphasizing that external rewards can undermine the intrinsic motivation for an activity.”

Be very careful assuming that a payment motivates open source developers. If you offered to help me do something for an hour for whatever internal motivation you might have, and afterwards I offer you $5 for your time, you would likely be demotivated.


If you offered to help me do something for an hour for whatever internal motivation you might have, and afterwards I offer you $5 for your time, you would likely be demotivated.

On the other hand if you offered to buy them a beer or a coffee it would probably be very motivating. I'm not sure why this is. Maybe cash just feels lazy and impersonal, so the amount being offered has to be big enough to counter that feeling.


It's the implicit conversation with the beer-buyer that matters. You are offering your time (which is atleast valuable to you) along with the beer. If you just buy a beer and left instantly, it's gonna be worse than $5


It's the implicit conversation with the beer-buyer that matters.

That is certainly a plus, but I don't think that's all of it. If I was working on something for a friend, that they had no idea how to help with. Then at some point they dropped off a snack or drink as a thank you, I would be happy about that. If they offered me a $100 bill I would be offended because they're not my boss, and my time is worth more than that.

I want to say that a small token of appreciation feels better than having my value quantified to an insulting amount. However, it may be even more basic than that. Food is a powerful reward, there is a reason it is used to train animals. It also could be that introducing money makes something feel like an obligation.


Offering $5 dollars feels like it's ascribing a low value to the time/effort to help. Offering a coffee or beer feels like a better gesture/token of appreciation. It's not about the value then but about the gesture.

For anyone who is working, people are "giving" them $5 all the time. But they probably don't have people buying them coffee/beer/lunch all the time.


That quote might be misleading:

"Volunteer work is an increasingly large, yet ill-understood sector of the economy. We show that monetary rewards undermine the intrinsic motivation of volunteers."

-- https://ideas.repec.org/p/zur/iewwpx/007.html

The earlier sentence make it clear they are talking about monetary rewards specifically, not any kind of reward. A t-shirt might notionally have a $ value, but it is not a monetary reward. Plus, the nature of a branded t-shirt has an obvious team-participation / prestige value.

If rewards demotivate people, we should also avoid positive recognition, or praise, which is a form of reward; of course this is unintuitive, so I assume monetary rewards are a special case.


It is tricky.

We all hear the stories of the person who saves a company a million dollars, and then gets giving a coffee cup (or an attaboy) for recognition.

It gets even trickier when the giver and the receiver have wildly different incomes.

Generally I find money to be a terrible proxy for what I actually value, but other options are worse proxies!


I actually would be very motivated. From the perspective of a contributor the worst thing that can happen is that their patch gets rejected or ignored. If you get paid for something it means someone actually wants your contribution. It's less likely to be ignored or rejected outright.


http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.511....

here is the work that you cited if i am not wrong.


That's missing the point a bit.

Good PRs are valuable, but require lots of work. Spam is not valuable, but it also does not require basically any work.

Hacktoberfest equally rewards both good PRs and spam equally. With those incentives, what are people logically going to produce?


On the other hand, what kind of people are going to logically participate in an event where the reward is a Hacktober branded T-shirt? What are their likely motivations?


> what kind of people are going to logically participate in an event where the reward is a Hacktober branded T-shirt?

Well, the answer is clearly "a lot of people," unless you can think of an alternate explanation for the increase is spam during Hacktober.

Why they want a damn T-Shirt so much is a totally valid question, though. I don't understand that either.


I submitted very few PRs to open source projects. I usually submit bug reports.

In both cases I get that what I care about gets into the project. It's enough for me.


My pet theory is that 'marketing oneself' trend is a sign of oversaturation (and forthcoming commoditisation).

This happen a few years back in UI design when designers started to put more work into presentation of projects than projects themselves.


> When seeming is taken for being, being becomes seeming

This quote can be applied to organizations on the wane.


> devolved into a sea of posturing and marketing oneself

Don't think it's devolved at all this has been the norm from the 80's onwards (perhaps earlier).

Look at how fractured open source is today and the sort of egos that come with it everywhere you look. While the points made in this article are valid, it's great that someone is incentivising people to interact with various projects rather than do their own thing rather than climb blindly up the same treacherous mountains others have done long ago.

Hey, also why not rewrite it in rust :)

Being a contrarian is easy, fixing these well acknowledged problems is hard.


I think they should completely reverse their approach: instead of instigating people to create senseless PRs, DigitalOcean should use the GitHub API to find contributors who made meaningful first contributions to existing projects over the last year, and send them T-shirts... in October?


I think a major part of this is about pushing devs who haven't contributed before to contribute such as new cs students.


That may be the intention, but what's happening is basically vandalism: https://github.com/search?o=desc&q=is%3Apr+%22improve+docs%2...

~1000 single line/word pull requests in the last 3 hours, almost all worthless rubbish. The scale of the problem is pretty severe.

To pick a single example: https://github.com/Geng-WD/websiteTest

A little personal Java project, unchanged in 3 years, where a new "contributor" has submitted a pull request where they add a comment with their own name in one commit, and then remove it and replace it with "Awesome coding website" in a second commit. 19 hours earlier another new "contributor" submitted a pull request to add a completely irrelevant mock gym web page (and they've done the same thing to a bunch of other randomly chosen repos).

If these people had to demonstrate valuable contributions over a longer span of time, I don't think any of this nonsense would be happening. There's no reason new CS students can't be respectful and put a little effort into doing something worthwhile, and they'll learn a lot more than from this mindless spamming.


After looking at a few dozen accounts and their PR histories, the spammers seem to be just searching for repositories that have the word "website" in their description and picking a random set of 5 to send random requests to.


This is exactly the way that video had been demonstrated.


That would mean DigitalOcean has to do work. It's probably easier for them if the open source maintainers have to do the work.


but this would take time and imagination and is so much harder.


You may have missed it, but the article screenshots a similar example where the spammer added "### Great Work" to the front of the README.


I’ve heard of people spamming readme commits to get their github graph green... I guess if an employer is tricked by something like this they are kinda asking for what they get.


That’s a failure of imagination and evidence of having never read the Git manual pages, because the GitHub contribution graph respects commit time (which you absolutely can override), and it’s a trivial hour-long project to write messages on it. I had greenscale pixel rendering of images on the contribution graph working in about an hour, most of which was figuring out the heuristics for quantity in each cell (hint: it’s not as complex as you think).

I can’t understate the simplicity of doing it, and I’d be nervous about someone taking the other approach as indicative of their technical depth. Then again, they’re already spamming READMEs so it’s not as if it was a strong signal to start with.


It's pretty easy, but if I'm looking at your Github to look at your work I can't find anything if you've clobbered your true history with a message.

I don't ever see that as a negative signal, but I do see it as a positive signal if I can just read your code, so if you write good code in public and you hide it, I can't find it.

Of course, whether you care is up to you, but if I find solid code there I'm going to recommend skipping technical evaluation if you're considering working with me.


All you need is a throwaway repo where you create all those commits to control the commit chart.

It doesn't hide your work at all (commit frequency is an awful metric of valuable work, some people do a lot of "fix typo" commits).


Any employer that looks at an applicants activity graph in any serious way only deserves the kind of people they will get.

Every employer worth working for will at least look at some actual code you've written, both in your own projects and in contributions to others.

And even then an in-person interview should be able to offset any github activity or lack thereof.


I once thought about making a script that automated git actions which would show you as active, but not actually do anything. Turns out theres too much real work to do to make it!


I came across exactly this the other day: https://github.com/aranair/go-kiasu



There are many saas for this already: https://www.gitgardener.com/


Wth is that? Bots? Trolls? Scammers wanting to start a t-shirt business?


People resell them on ebay.


Oh you're right! I was reading on my phone and bucketed that picture as a link to the other blog posts like the carousel at the bottom and mentally skipped it. My mistake!


Sorry about that carousel. I want to get rid of Disqus entirely (when I started the blog in 2012 they were not so icky). But, then I think about related yak-shaving, like moving off of Jekyll...


I’ve been down that rabbit hole myself. My blog is still running Jekyll but my comments are now on a self-hosted install of Commento. It works pretty well. For me the main feature is that it’s not Disqus, but another nice benefit is that comments support Markdown now!


It's all good. The error was mine. Disqus was, indeed, a great product back then. I didn't realize they added all this stuff now.


So when you're throwing a party and suddenly have guests trashing the neighbor's front yard, do you stop the party or let the party go on for another month because it was so good last year "and we'll change something next year, in the mean time can you build a fence"?

I'll not advocate calling DigitalOcean on the phone, but tagging the CEO on the complaints on twitter might be more effective than tagging the community manager who professes he is not listened to.


Why would you stop the party, rather than eject those guests?


They have been asked to, but say they can't?


In this case the bad guests are the repo spammers.


Well, they are incentivizing them. Just to get a notion of the scale: https://github.com/search?q=amazing+project&type=issues


This is strange, I wonder why they are targeting this repo? I have labeled some issues 'hacktoberfest' and have gotten very well meaning contributions. Here is an example https://github.com/earthly/earthly/issues/334


How did linking to the diff cause spam?


Within seconds of doing so, people started adding reactions to it and then someone 'suggested changes' that were essentially a commit revert. It's the community saying the changes were undesired, but those things nevertheless do notify maintainers. I only realized then that I was creating one of those situations where people pile on a few hundred comments onto a PR that are thin me-toos onto one side or the other.


WOW! I have had a wildly different experience. Really sorry to see that it has caused you so much stress.

I run engineering at Operation Code https://operationcode.org/ https://github.com/operationcode/

We've been massive fans of Hacktoberfest for the last 3 years because it has brought a MINIMUM 300% increase in quality pull requests compared to even the next best month of the year.

I even put my own money on the line to double down on the incentives with extra prizes in exchange for resolving multiple issues. I've made friends and long-term coding partners from the event as well.

I hope they never end Hacktoberfest, but I think they should definitely offer the ability for you to signal/flag that you're not interested in participating as a repository.


I feel the same way, and have a similar experience. Me and many of my friends and co-workers are always excited for October and make meaningful contributions.

I understand there's negative consequences, I also anecdotally believe that Hacktoberfest is a net positive for open source.

That said, last year many repositories popped up with the sole purpose of letting people make garbage PR's to hit the minimum. I have a hard time understanding why someone who is a developer and wants the shirt is comfortable doing this, when all you have to do to really earn it is making meaningful improvements to someone's `README.md`.


> I think they should definitely offer the ability for you to signal/flag that you're not interested in participating as a repository.

not good enough. it needs to be opt-in. why is a random private company generating even more work for open source maintainers?


From DO's policy statement on quality:

https://hacktoberfest.digitalocean.com/details#quality

> There's a seven-day review window for all pull requests before they count toward completing the challenge. Once a participant has submitted four eligible pull requests (ready-to-review, not drafts), the review window begins. This period gives maintainers time to identify and label spammy pull requests as invalid. If the pull requests are not marked as invalid within that window, they will allow the user to complete the Hacktoberfest challenge. If any of the pull requests are labeled as invalid, the user will return to the pending state until they have four eligible pull requests, at which point the review period will start again.<

So all the spammer needs are four projects with maintainers who are too busy IRL to flag spam posted to their repos?

Are we all now to be unpaid conscripts of DO's marketing department?


Sure, I’m fine with that!


Meh, this is the classic situation where one individual (either a person or a company) starts speaking for everybody (or everybody in a category).

And of course one size does not fit all.


Holy crap

https://github.com/MattIPv4/hacktoberfest-data#diving-in-pul...

> "Of the 483,127 PRs submitted during Hacktoberfest, only 23,299 (4.82%) were identified as spam"

That is insanely high noise for hacktoberfest, especially when tagging spam "correctly" takes a non-insignificant amount of effort from the maintainers.

I was ready to rant about this post but … no, wow, this is very much warranted.


That's only if the maintainers new to even flag them as spam or invalid, and didn't just close them. I'd fully expect more of the latter than tagging.

Here you go, you're an open source maintainer already in an often thankless role, take some extra work, with a side dollop of extra work labelling the crap extra work you're getting.


I saw one small unknown repo which had four pages full of PRs which just add "awesome" to the README, and more of that noise. The repo's owner hasn't been active on GitHub in a while, so those are four page that likely never get reported. And this is just the first day.


An important missing datapoint (as noted elsewhere on this thread) could be how many PRs were merged as a result of the 2019 Hacktoberfest; that'd help get a sense of the positive value contributed. I'm surprised it's not mentioned in the repository you link to.


Anecdotally, I'm getting a lot of obvious spam on my repos that are >4 years old and less spam on other repos. Also, my "org account" (from before orgs were a thing / fully featured) got more spam than my personal account (which I don't use often these days but does at least book a commit or two most months).

So, the spammers are probably intentionally targeting repos where folks aren't likely to bother marking as spam.

On a separate note, I do not understand why people care so much about mid-quality t-shirts...


> I do not understand why people care so much about mid-quality t-shirts...

Most likely a mix of "it's free", "it's easy", and "it looks nice on a CV". Perfect storm for a lot of students and juniors to spend an hour of their time on, without bothering to spend the extra two to actually make the contributions useful.


I found the T-Shirt really nice and comfortable. Sure, it's not the highest grade, but it's a damn nice thing to get for free!


Looks like that repo was made pretty much when the last hacktoberfest ended, which would be too early to tell IMO. But yes indeed, good datapoint to look at a year after (especially since PRs that take a longer time to merge are, I would think, likely to be higher quality PRs with a higher rate of dev conversion into contributor)


Ah, thanks, yep - that makes sense :)


> especially when tagging spam "correctly" takes a non-insignificant amount of effort from the maintainers.

No kidding. GitHub requires you to wait several minutes (they don't say how much time exactly, but in my experience it's definitely > 2 min) between reporting something as spam. So you can't just go through your spam PRs in the morning and report them easily, you need to leave the browser tabs open and come back from time to time to submit spam reports. Not reporting is much easier, so the real figures are certainly higher.

EDIT: Ah, they don't even mean reporting spam to GitHub. Maintainers need to "opt in" to Hacktoberfest's own rules and change their own PR labeling system according to Hacktoberfest's wishes. What a pile of nonsense.


Include the full sentence.

> Of the 483,127 PRs submitted during Hacktoberfest, only 23,299 (4.82%) were identified as spam, with 19,587 (84.07%) of those being in a repository that the Hacktoberfest team excluded from the competition for not following the shared values and 3,712 (15.93%) being labeled as "invalid" by project maintainers.

Spam submissions were sent to spam repositories, that hasn't been known for years.

From the article, they act like it's their burden alone:

> Their solution, per their FAQ, is to put the burden solely on the shoulders of maintainers.

But, as a contributor, I see plenty of links for me all over Hacktoberfest to report repositories that are also trying to skirt the system.


One of the repos that I manage actively participates in Hacktoberfest and I'm finding out about this "invalid" label counting now, via this article.

No, I didn't read the FAQ, but I imagine neither did a lot of maintainers, especially those that don't participate. The undercounting must be massive.


I mean this is subjective of course, but I think we can agree most reasonable people would say less than 5% is not "insanely high"?


This is unfortunate, and I agree with other commenters that Hacktoberfest should be opt-in.

I had a great experience with Hacktoberfest last year. I tagged a few issues with Hacktoberfest and got a nice PR from someone showing me how to configure my Vue project for unit testing.[0] It was a non-trivial PR and a useful contribution.

https://github.com/mtlynch/whatgotdone/issues/279


It seems that the person responsible for the project proposed this to be opt-in [1], but it never got anywhere internally.

[1] https://twitter.com/MattIPv4/status/1311391325443555328


If they can't make it opt-in, at the very least they should:

1. Only count PRs that contain something like "#Hacktoberfest" in the GitHub comment accompanying the PR. This would make it easier for maintainers to weed out the spam or at least understand where it's coming from and what term to search for when they encounter this unprepared. Also, it would give "visibility" to the event, so it should even fly with management!

2. Only count merged PRs. Apparently DigitalOcean have a "reason" for not doing this, because some projects don't use the PR merging feature directly. I think they should reward users who educated themselves on this point and only opened PRs on projects that do merge them.


1. Should be the case for sure.

2. I think the other argument you could make against that is not everyone has the same time to merge. If it takes two weeks to get merged, I basically have to have the PR in by the second week of October.


DO could just check again for merged PRs whenever they want.


Why wouldn't they be able to make it opt in?


> I had a great experience with Hacktoberfest last year.

Yeah same here, but as a contributor. I decided I wanted to make my PRs count and decided on a targeted effort of taking one mans tiling WM for Windows[1] and fixing enough of the stability issues and race-conditions I found to the point where I should be able to use it myself.

Not a single one of those PRs was “cheap”, and the resulting improvements in quality and stability has lead that WM now to seemingly have more users and doing better than it used to.

Yeah I got a t-shirt. But I also got a tiling WM which is actually usable the times Im forced to use Windows too, which I kinda value more.

So clearly win win, as far as both I and that project-owner is concerned.

[1] https://github.com/rickbutton/workspacer


It seems like some of the spam might have been automated. From this comment at least one spammer seems to do a regex for "website" in the repo's name.

https://github.com/promcon/website/pull/158#issuecomment-701...

Some people were saying this could also be used to detect repositories that have "auto-merging" in order to add vulnerabilities to them later, perhaps using Hacktoberfest as a cover for more nefarious activities. That's strange, I haven't heard of projects that automatically merge certain PRs from arbitrary accounts.


I've seen a repo where anybody who commited a change (via a merged PR) got added as contributor to the project automatically. That could be target.


I would love to see more talk from companies about how to foster meaningful contribution instead of focusing on measurable contribution.

I recently read "Working in Public" which was great, I recommend it. One interesting observation that was made: The perceived pipeline of user => casual contributor => active contributor => maintainer...is a lie. In the book they argue (convincingly) that you do not convert someone from casually contributing to actively contributing, it's instead that active contributors also make casual contributions.

What does that mean in this context? This company is operating under the assumption that they are helping by getting more people into the pipeline. In reality, what we need are active contributors who are invested in projects, not fly-by-night-i-want-a-shirt contributors.

For context I maintain https://www.CodeTriage.com which is a community of about 55,000 devs interested in open-source.


A substantial fraction of "serious" OSS is now developed by paid, fulltime contributors paid by big vendors (I work for such a company). A casual contributor has a much higher bar to overcome in terms of understanding the codebase and being able to create a useful contribution. But a typo is easy to fix and probably won't spend weeks in pull request ping pong hell.

Many repos have a sharp bifurcation between tiny PRs by passers-by and big chunky ones by the fulltimers. The space between is a desert.


> But a typo is easy to fix and probably won't spend weeks in pull request ping pong hell.

Yeah. How the maintainer(s)s of a repo respond to trivial typo fixes (real ones, not spam) is also a good way to test things / check for any weird attitudes.

Most of the time, PR's are accepted easily and swiftly. But sometimes (rarely), the response is strange or off putting.

There is the occasional PR that just sits there forever without being looked at too, which just shows the repo at that URL is dead / unmaintained. Also good to know before putting much time/effort in. ;)


> The space between is a desert.

Who’s going to put in all the time to learn to understand a repo, only to make a small change. At that point you might as well keep going.


Larger open source projects tend to fare better; especially well-documented projects with many small components. I can think of projects such as Django and Wine, both of which very often get small-to-medium contributions from drive-by devs who want to fix "that itch".


That just means many people will never start.


Yes, precisely.


> A substantial fraction of "serious" OSS is now developed by paid, fulltime contributors paid by big vendors

I disagree.


Could you elaborate? I would like to understand.


While I exclusively use Github, I was wondering why CodeTriage only supports Github, why not Gitlab/Bitbucket? I was looking into the issues and Found this https://github.com/codetriage/CodeTriage/issues/613, but was wondering if it's true even today?

As Github has been acquired by Microsoft, I would love to see other alternatives being supported by more projects/sites.


When I find something annoying in an open source tool that I use or missing some functionality that would be helpful for me, then I consider contributing, but only to get the things I need and nothing more. Once I get the PR through, I don't have any more interest in contributing unless I face another issue that bothers me enough.


> Finally, and most importantly, we can remember that this is how DigitalOcean treats the open source maintainer community, and stay away from their products going forward. Although we’ve enjoyed using them for hosting the WHATWG standards organization, this kind of behavior is not something we want to support, so we’re starting to investigate alternatives.

> Another promising route would be if GitHub would cut off DigitalOcean’s API access

I am pretty sure DigitalOcean is not doing this in bad faith or try to damage open source community but the author seems to be out for blood for what seems to be an oversight on the part of DigitalOcean, suggesting that this is a how DigitalOcean treats open source community and one should boycott their products.


DigitalOcean isn't intentionally damaging the open source community - they are just indifferent to the harm they're causing. They are fully aware of the spam problem and refuse to take measures to fix it because it would be too much work for them: https://twitter.com/MattIPv4/status/1311391587793014784

That's not an "oversight".


Potentially the issue may have more traction now: https://twitter.com/MattIPv4/status/1311421301291200512


I don't think that indicates any more traction. He already indicated in another tweet that he doesn't hold any power regarding the event: https://twitter.com/MattIPv4/status/1311395478244818945


Agreed... the author is being premature to blow this up.

The contest seems to be in good faith but had unintended consequences. Regrettable, but it happens. Give them a chance to fix it before trying to ruin their day over it.


It was a problem last year too. Even the person running it for DO suggested making it opt in.[1]

[1] https://twitter.com/MattIPv4/status/1311391325443555328


> the author seems to be out for blood

Sometimes a really over-the-top reaction is the only way to get the attention of a large organization.


I'm appalled by this behaviour but also bemused. What's the motivation for spamming repositories just to get a t-shirt? I mean, are the t-shirts really that good?


Judging by the names in the screenshots and linked github profiles it sounds like a lot of spammers are Indian students.

Thousands of students are all trying to get a low effort contribution in to have an extra line of "experience" on their resume and a T-Shirt from a western Silicon Valley company as signaling? Judging by the poor quality of the contributions, and the fact it's on GitHub, maybe folks studying IT?

That type of academic spam isn't new sadly [0].

[0] https://academia.stackexchange.com/questions/41687/what-is-b...


I am an Indian, and I've studied at an Indian university for my undergraduate degree, and I've witnessed spam like this everywhere. Not hacktoberfest, but I've been a part of it myself for some of the other things. Wish I had behaved better.

In most of these cases, it usually starts with someone really talented doing it with all the good intentions, and everyone else wanting to get in on that 'swag' and appear just as 'talented' and 'unique' amongst their peers. The freebies are mostly for 'show off'.

A few hours ago this link showed extremely low effort LeetCode/Programming Challenge problems and solutions aggregation repos created by us Indians (disproportionally more than any other country at the time I checked) with very silly open issues created & marked as 'hacktoberfest2020' (looks like some of them are gone now)[1].

But, from what I heard from my uni recently, things are improving and this year folks are trying their best to form groups to focus on meaningful contribution over spam.

Even after all that, unfortunately for us though, we'll still have bad actors, probably at the same percentage as any other country, but amplified due to our population and hyper-fixation with an unreal perception about most things we do.

[1] https://github.com/search?p=1&q=label%3Ahacktoberfest+state%....


Why do Indian IT students seem disproportionately represented when it comes to things like this? It seems like there is a strong culture of "getting ahead at all costs". This is seen in shitty YouTube tutorials, blog posts, open source submissions, all the way down the very bottom of the barrel with call centre scammers. Care to explain for us?


you're surprised motivated people, who lack privilege and dont know any better, resort to what they know and have been taught to do? who live on much less than you do and therefore can live off much lower value economic activity?

is it fair to place the blame squarely on them or is it better to recognize that the global system we are complicit in has created this tremendous waste of human potential?


Your comment seems to exude a kind of inadvertent tone suggesting "they fault is not with them, but in their lack of civilization". In other words, it's likely racist.

By GDP per capita, India is where the US was at in the 1950s. Would we excuse this behavior in the 1950s West just because "they lack privilege, don't know any better, resort to what they know and have been taught to do..."? Of course not, because relative poverty is no excuse for unethical behavior.


The way I read the GP post it’s more “the way economics are structured strongly incentivized them to engage in this behavior.” which is more a statement about how pressure and incentives work on humans than anything racist. I doubt that people in the US would react differently to the same incentives.

Comparing the raw GDP of the US in the 1950s to India now is glossing over a lot of points. For example, GDP per capita does not capture the (in)equality of wealth distribution in a country. Or how that GDP ranks on a scale: The US in the 1950s was in a high, probably even the top position (I didn’t check exactly which) - India with the same GDP now is definitely not. That makes a huge difference in perception.

It’s easy to dismiss the status value of a brand name piece of clothing or any token that elevates your status if you’re already high up on the ladder.

None of that excuses the behavior in the sense that it makes it “ok”. But it contributes to the explanation of why such behavior clusters in specific communities.

For me, that I’m ahead, it’s easy to look down and say that this is unethical behavior, but it’s important to keep in mind that I’m applying my ethics from a privileged vantage point - and likely you’re doing so as well.


> By GDP per capita, India is where the US was at in the 1950s.

In the 1950s the US's GDP per capita was the highest. Is that the case for India today?

> Would we excuse this behavior in the 1950s West

I mean the 1950s West was no bastion of ethical behavior. Wasn't that when the cigarette industry in the US started its decades-long campaign of misinformation, obfuscation, and false advertising to cover up the harmfulness of their products? And this was flagrantly unethical behavior by some very privileged people. This is without even getting into how women or minorities were treated.

> relative poverty is no excuse for unethical behavior.

One man's "unethical behavior" is another man's "playing by the letter of the rules, not the spirit". Spamming PRs for a free T-shirt is no way comparable to call center scams.


quite the contrary, i was calling it out in the person i was responding to. but now you have accused me of being racist, i am unable to continue the conversation. have a nice day.


I am still divided on your opinion about it being economical, but it was definitely not racist.

Identifying a privilege gap is not racist. India is definitely not in the same place as 1950s America which was the post-WWII boom. Sometimes called the Golden Age of america.

Then again, we do have some cultural issues to overcome, and yes, a lot of the cultural issues are a result of a traumatic period spent under the boot of the colonizers.

That being said, blame is not useful, the people responsible are dead, and hopefully we can mature as a culture. I see lots to be hopeful about, but lots to be fearful about too. The transition of power from colonials to our republic was botched, and now we're stuck with a broken political machine, and the powerful are trying to break it further.


I was very deliberate in avoiding calling you racist. I said explicitly and deliberately that your comment (not "you") seemed (not "certainly was") inadvertently (not "intentionally") racist.


Lack of ethics or interest in the subject, luring to the ways of gaming the system - I'd say this reputation is well deserved, as someone who studies here.


> the global system we are complicit in has created this tremendous waste of human potential?

Thanks for writing -- I like that way of thinking about spammers etc -- that they could have been doing something meaningful instead, if the world and society made more sense


> Thousands of students are all trying to get a low effort contribution in to have an extra line of "experience" on their resume and a T-Shirt from a western Silicon Valley company as signaling? Judging by the poor quality of the contributions, and the fact it's on GitHub, maybe folks studying IT?

Well, not exactly. There are a lot of people in India who are very passionate about open source and who actively try to get others to participate. So they organise events teach others how fork a repo and submit PR's. Batch-mates are either forced or join after seeing the enthusiasm. Unfortunately, these are not moderated and things go downhill soon where a PR is sent for the sake of it.


I've had multiple examples of "spam" from Indian and Chinese students who scraped an email from GitHub repos, wanting endorsement or to fill questionnaires or somesuch.


It's probably less effort to get this t-shirt than it is to get achievement badges or hats or whatever for many video games, and this nets you an actual physical achievement badge.


What you're saying makes perfect sense - no complaint or disagreement from me - but I'm still struggling to wrap my mind around this because when all's said and done we're still talking about a vendor t-shirt, and how cool can that really be?

If it does motivate people and they like them, then fine: I guess it's just different strokes for different folks.


> different strokes for different folks.

It pretty much boils down to this. Not only are vendor t-shirts actually a physical item but they are also often uniquely designed; many vendors only provide certain designs for certain events. So they really become a badge of honor and can gain in emotional value: "been there, got the t-shirt..."


People actually pay extra money for t-shirts that have specific brand names on them and entire businesses are built around that. It shouldn’t be surprising to you that there are people that want particular shirts.


True, but Ralph Lauren's a lot more stylish than Digital Ocean (not that I wear a lot of either of them!).


For a nerd? Probably not.


It's kind of like that episode of Pinky and the Brain where they finally are able to take over the Earth by convincing everyone who moves to a paper mache copy of the Earth a free t-shirt. People love free t-shirts.


Ever visit a booth at a conference or trade-show where they give out t-shirts? Pandemonium!


I've worked a bunch of them back in the day and they are indeed a feeding frenzy. Can be a lot of fun though.


The shirt is a form of signaling and once you wear it in the right circles you’re in as one of the cool kids.


You might wait, what, a 10 person line to get a free t-shirt? Making 5 “corrected typo” commits is probably equivalent to that effort and looks like 5 is enough to get past a spam tag.


I really wouldn't. I'm not a minimalist by any means but I've been decluttering for a while and am certainly long past the point where I'll tolerate having possessions that I don't want. Maybe this is a privileged point of view but, as with swag in general, most vendor t-shirts are honestly just tat.


> Maybe this is a privileged point of view

It is. T-shirts are wear items and getting a free one that looks decent is one less that you’ll have to buy.


You would be surprised by the stuff people do for free swag. I don’t get it either, but I’ve seen first hand how hard people go for that sort of stuff.

It’s a shame it has led to people spamming repos, however.


There are a ton of automated trivial pr spammers out there that aren’t even in it for a t-shirt. Maybe it’s vanity or just wanting to see the world burn ¯\_(ツ)_/¯


Maybe some people feel that these t-shirts signals street cred?


Definitely. I've always wanted a cool Hacktober Fest t-shirt like the ones my co-workers have. But I certainly wouldn't spam open source projects just to get one.


You never did anything stupid and superfluous as a youth?


In my experience the average American goes into a leave nothing behind mode whenever baited with Free.

Just look at the kind of mayhem Black Friday precipitates, and that's not even free - just a promise of exceptional discounts.

The comedian Doug Stanhope has an amusing bit about free healthcare and how wastefully it would be consumed by Americans conditioned to maximally abuse anything offered free of charge.

Edit: I don't mean to suggest Americans have a monopoly on this sort of behavior, it's just the country/culture I'm by far most familiar with.


The spammers appear to be overwhelmingly Indian, based on the usernames in the screenshots. Which is somehow unsurprising to me (there are 1.5 billion of them, after all).


Non-Americans on the other hand hate free stuff.


For some context it's worth quoting directly from the published statistics available at (1). Although if this is based on manually tagging something as spam it is probably an understatement.

    Of the 483,127 PRs submitted during Hacktoberfest, only 23,299 (4.82%) were identified as spam, with 19,587 (84.07%) of those being in a repository that the Hacktoberfest team excluded from the competition for not following the shared values and 3,712 (15.93%) being labeled as "invalid" by project maintainers.
1. https://github.com/MattIPv4/hacktoberfest-data


They literally checked for a label with text "invalid" and that's it. The OP, for example, used the label "spam" so it doesn't count. Simply closing the PR without merging or commenting doesn't count. Any other text label doesn't count.

So yeah, I suspect it's massively undercounting.


Their FAQ (linked from the submitted article) says:

>[...] please give them an `invalid` or `spam` label and close them. Pull requests that contain a label with the word `invalid` or `spam` won’t be counted toward Hacktoberfest.


Since project maintainers don't have to opt in to Hacktoberfest, there's no reason for them to know that the FAQ exists. Most maintainers are unaware of what's going on and will just close the spammy PRs without tagging them.


Oh yes, I wasn't disagreeing about that. I said as much in an older thread about another group of people attempting this same thing: https://news.ycombinator.com/item?id=23968008


The code in the linked repo with the stats is literally:

>const totalInvalidLabelPRs = await db.collection('pull_requests').find({'labels.name': 'invalid'}).count();

They also mention the label "invalid" multiple times and never the label "spam." So even if they count "spam" for making entries invalid for a reward their stats do not seem to take that into account.


I believe this is a change between 2019 and 2020.


Ah. Nonetheless given this is analysis over historical data I would label any PR that hasn't been merged in over the last year as effectively spam. The fact that none of their stats seem to include how many PRs were actually merged in is rather concerning.


It would seem like a much better idea to say that only PRs that are explicitly confirmed by the project maintainers as being valid will be counted.


Then you'll get people spamming maintainers to please hurry up and review/confirm their important PRs, and abuse if they don't.


That's easier to deal with, because the maintainers can just reject the PRs and not have to worry that the spammer will get goodies from Digital Ocean if they don't mark it as "invalid" or "spam" or some other magic tag. And as a maintainer, high on my list of "reasons to reject" would be "bugging me about a PR from a random person I've never had a submission from before and who has no other track record of contributions to open source projects".


Right, that relies on the maintainers knowing that they are "expected" to do the extra work to tag those with special tags, otherwise closed PRs count as "good".


"Only" 24k spam PRs. Face palm


Only 4.8% spam PRs better? The absolute number means nothing.

In a sea of 5B PRs, 24k would look impossibly good.


Does "only 24k MRs had maintainers that bothered to properly mark them as spam per hacktoberfest's guide" sound so "impossibly good" as well?


The absolute number definitely does mean something! Its work created for maintainers. Checking and tagging a PR takes non-zero time. Creating 24,000 invalid PR's is a massive burden on the open source community.


How about data on what % were merged?


Given that there's been a year since last year's event this should be a fairly clean piece of data at this point. Even longer or more involved valid PRs should have been merged in by now.


That's an insane amount of spam. The percentage is irrelevant, look at the real number: we're talking tens of thousands of instances of undue burden on the maintainers of open source packages.

And the number of "nice PRs" is essentially irrelevant here: this is not a zero sum game, a thousand good PRs don't cancel out a project getting flooded with bad PRs.

If your event can't prevent substantial abuse of the community you pretend to do this for, you should stop your event and figure out how to do better.


Wow, such an apple to oranges comparison. I scanned the page but didn't see anything better, so we should assume that 84.07% of all pull requests submitted during Hacktoberfest are to repositories excluded for not following the shared values. That implies that of the 483,127 PRs, 76,962 are to qualifying repositories, and in fact that 15.93% of ALL valid PRs are spam.


Why would they exclude a repo and then not exclude all PRs from that repo?


This is exactly my point. They listed the top-line, total number of all PRs without any filtering, then when they calculated the "spam ratio" they filtered out "invalid repos" as well as "spam PRs". It's a misleading statistic.


Wow. Just a moment ago I received a PR to slightly modify a readme, in a way that seemed unusual (no insertion of links or anything, but odd punctuation choices). I couldn't understand why someone would send it, and then saw this post.


To be fair I always send PRs that correct very small things in documentation such as typos and punctuation. But I do that all the year not only during Hacktober Fest.


That's totally normal I would say. Once you have repo with a few thousand stars there will also be occasional spam requests from what I guess are people that want to be able to pin the repo to their profile. Can't imagine dealing with the spam that the post is describing.


Yeah, we just got a "Fix a typo in README" PR that added the words "Fix a typo in the README" to the README, and we're a pretty niche open source repository.


I'm sorry actually to see that most of the names in the screenshot are people from India. Hacktoberfest to some degree has turned into a madfest with most college students here. Rather than actually contributing to open source, many new repos pop up during these times where fellow college students raise a PR for nothing.

It's the T-shirt that's the primary reason but also thr flaunting on social media as if I'm some kind of certified open source contributor.

PS: I've also been part of Hacktoberfest launch events where some people literally created their first PR.


> PS: I've also been part of Hacktoberfest launch events where some people literally created their first PR.

If a reasonable (heh!) percentage of those people continue on to create meaningful further PR's, then it's probably a success for that piece of things. ;)


This is especially stupid of the spamming participants. If you so desire a t-shirt and don’t want to make any meaningful contributions, just make your own BS repo and make your own BS pull requests.

I intended to make meaningful contributions last year and accidentally hit the quota just by making PRs to my own projects.


I think PRs for own repos are not allowed. Unless they create a fake account to contribute to.


From https://hacktoberfest.digitalocean.com/faq:

> Do pull requests made on my own repositories count?

> Yes, but we strongly encourage you to make quality contributions to other repositories.


Personal repos have worked in the past, as long as they're public.


The first user I looked at from the repository mentioned in the article did all of the above. Spam PRs in someone else’s repo, spam PRs in their own (fake) repo, spam PRs in a sock puppet’s (also fake) repo. The only thing they haven’t tried is writing a legitimate PR.


Yes they are, I did contributions to my own projects and they counted. At least last year.


Works fine, yes


It seems it will be necessary for DO to put more of a burden on potential t-shirt recipients to prove that they are making valid PRs and acting in good faith.

A first step would be to only allow contributions to selected projects that have first approved to be included in Hacktoberfest.


Yes, especially since they don't seem to be willing to dedicate more to policing this - apparently everything related to it is handled by one employee, who's obviously limited in what they can do: https://twitter.com/MattIPv4/status/1311392743885869057


Maybe accounts eligible to get the t-shirts should be only the ones that have a certain amount of past contributions. Like at least a year old? Maybe that would be unfair to recent contributors, but there's always next year. Just like you can't perform some actions on Stack Overflow before you achieve a minimum reputation score.


Are Hacktoberfest contributions checked by an API now? Could you not require the maintainers to tag the PR "Hacktoberfest contributed" and only that would get you a tee? People would know not to submit junk because the maintainers wouldn't give them the label for junk.


That would be an order of magnitude more maintainer effort than flagging spam is right now. Not to mention that the number of maintainers being aware of this event is probably quite low.


How is it more effort to label legitimate PRs than to label spam PRs? Are you saying legitimate PRs are an order of magnitude more than spammy PRs? Because it sure doesn't look like it.


> Are you saying legitimate PRs are an order of magnitude more than spammy PRs? Because it sure doesn't look like it.

The numbers quoted elsethread look like that to me. Not necessarily the full 10x difference, of course, but choosing just between these two systems it appears clear to me that there would be more work for maintainers and/or significantly less people being eligible for T-Shirts because few maintainers are actually aware of Hacktoberfest if every PR had to be tagged by the maintainer to count for eligibility.


DO knows that that would instantly kill their marketing campaign since no project in their right mind would willingly participate.


Why do you say that? Several people here said they have been helped by it. Seems they would.


If all the spam commits are suddenly focused on only participating repos, that will quickly change.


Maybe, maybe not. It's hard to really know without trying it and measuring.


Loads of projects, particularly recent ones, would gladly welcome any help and increase in visibility.


I dream of the day I can walk into a job interview and brag about how many PRs my Open Source projects have to reject per day. Maybe that's just my Nobody Privilege talking though.


This is called opt-in and is suggested in the article.


Or requiring that open PRs must have +N lines in non-text/markdown/config/dependency files


How about improving compilation time thousand-folds: https://github.com/torvalds/linux/pull/447


Huh, does he think he's funny? Spamming the Linux kernel git seems like a good way to be considered a major wanker. And doing it from a non-throwaway Github account?


Linux kernel doesn’t use GitHub so it’s not really spamming actual devs.


I love that this PR has 2 commits and the second one is just adding a newline to follow convention.


That would be way too easy to game.


Docs have a lot of value - and bad docs are a huge barrier others who might contribute. No reason to ban doc improvements.


Google's Summer of Code was a little annoying too. You'd get this wave of Indians, where GSoC is extremely popular, asking you what they could do for you if they knew some C++. It was a lot of work to deal with their applications and shepherd them along a project and it usually yielded little in the end. We wanted new contributors, but at best we'd get a sort of working idea over a summer.

I know for some other projects GSoC worked out well. I'm sure people will pipe up telling us how we're doing it wrong if we couldn't get good results from GsoC candidates, but after a couple of years I was tired of being involved with it and got cynical about it.


WRT GSoC We're constantly tweaking the program to make it more relevant and less time wasting for projects. Early on, we got people doing extremely negative things to 'succeed' in the summer of code.

It's reasonable to bow out if it's not working for your project. Maybe check us out every few years to see if we've addressed your problem :-)


Oh, Octave is still doing GSoC. I just haven't been involved with it in a while. I guess it's still working for someone else, glad to hear it's gotten better.


I'm sorry it didn't work out for you. For you, was a "good result" just a good project, or a contributor with continued involvement?

Coming from the other side, as someone who was a GSoC student, but whose involvement with open source ultimately dropped off over the years, I think one of the problems here is the timing. GSoC students are typically in their third or fourth years of undergraduate study, which are followed by internships and on-campus data structures & algorithms interviews which require a lot of preparation. Then for the first couple of years in the industry, most haven't sorted out their work life balance sufficiently to want to code in their free time. It's only recently (2-3 years after I graduated) that I started feeling like I had enough time to get back into open source.

The fact that there as many Indians applying as there are, I think is due to a combination of the factors that 1. there are a lot of Indian CS undergrads 2. internships are extremely competitive, so GSoC is perceived to be an alternative (which it really is not)


I'm sympathetic, but I'd be interested to know whether there's also an increase in non-spam contributions. We probably need to wait a while to find out, since it's reasonable to expect the spammy t-shirt-seeking PRs to be front-loaded and the substantive PRs (if any materialize) to take some time.

edit: and it's worth saying, sometimes a newbie's first PR is pretty indistinguishable from spam. It would be ironic if one of the results of this project was teaching a bunch of young programmers that they're not needed or wanted in FOSS.


Last year, in addition to usual work by long-term contributors, my project got four useful bugfixes (all from the same person and taken from our recommended first contributions list), a few trivial typo fixes, and no spam.

I've seen people say they benefit from Hacktoberfest and some people say they get a lot of spam, and it's hard to know which outnumbers which, but I don't think anyone should be saying with confidence that it's a pure negative, and I think DigitalOcean's suggested fixes (disallow new accounts, disallow people who've gotten too many contributions marked as spam) are probably the right direction to go.


I was at a wonderful Hacktober event last year (thanks, Setlog and FOSS-AG!) and think some pretty good stuff came out of it. The incentive to submit low quality PRs became quite obvious quite quickly however.

My gut says that Hacktober probably spawns some productive contributions, but most of them would likely have been submitted anyway.


From my perspective last year as a first time participant in other people's open source projects, I don't think I would've made any of the contributions I did without Hacktoberfest.

Projects tagged with the Hacktoberfest tag tended to signal either projects that were both active and had low barriers of entry for newbies, or weird mechanical turk-esque spam. While the latter is unfortunate, the former isn't nearly as easy to find as it should be the other 11 months of the year.


Last year, that was not the case.


I'm sure there's a bump in non-spam contributions as I have submitted them and know others who have too, but I suspect it is a small fraction compared to the T-shirt spammy seekers as numbers have grown over the years.

I myself have submitted small PRs during Hacktoberfest but they were still meaningful corrections and in addition to at least 5 significant contributions. We do need better signalling. I tend to pick projects I already follow or have been tagged for Hacktoberfest.


I get where you're coming from but I think you're missing the point. The issue seems to be that even if there are good contributions coming in their benefit is disproportionately outweighed by the work and disruption caused by the large volume of spammy PR submissions.


As an open source maintainer, I'd gladly sift through a mountain of spammy PRs (heck closing 4 per hour, as called out in the article, is almost zero trouble), if it means even a handful of real significant progress and issues fixed and potential future maintainers.


+1. I feel the same way. If one out of 20 drive-by contributors stick around and become regular, that would be a real win for me. (I'm currently maintaining a project with 20k GitHub stars and we have four regular contributors.)


That's a big if.


Well this thing has been running for a few years now so I'm sure someone has that data



A snapshot of one year's participation in that specific period isn't too relevant to what we are discussing, because it doesn't track sustained future contributions by those same users.


That's a good point. Wonder if they're be open to adding something like that...?

Maybe suggest that in an issue on that stats tracking repo?


In theory they are. In practise the lack of a test dataset - and the lack of access to their dataset - means it's virtually impossible for a third party to make any significant contribution to the data processing code.

Such an effort would have to start with them voluneering a test dataset and/or schema.

I have asked - https://github.com/MattIPv4/hacktoberfest-data/issues/5


I'm not missing that point, I'm asking whether it's true. The right way to answer that would probably be to find out how many of the new submitters from previous years went on to continue to become valuable contributors.


I think I understand the intensity of emotion here. Open source is a really high trust community, more so than a lot of real-world spaces. Yet, it's adjacent to some other areas where poorly tuned incentives cause bad behavior. Spamming for free t-shirts is a relatively harmless manifestation. It's just some attention, though attention is our most valuable resource. I'm also reminded of cases where people take over undermaintained plugins to insert malicious behavior; it's the same kind of thing, just farther along on the badness scale.

I'd love it if we're able to preserve the high trust nature of open source. I also wouldn't be surprised if it starts eroding. If that's the case, this kind of thing is the tip of the spear, and in that light it makes sense to get pretty upset about it.


What can we do?

My most fervent hope is that DigitalOcean will see the harm they are doing to the open source community, and put an end to Hacktoberfest. I hope they can do it as soon as possible, before October becomes another lowpoint in the hell-year that is 2020. In 2021, they could consider relaunching it as an opt-in project, where maintainers consent on a per-repository basis to deal with such t-shirt–incentivized contributors.

It seems like what could be done that's better for all involved, since there are reportedly (here in the comments) some repo maintainers that really like the program, would be to:

- Immediately suspend it while attempting to contact all the repo maintainers that are on the list

- Explain what's going on, apologiz, and give them the option at that point to opt in if they see benefit otherwise do nothing or decline to not be included

- Note on the Hacktoberfest project page the temporary suspendion for maybe a week while they get info back on who still wants to be included (and maybe some other repos volunteer, who knows).

To me that seems like a sane way to handle this (as opposed to the somewhat hyperbolic statements and suggestions in the article).


I think what would be quite effective is to simply stop giving away t-shirts. Or give them to everyone who registers. But I've seen how long engineers will wait in line for a free shirt at a conference and it is astonishing, so it's not a surprise to see what they are willing to do for a t-shirt here.


Step 1: Set up a website offering these shirts for $5 + S&H at some domain that seems like it's for DO outreach work.

Step 2: Set up a system that creates an account and automates some pull request.

Step 3: Tie the two together and drop ship the shirts to the person who paid for it through your site.

Step 4: Profit a small/moderate amount and have a repo you're the primary dev on that looks really attractive as a proof of work, resourcefulness, and willingness to ignore ethical questions to a lot of Silicon Valley startups.


You joke, but there are automated bots already submitting pull requests to try and get a massive number of shirts. Once they start shipping they end up on eBay and the like.


I joke, but I'm also fully aware that the best jokes have a healthy dose of truthfulness...

Regardless of whether they actually monetize the shirts much, I wouldn't put it past someone to use that as an interesting thing to offer up in an interview, depending on the company and how they perceive the interviewers. :/


For an idea of the magnitude of this problem, just do a search for recently-created pull requests with the text "improve docs": https://github.com/pulls?q=is%3Apr+%22improve+docs%22

By my count, the rate of these PRs has increased from about 20/hour (averaged over the past month) to about 200/hour (in the last 12 hours), with the vast majority of the recent ones being worthless spam.


That link didn't work for me, but this does: https://github.com/search?o=desc&q=is%3Apr+%22improve+docs%2...

"Update readme" is similarly terrifying: https://github.com/search?o=desc&p=2&q=is%3Aopen+is%3Apr+%22...


Wow, that is just terrible. Not just the wasted hours of maintainers. But many of those PRs set off CI builds, leading to more wasted energy and potential carbon emissions.



They do not tackle the “opt in” issue though. I wish they said something like (either one of the option)

* next year will be opt in because ... * we are considering of making next year opt in ... * we considered of making next year opt in but we discarded the option because ...


Seems like they've taken it seriously to me.


No, this doesn't solve the problem for anyone. They need to cancel it immediately, or make it opt-in, again immediately.


I've been a developer for 5 years. One of my 2020 goals, aided by being stuck at home because of a certain pandemic, is to make the leap to being a contributor in an open source project I care about.

One of my coworkers shared Hacktoberfest details and I got really fired up! I looked through repositories I could reasonably contribute bug fixes or light features to. Got myself familiar with the codebases, PR process, Hacktoberfest guidelines (that are very clear about spammy contributions).

Then reading this and seeing some of the bogus contributions myself (some by contributors who coincidentally share my name!), I don't know how to feel about this. Maybe keep up my laziness streak and punt my contributions to November (and reward myself with nerdy apparel!)? Or take this as a fun opportunity to redeem my name?


No need to be shy! Quality work is quality work, no matter when it is submitted, and earlier is usually better! If DO is going to give out free t-shirts, there's no reason to purposefully deprive yourself of them just because someone wrote a blog post.

The Julia Language [0] gets some spammy issues/pull requests as well (not only during October) and while we have the benefit of dozens of maintainers such that it's bearable, I definitely sympathize with the issues OP is dealing with. Opt-in could be a good idea, although as usual, the issue is scaling and verification.

[0] https://github.com/JuliaLang/julia


Nah, kick the laziness and get in there and do something! Good contributions stay good contributions.


Ha, that's a pretty good point! I doubt contributors will see good PRs as spam, even if it does have to be sifted through a puddle of nonsense.



Why do they all use the word amazing? Is there some chat group where someone pointed out that using "amazing website" was the easy way to go?


LOL, wtf is this shit?

Luckily noone has found mine :D

Edit: LOL, those PRs are cancer.


Hm. I did my first Hacktoberfest last year, and it was fun. I only remember 2 of them.

One was something like build the worst implementation possible of aspects of the .Net framework, but while a joke project, it's been around for a few years, and you actually had to make something that _worked_ and in a reasonable amount of time. It was a fun challenge.

The other one was a bit of a lark, but it led to me and the maintainer having some discussions, working out some code, and then them taking about 2/3 of the PR just because there were constraints that were immutable for them, and neither of us could come up with a viable workaround, and we parted friends.

This is something that should be fun/interesting, and presumably, adding to the open source community. The T-shirt is a cool idea, but I think I ended up doing 7 or 8 of them just because I had gotten into the mode of "I'll just skim through the list of open projects and provide some real help while I have some free time."

Maybe it's time to make it opt-in. Register your projects with DO and Hacktoberfest, and those will be the only ones that get counted. Assumption being though that if you sign up your projects, you're going to stay up to speed on PRs and merge or mark as spam in a reasonable amount of time.


I tend to take a different approach. I'll hope on one of the Discord servers I'm on, and ask if anyone is working on anything and needs help.

I like incremental games so I usually end up helping out people who are new to making them add things like save systems, or sprucing up their CSS so everything aligns better.

Sure it's not as grand as contributing to something used by a ton of people, but I'm not sure I'm good enough to be able to do that.


Why doesn't DigitalOcean require that the PR gets accepted to get a T-Shirt?


Merging a PR requires effort from the maintainer.

I frequently perceive a sense of entitlement from drive by PR contributors, as if they are giving a gift to the maintainer, when in reality, it often takes more time for the maintainer to test, review, and give feedback on contributions than if they’d done it themselves.

I imagine the people spamming repositories for a T-shirt are the same people who will harass maintainers to “just merge it already.”


I don’t see conflict with this requirement. Maintainers should not feel pressure to speed through or acceptance.

This means not all os projects are likely to yield you a t-shirt, but it does mean your contributions must be good and relevant to have a chance.

There is some technique to reviewing a project and knowing what is likely to be pulled based on its context.

That aught to cut down on it.

Fwiw, DO should be doing something for the projects that accept a PR from this “fest” either t-shirts or a donation to a foss advocacy org or similar.


This is one reason why I generally open an issue first. Lets talk about my problem first. Maybe I add a PR / branch / patch as a PoC. Maybe it gets closed with "No" a few month later.

However, I don't see how that works with a hackaton mindset.


>I frequently perceive a sense of entitlement from drive by PR contributors, as if they are giving a gift to the maintainer, when in reality, it often takes more time for the maintainer to test, review, and give feedback on contributions than if they’d done it themselves.

If that's what you think then why accept pull requests at all on your project?


Well for one, last time I checked GH doesn’t allow this in their UI.

And on balance, I love open source!

It’s just one of those unfortunate things where it’s hard and slow to be considerate while being easy and cheap to be inconsiderate.


They claim they don't want to exclude projects that don't merge through the PR feature: https://twitter.com/MattIPv4/status/1311388583136301059

They probably should validate a somewhat matching commit with the same e-mail address ending up in a branch or something. Few if any projects modify those.


Not a good reason. What about people who don't use GitHub at all? Why are they excluding them then?


All projects _not_ using GitHub are already excluded.


That's my point :) Arguing that because this would exclude some small number of projects does not make sense if from the get go you exclude a large number of open source projects.


Is there a logical fallacy where you appeal to only the rarest of circumstances? You want your t-shirt? Find some repo that uses PRs normally.


I'd love to know the name of it if there is! It seems to happen quite a bit.


Just a piece of anecdata in regards to this, last year in my first Hacktoberfest I made a PR for an open issue on a project that got no response from the maintainer until a week later, where he said something along the lines of "Oh sorry, I missed this one, I'll get to it soon", and then he just never reviewed it or anything else on the project ever again.


That's a perfectly normal Github experience, so it wouldn't invalidate the metric.


Do more than the bare minimum for your tee then. Hedge your bets.


Most of the PRs I make outside of work and personal projects are to obscure libraries I found useful, which usually have a maintainer whom awakens from their slumber once every few lunar cycles.


Ha! Thanks for the actually out-loud laugh.

(I maintain a handful of small libraries that meet your description, and only recently "awakened from slumber" to release a new version of https://github.com/hunterloftis/throng after five years)


I'd take that a step further and require that the PR fixes an existing issue opened by a different user. Go ahead and fix a typo too if you want, but that alone won't get you a t-shirt.


This seems like an eminently sensible suggestion and I'm perplexed as to why Digital Ocean wouldn't already have chosen to implement it.


I have a feeling there would be some problems with people submitting valid and helpful PRs but they aren't merged in time to be counted.


Ding ding ding, we have a winner. ;-)


https://en.wikipedia.org/wiki/Goodhart%27s_law

>When a measure becomes a target, it ceases to be a good measure.


I like the DigitalOcean team and know some of them personally. The thing is that the author is spot on with his comments.

In the OpenFaaS community we've suffered every year from spam and low quality PRs that completely ignore the contribution guidelines. The worst part is that we cannot opt out.

I would love to see the team listening to maintainers and coming up with new ideas.


Apparently they do take opt-out requests by mail now: https://twitter.com/MattIPv4/status/1311390498888781824 (be nice, seems like that guy is the sole person asked to handle all this alone)


Which won't stop the spam unless the people spamming know ahead of time they won't get credit. Not sure if that's the case or if there's a mechanism for that.


True, assuming the spammers read a notice is way to optimistic.


I was curious what the average PR looked like, so I went to their project and opened one at random: https://github.com/whatwg/html/pull/5968/commits/5d8b75ef0a3...

Yeah, it's pretty bad.


And when their attempts to "contribute" to the WHATWG and Let's Encrypt documentation repos didn't go through, they resorted to creating a handful of dummy projects of their own, then making five pull requests on those repos from a new Github account.

At least that doesn't inconvenience open-source maintainers, I guess. It's clear that the spirit of the event has been lost, though.


That one seems gone, so I clicked on another, https://github.com/whatwg/html/pull/5984/commits/f056fd22b59...:

  - please let that site know of the problem instead. Thanks!</p>
  + please let that site know of the problem instead.We will try to better ourself Thanks!</p>
Wow.


This PR would need another PR to fix white spaces and punctuality.


- This PR would need another PR to fix white spaces and punctuality.

+ This PR would need another PR to fix white spaces and punctuation.

Do I get my free T-shirt now?


I submit this PR [0] for your enjoyment:

  +#...
---

[0]: https://github.com/OscarZhou/CSharpTraining/pull/1/commits/8...


That PR unarguably makes the repo worse. That’s really bad!


I know that it's only peanuts for Digital Ocean in the grand scheme of things, but this blatant disrespect for open source maintainers makes me seriously consider not topping up the credits for my personal servers next time and moving them elsewhere.


I understand where you're coming from, but I think it's a bit unfair to level the accusation that this is blatant disrespect until DO responds to this (or decides not to).

I can even understand the initial thinking behind the FAQ entry - on the surface, it seems like a decent solution that works in theory. But as the blog post highlights, it just drove an entirely new kind of negative behavior.

I'd personally wait to see how DO responds/adapts (or doesn't) before investing the time to move my servers.


I think them not putting any meaningful measures in place since the last years is already a response.

One such measure could've been to make it opt-in, but as mentioned on Twitter that was proposed internally and rejected. That to me already clearly signals that they were aware of the problems they were causing for maintainers (maybe not the extent), but chose to prioritize their bottom line, which to me is very much "blatant discrespect".


To be fair, judging by organizer's reaction [0] it seems that the problem of spam this year is much more widespread than last year.

[0] https://twitter.com/MattIPv4/status/1311421301291200512


Classic Goodhart’s law: “When a measure becomes a target, it ceases to be a good measure."

I don’t know the solution for this. But sheer number of PRs/commits is obviously meaningless. We just don’t have a better (cheaper) proxy to latch on to.


I think if people who get a `spam` or `invalid` tag to their PR should be instantly disqualified. That could work.


That is the case. The issue this year is the PRs have been essentially automated by some YouTuber telling people how to game the system


When I read about this, I expected a lot of unnecessary but positive PRs, or some grammar-nazi-ness, but actually it is really pure spam:

   https://github.com/phpmyadmin/website/pulls?q=is%3Apr+is%3Aclosed+label%3Aspam
The changes are not even positive contributions, it literally breaks the documentations and adds some useless or unwanted meaningless SPAM.


Holy crap those are bad PRs


This sucks, but I’m not surprised. I’ve been contributing to Hacktoberfests for 4? 5? years now, and the explosion in size from the first year could realistically only result in this. I was planning on taking part again this year, but now I’m not so sure—I can make sure my PRs are of a minimum quality But seeing the ones they’re up against really puts a bad taste into the whole affair.


Has anyone (DigitalOcean, Github or others) run numbers on what percentage of casual users are converted into persistent open-source contributors while initially being incentivized by Hacktoberfest? Sure spam is and always will be a problem, but if that first number is significant (of course that's a big if) then it makes sense to encourage this effort by putting in some moderation overtime. As I mentioned elsewhere in the thread I'd very gladly sift through a mountain of spam if it meant getting another reliable maintainer or two for my project.


Probably 0. That number will grow significantly you directly ask contributors during hacktoberfest if they want to become long term contributors.

There are four types of contributors. Project maintainers and team members who are invested in the project. Contributors who want a bug or feature implemented and will do it themselves. Those who want to contribute to opensource projects but can't decide on which project or issue to work on but hacktoberfest gives them an excuse to just pick any project and if you ask them directly they might stick. Finally there are those who only do it for external rewards.

The people you can reach only through hacktoberfest are the last two groups and only the third group might stay over the long run. The first two groups will always be there, even without hacktoberfest.


I would expect that percentage to be very, very low. In fact, among the people who are making spam PRs, such as adding "amazing project" to the README, I would not be surprised if that number was zero, or at least close to it.


I suspect a one off external gift is not going to get many long term committed maintainers. The two just require such different motivations and reasons for doing the activity. But I may be wrong.


I foresee maintainers automating the process of flagging _every_ pull request as spam for the event window, and communicating that decision to the actual community beforehand.


Maybe tell legitimate pull requesters to add a special keyword to not be tagged as spam. Something like "digital ocean is pants": http://news.bbc.co.uk/2/hi/uk_news/england/hereford/worcs/75...


Looking at the homepage[1] for Hacktoberfest this year, the organizers do appear to try to lead people towards projects that are looking for help.

It's possible that differences in the way the event is announced and explained may lead to different expectations and results.

And sure, some people are just going to spam, especially if there are incentives involved. Looking at a few of the pull requests linked in the post, some of them definitely are of questionable contribution value.

An ideal outcome should likely still incentivize participation: for some folks, this may be their first time contributing to open source at all, and there's a non-zero chance that could lead to massive learning opportunities for them, and future contributions to open source projects -- but yes, maintainer burden is a real problem to balance against too.

Providing opt-in/out for repositories is certainly one possible approach. What other techniques are available to manage large quantities of inbound communication and filter signal/noise?

[1] - https://hacktoberfest.digitalocean.com/


Have you reached out to DigitalOcean about this? It's been my experience that they're generally pretty receptive to feedback.


Yep. They seem sympathetic [1] but unwilling to actually stop hurting us [2].

[1]: https://twitter.com/MattIPv4/status/1311366041897971712

[2]: https://twitter.com/MattIPv4/status/1311395478244818945


Hey domenic - I work for DigitalOcean on the Community team.

We also noticed an uptick in spammy PR's and we are working on a bunch of immediate and long term changes to improve the situation for Open Source maintainers like you.

Here's an official post where we walk through currently proposed changes: https://hacktoberfest.digitalocean.com/hacktoberfest-update


The limit on the number of shirts might be the issue. Did you consider removing the speed component of this?

> This year, the first 70,000 participants who successfully complete the challenge will be eligible to receive a prize.

How about choosing 70,000 participants at random? Or any other criterion that doesn't encourage quick content-less contributions?


Hi Andrenotgiant - Could you not limit submissions to only labelled issues? That way it’s opt in and you reduce spammy PRs.

Many maintainers put “Hacktoberfest” labels on issues they’re happy for newcomers to work on.


We could change it to be opt-in only, but the participants who are creating spammy PR's have already shown they don't read rules, so I think they would continue to create the same PR's.

That's why, to deal with the immediate issue we're creating an obvious, even lower-effort route for the impatient participants to take (follow a guide to create 4 PR's on your own repo) - and longer term we'll make bigger changes (maybe including an opt-in only model) that solve the perverse incentive issues that seem to be driving this PR spam.


this tweet: https://twitter.com/MattIPv4/status/1311425661840613377

suggests to me that DO expected that maintainers would email DO about spammy users. is that the case?


We have a system to invalidate spammy PR's, but not spammy users.

The problem with most "spam flagging" solutions is: 1. They only kick in _after_ the PR is created and the maintainer's time is wasted. 2. In some cases they might actually cause more harm than good. A user is flagged for spamming, gets blocked, creates another account and spams some more... etc.

For that reason we are focusing our efforts on just re-routing these impatient users into guides that have them creating PR's on their own repos.

Long term we are definitely committed to updating the program to make sure it's delivering on the mission of getting people positively involved in open source.


Doing anything completely open on the internet invites spam and misery.

This program would be better off just sponsoring projects instead. Otherwise use opt-in repos and invites/approvals of people who want to work on them, or other rules like limiting new accounts and personal repos.

The first improvements this October should be to the hackathon itself.


I did notice in the rules that you can report the spam.

If a maintainer reports your pull request as spam or behavior not in line with the project’s code of conduct, you will be ineligible to participate.

https://hacktoberfest.digitalocean.com/


This reminds me of the constant "security reports" I get.

"Changing the email doesn't expire the session on your web app". Should it? The email isn't the login, why should the session expire? It should expire on password change, maybe username change (but even then, why?). It's just a bunch of spam templates basically from people who don't really even understand the reports they are making.

And then they ask for public recognition so they can get points on one of those public security leaderboards.


Got similar from the security department of my previous company. Every time a new gadget was found in Jackson that allowed RCE if you turned on the feature to instantiate arbitrary classes based on user input (which is clearly documented with warnings in Jackson and was linted against being enabled in our internal rules _anyway_), we would get a ticket that we had 24 hours to update to the new Jackson version which just added that to their blacklist of classes not to be instantiated.

"Prototype pollution" from transitive dependencies of our frontend build scripts in node was another one where we would get spam security issues, though thankfully without the same tight deadline.


Now imagine how much worse it would be if some company decided to "help" infosec by giving away a free t-shirt for filing 4 "security reports" whether or not they were valid.


This is for.. a free t-shirt?


Yes, weirdly enough. People make new accounts, create (or generate?) a bunch of crap PRs (e.g. adding random words to READMEs, shuffling whitespace around, ...) in repos they picked somehow, ... to get a free t-shirt in the end.

Initially, at least from what I remember, the "spam" mostly was purpose-made repos a la "make a PR in this repo to add your username to this list and get a Hacktoberfest point", which didn't drain others time. But now it spreads to random projects, since they try to block those purpose-made repos.

Haven't figured out yet which repos are picked why, it doesn't seem to be entirely random. Seen it on some projects I'm involved with, others not at all. As mentioned in the article, the HTML spec is hit every year. ...


Maybe they should either just give you a t-shirt on sign up or not give a t-shirt at all but prizes to the best contributions or something like that. I'm unconvinced that the prospect of getting a t-shirt would incentivize someone to put a lot of effort in a contribution. I feel like people who do put in the effort would do it regardless of the t-shirt.


I like their other prize for this year: plant a tree. They should remove tshirt and switch to that completely. It might reduce the spam.

Free t-shirts are also pretty controversial due to child labor often involved at some point in the manufacturing.


Such kind of issues exist everywhere albeit varying levels. I would like to draw an analogy with the research field. Researchers are solving a very specific problem in the hope of getting a paper published. Mostly they don't care about how their research would fit in the bigger picture and it is evident from the fact that most research ideas do not get adopted in commercial products (equivalent to PR requests not getting accepted)


> To be clear, myself and my fellow maintainers did not ask for this. This is not an opt-in situation.

I was under the impression PRs only applied if they fixed an issue tagged as “Hacktoberfest”. Is that not the case anymore or am I missing something?

Edit: looks like the rules changed at some point and now it’s any repo. I wonder if they should stick with labelled issues only to resolve this problem?


I'm not a public repo maintainer. I have just my own personal repos, but I will say that for me Hacktoberfest has been only a positive. I think my first ever PR on someone else's repo was because I was spurred on by Hacktoberfest to dip my toes in. Since then, I've become more and more comfortable with git and Github.

I would be very sad to see Hacktoberfest end.


It's interesting to see the other side of the story. I won't dive much further, just wanted to thank OP for the clarity of the post, explaining the rationale and above all offering an array of solutions with various effects. That's the kind of articles I get richer from. Thanks

And take care, hope you don't drown in the PRs.


Perhaps the approach should be: Provide a GH username, submit actually useful code, have it reviewed & merged. Then, when the username is listed under some commit in master you can mail them their damn shirt.

Incentivizing spam should be criminalized over the next decade if we are to maintain our humanity.


Perhaps this would align everyone's incentives:

- Only honor PRs against repositories that have opted-in

- Only allow repositories that meet certain "notability" criteria to opt-in (to prevent the creation of "fake" repositories)

- Only honor PRs that are merged within a specified time-period

- If DO has the resources, volunteer some folks to filter/close spammy PRs on the participating repos

I maintain several open-source projects, and the spam would annoy me. That said, if the constraints above were applied to Hacktoberfest, I would opt-in my own projects. I think these constraints would do a reasonable job of disincentivizing people opening spammy PRs (because I simply wouldn't merge them), while bringing my projects to the attention of developers that are looking to make a contribution to open-source in good faith.


> There is no consent involved.

Isn't it possible to disable pull requests? I thought GitHub had that capability by now. It's unfortunate but if the abuse persists on GitHub I suppose it's always possible to go back to sending patches via email.


No, AFAIK you can disable Issues, Projects, etc. but not PRs.


You can temporarily prevent non-collaborators from opening PRs via settings>moderation>interaction limits.

However I agree, it is strange that you can disable features like issues but not PRs.


I was really looking forward to Hacktoberfest this year, because it's the first time that I'm also participating as a maintainer. But I've already got my first spam PR on my rather unknown 50 stars repository.

Interestingly, this year one can choose between a t-shirt or planting a tree. In other words, everyone who chooses a t-shirt is now considered a person valuing some "useless stuff" over doing something good for the world, which looks like a moral trap from DigitalOcean's side. They should just drop the t-shirt option, which would be both more useful and hopefully stopping at least some of the spammers.


I got one of these (they changed the title of the project in the readme to that of one of their personal projects) earlier and had no idea what it was about until a GitHub support person told me it was just more Hacktoberfest spam (at which point I went up and learned what that is). And apparently it's my responsibility to clean it up? No thank you.

I love the idea, but maybe let me opt in or something instead of putting the burden on me to reduce your spam. It would be trivial to have projects put a "hacktoberfest" label on something if they want to participate, for example.


While it is easy to point finger towards DO (and they do share the blame) I think it is important to remember that in the end its individual people with individual responsibility who are doing this abuse.


Here's a crazy idea: What if you had to actually get your PR merged to get a t-shirt?

That would dramatically reduce the incentives for spam, since a spam PR is very unlikely to be merged.


Now I don't feel so bad about making PRs to my own projects.


Would it be possible to use git actions to automatically flag every new pull-request as spam if it's not from a previous contributor during the month of october?


I've got some useless PRs during month of October. I archived the repo two days back since I'm not actively working on it anyway and who wants to deal with spam?

https://github.com/learnbyexample/Python_Basics/pulls?q=is%3...

I'll have to see if this prompts useless PRs to my other repos. Hope not.


This is rather unfortunate. If only the spammers realized they could simply post four PRs on their own repos. That would at least limit the problem somewhat.


I m always surprise by the amount of effort a lot of people can put into winning so little. I mean, some people do SPAM pull request to get a free t-shirt ?


Such a massive marketing fail. Hacktoberfest reminds me of an event called Oktoberfest where alcoholics from around the world join their ranks and collectively destroy their livers and promote destructive drug taking (alcohol is a drug). On a second thought, given how much of a car crash this thing is, perhaps Hactoberfest bears appropriate name in the end. Damn DigitaOcean... get you . together!


> Hacktoberfest reminds me of an event called Oktoberfest

I'm pretty sure that's 100% intentional.

> where alcoholics from around the world join their ranks and collectively destroy their livers and promote destructive drug taking

Also called "having fun" and "taking a break from the ordinary". You know, things commonly considered recreation.

Of course some people are going to be over-doing it, but that applies to anything anywhere.


Can I just buy a t-shirt and plant a tree by doing so?

I have only little time these days but I like the design and would like to add to the good cause behind it.


Making the whole thing opt-in would be ideal, but failing that there ought to be a simple way to opt an entire repository out.


It seems like we could have a script that marks all low-value PRs during the 7 day window as spam, and automatically emails digitalocean about it.

And then run another script to try to find high-value/non-spam PRs and suggest those to the maintainers for a second look.


High quality plain t-shirts are $6 in bulk.

Why would you save $6 to turn yourself into an unpaid walking billboard for someone else?

To me, wearing clothes with logos or names on them that depict a company or brand that you don’t personally own is the ultimate low-status move.


It seems that you can at least opt-out by sending them an email:

https://twitter.com/MattIPv4/status/1311366041897971712


Counter example, on the mozilla-mobile/fenix and mozilla-mobile/android-components repositories we received dozens and dozens of great PRs last year. Hacktoberfest is something we always look forward to.


This is unfortunately so true. Last year I got away with "only" 4 or 5 spam PRs (and 0 legitimate one). But this year I've already got 2 before my timezone reached October!

Digital Ocean, please stop this.



We got hit with someone that blindly updated a file path in a mirror only repo.

He threw a fit though in response about us "not building a community" in the mirror repo. Heh. Get fucked buddy.


Maybe projects should add to their code of conduct "No spelling or copy changes are acceptable during October"


I tagged a few issues last year and got 38 spam PRs.

This happened because DigitalOcean displayed my issues on their Hacktoberfest page.


Did you get any valuable PR's or only spam PR's?


No


A guy that works at Google complaining about minor PR spam and attacking a generally positive open source program is a bit rich given we can't visit a website or watch a video online without having ads shoved down our throats.

Anyway there is a simple solution... Archive the repo for the month of October, take a break from OSS, and chill out.


I am proud to admit I am the type of guy that will do simple PRs to fix documentation.


> DigitalOcean seems to be aware that they have a spam problem. Their solution, per their FAQ, is to put the burden solely on the shoulders of maintainers.

During such events, I think maintainers(for popular projects) should get some help for spam filtering PRs.


Damn, that seems super annoying.


I feel it’s a bit unfair to blame Digital Ocean for the actions of individuals.


They set up the incentive structure, they're responsible for the effects


god, this is really bad side effect :( please be mindful


I get that this is aggravating, but I think the hyperbolic hammering on the event that's presumably about promoting open source is misguided.

Opt-in could help. So could better access control tools from GitHub.

DO could make it so that users have to use a specific tag on the PRs; there are tons of ways maintainers could filter on that.

DO could switch the prizes to be something less likely to draw spam than a t-shirt would - like free cloud resources.

TLDR; in the spirit of software - let's iterate on this imperfect event instead of junking it outright.



This isn't the first iteration or second iteration or third iteration. Feedback was given and nothing has changed.

Please stop blaming the victims for not doing enough.


Indeed, DO could have done something about that at some point in the past 5 years.

(Although the free cloud resources idea sounds worse to me - those have actual value (spam, mining, ...), so there'd be a real incentive to try and automatically game this)


[flagged]


> i setup some droplets many years ago but they wont let me cancel.

Have you tried to log into your control panel and delete your droplets? This doesn't sound like a DigitalOcean problem on its face.


that's exactly the problem is i cant login, and account recovery efforts have gone in circles


Fix the log in issue and delete your servers.

What would you expect Amazon Web Services to say if you were unable to log in on your AWS account and remove your EC2 instances?


[flagged]


Your server isn't going to delete itself. Seeing how you've responded here, this isn't DigitalOcean's problem, but solely yours.


it's not unheard of for an email address to be blacklisted by a transactional email provider or put on a suppression list. See if you can raise a ticket and troubleshoot why you can't get password reset emails.

I'm not sure why you're expecting a phone call though.


is the crankiness and frustration that obvious? perhaps if i were sapping thousands from your bank account you might be cranky too. i digress never meant to full on rant at hn, just wanted to post a PSA. good day to you good sir.


update: someone from DO reached out and helped to fix the problem. it was still an experience getting to this point... so big up to whoever was lurking on hn. :) all is well that ends well


> To be clear, myself and my fellow maintainers did not ask for this.

Oh yes you did: by using github.

You can self-host and nobody will bother you in a way that you can do little about.


I feel bad for the repository maintainer, but they take Digital Ocean's initiative in extremely bad faith during this article. People are frequently incentivized to do the wrong thing and, oh look, here we are. This situation could be resolved or at least improved upon by Hacktoberfest and a load of maintainers sitting down and talking things out.

This is a comms problem, not a "corporate-sponsored distributed denial of service attack against the open source maintainer community". The well-meaning frequently cause more problems than they solve, but it is better to have them on the inside of the tent pissing out than on the outside of the tent pissing in, it is said.


>extremely bad faith during this article

No, this article never even implies DO is doing this intentionally. The tone is annoyed, even aggrieved, but not really angry. The author, in fact, seems to be rightly applying Hanlon's Razor, and is constructively figuring out how to fix this unintended down-side to what should be a nice gesture by DO.


Perhaps, but actively saying things like “most importantly, we can remember that this is how DigitalOcean treats the open source maintainer community, and stay away from their products going forward.”

Which is not constructive. I think DO should sort this out and there’s any number of decent options just in this HN thread, but this post is only going to help if it generates enough negative publicity on HN for DO to recognize. In and of itself, it’s just another fed-up dev.



After how many years of the same pattern repeating is the complaint valid in your eyes?


I didn’t say the complaint wasn’t valid.


How do you square this opinion with https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...?, the youngest commitor to the Linux kernel?


Have you read the pull requests? I did and I'm appalled that someone would waste maintainers time like this. It's so pointless, and for what? A t-shirt?

15 spam PRs in last two hours https://github.com/phpmyadmin/website/pulls?q=is%3Apr

9 spam PRs in the last day https://github.com/whatwg/html/pulls?q=is%3Apr+is%3Aclosed+l...

Take a look at this user who has made 21 commits this year. 20 of the commits are from today and all of them are for valuable additions like:

* "made with love"

* "you will love it"

* " with "

* "Please do try we have made this for you"

* "That was amazing dud"

* "lovely i loved it"

* "and awesome"

* "and cool"

https://github.com/shibin37

EDIT: Formatting


Yea checking out the author seems to provide the needed understanding.


As completely unrelated, I would presume.


How then do you decide what is a PR just for the free shirt?


Your Linux-example a) is by a long-term contributor having a bit of fun and b) is better quality than many of the spam commits in that it actually makes things a miniscule amount better. IMHO, actual typofixes in the scope of Hacktoberfest are okay - they help a tiny bit and they don't hurt much. If someone learns how to make actual PRs through that, fine by me.

compare e.g. clear spam, which adds a copy of someones website into the HTML specification repo: https://github.com/whatwg/html/pull/5972/files There is no scenario in which that is even a potential improvement.


I don't see how this is a problem. This is what you signed up for.

I am also an open source maintainer, and would love for Digital Ocean to drive by my project.

Isn't this what we signed up for as open source developers?

Maybe I'm just lonely.


Everyone's a critic. Not all changes are one hundred percent good, but just saying "no. stop." isn't the correct answer.


This isn't some backseat driver telling DO how to run their event better. This is literally someone who is supposed to be benefited by the event (an open source maintainer) who is saying the event has negative value for them. They have skin in the game, they're doing this for free compared to the DO employees who are being paid to administer the event.


Why? Open source has worked perfectly fine before this and maintainers are saying it's a net negative. So why not stop? Because DO doesn't get a PR campaign anymore?


DigitalOcean's network is one of the worst on the internet in regards to abusive traffic (DDOS, spam email, hacking attempt origin points.). I know this term is as good as dead these days but their SysOps are not good netziens. Abuse complaints never receive replies, their system images are insecure by default, they encourage novice users to take extreme risks in order to sell more product. /End rant.


This reminds me a bit of this exchange (parent and child comment):

https://news.ycombinator.com/item?id=22407343

https://news.ycombinator.com/item?id=22408905


Well, my attempt to run zmap on DO quickly resulted in banned account so they do _something_


I have never received a response to abuse complaints about zombie machines hitting my network, not once. If you run a website or network you would do well to add the entire DO ip space to your firewalls blacklist.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: