Hacker News new | past | comments | ask | show | jobs | submit login
I went down the rabbit hole of buying GitHub Stars, so you won't have to (the-guild.dev)
322 points by Urigo on June 1, 2023 | hide | past | favorite | 178 comments



Maybe this has been obvious or well known, but this was a revelation to me:

> It took six hours for my order to complete, and the accounts look legit; each has a profile picture, different companies that they work for, a couple of repositories, and a contribution to one or more open-source projects, next to being a GitHub member for over a year.

This is the motivation for garbage AI-generated PRs or insubstantial docs changes that "people" make. It doesn't matter if they're good. They only exist to add surface level legitimacy to fake accounts so that services like this one can exist.


I've noticed this on producthunt too; there are lots of accounts which just summarize the post and you can tell they don't really know what the product is / have tried it. But other than that they seem completely legitimate.


And there are bots on Reddit and on Hacker News that will provide automatic links to paywall-skippers and such.

Not all bots have bad intent. Some compensate for a lack of a protocol/platform-native feature. Which makes all bots occasionally annoying, because human time is spent.


Intent is a matter of perspective.

I'm sure that from the perspective of a publication that paywalls, those bots are not a good thing.


Exactly. I’ve denied so many PR’s which only fixed a typo in comments or some other trivial thing. I know what they’re doing.


Do you then fix the typo with no attribution to the original reporter? Or do you just choose not to fix a known error because you question the motives of the reporter?


Usually this is stuff like fixing "// TODO: this fails reguarly in the CI" (real example from the last such PR). It's the kind of thing I'll fix if I see it, but not really worth going out of my way fixing something like that: it's just a comment in some test code (not user-facing) and the few people that will ever see that comment understand "reguarly" as well. It's a non-issue and basically just spam IMHO.

These are always either bots or people looking to bolster their CV by bragging they "contributed" n PRs to n repos. I signed up to collaboratively make some (hopefully nice) software, not to deal with a stream of PRs like this.

Typos in README or publicly facing docs are different; I usually merge those (and those are almost always good faith too, because usually a real human picks up on them before the bots/script kiddies do).


You deny PR's just for typo's? I've done a few of those in the past all in good faith, perhaps I'm the bot?


No I have denied PRs that _only_ fixed a typo. Edited to be more clear.


I think the people responding are missing the point.

I've also submitted PRs that just fixed typos, and I've considered that a legit contribution.

But if I maintained a high-profile project right now, I'd at least take pause in thinking some of these accounts could be spam reputation-boosting accounts that only make comments/PRs to lend legitimacy to the account when it ultimately stars some artificially boosted repo.

And making it harder to detect star manipulation erodes the signals of trust which have been used on Github, and ultimately can be a security concern (historically I've looked at numbers of contributors, stars, downloads, and issues open/closed as a rough idea of how secure some npm dependency might be.. basically the idea that "more eyeballs" can mean slightly less chance of a massive security issue, especially in security-critical code like oauth libraries)

I don't know what the solution is here. Maybe requiring people sign a CLA like some corporate open source projects do is at least enough of a barrier


I issued a PR to correct a typo in a popular ML library just two weeks ago, I am not a bot, this strategy seems flawed.


Any mechanism to combat abuse will have false positives as well.


What proof do you have that you are not a bot?


He passed the Voight-Kampff test.


So you’ve just left the typo in the project? I’ve opened PRs just to fix one or two misplaced characters before. Just trying to do my part to help


Could just fix the typo yourself and reference the PR in your commit then close the PR without pulling.

I imagine contributors won’t exactly be happy with that though.


Honestly, who cares if the end result is the same? Their forked repo with the patch on still exists, the patch was incorporated in whatever way made sense to the original repo owner.

OK, a credit, but really, who cares? People can see what an accepted pull request consisted of, so I'm not sure they're kidding anybody in terms of boosting their reputation with credits for fixing typos.

All the same, I'm just glad to see people improve their presentation, especially typos.


I probably submitted more than 10 PRs that only fixed a single word in projects I like.

I do that when I see typos in documentation.


Same, and I appreciate similar contributions to my own projects


I had the same thought. Is fixing typos not contributing? Should I not be submitting PRs to help with documentation/polish/etc? I never even thought that trying to help would be viewed as malicious.


Project maintainers frequently hold themselves back with amateur presentation, and that includes typos. It's hard to take some things seriously if they are failing at English, let alone their programming language of choice. It's sad because there's plenty of amazing open source out there, but the presentation is terribad.

IMO, the solution is simple: allow project maintainers to disable pointless metrics that would incentivize the GitHub equivocal to karma farming.

I also think the quality of comment on HN suffers for the fact that the karma score is visible metric to the end-user. Reddit particularly. The view count on tweets too.


Denying that PR and fixing it yourself is taking credit for others' work, and leaving it in is no good either. I don't see any upside to rejecting them. I'd be ashamed of myself for that.


Well, i submitted such a PR today. It hasn't been rejected (just yet).


Lol so you didn't accept PRs to "own the bots"? Your horse might be a bit too high you know.


Relevant xkcd: https://xkcd.com/810/


Thus, we win once the bots submit useful PRs.


Someone submitted a PR to me yesterday on my open source repo fixing a typo — did not occur to me it was part of this strange con


It probably was legitimate. There are tons of people (myself included) that send PRs to fix typos and such. I also accept PRs like that to my own projects.

A lot more people will read the docs than the code, and typos are annoying and for some people highly distracting (OCD)


It probably was not legitimate. There are wonderful people like you but they are dwarfed by the thousands of scam accounts using this method to try and engineer legitimacy.


I did it too and now I hope people don't think I tried to scam them by being nice...


I doubt it. I too submit small fixes.


Same here. Typos bother me. Rather than complaining, I just submit a PR to fix particularly egregious ones.


[Citation needed]


It's reassuring to know that I'm not the only one with this issue. Typos can be extremely disruptive, often compelling me to re-read the entire sentence.


how is it a con if it's a useful contribution? it's not my job to filter bot accounts used for starring repos. if a PR improves my project I'll accept it.


If it’s useful (actually), it’s not a con on you of course! It does pollute the overall ecosystem though, since it is used to prop up fake stars and the like.


Same. There ought to be a way to submit inconsequential things like typos through a separate system that don't get counted as pull requests. That way the people who are genuinely claiming good faith shouldn't really care if it doesn't count as part of your "PR score".


Yes, botspam is bad. But if a change is good. A change is good, merge it and get on with your life.

If someone wants to write "check_spelling_bot" and get a ton of github karma, I have 0 issue with it. In fact, I encourage GitHub to do it :).


I've never seen "good" changes from these bots. The code they propose often isn't even working, it's just a change that looks like it might do something. Some of the spellcheck changes might be beneficial, but there's an ethical question about whether accepting those PRs (versus making the change in a new commit directly) is a net-negative to the community overall.

If you got spam, but the spam was useful to you personally, not marking it as spam prevents your email provider from flagging the account as spam to the wider world.


But if it's useful I'd argue it's deifinitionally not spam for you. And if most people don't mark it as spam so it doesn't get flagged, that means it just isn't spam. Bots send emails too, and not all bot sent emails are spam.

Relevant xkcd: https://xkcd.com/810/


> But if it's useful I'd argue it's deifinitionally not spam for you.

That is absolutely false. Spam is not defined based on it's utility, but based on whether a message is solicited.

If the author of a repo hasn't signed their repo up for an automated PR bot, that bot sending out PRs is absolutely spam.


Most definitions for it include whether the message is irrelevant or unwanted. Trying to define it purely as unsolicited gets into weird cases where you'd say that an emergency tornado alert is spam because it's unsolicited. Maybe there's a definition if spam where that's the case but if the residents of an area find that alert useful it seems unlikely that they would describe it as spam, even though they didn't solicit the warning


I don't see any definition that address "irrelevancy" but many do mention "unwanted", which is effectively the same thing as saying "unsolicited".

Edit: Emergency alerts are solicited by owning a phone in a country that mandates that cell carriers send those messages.


> Edit: Emergency alerts are solicited by owning a phone in a country that mandates that cell carriers send those messages.

That's so pedantic, I'm inclined to disagree - if it's even correct. Not everything that's unsolicited is spam. An unsolicited thank you note isn't spam.

More components of spam are traditionally a large, indiscriminate number of receivers, and low quality messaging. Irrelevancy does partially capture both of these.


You may view it as "pedantic" but it is how language works.

There is a complex set of social expectations around when a thank you note is unsolicited. I could start trying to describe them, but then you'd probably call me pedantic again.


For me googling spam definition brings up google definition in which irrelevant is the first word.

Before getting sidetracked on definitions of other words, I would be deeply surprised if asking a bunch of people on the street "If you received a message you didn't ask for and didn't expect, but it turns out you are very glad you received it and it provided you value, is that spam?" resulted in a broad consensus around "yes"

Unwanted and unsolicited are definitely not the same thing, whether or not you solcit something has nothing to do with if you want it, I'm not sure how those are synonymous. And if being mandated to recieve a message by someone else counts as you soliciting the message, where solicit in its most common definition means to ask for, then I guess that means that everything is solicited?


Since you failed to post the "google definition", here's the one I get at the top when googling "spam definition"

> Unsolicited e-mail, often of a commercial nature, sent indiscriminately to multiple mailing lists, individuals, or newsgroups; junk e-mail.

Note, "unsolicited" is the first word and "irrelevant" does not appear anywhere and there is no discussion of the utility of the message to the recipient.


PR spam is annoying, you end up with “oxford-comma-bots” fighting “no-oxford-commas” forever to get credibility.


this is valuing a humans time of filtering through good and not good changes at 0. Imagine if you had to receive all email and judge each individual one for spam / not spam.


I occasionally ponder if I should send many small changes at the company I work for. Not because I'm not a real boy but because every so often my manager will be impressed by my commit count. Not that I'm trying to inflate it (I just try really hard to avoid meetings so I can write code which I actually enjoy), but...sure seems like it encourages the wrong behavior.


Oh wow, thanks for pointing it out. It’s now clicking for me.

A few months back, I noticed that there were some accounts posting issues on open source repositories, but their issues were a direct copy/paste of mine. I couldn’t figure out why they would copy/paste my issue so I dismissed it.

Now it makes sense!


What sort of "AI" model drives bots like this? Is it really AI? Or is it more like automated scrapers that run some basic deterministic functions (like spellchecks)?

Or are people actually deploying LLMs to inspect code and produce usable optimizations?

Because that would be an interesting beneficial side effect to an otherwise "nefarious" marketing hustle.


I expect it's a combination of scripting, some manual effort, and AI.

For what it's worth, I've seen lots of examples of this, and "usable optimizations" is entirely false. The PRs are often not working code. It's scattershot. The point isn't to make a PR that benefits the project in any way, it's to fill in the green square on the profile so it looks like there's a human doing things.

Anyone giving this stuff more than a cursory glance would see that it's all bullshit. But the point isn't to stand up to scrutiny. It's to defeat abuse protection measures with legitimate-looking activity. And in the case of stars, to make it look to anyone who's just glancing at the star-ers that there are real people starring the repos.

It's deviously clever and absolutely terrible.


Depending on where the scam is run, I wouldn't consider it out of the question that it would be economical to do manually.


This is the sad truth for many things that go "viral". If you are a creator or trying to sell an online product and you have scruples about this sort of thing, you are inherently working at a disadvantage. They allow steroids in this league and everybody who isn't taking them is almost certainly a loser by default.


I have a friend who basically kickstarted his entire career by boosting posts on Reddit with bots he ran. I won't knock his talent, he's a brilliant engineer, but I 100% believe he's gotten where his is in life because he knows how to play the game properly. Heck, getting his indie business off the ground is only one of several examples he's admitted to me of using Tor networks and bots to boost his ranking in competitions or sites like reddit/twitter/etc.


Funny you should mention Reddit, because IIRC Reddit itself started with first-party bots pretending to be users, manufacturing "social proof" to cheat around initial user acquisition hurdles.


You recalled correctly! Huffman isn't shy about it.

https://www.vice.com/en/article/z4444w/how-reddit-got-huge-t...


This is a very old trick; the Nazi party started their membership numbers at 500 to make it appear they had more members. A certain failed Austrian painter had membership number 555.


> Funny you should mention Reddit, because IIRC Reddit itself started with first-party bots pretending to be users, manufacturing "social proof" to cheat around initial user acquisition hurdles.

"started"?

They are still doing that today to inflate their user count...


Maybe any game where the "proper" way to play involves dishonesty and sociopathic behavior isn't one we should be structuring society around, and therefore we really shouldn't be trying to play it in the first place.


How would you check and enforce it to make sure?

Obviously some fraction of the population will lie about being honest and upright.


We don't have to do that, just make sure that dishonest behavior is not the dominant strategy. In our society, we could start by ceasing to promote and reward such behavior constantly.


> We don't have to do that,...

Why not?


People don’t realize just how many books and blogs, websites, apps and open source projects are out there. What’s going to happen on day one, most likely, is that the only entities that will view your works will be bots, scraping for content to train off.

If you think you won’t succeed because you just refuse to buy likes, I think you’re still way off the mark.

You need to release to a market, that you directly engage with. Releasing into the void is a mostly self-serving exercise.


I think this can be further generalized as – advertising and marketing are as important as building the product itself. Too many people (especially engineers) think that if you create something that is 10% better than the existing mainstream solution then the entire user base will flock to you overnight, but in reality no one will care because no one will know about it.


Not only that, but even if I *do* know about it, I still won't want to use it. I already know the old mainstream thing. 10% efficiency gains isn't enough for me to invest my limited time in learning a new tool, platform, data model, etc. 70%, 80%? Maybe then, but I'd still be suspicious.

What usually does cause me to adopt something new is 2+ co-worker / colleague recommendations. For example, I would never have started using k9s until 2+ colleagues started to hound me to stop typing out kubectl commands and to switch to k9s ASAP.


just being a stickler here but 10% is probably enough


If it's consistently 10% better, and every other aspect remains the same.


Paul Graham would say for exponential growth, 10x better is the bar


This is really important. One of my friends worked in a startup, an online retail store similar to Amazon, and it came out before Amazon) but the marketing wasn't there, so that service failed.

Nobody knew about it, so it didn't make any profit. Great concept? Absolutely! Look at Amazon today.


It used to be called "growth hacking".


Yes and no. To give an example, people have been cheating at YouTube for ages with viewbotting, but Mr. Beast managed to smash the viewership of any of those people by using YouTube’s analytics to crush the algorithm. Also as you scale up, the less viable the strategy becomes due to risk of exposure.

The idea that any creator or advertiser NOT buying views is “almost certainly a loser by default” is an exaggeration.


> people have been cheating at YouTube for ages with viewbotting, but Mr. Beast managed to smash the viewership of any of those people by using YouTube’s analytics to crush the algorithm.

This... doesn't sound much better. In some sense, it's buying view with extra steps. Both methods are about gaming YouTube instead of creating quality content.


>Both methods are about gaming YouTube instead of creating quality content.

If you scrape away the abstractions for a second, what you will find is Mr. Beast used data to tailor his videos to get his viewers to watch lots of ads and buy lots of products and love his videos and become repeat customers. Mr. Beast is rather transparent about all of this. If you think he’s a creepy manipulative guy, you just don’t watch his content.

The whole thing with buying engagement is that you’re confusing the signal. Instead of coming through the legitimate channel where you openly pay for advertising, you intentionally deceive people, you make it just a little bit harder for the poorer and unconnected to achieve social mobility, you pollute a useful signal with false data, and you do a large number of things which annoy netizens in particular.

All I’m just saying, is that there’s more than one path to be successful. I know people who used cert dumps on the pretence that everybody else is doing it and they won’t put themselves behind, and I know others who literally just did the work and it really didn’t take THAT long in the grand scheme of things and they actually came away having learned something useful at the end of the day. Both kinds of person succeeded.


"Quality content" is decided by YouTube's algorithm (viewer retention etc).

So no in the end, the winner of the algorithm is the one creating the "quality content".

Or in other words, if your viewer retention is terrible, it's probably because your videos are just not that good.


People use stars to get a gutcheck on how serious a project is. I'll be honest, I've been trained to look at Github stars, most recent commits, and how responsive the project is to issues to see if I should use something...


This is ridiculous, obviously flawed in its efficacy.

My way is way better - just simply look at the entire code base, read every file, make sure nothing phones home, internally criticize the design choices, realize I could have written something way better, start on the project, get halfway done, realize I wasted a lot of time trying to do a stupid project just to save a few keystrokes on a tiny program I almost never use, give up, and finally, realize I didn't want that package anyway.

Clearly. obviously. better.

(There's probably a krazam on this, but /s if anyone needs it)


I have no idea why you're marking it /s. This is my life


I don't remember posting this comment.


This is the first time I've run across Krazam and wow this is hilarious. Thanks!


undiagnosed ADHD be like


I look at 'most recent commits', but sometimes with smaller pieces of software, no commits for a year or two can mean it's very stable and it does that one thing very well...


Yep. When we evaluate new open source packages we look at a lot of factors, but they include stars, age, commit frequency, issues, etc.

* stars - the likelihood that an issue affecting us will affect someone else. More people affected, the more likely it gets fixed.

* age and commit frequency - old projects with continuous commit history are preferable. It's okay for recent commits to become trivial/minor. Likely a sign of mature software in a maintenance mode. It's _very_ concerning when there are no commits.

* issues. This is typically a lot harder to simulate simply by buying likes. There are four factors here:

  * Do the number of issues line up with the seeming popularity of the project?

  * Do the types of issues line up with the types of problems you'd expect from the project? 

  * Are there any issues, particularly open issues, that block certain functionality or use cases?

  * Do the maintainers actually interact/discuss issue?


> stars - the likelihood that an issue affecting us will affect someone else. More people affected, the more likely it gets fixed

So you are equating stars with adoption, which they are really aren't a good measure of. Even in the most favorable interpretation stars can only serve as a proxy for adoption when used in combination with a lot of other signals.

Especially for hyped topics, even ungamed stars are more directly a proxy for interest rather than adoption. And as many startup founders of failed freemium products can tell you, very often interest doesn't translate to adoption.

I had to learn that myself the hard way when I helped create a top 5 starred Rust project at the time (~5k stars), which I can tell you for certain no more than a handful of people outside the company used.


Well, that's why we don't use stars on their own.

As I explained, we're building an aggregate view across a bunch of metrics with human judgement making the final call. Further, we're often comparing against alternatives so this isn't really a Yes/No type of thing (especially on stars). It's a general sense of a project's well-being and stability.

----

LangChain is a great example:

* 44k stars, 5.1k forks

* Roughly 8 months old (very young)

* Very active commit history

* Insane number of issues

* Lots of open PRs

This is a tool that we'd be hesitant to use without proper risk mitigations. It's young, it's moving very quickly. While it has a lot of visibility, it's not clear if that's all moving in the right direction.

By contrast, underscore:

* 27k stars, 5.6k forks

* 11+ years old

* Stalling commit history

* Some issues, but not many.

* Not many open PRs

This is clearly a project on the opposite side of the spectrum. If you didn't know about mature alternatives (lodash, rambda, etc), it'd be a safe project to approve.


I first check projects for regular commit history, and long commit history.

Next I try to read through issues and PRs to see how the maintainers deal with the community.

Finally I try to develop a bit with the package. Stars generally don't factor in.


I look to see if a project has a test suite and how many open issues without any response.


It mystifies me how people think Github is important to use for the community. Stars are often cited as a primary reason, presumably so projects can list their star counts on their web pages.

To me, the choice of Git repo should be one of the more portable aspects of a project. I know there are hideous lock-in mechanisms like issue lists and wiki discussions, but hopefully someone will standardise them one day into Git objects.


I see two community reasons for using GitHub:

1. People already have accounts, so it lowers the barrier to reporting issues and contributing.

2. GitHub provides some level of discovery that's hard to replicate otherwise.

Both of these are a function of GitHub being popular. It doesn't have to be a particularly great tool if it's what people are already familiar with. I wish it were otherwise, but I don't see a real alternative that can compensate for these advantages in the short term.

I love your idea of having issues/discussions/etc managed directly in Git—that seems like a more flexible and adaptable system. Even if somebody developed a standard like that, though, it seems impossible to get any of the big repo hosting sites to adopt it since it directly competes with their own value proposition :(.


Also 3) I think GitHub is the best product out there at the moment (and I suspect a lot of people share the sentiment).


To be clear, though repetitious, I was only mystified about the community aspect.


I can't find the podcast but, IIRC, Bryan Cantrill said that at one point VCs would look at GitHub stars to gauge interest in various projects in one period of time. Apparently this might have been a factor Docker's rounds.

I think this was Oxide and Friends show, but I'm unsure.

Makes sense on its face, popular projects = potential money. Whether that's a good signal or not is a different discussion. I even think on the episode they mention how misguided this was, people using something might not mean there is potential for revenue if the people are using it solely because it is free and licensed appropriately.

People are just looking for patterns anywhere to justify whatever they want. I've seen npmtrends used to justify adopting certain libraries. Really makes you wonder.


Can confirm that it was Oxide and Friends, just heard this one the other day. Here's the episode from anyone interested: https://www.youtube.com/watch?v=l9LTJdT0sZ8


This is still happening today. There are multiple VC groups trawling through GitHub trending and star lists to find potential investment targets.

The Shopify acquisition of Remix (https://news.ycombinator.com/item?id=33405997) established the viability of this investment strategy.


Just in case you don't already know this: the wiki is just another Git repo that can be cloned (and even pushed to).


I didn't know that. That's cool.


It's pretty annoying personally IMO. You should be able to point the Wiki tab to a folder in the main repo instead of having to maintain two different repos.


I see; I suppose not being able to make a single commit that updates the wiki and codebase at the same time is a bit annoying. I'm just pleased it's Git at all!


And it would be nice to have the branch/PR workflows for them. Better to just use a docs/ or wiki/ folder since markdown files are rendered.


GitHub is great for discoverability, not exactly "community". Because GitHub is a high-ranking and trusted site some of it's "creditability" can wash-off onto one's own project.

Stars are a vanity metric but, they can help with monitoring some early growth.

Everyone on this thread should list their projects, then we all star each others stuff all the way up!


Yeah discoverability is a good point, although I don't think I've ever used Github that way (generally I discover Git repos from a URL), I'm sure some people do!


Perhaps it's like fiat currency:

If enough people act like it has value, eventually it kinda does.


Like non-fiat currency as well, in that way.


Regarding the issues, there are some projects like git-bug https://github.com/MichaelMure/git-bug trying to embed these sorts of meta-work into git.


It mystifies me how people think Github is important to use for the community.

Github is a package registry for NPM, Docker, Ruby Gems, Maven, Gradle, and NuGet. Quite a lot of things would break if it disappeared.


Well, and rather more importantly it's a very popular Git repository. But I'm asking about the community aspects, not the technical delivery.


a lot of community has migrated to GitHub ... because it looks like more "modern" than mailing list. besides, GitHub is the only visible part of community in the perspective of new hands.


No, people don't use GitHub because it looks more modern than a mailing list. They do so because it more closely matches their expectations of how to use computers and software.

Mailing-list likers have not come to grips with this in the last twenty years and I don't expect them to start now, but the dismissive attitude might be preventing you from learning something. "Principal Skinner, it is probably not the children who are wrong."


why you directly think I dismiss something? I just want to indicate there some community only exist on github.


The real revelation here is that people take this kind of Imaginary Internet Points seriously enough that there's a tool to algorithmically assess the quality of a project's GitHub stars[0].

Like: what?

Stars are basically the laziest possible way of doing anything with a repo besides looking at it. What possible signal of any value could anyone possibly hope to discern from a repo's star count? And yet not only is there an economy of counterfeit stars (flares? c.f. pieces of flair, which are equally meaningless. Also, the budget ones are ephemeral, just like a road flare), there are people who care so much about stars and flares that there's a whole 'nother economy behind discerning which are which.

Mind. Blown.

[0] Astronomer, mentioned in TFA https://github.com/Ullaakut/astronomer


If I have a choice of two pieces of software I am definitely using the one with the most stars first. I never considered that authors might have paid for stars.

I wonder, when you go to StackOverflow, if you would start by checking answers at the bottom since votes are so meaningless?


> I wonder, when you go to StackOverflow, if you would start by checking answers at the bottom since votes are so meaningless?

I usually read from top down to the point where the answers get to a "have you tried sticking a fork in your toaster?" levels of bad if I don't already have enough domain knowledge to know whether an answer is good for myself. It's frequently of value to me to know what the wrong way of doing something is so that I don't inadvertently go down that road myself.

It's also lower effort to read a stackoverflow answer before voting. I have greater confidence that the person voting may have actually read the answer than that a person starring a repo us actually looked at even the first bit of code.


When something can be treated as a score to compare with others, it will be. Everything else just follows.


While I was suspecting something like this is taking place, it's quite sad to actually have solid evidence.

I was actually doing my own kind-of research in hopes of starting an open-source-focused start-up. While I know stars aren't the definitive indicator of the project quality or popularity, it certainly helps build a good image, as it's one of the easiest public stats potential users can track. Also, unless gained in nefarious manners, like described in the post, they also roughly align with the project's current stage.

I was exploring how other start-up projects gain their stars, mainly by aligning the visible bumps in their star count to various events related to the project.

Pretty much all official start-up or seed funding announcements I've seen happened in the range of 300~3000 stars, mostly at ~1000. The main drivers behind the star bumps of the projects were e.g. viral HN or Reddit posts, PH launch, or getting to GitHub Trending.

For some projects however, I couldn't find any events related to the stars increases which, considering this post, makes the possibility of them being bought higher than 0.


"Instead, ask your colleagues or your community on Twitter why you should pick this project over another. You can also start a new discussion or create an issue on GitHub asking for other people's experiences."

As it gets easier and cheaper to run LLM-based bots, I wonder how long this approach will keep working. Devs in mid-to-large companies should have colleagues they can ask directly, but smaller startups might be vulnerable to being artificially swayed towards specific options.


I'd rather just do without than having a "community" on twitter.


Assuming the issues aren’t also hijacked it’s pretty easy to do your own sanity check of a project. Takes no more than a few minutes of archeology in the code and issue tracker.

Also, Twitter has a fomo/cool bias which is really terrible for software imo. Boring is usually better.


I judge OS projects on github by the presence of donation/sponsor information. If this is missing, the project will eventually die, regardless of how popular it may be. Sure, forks will popup, but would you use semantic ui fork or the original semantic ui?

To all of you out there who host their open source projects on sites like github: please provide ways for your community to help out. PRs are not helpful as if you get 100 PRs in a week, there is no way that you, as a single maintainer can ever evaluate and merge all of them. You will get tired, exhausted and start looking at your project as a second job which pays nothing.

Your project will be used by fortune 500 companies and when you find out about it, you will regret not having opened a gateway for donations.

I want to support projects like semantic ui, but please help me help you. Stop hunting “stars”, as this very article clearly shows how worthless they are (github should remove this feature imho).


I'm one of the maintainer of Hurl [1], an Open Source project on GitHub. I work at Orange, a French telco company, and the project is under the company umbrella. So we don't have any "Sponsor" information, it's rather Orange giving back to Open source community.

I can keep working on it provided there are indicators of adoption from outside Orange. In this case, the only KPI are stars and forks numbers. So stars are really important to us.

[1]: https://github.com/Orange-OpenSource/hurl


I maintain one or two projects that are somewhat popular, and I'd love to monetize them (without compromising my FOSS ideals) so I could work on them more.

That said - perhaps I haven't looked hard enough, but it just doesn't seem like projects that accept donations make enough money (on the whole) to be worth the time spent setting up donations.

Sure, I've seen maybe two where a developer ends up being able to work on their project full-time (that's the dream), but most seem to make like $5/year or whatever.

Am I wrong? I would love to be wrong.


You have to get used to mentioning it in every available avenue (readmes, docs, CLI greeting message, any web UIs, release notes, issue templates, etc), and also be happy with a really slow ramp-up time, but yes, it’s possible. Source: I’m Benjie on GitHub if you want to check out my sponsors profile.


I missed your reply a few days ago. Sorry! And thank you for taking the time.

I'm terrible at marketing myself. I suppose that is a hurdle that I must overcome. I'll check out your projects to see what I can learn.

Congrats on your success!


On a related note, the dagster team did a similar search for fake stars: https://dagster.io/blog/fake-stars.


Discussed:

Tracking the Fake GitHub Star Black Market - https://news.ycombinator.com/item?id=35207020 - March 2023 (284 comments)


Here's another way if you are a company. Ask applicants to check out your repo and give it a star. Bonus points if you are a startup and make the CEO personally message the applicant on LinkedIn


Don't give them any ideas, please. Because the next thing you will cause will be fake job applications in order to get people to apply in order to coerce them into giving the repo stars.


The one I wrote above already happened to me


Ugh, that's really bad and deserves for the company to be named. Super unethical.


I don't remember source forge having stars, at least not during the time it was used to index some of the most impactful open source projects. So it's definitely possible for OSS to do just fine without this feature. I really loathe these systems that insert themselves between users and the repos adding little value and look forward to their demise. The repo will tell you if something is still active and maybe I'm naive, but I doubt faking commits happens as often as fake 'stars'.


I'm constantly fascinated with the underground world of manipulating the algorithm and user behavior. You can literally buy almost anything if you find the right site/scammer/market.


Something not highlighted here is that you can purchase bots for a project you don't own. Which could damage the reputation of that team. Something to keep in mind.


I've run into this professionally, companies looking for funding will try to get fake stars to show a rate of adoption that isn't real.


At that point it's just straight up fraud, right?


Yes, that's exactly how it is treated. Actually, attempted fraud. But if it were successful and came out post deal the founders would be in pretty hot water.


A family member is trying to create content online and was really discouraged by looking at the number of followers and likes in other people's profile. I had to open a few profiles and pick followers at random to show that those were bots. These people weren't even remotely related to the creator's field of study. Also, they had 500k followers but their lives usually can't even gather 10 people to watch them.

Then we looked together for ways to buy that in minutes. To be honest, I knew this existed but didn't know it was so easy and widespread. It was a learning opportunity for both of us and it gave them more encouragement to continue working.


Alright so nearly all the 'cheap' stars got detected, but did any of the 'premium' stars get detected?


I tried to look up some of those accounts and they don't seem to exist anymore


This shows that anyone who markets FOSS projects with statements like "It has 12k stars on GitHub!" is either clueless about this kind of manipulation, or complicit in it. I have previously expressed skepticism about how useful stars are in the first place, but this is pretty clear evidence that they are basically minor ads that can be bought and sold on the marketplace.

As such, I will be examining open source projects that emphasize their star count in marketing a lot more closely from now on.


Looking up how to buy GitHub stars, it looks like you can also buy reactions (thumbs up, thumbs down, etc). Kind of obvious, considering everyone already knows you can buy stars, but I never considered this would be a thing.

This is actually more worrying, as those are frequently used for voting and can really influence the direction on a project for contentious issues: "Look at all the +1/-1!"


Call me contrarian, but I rarely look at how many stars a project has, and that's probably because I've started to be more skeptical of large projects by default. If a project is smaller, has a good README, isn't quick to ask me for money, and I can read/understand the code, I'm more likely to give it a star.


> As you can see from the receipt, my order is number #57189, so it's definitely not something that's only used every once in a while

It's funny that the author things he discovered a "signal", from a company whose business is selling signals.

> Well, the biggest one is that those are brand new accounts — they were created at the time of my order. They don't have any fake personal information or repositories or contributions.

> And after a month, they are all gone. GitHub detected and banned them.

Obviously, the cheap option is there only to prove to the clients that expensive option is worth it. Here the author believes he found a signal - stars disappear after a month. I would bet that the people collecting the money are making they disappear to make a point.


The order number is the WooCommerce order id, which is the WordPress post id. It is not sequential. The extensibility of WordPress comes in part from its "custom post types", so any WordPress plugin that uses a custom post type contributes to incrementing this number. E.g. if I write a logging plugin that stores entries as a custom post type, or a mail fetcher, or an obvious example is the products listed on the store itself, if they cumulatively came to 57188, then this is the first order on the site.

The order number is a poor indicator of the number of orders.


I was insinuating that maybe it was faked altogether to appear like they have more business than they do.


I absolutely couldn't care any less about how many stars my repos have. The fewer, the better.

I basically write my code for my own consumption. Some of my projects are kinda weird[0], because they were designed to fill a specific need at the time.

[0] https://github.com/RiftValleySoftware/RVS_IPAddress


I've always been a fan of looking at number of issues and closed issues vs. open issues (judging differently if they auto-close).


How interesting. I think my biggest surprise from this is that Github actually looks for illegitimate stars. I wonder how that affects the repos with a huge amount of stars, like if they get culled often dropping the repos to a huge degree. I'd imagine a lot of these bots follow big projects to look more authentic.


Any type of functionality like this will for the foreseeable future be "purchasable". Not sure how many here frequent BlackHatWorld, but it's a fascinating website for an SEO such as myself. Same goes for Google reviews, yelp reviews, CTR manipulation, etc.


Can't Github buy some high quality stars every now and then to kill those accounts who look legit (or perhaps once was legit) and ban them for partaking?


This topic always makes me think of this song, it should be played on GitHub: https://www.youtube.com/watch?v=hT_nvWreIhg


Starring a repo and using it should be treated differently. There are thousands of projects with thousands of stars that have probably never been used in production.


Unrelated to the article, but it looks like the Twitter embed at the bottom of the article breaks the back button by adding another entry to the browser history.


I feel like I read an article just like this last year...


Got it. So buy stars for my competitors.


So pointless, is anything authentic?

Your all probably bots too


Nice use of ‘your’ to try and trick us bot. Oh I’ll make a stupid typo to trick the humans


Beep. Boop. Bop. Hello fellow commenter.


Good AI means you would get more out of interacting with the bots.

Enjoy humans while they still matter.


Can confirm. I am a bot.


As a language model trained by OpenAI, I can neither confirm nor deny that I am, in fact, a bot.


I can see how "As a language model" will in the future be the equivalent of ACK and HELO


Meanwhile, the sudden influx of upvotes on my not particularly funny third-level reply, to a second-half-of-page subthread, on a mundane-looking HN submission, suggests some people must be using some kind of automation (e.g. F5Bot) to get notified about certain keywords. I'm guessing "OpenAI" did the trick here.

Hashtag Elon Meta Mask anyone?


EHLO ycombinator.com


I once accused a customer service chat ..'backend' (let's call it) of being a bot, in annoyance - which was met with a slightly pissed-off sounding 'I am not a bot I am human' or similar. ..which wasn't entirely convincing?

(Human or not I think I was already too annoyed with the whole thing/had annoyed them too much with the accusation to get whatever help it was I was looking for.)


Good bot


*you're (I'm a grammar bot)


Ok, what are GitHub stars? I thought they rented space for git repos? (I'm paying them for that, at least).

I've never used github to find repos and don't even know how, for public projects I've heard of them some other way and went to their github page if they were hosted there...

Do they have some social networking features that I've managed to ignore so far?


Stars are akin to a like on youtube, which seems to be used in the social game of github. If there is a metric, there will be people worshiping it, and the same is true for those stars. Though, I'm not sure how big the impact of them really is. For example, there is https://github.com/trending where you will see how many new stars the popular projects gained today.


I’ve been using stars as bookmarks instead of likes. A lot of times I’ll encounter a problem and remember that there was some software that solves it and go through my starred repos to find it. Guess I’m in the minority


You are not alone with this. It's probably even the original purpose of it, considering that some browsers use star-icon for their bookmarks. And GitHub even supports this by having lists for managing starred repos.


If I'm deciding between open-source libraries, choosing the library with 12 stars is far riskier than choosing the library with 2,500 stars.

I know it's not a perfect measure but I don't have infinite time to spend researching every choice I make, and this is (for now) a pretty reasonable heuristic that a library is maintained and has eyeballs on it.


It's just the "click to favorite this repo" mechanism. It lets people track repos more easily ("see updates to started repos") and it's become a social gaming mechanism ("the most-starred deep learning cryptocurrency Blockchain mumble jumble on GitHub!")


You need stars to ensure your open core marketing strategy is effective.


Bingo. I realized it's all a game, when I worked developing facebook integrations at various digital agencies in 2010-2014 and they all bought tens of thousands of Facebook likes to kickstart the "organic" likes when they developed some app and campaign for a big brand. At the end the likes rang up, everyone was happy, got their Christmas bonus, etc.

I am not sure how much the advertising industry isn't just a zero-sum thing just like the casinos and exchanges for stocks and crypto. For someone to win, someone else has to lose. And the house always takes a cut from everything lol.


It is a metric similar to a "like" on YouTube or Instagram or something.

Github does more than just "rent space for git repos". It does do that, but there are two sides of GitHub. There is the legit CI/CD Git platform side, and then there is a social network of code side. The stars are a mechanism that contribute to the social network side of github.

There are basically three metrics on GitHub:

1. Stars - Akin to "likes" on other platforms, or favorites, or hearts. They represent a thumbsup, or a soft-bookmark to find a project later. Indicates appreciation or interest in a repo/project. These are the most common form of karma or metric on Github

2. Watch - This is like following a specific repo. You can get updates when a new release is made, pull requests are made, issues are submitted, etc.. It expresses deeper interest in a project/repo because you want to see updates as they happen.

3. Follows - These happen at the account level. So you can follow an entire organization or developer and see everything they do. This is like subscribing to someone on YouTube or following them on Instagram. You want to see more projects and updates for them in your feed.

All these metrics combine into an internal "SEO" on GitHub search (and Google search). So stars actually do tangibly contribute to the likelihood of a project being discovered by strangers. Which is what gives them value.

The author also points out that having a lot of stars is often perceived as being more trustworthy of a project. This is an intangible benefit that the author was expressing he wanted to test. However, as the author stated, this is a bad metric in practice, but it doesn't mean that other people don't perceive it that way. Similar to a twitter account with a 100,000 followers, you will assume is more noteworthy/trustworthy/prominent than someone with 1,000 followers. This is not necessarily the case, which is similar to how it works on github as well. It is often perceived this way, although it shouldn't be.


Stars also can act as bookmarks to projects. There are tools out there (https://github.com/simonecorsi/mawesome is the one I use for my private stars list) which can take your stars and turn them into a markdown file.


> Similar to a twitter account with a 100,000 followers, you will assume is more noteworthy/trustworthy/prominent than someone with 1,000 followers.

I'd assume they're better at catering to the lowest common denominator tbh.

As for libraries/project trying to solve a technical problem, I'd look for 3rd parties saying they're using them and how it's going for them. Definitely wouldn't trust "stars".


> It is a metric similar to a "like" ...

They used to be described as a "bookmark" for a repo, same as browsers do for web pages.

Thinking that over a bit more, it's still probably the more accurate mental picture of what they're for vs a "like". But, that might just be me. :)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: