Hacker News new | past | comments | ask | show | jobs | submit login
I found black-hat content marketers: sockpuppet bloggers, fake Reddit/HN accts (twitter.com/troyd)
393 points by troydavis on Oct 14, 2020 | hide | past | favorite | 241 comments



The Twitter thread has a lot more. There's a person or firm who is doing black-hat content marketing at scale. They either approach freelance bloggers to write articles about their clients (to be submitted to Medium-hosted and similar blogs as if written by unaffiliated users), or create Medium profiles for fake names and write the posts themselves. Posts have appeared in Better Programming, DEV.to, FAUN, freeCodeCamp, HackerNoon, ITNEXT, JS in Plain English, Level Up, Towards Data Science, and other aggregators.

From there, they use dozens of fake Reddit and HN accounts to submit, comment on, and artificially boost their own posts. They also submit some unrelated cover posts. You can see some of these on HN here: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

They have sockpuppet Reddit accounts doing the same thing across 10s of subreddits.

Here's the set of clients I know of (https://twitter.com/troyd/status/1316020415995674624):

AccessiBe

ClimaCell

Imperva

Loadmill

Rookout Labs

WhiteSource (AKA SecureCoding dot com)

Testcraft

I have no idea what they thought they were buying, but what they were receiving was black-hat content marketing at large scale. There may be other clients I don't know of.

Here's some Medium profiles of authors, where you'll see those companies get plugged over and over. I expect some of these will be removed by the authors soon:

https://medium.com/@AntonLawrence

https://medium.com/@Justin_Parsons

https://medium.com/@geeostan8

https://medium.com/@Dickson_Mwendia

https://medium.com/@arinoman

https://medium.com/@oyetoketoby80

https://medium.com/@mrsaeeddev

https://medium.com/@ginomessmer

https://medium.com/@diptokmk47

https://medium.com/@sajjadheydari74

https://medium.com/@ngwaifoong92

https://medium.com/@SeanHig

https://medium.com/@henson.casper

https://hackernoon.com/u/ari-noman

https://hackernoon.com/u/diptokmk47

There's probably others I'm missing. If you run a blog and one of these companies is mentioned in a post that doesn't disclose an affiliation, it may well be fake. I contacted many of the publishers above. To quote Quincy of freeCodeCamp in https://twitter.com/ossia/status/1316216151802667008 after he researched this: "Stay vigilant, friends."


Aside from the content itself, I’ve noticed the commenting and voting patterns on Reddit for articles about AccessiBe are suspicious. I always assumed they had “help”. It’s a shame social media doesn't have a “this looks suspicious but I have no direct evidence” report button. I expect it’s very easy in a lot of cases to determine abuse when you have access to internal data.


They don't have a report button like that because competitors would abuse it to flag stuff.


People who would do that can abuse the existing report buttons. I’m talking about the case where a normal person would think “well, it’s not obvious spam so I can’t report it as spam, but I think that if somebody looked at the data, they would see abuse”.


AccessiBe itself is rather shady [1], so it isn't really surprising to see them on the list.

[1] https://adrianroselli.com/2020/06/accessibe-will-get-you-sue...


PS: Maybe one of these companies would like to comment. What services did you intend to purchase, and who are you buying them from?


> approach freelance bloggers to write articles about their clients

This particular industry isn't particularly secretive. I know of mature agencies in the UK and US doing this with real offices, full time staff, nice websites, senior staff with LinkedIn profiles, etc. And I'm sure there are others outside the anglophone world.


Interestingly, when you look at e.g. thereyougo's submission history https://news.ycombinator.com/submitted?id=thereyougo there are [dead] but not [flagged] or [dupe] posts from towardsdatascience, imperva, and testcraft. Maybe the sustained campaign got those companies' domains banned from HN? Or there was something specific to those submissions that triggered the spam filter.


Yes, what you described is what happened. That's also why the submissions appear to have stopped about 25 days ago, even though the submitter was trying. The recent history will contain more "cover" (legitimate) articles than sockpuppet posts because the recent sockpuppet ones never saw the light of day.


I think the Reddit and guest blog post part of this are pretty well-known and even heavily advertised, even directly on Google Ads.

Try searching for 'buy reddit upvotes' and you'll get ads for services that do exactly what you're talking about.


Yeah I get e-mails every once in a while on a (non tech) website I run asking if we can do a guest post for money. It's a common advertising program I'm sure.


It would be a good idea to present stronger evidence before saying HN is astroturfable. It has one of the strongest anti astroturf systems I’ve seen, primarily because of how HN works (which, sadly, is hard to discuss openly).

I believe you about Reddit, but it’s going to be quite hard to buy your way into HN, no matter how cleverly you do it.

I’m not saying it’s impossible. But it’s so easy to believe, and so hard to do, that it warrants skepticism.


HN / reddit / lobsters whatever programmer link aggregators may be astroturfed. But after a while you will learn to recognise the smell of astroturfing and even otherwise, they are refuted by commenters. More likely that there are fanboys that come with irrational arguments rather than astroturfing in these forums.

The greater harm of these content marketing bullshit is it pollutes Google search results. If you search Google for some mainstream enough technical terms, all you get is shallow information, poorly written, posts on bullshit sites.

I'd like someone to make a search engine for authentic programming related websites and blogs, even if hand aggregated. Instead of surfing through 5 pages of ZDNet, geeksforgeeks, DZone, thenewstack, quora etc.. highly SEO'd sites.


I've been pleasantly surprised by content om DZone at least, though the others less so.


It's a hit and miss site tbh.

I was once told that one of my rambly blog posts had appeared on there without me knowing; it wasn't the best, but still. My blog post was on our 'company' blog, turns out the marketing department of another segment of it just took it and reposted it on dzone.


I think that anti-astroturf systems are definitely one of the things that should qualify to remain as proprietary data of the people who run the servers.

Much the same as the best online payment processing anti-fraud services are an opaque black box that you feed some data into, and you get a result back. They don't tell you what's going on inside the black box.

I would not be surprised at all if the top vendors for online payment processing fraud detection also offer services for anti-sockpuppet/anti-inauthentic user detection. Some of the methods going on in the back end to analyze the validity of a transaction will also apply.

Considering the modern weaponization of social media to manipulate stocks, elections, protests and such, I would consider that sort of SaaS to be a growth market.


There is a problem though. In a physical setting you cannot do shadow banning. Now in the past 100% of public discussion happened in public spaces, but let's say that now this happens in 50% of the cases online. Let's also say that this grows considerably to something like 75%. The issue is that public discussion becomes increasingly easy to be censored.


Censoring isn't bad in and of it's own, so it would help if you illustrated this with perhaps some undesirable examples of things happening today.


HN hands out shadowbans like candy for people suspected of vote manipulation. I completely agree with this approach because it keeps the site clean.

An interesting question to ask is if HN owes it to its users to be more transparent about responses like shadow banning and provide ways to appeal such responses. Most of would say no, the current approach is working for us and we should keep it going. But then I wonder why we’re ok with HN behaving like this but not large social media companies.


I wonder how can shadowban work at all to begin with. It only takes 10 seconds to open a public thread in an incognito window and confirm if voting, commenting, etc indeed happened as the commenter intended or was it only happening in a private echo chamber.


One way I fixed this in a small gamedev forum I help maintain was by letting users view shadowbanned comments created by the same IP. There's still the chance of the user using Tor/VPNs, but it's rarer.

Shadowban is not perfect by all means, but it's still a good deterrent in my experience.


IP-based recognition is annoying for people leaving in third world countries[1] though, because there little IPV4, they are many behind the same NAT.

Also, with tethering it's really easy to circumvent, without needing a VPN.

[1]: IIRC, the whole Laos only has a /32 subnet… yes you read it right: a single IPV4 address for end entire country. And many country only have a few /16.


I was intrigued enough to look it up. According to[1], Laos has 54,784 addresses. The smallest is Santa Lucia, with a /24. North Korea and Dominica have a /22.

(Apologies if I'm getting that number wrong. I don't do much with subnetting.)

[1] https://en.wikipedia.org/wiki/List_of_countries_by_IPv4_addr...


Thanks for the fact-checking! I looks like I didn't recall correctly (or maybe it used to be true but changed at some point, who knows?).


Please read my message again. I'm not restricting anything and there's nothing to be circumvented, it's about letting unlogged/anonymous users view more stuff to deter detection of shadowbanning.


Maybe “circumvent” isn't adequate here (not a native English speaker), what I meant is that's easy to bypass your countermeasures: post from my computer, and check from my phone if my comment is visible, if not I shadowbanned.

And regarding third world country, your idea doesn't prevent them to access the website, but they will access a site where the shadowbanning feature is pretty much disabled, which could lead to the proliferation of trolls or spam targeted at this specific country.


If this ever happens I can just change my approach. It helped me a lot so far. I'd rather have an approach that is currently working on >99% of the cases I need than chase some hypothetical 100% solution that is virtually impossible to achieve.


This doesn't make any sense. Most IPv4 addresses change every day if not multiple times during the day. I guess you can only filter the dumbest of the dumbest this way. And if someone has the wit to open an incognito window it doesn't take a genius to notice that he can only see the same day's comments.


There's a lot more involved in my case than that, but suffice to say it worked well in the forum I maintain, so your intuition disagrees with my practical experience. And yes, my trolls/spammers are dumb.


There are two different classes of user. One class checks those things and knows that their post didn't go through. (That doesn't apply to voting, though—that's important.) A second class of user doesn't check and doesn't know. That second class tends to be more naive spammers or promoters, and shadowbanning works well in those cases.

It's useful with the more sophisticated class too, though. If they have to start fresh with new accounts, it slows them down and makes what they're doing more obvious to the community.


I guess it works well against non-malicious jerks. They come, they don't get the social reward they are expecting, and leave. And maybe later they will retry but with a better behavior.

And maybe the malicious/non-malicious ratio is low enough to make this method efficient.


How I initially joined HN: Made an account to point out an astroturf post I saw someone promoting elsewhere to upvote spam.

I was just a passive consumer before.


I don't know about astroturfing on HN, but I've seen enough patrolling here to be skeptical about any kind of “HN has strong defenses” claims.


What does 'patrolling' mean in this context?


I thought it was a common term, but maybe it's just some reddit slang: I use it to mean one community “attacking” or “boosting“ a submission on a forum based on the community values, and not the worth of the said submission. In can have different levels of synchronizations:

- no synchronization at all: conservatives (especially American ones) flagging socialist-sounding posts (often upvoted by Europeans when Americans are asleep), Gophers & C++ guys flagging Rust posts, etc.

- loosely synchronized: some content is getting popular on /r/rust, or /r/python, some people there will connect to HN to upvote it here.

- strongly synchronized: some influential Twitter handle posts a message about how “some shit went to the front page”, zealot followers come and flag the submission. Also works with specific subreddits (/r/programmingcirclejerk for instance, even though it's more aimed at comment posts than submission).

It happens a lot, often enough to be noticeable. Sometimes it sort of regulates itself (like in the left-right battle between Europe & US, or between the Rust Evangelist Strike Force & Rust haters), but not always.


Thank you, that's interesting! I think some of those things are going on. Other points I'm a little skeptical about—for example I don't believe that the left/right divide correlates as strongly with the US vs. Europe as you suggest. Many of the strongest leftist posts we see come from the U.S. and many of the strongest rightist posts come from Europe (to judge by IP geolocation).

Your 'no synchronization' case is tribalism. That's certainly happening here, as probably in every large-enough group. Yes, it's a significant problem. But it's not the astroturfing/manipulation problem being discussed in this thread. If your skepticism about "strong defenses" was meant to include this case, that's too general.

Your 'synchronized' cases would constitute abuse in HN's terms, and if you or anyone notice it happening in the future, we'd greatly appreciate being told about it at hn@ycombinator.com. Actually, if you can even point to cases where it happened in the past (e.g. "some content [was] getting popular on /r/rust, or /r/python, some people there [connected] to HN to upvote it here"), it would be interesting to look back and see whether we detected it and/or could do something differently.

The one thing I'd caution is that it's extremely easy to convince oneself that these things are happening when they're not. Nearly everyone with strong views about this phenomenon is massively deceiving themselves about it—if you're only guided by what feels like it must be happening, there's far too much opportunity to just project things into the situation. People do this all the time, and it's a big problem—as I've said elsewhere in this thread, it's actually a bigger problem than the abuse and manipulation being complained about. The solution is to guard against that by always looking for some extraneous indication (i.e. evidence)—for example a thread on Reddit saying "let's upvote this on HN"—and to be agnostic in the cases where one doesn't have that.


> that the left/right divide correlates as strongly with the US vs. Europe as you suggest.

Right, talking about “right and left” was a mistake because the meaning of these words are pretty fuzzy and highly context-dependent. I'd give a more precise description then:

Comments containing criticism of mainstream economics, references to Keynes, arguing that “all capitalism is crony capitalism” or “capitalism didn't defeat communism, welfare state did”, being in favor of strong state intervention etc. are going to be much more upvoted when Americans are asleep. And conversely for comments referencing Milton Friedman, praising the power of the market, economic growth as the main goal for social wellware, etc.

I've seen more than once my comments on the aforementioned themes being upvoted multiple times, then grayed several hours later, to end-up with a positive score the next day. I didn't notice the temporal correlation until someone brought it in a thread, and many people shared the same experience.

If you want to have a look, the recent thread on the Nobel prize in Economics smells like a good candidate for investigation (even though I didn't participate in the thread, so I have no evidences there).


I think HN is one of those rare site I really wish they had ad or some donation button where I can paid. Dang is doing an exceptional job in holding things together. And despite a slight downward trend in terms discussion quality, HN is still by far the best tech and other "nerd" interest forum. And the community as a whole self police each other. The vibrant and large number of active users, compared to a SubReddit, Dev.to and other site makes astroturfing HN really really really hard.

Anyone who have submitted anything on HN would know. Getting on the Front page isn't an easy task at all. And staying on front page is even harder.

And Cunningham's law doesn't always work on HN. Sometimes the community just decide to ignore it. Lol


It would be a good idea to present stronger evidence before saying HN is _not_ astroturfable. Yada yada. Just pretend I flipped the rest of what you said around too.


> which, sadly, is hard to discuss openly

Why is this?


Because discussing the moderation system is strongly discouraged and leads to downvotes.


No, that's not it; it's because anti-abuse is a cat-and-mouse game, so while you can discuss it (in an appropriate setting like a junky meta thread you didn't start), you can't expect HN to be forthcoming with details, because those details lower costs for abusers, and maximizing costs is the whole ballgame.

Kibitzing HN moderation itself is one of our oldest pastimes.


Its funny because the most hardcore in open source and security would argue that good techniques don't rely on obfuscation and secrets because those cats can get out of the bag. Never purrsonally subscribed to those as I agree with the cat and mouse perspective. Information assymetry is effective.


People in security who say that categorically are betraying ignorance, because there are several "hardcore" settings in software security where the same dynamic --- attacker/defender cost competition occurring by degrees --- plays out. Anti-ATO, content protection, botnets, anti-DDOS, hardware platform security, just to rattle a few off my head.

The correct security objection is to obfuscation being deployed in settings where there are decisively effective controls that could be deployed instead: where it doesn't make sense to raise attacker costs by degrees, because those costs can be raised to intractable levels instead. I'd cite an example, but it would spawn a 500 comment thread about how Linux sysadmins manage their networks.


I've never thought HN was impartial. The fact that discussing astroturfing is against the rules was highly suspect.

I consider this a highly censored website with particular objectives, but a decent userbase.


The rule is against insinuating astroturfing without evidence. The alternative is threads full of pointless insinuation about astroturfing—the favorite junk pastime of internet forums. HN doesn't have "particular objectives", it has one objective: to gratify intellectual curiosity, and that guideline is obviously integral to this.

That doesn't make actual astroturfing ok. We spend many hours combating it and banning accounts and sites that do it, including the ones that Troy's reporting on here. There's just a huge difference between it-really-happening and pointless-toxic-speculation. The difference is evidence, and that's what we require.

https://news.ycombinator.com/newsguidelines.html


Just curious, but what constitutes evidence in this context?

And how do you deal with the other side of "unfair" behaviour, e.g. excessive flagging or downvoting for legitimate posts or comments. As far as I'm aware there isn't any evidence required to downvote or flag.


By evidence I mean something in some data somewhere that's more than just the opinion being posted, which we can look at and evaluate objectively. I know that's a bit of a lame answer, but I can't give you specific examples without giving the same examples to others who would want to circumvent leaving evidence in that way.

The main thing to understand is that we need something to look at other than just an opinion that one commenter was expressing which another commenter didn't like. That's evidence only of difference-of-opinion, not abuse.

Such data isn't always secret and isn't always just on HN. For example, if someone is asking for HN upvotes on Twitter, we sometimes get links from eagle-eyed HNers. Similarly when someone is sending out spam emails trying to organize a voting ring. And sometimes spammers copy comments from other forums and paste them into HN. Those are pretty basic examples but I hope you can see that in each case there is some objective data that supports a judgment of abuse.

Conversely, suppose you like $BigCo and someone else hates $BigCo, sees your comment praising them, and replies "how much are they paying you, shill?" That's the kind of thing we don't allow, because there's literally nothing supporting that judgment. The same type of commenter will see various comment arguing for $BigCo in HN threads and then post to other threads with high confidence that "HN is overrun with astroturfing". What they mean is that it's overrun with comments they don't like—and even then, "overrun" is an exaggeration.


Thank you so much, dang. I'm completely with you 110% that just throwing accusations this way is bad, and that it's also most likely a "I couldn't disagree more" leading people to a wrong conclusion about shilling and such.

What about the other side of it though? your reply didn't really address it.

What I feel is happening now is that in those situations (and others), people downvote and flag things that they don't agree with. They're not shouting "shills / astroturfing" yet the collective power makes it easy to silence opposing opinions, especially if those opinions are in a minority.

Completely anecdotal, but I reported to you two cases of flagged stories that in my opinion had value in them for the community (and in the discussions around them). Those stories were effectively silenced. I think it's a shame. There's no evidence require to flag or downvote, there's no requirement to even give an argument/reasoning for doing it.

Are there any plans to tackle this kind of behaviour in a similar way that empty/non-evidence-based claims of astroturfing and shilling is dealt with?


Here's some logs + a writeup of when they spammed Lobsters on behalf of LoadMill: https://lobste.rs/s/utbyws/mitigating_content_marketing

My first name @push.cx if you want to share notes on these or other abusive users.


'Fairness' has no relevance when it comes to a site's participants deciding among themselves what is and is not worth discussing. Perhaps those posts and commands that were flagged or downvoted were considered less legitimate than you might have suspected by the rest of the userbase. <insert xkcd 1357 here>


IDK. Guidelines: https://news.ycombinator.com/newsguidelines.html

Be kind.

Please don't sneer

Please don't post shallow dismissals

But downvotes are sometimes used unkindly, dismissively and as a way to supress a different view (which may or may not be justified). You nuke me and say why, I'm happy - we can talk! I can learn something new! Downvoting factual posts silently is... frustrating. And ill mannered.


In aggregate downvoted posts are practically always low-effort (or sometimes just really unhinged or otherwise patently wrong), so I'm not convinced that downvotes being used unkindly is a big problem.


Downvoting is also restricted to accounts who exceed a point threshold. I think that particular feature (ensuring people who can downvote at least have some level of trust within the site) has been critical to prevent hive mind-style downvoting. I rarely see downvoted comments where I don't understand why they were downvoted.


perhaps ironic, but can you help me understand why the GP comment was downvoted on this thread? I’m genuinely wondering. (the one from throwaway_pdp09)

I’m definitely happy that there’s minimum Karma for downvotes, but how does it prevent hive-mind downvoting?


At a guess, it's a post that implies a sort of soft conspiracy with very little evidence. It just doesn't contribute a whole lot of value to the subject at hand, except for attempting to foment a vague sense of wrongness.


I'm the poster of that. Regarding evidence, I copied bits from the HN guidelines, said that downvotes are OK if I know why cos they bring benefit, then got silently downvoted. Is that not evidence enough? BTW I can't downvote myself. It was an honestly made critique and suffered from exactly what I protested against.

If I was wrong, your response does not elucidate why, in fact let me quote bits back to you "soft conspiracy" ... "very little evidence"[0] ... "a vague sense of wrongness"

Well maybe but your post has less substance than mine.

[0] you didn't ask for any BTW


That is not a good indicator. Your mind can always rationalize something.

Imagine if a large corporation bought ~50 old accounts and spoofed different computers/browsers. Only 5 votes is needed to hide a post.

Controlling the narrative is almost trivial if you can spend mere thousands of dollars.


But as I've pointed out before, how do I as a user of this site, get access to the very evidence I need?

There are numerous suspicious posts - which may just be my biases, or not - such as this thread with a guy posting a lot of facts https://news.ycombinator.com/item?id=24746397

I applaud this because we need facts, but one guy there has an astonishing level of facts ready to go and a rather slick and way of putting things which I recognise. Why? because I used to work in publicity (though not of the spinning kind). I recognise the style. I want the guy here and posting because we need facts not shouting but if he has a financial interest, we need to know. It should not stop him being there if there is because in some respects his pro-niclear posts are pretty good but it needs to be open.

Other problems - there's a certain style of posting that proposes stuff with zero facts and magically gets voted to the top of the thread. No facts, slight whiff of fud, pushed to the top. That's not actually how the HN crowd tends to react to info-free posts (or myabe there's a subset who does, I may be mistaken). But how do I analyse the voting patterns when I don't have the voting data?

I'll not mention what happens when china becomes the subject.

Is it me? I don't know. But then I can't tell without evidence. There seem to be other problems. Is it me? I dunno. I'm posting less here because I feel good stuff is getting swamped (not just my stuff, a lot of other people's stuff. My posts aren't generally a pinnacle).

Edit: so how do I get the evidence you require?


New submissions appear here: https://news.ycombinator.com/newest. Eventually, consider enabling "showdead" on the settings/profile page.


It has already been pointed out that discussing anti-astroturfing measures in detail is not done, for reasons that have been explained. It doesn't seem reasonable to keep demanding explanations given what has been said before.


"The difference is evidence, and that's what we require."

So how do I supply the required?


You can always contact the moderators in private if you're unsure, or do lots of research, like Troy did here.


> It would be a good idea to present stronger evidence before saying HN is astroturfable.

What? You do realize anyone can create HN accounts to post any link and comment on any discussion, right?

Even if you argue that there are magical ex post facto measures to tackle obvious and rampant abuse, you do understand that the system is indeed vulnerable to astroturfers right in its very design, don't you?


Anyone can create HN accounts to post any link and comment on any discussion, but certain accounts are dead on arrival, and votes from certain accounts don’t count, so you can pollute /new or discussions (especially for users with showdead on), but that doesn’t mean your content automatically get prime spot placement just by registering more accounts.

Meanwhile, clickbaiting is much more effective than creating accounts.


> but that doesn’t mean your content automatically get prime spot placement

That's not how it works. It might be a desirable goal, but that doesn't mean that the role of a content marketer is not to a) astroturf discusssions, b) generate content that's SEO-friendly even if they don't blow up.

Customers already get their money's worth if you get your minions sparking causal low-key discussions about their product/service/PR talking point on random places in order to raise awareness and focus on topics in ways that serves your best interests.


Thank you for reporting this. https://hackernoon.com/u/diptokmk47 was actively under review via our back link monitoring tool, and after this report, we've banned https://hackernoon.com/u/diptokmk47 and https://hackernoon.com/u/ari-noman --- if you see any other black hat behavior or things to improve on https://hackernoon.com/ here is the fastest way to reach the relevant person https://hackernoon.com/contact


Amusingly enough, if you sort Imperva posts by popularity, you'll get stories about their own breach.


>You can see some of these on HN here:

Looking at comments and upvotes numbers, it's not like their HN efforts were any effective.


Unsure if this is part of the assertion or not, but some of these appear to be actual people with LinkedIn profiles and careers. It was unclear to me if they were spawning new identities to author the content.


I'm pretty sure most are actual people (mostly engineers in developed countries), but I came to suspect that at least 2 are not. I don't know for sure either way.


Funny that I have learned to not click on these sites in Google search. I know that's almost always going to be shallow, SEO content.


When I search for programming topics (in my case, Swift related) I see lots of links from sites like xspdf.com and others that clearly just scraped stack overflow and other legit sites, crammed the content in a useless format that makes Google think it is original and adds tons of ads. Since it looks legit in Google search results you click on it and then realize how useless the content is. Often the sites appear earlier that the original content did.

So I don't click on them now that I know this. Google of course seems not to care or is incapable of noticing.


Do you know if these were direct clients, or if perhaps they contracted with one entity, which then outsourced it to this shady group?


Uninstalling climacell... know any alternatives?


It'd be interesting if we could train AI to spot fake accounts. A good sized sample would be lovely...


It would also be interesting if we could convert lead into gold. Technically true, but not a novel observation, and an incredibly difficult problem.

Edit: specifically, none of the interesting parts of this problem/idea are in the phrasing - the only thing to do is just go out and implement it, and that's very difficult because (1) you're essentially trying to solve the Turing Test ("is this a computer or a human?") (2) most of the techniques that you might use to heuristically make this determination can either (a) be defeated very easily or (b) be defeated by another AI made using similar resources+techniques to those used to make the first.


Yeoman's work. TY!


[flagged]


"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."

https://news.ycombinator.com/newsguidelines.html


Is this really any less ethical than a more sophisticated marketing scheme?

"Black Hat Firm" vs. Expensive Growth Marketing Firm:

– unestablished writer vs credentialed, domain experienced writer (with a salary?)

– spammy Medium.com presence vs Editor Connections at TechCrunch

– network of "dozens" of nearly ineffectual sockpuppets vs dozens of employees with "real" accounts ready to upvote and engage on Boss's call

If you're surprised by this, spend a day in the pits of a product launch in literally any industry. We welcome ingenuity when we talk about growth hacks but criminalize it when it's not classy. This is all a matter of access, Marketing and PR generally comes down to what resources people have and what they can get away with.


Most "sophisticated" marketing schemes are incredibly unethical (and I've even argued that marketing at it's core is unethical, but that's a hard sell most places). That might mean that it's not any less ethical, but equally scummy.

The fact that it's not a surprise (and implying that everyone does it) is not an excuse to keep manipulating people at a mass scale when they're browsing the internet in (primarily) good faith. I'd say that pointing that out and accepting it as a good justification is in and of itself unethical.


>(and I've even argued that marketing at it's core is unethical, but that's a hard sell most places)

100% agreed, and I gather from other times/places where this was brought up that there's more of us than you might think.


I sincerely hope so. I fervently believe that all advertising is, by definition, paid manipulation. As such, it is deeply immoral, despicable, and should be outlawed on the constitutional level! I also believe that this manipulation is the foundation of almost every evil in the western world, because it is the perverse motive that corrupts our democracy and our economy. The ability to manipulate masses of people to buy your idea or product using money allows those with money to dominate, not those who are competent or correct (either at politics or economy). The fact that our societies not only allow this but actively celebrate the biggest manipulators as the best of us (google facebook etc.) is beyond immoral for me. It's insane!

I also believe that anybody who calls out this sick game at the level of society would be cruelly ostracised by those in power, either in media or politics or economy. Still, we should voice this more often. We should make people rethink our societies.


At the end of the day it's all about good faith. You either have it or you don't. It's completely obvious to me that it's ok for a product company like Apple to have industry / journalism ties that they provide information ahead of time for a bigger launch day.

I could go essentially any situation with marketing and that's what the ethics boil down to for me. It's simply not good faith to rig voting on a social network. I hate it when my friends ask me to upvote their HN post just because they've asked and they know I've been around here a long time. But if they write a cool post and say "hey if you like this spread it around a bit" then that is fair and I'm happy to submit it to others here.

So to answer your original question: Yes, it really is less ethical and the continuum does not range from spammy to sophisticated. Spam can be quite sophisticated. They're correlated, but they're different measures and, really, a true master doesn't employ these techniques because they're playing a different, better game.


"I didn't know that black-hat content marketing existed" (from the Twitter thread)

I haven't laughed so hard in months :)))

I've talked with over 150 founders (I'm running a website that deals with acquisition channels & user growth [1]) and this is like uncovering 0.01% of what's going on out there.

I became friends with some of these founders, and some of them admitted that a significant part of their growth was "paying for being featured in publication X"

Don't get me wrong, I'm against this "black hat content marketing" practice, but let's also consider the other perspective:

a) 80% of the content on publications like TechCrunch [2] is all about Google/Apple/Tesla/Virgin. I challenge, you go there RIGHT NOW. COUNT the % of stories about FAANG companies.

As the markets go into a "winner takes all" mode, these publications only cover the big winners. So people only hear about them, which amplifies the whole "winner takes all" thing, and the vicious circle continues.

Some of these big publications have made "attempts" to be "indie-friendly", but that's one big BS. I won't name the company I contacted (it's a biger publication). I basically told them: Hey guys, got an interesting article that was featured on HN front page 3 days ago, can I do a deeper piece for your publication?

Their answer: "Oh, that's great, go to our sister website X.com, we feature non-FAANG there". X.com was a website that wasn't even in the Alexa top 1M list.

I also have some doubts that the OP has removed some bigger publication names (maybe afraid of getting sued? No idea).

My point is: As publications get more "closed", the incentive for getting there via other means is going to get bigger.

[1] https://www.firstpayingusers.com [2] https://techcrunch.com


> I haven't laughed so hard in months :)))

Please don't be a jerk in HN comments. We're trying for a different quality of discussion here. The rest of your comment would be just fine without that swipe.

https://news.ycombinator.com/newsguidelines.html


I don’t see a problem with it. It says more about the comment author than it does about the article. If he (commenter) wants to out himself as a knowitallogist, so be it?


Whats wrong expressing your feelings on thread and writing about it? Humans are now jerks ironically laughing at things? Okay. :)


When you're laughing at how wrong someone else is (or you assume they are) on the internet, the default is that this is a snarky way of putting them down.

If your intention is different from that, this is good, but then the burden is on you to express your feelings in a way that disambiguates your comment from the default. https://hn.algolia.com/?dateRange=all&page=0&prefix=false&so...


Anecdotally, my website has apparently cracked some SEO metrics threshold, as I'm enjoying the questionable honour of receiving various offerings on a daily basis. These are from agencies either directly asking for link placements, or offering payed posts, or, the other way round, offering ready made content (which may or may not contain an endorsement of some kind, most probably the latter). One thing these have in common is that those are the most stubborn cold call campaigns, I've ever seen. They do not give up, there's a 7th and 8th followup on an answered mail, eventually turning towards the passive-aggressive side. (While, clearly, they couldn't be bothered to have just a quick look at the site in question, just to recognise, how inappropriate and meaningless their advances are.) There seems to be quite a flourishing industry behind this.

(Not so serious theory regarding overconfident influencers: Maybe they are just copying the style of the agencies, they get their pay checks from.)


Without naming your site or sharing any details, would you be willing to share how many daily unique visitors you are getting that it has reached over this threshold?


It's just a small site that has been out there for some time. I kicked out any analytics for good some time ago, so I do not have any reliable data myself. There have been times with hyped content collecting millions of hits in a few days, but those have been years ago. (Then, this only resulted in offers for buying the domain, for, what I can only guess to be business of varying trustworthy.) I added a blog two years ago with rather infrequent post activity. The content is quite specialized and probably far from general interest. Ad revenue from a few Google adds sprinkled here and there is about nil. Currently, Google search stats indicates varying performance from 1 to 1.5 M impressions and about 20 to 30 K clicks at the beginning of the year and a rather steep decline towards the summer. (Only estimated pages with first impression are slightly up. I've actually no idea what these figures do indicate nowadays, whether there's anything noticeable about them or not.) Ironically, the mails started, when the performance was going down. (Full disclosure: https://www.masswerk.at/)


Do you happen to know if the follow-up emails were manual or scheduled/automated? I think it'd be interesting if someone actually wrote 8 automated follow-up emails and deliberately went aggressive in the later automated steps. Maybe that actually works enough times to be worth it?


You've obviously never got into the CRM system radar of a low level sales person at Cogent.


It's mostly automated. They'll use tools to dump out data based on search metrics into an Airtable or Google spreadsheet, then templates for the mail stages and tools to send and track opens.


> I also have some doubts that the OP has removed some bigger publication names (maybe afraid of getting sued? No idea).

Just for completeness: I have not. This is everything I have near-certainty of. I omitted 2 or 3 possible authors who I'm not sure of because I'm not willing to accuse someone with less than near-certainty (where certainty would be their admission). If I knew anything else, I'd have posted it.

Maybe more importantly, I don't have any reason to believe publishers knew this was happening, and have every reason to believe the opposite. Many of these blogs take transparency and disclosure really seriously.


I used to know an e-commerce startup founder whose entire growth strategy was focused on this brand of "grey hat" backlinks SEO. He was a Forbes contributor and tried to get stuff published in high domain authority sites pretty frequently through "content partnerships" or other means. The e-commerce startup ended up failing.

Now he runs an SEO agency that focuses on backlink building.


> a) 80% of the content on publications like TechCrunch [2] is all about Google/Apple/Tesla/Virgin. I challenge, you go there RIGHT NOW. COUNT the % of stories about FAANG companies.

For the lazy, I did just that, 19 stories under the "latest" Tag, 7 of which are about Google/Apple/Tesla/Virgin/Microsoft. So about 36%.


I remember a big-tech publications (TechCrunch or VentureBeat) fired one author who was caught getting paid by startups. This was 3-5 years ago.


I don't endorse this behavior at all but can it really be said to be black-hat? imho seems more gray to me. From what I understand, content _is_ being generated and there are attempts to boost the SEO, so they're providing a service and it's not outright theft. At worst it just sounds like poor practice and gaming the system. They're not acting maliciously, illegally, and/or destructively, which is how I've always defined black-hat. Am I missing something?


> They're not acting maliciously, illegally, and/or destructively, which is how I've always defined black-hat. Am I missing something?

It probably is illegal if they aren't disclosing a paid relationship: https://www.ftc.gov/tips-advice/business-center/guidance/dis...


Not really, according to their page, disclosure is a best practice, but doing otherwise isn't inherently illegal. Their own guide [0] begins by talking about adhering to their guidelines as "voluntary compliance", and that a failure to do so would require their investigation into the specifics to determine if it was a violation of law: It is not automatically a violation.

[0] https://www.ftc.gov/sites/default/files/attachments/press-re...


Those guidelines are voluntary, sure. But there are actual laws included (section 5 of of the FTC Act (15 U.S.C. 45)) that make some things actually illegal.

Specifically: "Unfair methods of competition in or affecting commerce, and unfair or deceptive acts or practices in or affecting commerce, are hereby declared unlawful."[1]

This voluntary guide helps people understand how the FTC inteprates that law. Deciding not to comply because the guide is voluntary doesn't preclude prosecution.

You are right insofar as there is no law that specifically says "on the website reddit.com you must disclose if you are paid".

[1] https://www.law.cornell.edu/uscode/text/15/45


> Specifically: "Unfair methods of competition in or affecting commerce, and unfair or deceptive acts or practices in or affecting commerce, are hereby declared unlawful."[1]

This is one of the least appropriate uses of the word "specifically" I've ever seen.

That law just says "Don't do bad things. You know who you are."


It's often written like this, and regulations define more specific things (or it's left to judges).


Yep, there's lots of gray area. But not disclosing a financial arrangement isn't illegal in itself. It's only in the last few years that Youtube even had an option to flag videos as "contains paid promotions", though now they do require that to be flagged if you're getting paid.


> But not disclosing a financial arrangement isn't illegal in itself.

What is making you think this is true? It's not.

For example 3/4 of the example cases here are around undisclosed financial arrangements in influencer marketing, and in all cases the company admitted fault: https://mediakix.com/blog/ftc-influencer-marketing-violation...

I'm not sure what your definition of illegal is, but there is a law that the FTC is using to win legal cases on the issue.


I'm referring to their guiding principal that disclosure is necessary when that disclosure might change how a consumer evaluates the review. If it is reasonable to think the disclosure would make no difference, then no disclosure is required. Though I'll admit that can be interpreted broadly enough to say that disclosure is always necessary.


Likely to also be wire/mail fraud if my understanding is correct.

https://en.m.wikipedia.org/wiki/Mail_and_wire_fraud


If this isn't black-hat, what is? Specifically, an author not disclosing an affiliation or payment, fake accounts promoting it, and fake accounts commenting on those promotions. In the continuum between white and black, what's left on the black side? :-)


SQL injections, vulnerability exploits, or other hijacking mechanisms to inject links directly. In my mind, using a system as designed without breaking any laws lands in grey-hat territory.


Outright spam, like forum and blog comments. Exploiting web sites, or abusing trust relationships (like updates to WordPress plugins or browser extensions) to inject spam into legitimate pages, or to create entire networks of spam pages within unaware sites. Launching deliberately obvious spam campaigns to discredit competing sites.

That's black-hat SEO -- where the behavior itself is probably illegal, even without the intent.


Dont forget SAPE. When I heard of that I knew the SEO world was rougher than I had previously imagined. For those that don't know, its a Russian run marketplace for link injected/hacked sites. Really big, high quality sites.


It is more commonly known as Astroturfing


Well, fuck it.


Some examples that come to mind: click fraud, phishing, malware.


That doesn't preclude the techniques we're discussing here, it only adds other techniques to the black hat tool box.


I think the point is that those activities are clearly "black hat", while astroturfing isn't inherently illegal. (well, maybe in some jurisdictions it is. I seem to recall that the UK has much stricter "truth in advertising" laws than the US)

It all exists on a spectrum though, and we can disagree on where the line for black hat is drawn, with no one being wrong because it's a matter of opinion.



I agree. From the twitter thread, I clicked on random Medium user and on hos random article. Yes, there was a passage about one company's product. But boy, there were 15 passages of pragmatic, no BS content which I found very useful - I'd even say it was above average written article.


Deception for money is malice.


I'm confused on your word choice. Malice is generally a more deliberate act of hostility. "Fraudulent" would seem to fit better.


Isn't this extremely common honestly? Isn't this so common it has names like "astroturfing" and "growth hacking"? This so not black hat that it's basically just regular ol' hat.

Yeah creating fake grass roots support isn't particularly new, and I would venture that it's been around as long as there's been marketing.


It is extremely common. These companies probably bought something that was described as something more benign and ethical.

And I totally understand. If you’re a small company you don’t have time to do further check-ins. They probably get you with a good pitch about influencer marketing for a niche area and that sounds nice enough to many to consider putting some money on it.


I feel it's unfortunate "growth hacking" took on this connotation (ie, "comparable to astroturfing / fraudulent by definition", given the need for a term to describe (genuine / above-board / white-hat) creative approaches to connecting your idea with a market. Reminiscent of "hacker:cracker" usage divide, more generally.


Yes. Reddit upvotes are dirt cheap and sites sell article spots all the time.


This happens in the real world too. IIRC, Stonyfield yogurt had their friends and family buy the product when it was newly on the shelf at groceries to make it look like it was selling well enough to get picked up permanently.


I read a couple of the tweets and some other comments, but I didn't realize until I reached your comment that they were talking about legit services using those marketing techniques! I thought they meant black hat services were trying to act legit by paying for publicity that would paint them that way...


I do not understand the term “black hat” in this context.

Astroturfing as been around for decades.

Companies reaching out to influencers to have them write something nice is also an open secret to the extent it is any secret.


In the past I have pointed the HN moderators (contact link is the footer) to threads with sock puppets. Some are easy to recognize, e.g. new users praising a service that got submitted only 10 minutes ago. Sometimes it's harder to find a pattern. Do email the HN moderators if you suspect fraud or unfair practices.


Normally they're pretty trivial like you said. This one is extensive: 20+ sockpuppet HN and Reddit accounts and dozens of fake blog posts submitted to many of the largest Medium-style blogs. It's an entire propaganda network just devoted to content marketing.


20+ accounts is pretty small potatos for a content marketer.


Probably just one persons side hustle while they're on fire watch at work.

But then again they could be a foot soldier for an appropriately compartmentalized consortium.


An interesting question is to what extent they're jumping through hoops to appear as distinct, legitimate, real users. Such as:

a) using a proxy from a residential ISP somewhere

b) never putting two or more sock puppets behind the same IP. Or assuming exclusive ipv4 use, not even any two clients from within the same /20 to /18 sized netblocks.

c) a reasonably varied selection of ISPs. Also variation in ISP netblock geolocation (maxmind geoIP database or similar) based on ARIN, RIPE, APNIC etc registration data. Obviously if you're promoting something that's very tech/startup industry oriented it would not be as suspicious if you had a bunch of posters "authentically discussing it" that geolocated to the SF bay area and Seattle.

d) a reasonably varied selection of common user agents. Also intentional variation on operating system and browser fingerprinting variables.

e) intentional training to avoid writing patterns and phrases that might seem similar, when one person is driving ten accounts

f) not doing dumb stuff that gives away your time zone, like if you have a bunch of people in a GMT+4 time zone that post things regularly on a 9-5 daytime work schedule, when they're supposed to be pretending to be ordinary internet users in a USA time zone.

From a black hat network engineering perspective there are a lot of ways that one human driving 30 sock puppets can appear from 30 distinct locations, as if there were 30 real humans with 30 different operating systems/computers/browser user agents and browser fingerprints.

As to what level of analysis tools are run on the server side to detect "low effort" sock puppets, that's another question.

From the point of view of a place where fake accounts/sock puppets post things, obviously an organization like twitter has a lot more staff resources to devote to writing custom analysis and correlation tools. Specifically for the purpose of identifying common patterns in inauthentic accounts.

I presume that dang and the people who run the ycombinator admin interface can see the IP address of every poster next to the post's timestamp, and might notice in a manual fashion if a lot of suspicious posts started showing up all from the same netblocks. But then again maybe not.

If you want to see the tip of an iceberg of one method for running sockpuppets, go google "residential proxies for sale" and start looking through the slickly presented marketing material.

https://www.google.com/search?client=firefox-b-d&q=residenti...


What about building a service, perhaps a web browser plugin, that overlays a trust-worthiness score + contextual information, over text (bonus for images) on the web?

If a user posts "I recommend product X", the overlay would say "Caution: This user also recommended the product Y,Z in the last 48 hours. 34/34 of the users post in the last month are product recommendations. Sentiment score for users posts: 100% positive. The following internet accounts are likely controlled by the same individual or organization: ..."

Is it feasible? Does it already exist?


Somewhat feasible. You really need distributed fact checking to scale up especially since you don’t have source ips. If you can do that you might have something if you can keep participation up.


Do source ips still help at all? These days it seems to be rather easy to visit a website from numerous different source ips.


I wish they still helped more, but they do still help.


and never forget: the black hats will never stop trying to game any system you try to come up with. At this point in time, I think it's pretty clear they have a strong upper-hand.


Super interesting. We're working on something similar within the news ecosystem.

See https://onesub.io/chrome for the extension (and https://onesub.io/mission for our wider mission)..


A distributed trust network is feasible given sufficient critical mass. I’m looking into the best way to bootstrap it.


I'm working on something like that.


dev.to and itnext are in the twitter thread thanking the poster for bringing this to light as if the existence of accounts that do this isn't a major boon to their platforms. dev.to and itnext, just like medium and other such sites are essentially content marketing farms. It is possible that they both dislike that activity and benefit from it, but I couldn't read those tweets without some skepticism.

I for one find that most really high quality blog posts I read are not really on medium/dev.to/itnext, unless some large organization has committed to making every employee post there, but rather on small blogs run by people who pay the $5/month or year or whatever to run their own blog. Maybe I just haven't looked around enough.


Why is this being referred to as black hat? Everyone does this. You think the Heroku blog posts are written by Heroku employees? It’s really just online ghost writing. I don’t see how this is a big deal.


I was a Heroku employee until recently. I know the people who work on those blog posts, and I tried to contribute one myself.

To my knowledge, employees absolutely do work on those; articles are edited and go through several rounds of review, but they aren't ghost written. Even the ones written by the management team seem entirely congruous with their level of product knowledge.

There are external people (usually former Herokai) who contribute to the Heroku blog; their names / positions are listed on the byline of the article.

I wouldn't see the point of having a non-employee writing the canonical description of a new feature; they wouldn't have the context necessary to do the best job.


> I don’t see how this is a big deal.

It may not be a big deal to you, and that's fine, but it was a big deal to some of these publishers. Many of them take their neutrality seriously, to the point that they edited the articles to remove mentions and/or links when they learned about the undisclosed conflicts and the sockpuppet promotion.

Also, many (most?) of the publishers hadn't seen something like this: for-hire operation, well-known startups, and getting distribution in some of the largest technical blogs. While that certainly doesn't mean it isn't happening, it does mean that there's value to publicizing the details.


> I didn't know that black-hat content marketing existed

seriously? I think I first saw it in 1996. Sock puppets are nothing new or innovative as a general concept.


Yeah, this has been going on for a long time. The methods evolve, but the bullshit stays the same [1]. Good rules of thumb:

1. If it praises a product, the article was paid for

2. If it has a link that looks even slightly out of place, that link was paid for

3. With few exceptions, most of the revenue for the publication comes from selling features and backlinks (much more than subscriptions and ads).

[1] http://www.paulgraham.com/submarine.html


> seriously? I think I first saw it in 1996. Sock puppets are nothing new or innovative as a general concept.

Sock puppets have existed for a long time, but turning it into a contract service for clients, selling it to well-known startups, and scaling it up this big is not something I've seen before.

(also, hi from NANOG and the SIX ages ago!)


Hi Troy! Response below written not so much for you to read, but everyone else looking at the thread.

On a large scale there is a big overlap between the general concept of operating a low-cost call center in a developing nation that speaks English (India, Pakistan, Bangladesh).

This has been taken to its most blackhat extent by the people who are running call centers for the fake "fix your PC it has viruses now, we are Microsoft support" scammers. Or even worse the fake Internal Revenue Service / Canada Revenue Agency type scam call centers.

The operating costs in monthly salary per human, and office rental, basic desktop PCs, electricity, telecom services are all very similar between the different grey/black hat business models. If you have humans talking to other humans by voice, you've got a bunch of $20 headsets with boom mics plugged directly into the desktop PCs, some SIP softphones, an asterisk setup, and a router/VPN connection back to a bunch of grey market SIP trunks. The voice part is obviously not necessary if it's just click workers.

Assume for a moment that your office and computer equipment is a sunk cost, and you've got a room full of dudes working a 6-day work week and paying them each $250 a month. They each need to bring in revenue of something like $10-11 per day to break even on the payroll. Obviously this "business model" is somewhat dependent upon finding a place that has a sufficiently large pool of low-wage, but moderately educated people who can drive desktop PCs, and a place to put your call center type environment in low cost commercial real estate. Thus my mention in another comment about Bangladesh.

Some organizations went into the gold farming market to hire people at the equivalent of $250 USD/month to repetitively perform tasks in MMORPG games and then sell the virtual currency/assets to people in the US/Canada/Europe with extra money to spend.

After that, they quickly discovered that it could be more lucrative to have one person run 30 reddit accounts (or twitter, whatever) and get paid to upvote stuff to the front page.

If one business model fails, you take the same office environment and temporarily convert the workers to doing another task. Or you have a mixture of tasks going on simultaneously for the click workers.

People with difficult accents or less than optimal phone skills, that cannot successfully scam a person and then transfer them to the higher-ranked "closer", are positioned for these click worker tasks.


>I think it would have been less work for them to be legit

My neighbor had a brother who was a "black hat" and when he explained to me how he ran his scams, I came to the same conclusion.


>I think it would have been less work for them to be legit

I had the same conclusion when someone told me about the dozens of cheap/used Android devices they and their roommates used to get paid to "watch" ads. Apparently there are apps that will pay you for ads you watch, and you can babysit them pretty easily to make like..... $200/mo.


Just to keep things in perspective, $200/month is already the median income of some places.


Which places do you have in mind? Globally, the median is close to $800/month per household.


> Globally, the median is close to $800/month per household.

You pulled that number out of thin air, didn't you?

I mean, some european countries like Bulgaria have a median household income of around 400€/month.

Do you honestly believe that Bulgaria is one of the world's poorest countries?

Getting back to reality, according to wikipedia, which cites OECD, India's median income in PPP is currently around 2500$/year.

India ranks 43rd in the world's median household income ranking.

https://stats.oecd.org/Index.aspx?DataSetCode=IDD


> You pulled that number out of thin air, didn't you?

I used a calculator. Annual figure is from there: https://news.gallup.com/poll/166211/worldwide-median-househo...

> some european countries like Bulgaria have a median household income of around 400€/month.

Irrelevant. Medians don’t aggregate the way you think they do. To compute global median, you need to know shapes of distributions in each country, and population of the countries.


I'm the OP. It's been about a week since my tweets. As most of us know, some sites are intentionally filled with promoted garbage (Forbes contributors, a lot of Quora and dZone, etc.).

Many blog maintainers/curators really are trying to be transparent, though, including most or all of the ones I mentioned. They either don't want content that was paid for or will only consider it when they know about it and it's disclosed to readers. For those blog maintainers/curators, here's a few thoughts:

* I think the generalization here are probably that if an author links words/phrases other than a company's name to a company Web site (like linking a type of product or the problem that the company solves), a curator should be more suspicious of the submission. It probably should be changed to link to an editor-chosen neutral discussion of that topic, like a Wikipedia page, trade group, or RFC.

About 2/3rds of the posts that I suspect had a conflict of interest would have stood out this way. For example, one post links "usability testing" to a company. That should stand out during review, regardless of the company or author.

I'd also be suspicious of posts that have more than 1 link to any company. Obviously it could be totally innocuous, but it's unusual and generally unnecessary.

* Don't blindly trust my assessment or the list of companies I provided. As with any random person on the Internet, I can't say authoritatively how any given post was motivated; I can only point to lots of people who are writing very similar things about the same few otherwise-unrelated companies. Each publisher will need to decide for themselves when a coincidence goes from unlikely to impossible.

The lesson here is probably to think about that for yourself: where do you draw the line? Would you prefer to err on the side of false negatives, false positives, or exercising editorial discretion (allowing the article but removing parts about specific companies)?

* I strongly recommend _against_ penalizing authors who are in developing countries.

I think the content marketing firm made victims out of the authors in developing countries. At best, they thought they were providing a real service to the public. At worst, they thought they were making good money doing something that might be a bit shady, but is common in their area. I have no reason to think they knew about the after-the-fact promotion.

(Authors in developed countries like the US - which includes all of the suspected made-up authors - obviously shouldn't be doing this and probably know it. Different rules apply.)

Moreover, someone in a developing country has limited opportunities for career growth and visibility (and some authors clearly have technical talent). I don't think this should justify taking those opportunities away. For example, I do not suggest refusing future articles from these people or otherwise limiting their distribution. Perhaps their future submissions need tighter review or can't be about specific companies/products, just technologies, but it's important that they still have this avenue.

Good luck.


The biggest companies can astroturf and never get caught because they can pay a premium for high value accounts.

Anything pro S&P500 should be suspect, especially when the comments are defending bad news.

Heck if Aldi astroturfs on reddit Frugal, you bet everyone else has a reputation management team outsourced for plausible deniability.


>Anything pro S&P500 should be suspect, especially when the comments are defending bad news.

That's a pretty ridiculous assertion when you consider how many people benefit from these companies' products/services. (Look at how many people like Apple products.) Subsequently, they will defend the companies that they think are getting unfair treatment.


> Look at how many people like Apple products.

Maybe Apple is just really good at public relations and turns customers into zealots with their marketing techniques. They get plausible deniability when the fans show up and advocate for Apple and they get a lot of "boots on the ground" (fingers on the keyboard?) for free. Even better, plenty of their PR soldiers are paying them to be on the team.


Maybe, but that's not astroturfing, which is what the implication was. Astroturfing is fake accounts or purchased votes.


1 or 2 posting accounts, hundreds or thousands of upvoting accounts.

All of this can change the narrative.


Flagged as suspect. Just kidding. Well, it could be argued that if you are defending S&P500 you are probably a pawn whether knowingly or not.


No more so than defending a government because you like what it has provided.


I get request quite frequently to have 'guest' blog posts on my domain. Obviously, I tell them to get lost but I'm sure the money is good. So it's not just the fake accounts you have to watch out for.


I guess it isn't news to me, but I'm running out of places for legit conversations and product recommendations.


It's a constant uphill battle, but it seems the harder a platform is to use, the less likely it is to gather big crowd and thus the marketers. So perhaps an old style BBS, IRC , or a newsgroupserver, and a client built intentionally with bad UI


Disheartening, but definitely not a new thing. As OP / others pointed out, many are super obvious. Others are less so.

As more and more companies try to go the content marketing route, getting the all-too important backlinks become basically required. If you aren't getting them organically, then this is a great way to "kickstart" that process. Once you are in front of more eyeballs, then the organic growth can start (if your products are good/useful)


This isn't new at all. If I wasnt under an NDA I'd tell you more about how deep the rabbit hole goes. Keep digging, lots to find and crooked stuff to expose.


There is a huge difference between endorsing a product and writing a tutorial about it... but content marketing is a gray area. I couldn't find any articles through those medium accounts that really seemed like spam to me - so I would say the authors are in the clear despite being called out really hard on social media right now.

It's really the sockpuppet treatment that is "blackhat" which the authors might not even be aware of.


If I employ someone like an ‘evangelist’ to post on HN to talk my book every day, is that more legitimate? Or if I’m friends with or share common interest with Or otherwise court an established taste maker and he or she boosts my content and features me in puff pieces to the exclusion of others, is that more legitimate?

Flacks are inexorable from places of public attention.


I remember working out of a coworking place in Thailand. There were a group of guys next to me running a marketing agency based upon hundreds of fake profiles they maintained and talking about it like it was the most natural thing in the world.

People who are surprised about these things forget that in large part of the world doing this (black hat marketing) is not illegal at all.


I'm not sure this is surprising really. It wasn't that long ago when there was clearly a targeted campaign against zoom across channels where you'd expect engagement from technical people.

I don't know if it wasn't doing its job, or if something like that just can't be maintained long term but it seemed to peter off relatively quickly.


Does anyone know about the legality of this? My suspicion is that this black-hat content marketing is perfectly legal under US law. It's only potentially a violation of individual platforms' TOS (or publishing agreements). Is this approximately correct?


It may fall under the FTC’s regulations on influencer endorsements: https://www.ftc.gov/sites/default/files/attachments/press-re...


I thought this was mostly unenforced or unenforceable in cases like this -- where it would be really hard to prove that someone took a payment. I'll keep digging. Thanks!

One more thing -- this isn't an explicit endorsement. It's content marketing. I rather thought that made even the FTC rules hard to apply.

Also: "The Guides are not regulations, and so there are no civil penalties associated with them. But if advertisers don’t follow the guides, the FTC may decide to investigate whether the practices are unfair or deceptive under the FTC Act." https://www.ftc.gov/news-events/media-resources/truth-advert...

edit: more details


I would presume there is a 1099 or W2 somewhere, unless this is happening internationally.



I mean, it wasn't exactly a secret right?

I 'd like to see sites like HN become vigilant against this kind of fluff marketing content (which has been making the rounds for a loooong time) from now on, as much as they are vigilant against political content.

Any new rules?


>I 'd like to see sites like HN become vigilant against this kind of fluff marketing content

One of the reasons I continue to use HN is that there seems to be very little marketing content that bubbles to the top articles, compared to almost every other site. Something is working.


I found hundreds of fake accounts on ProductHunt used to promote startups to get status Product of the day.

What Produchunt team did? Nothing)) Met some people in startup community who reported this fraud too, zero action taken.


Is this actually news to people? This has been happening for decades. The purpose is largely for SEO / sometimes for actual traffic.

Just go google "private blog networks" (pbn) or "link farm".


There's a big difference between "this happens" and "this is happening now, via these accounts, take a look and see precisely how, as it happens".

Also, if the threshold for posting something of potential interest to HN were "exclusive / breaking news" there'd be far too few posts and likely no community here at all for [redacted; aiming to practice civility and kindness] all of us to enjoy.


You seem to be refuting claims I never made. I'm simply perplexed that so many people don't know this market exists (as is indicated/suggested by the number of upvotes).


They're not 'refuting claims you never made'. They're simply responding to your comment by making their own points. Which is a completely standard way to carry discussion forward.


I knew this happens, and I upvoted. Learning details about something I knew about is useful.

What makes you think upvotes imply people don’t know this sort of thing exists?


Because Troy Davis, in his tweet, literally said "I didn't know that black-hat content marketing existed"

I think his comments here indicate that what he meant was he wasn't aware it exists "at scale" in this way, but taking the content of the linked post at face value, it's easy to come away thinking "How did he now know this existed?". It's precisely what I thought after reading his tweets, but before seeing his detailed comments here.


Maybe he is a bot creator (or a bot account) that tries to discredit the importance of other bot accounts? I mean it would be the natural step for those groups to try to persuade the public (e.g. reddit users) that is not a big deal that those accounts exist.


An upvote doesn’t necessarily imply lack of knowledge of this existing.


I am interested in why you felt the need to make the comment, though? Because it can come across jaded like, "yeah yeah, seen it all before, there's nothing new here", which is sort of like throwing cold water on it, which can reduce the odds of positive change happening.


> Is this actually news to people?

Good question. The thing that was news to me is that it's happening on very large blogs, not random sites and that it's being sold to and used by well-known startups. Like you, I expect sockpuppet accounts and random linkbuilding garbage, just not at this scale or with this much distribution.


Welcome to the world of black hat services => https://www.blackhatworld.com/forums/seo-packages.206/


The thing to note there is that despite being on a "black hat" website, most of those ads are from marketing or SEO companies linking to their own legitimate company websites. It's clear most of them just see this as how internet marketing is done.

I also note that as a not logged in user, the Twitter just shows me "related tweets" from people actually hiring or offering to sell social media content.


lol. sus tho I like how this site has a cookie policy overlay.


Not only very large blogs - the Forbes contributor program is probably the biggest example of "linkbuilding garbage"


The internet used to be a suspicious, sarcastic, and rather disbelieving crowd, but that has changed for newer generations. I do thing there are many people for whom this is news. There 's just too much fluff feel-good marketing content going around that one forgets the motives behind it. Old style blogs and comments made it obvious that there were tons of spam, nowadays it has gone covert because major distribution platforms remove obvious spam.

Unfortunately this means that if you want to promote your side project, there's just no way, most topical subreddits will flag u as spam, and you are left with niches like r/sideproject. Or you do this kind of social media/content marketing


It would be news to millions of Americans, for sure, since millions of people know about "the internet" less than they think, and additionally, many millions don't "think like a criminal" to consider how organizations might try to trick people.

And as others have said, even if the behavior weren't surprising, learning about a specific ring of it is.


It's like just recently dawned upon people that criminals and corrupt people also go on the internet and take their shitty behavior with them.

color me shocked.


This very site denies it happens even though it is often quite obvious.

> Please don't post insinuations about astroturfing, shilling, brigading, foreign agents and the like. It degrades discussion and is usually mistaken. If you're worried about abuse, email hn@ycombinator.com and we'll look at the data.

https://news.ycombinator.com/newsguidelines.html

If you mention this kind of thing at all, dang will pop in and warn you. You can post all kinds of other crazy stuff, but mention astroturfing or vote manipulation and you will almost always get a response. This is because these sites realize the exact opposite. That the percentage of this stuff is absolutely massive and they are terrified what will happen if the public finds out how common it is.

Its not that people do not realize it happens. It is that the average person is underestimating it by serveral orders of magnitude.


That's no denial. That guideline specifically asks people to notify us so we can investigate. We do that every time anyone asks. I've personally spent hundreds (probably thousands) of hours investigating such things, have banned many accounts and sites for it, and have put tons of effort into writing code to combat it. We take it seriously. We just need some evidence. Surely you don't think we should lower the bar below that?

If you don't believe me, ask troydavis, who went to all the trouble of investigating the above case and writing the OP, whether we take evidence seriously and ban accounts and sites based on it. Or any of the countless other HN users who spot things and ask us to look into them. They're the ones who actually care enough about the community to help protect it.

The problem is that there's another side of the coin: most of the cheap insinuations of astroturfing, shilling, foreign-agenting, spying, botting, and all the rest of it—where by "most" I mean the vast majority—are pulled (begging your pardon) purely out of the insinuator's ass. Internet users just love to make this stuff up as a cheap way of throwing shade on whatever they dislike. That's the dross the guidelines ask HN users to keep out of the threads. This is important because gratuitously accusing others of dishonesty is a fast track to poisoning community, and it's 1000x easier to generate such accusations than it is to answer them.

> This is because these sites realize the exact opposite. That the percentage of this stuff is absolutely massive and they are terrified what will happen if the public finds out how common it is.

Here is something I can answer definitively—you're talking about what's going on in my mind and I think I can speak with some authority about that. No, that is not what's happening. What's happening is that I worry about the integrity of the community on two sides: protecting it from actual abuse and manipulation on the one hand, and protecting it from toxic fantasy bullshit on the other.


> I've personally spent hundreds (probably thousands) of hours investigating such things, have banned many accounts and sites for it, and have put tons of effort into writing code to combat it.

> No, that is not what's happening.

Not to try and get clever and twist your words, but these statements do not appear to line up particularly well with one another.


First statement is absolute numbers. Second statement is regarding relative numbers. They can both be true.


Ok, I'll give you a detailed breakdown. You said that three things were happening which in reality are not:

(1) that we "realize the exact opposite [of what we say]" — in reality, I tell the truth as far as I know it, because I respect this community (edit: plus, for the cynical, it would be a stupid and unnecessary risk not to);

(2) that we're "terrified what will happen if the public finds out how common it is" — in reality, I'm confident that the community would be bowled over by how diligently we work on this, and my only woe is that half the commenters don't want to hear it when I tell them how common it is (namely, that it's uncommon relative to the insinuations that they love to fill the threads with, and that such insinuations are the harder problem to solve and a heavier burden on moderators);

(3) that "the percentage of this stuff is absolutely massive" — in reality, unless I'm wildly ignorant of my job, it's tiny relative to the quantity of imaginary things people make up about it. The latter is the greater threat to HN. With real astroturfing and other forms of abuse, it's possible to find evidence and take action. But how do you persuade the internet not to hurl shit-soaked spaghetti everywhere? (Sorry for the unhinged metaphors, but it's demoralizing to argue about this in HN comments, because none of the users making grand insinuations want to hear about that side of the problem, and when I raise it they say things like "dang denies that astroturfing exists".)

We have a rule that you can't manipulate voting, commenting, or submissions on HN (because some people do that and shouldn't). We have another rule that you can't smear others with insinuations of abuse without evidence (because some people do that and shouldn't). There's no contradiction there. That doesn't seem hard to understand.


> in reality, unless I'm wildly ignorant of my job, it's tiny relative to the quantity of imaginary things people make up about it.

It is a safe bet you would not have written this the way you did if you knew this was a larger problem that you did not reveal or you were an amazing thespian.

Apologies on the insultation, which is obviously unfounded at this point.


Appreciated!


These are the same people who made dedicated effort to trick Lobsters users into giving them invites so they could spam our site back around the start of 2020: https://lobste.rs/s/utbyws/mitigating_content_marketing

If they're still spamming for LoadMill eight months later, that strongly implies to me that the clients know what they're getting and are OK with the tactics.


I am curious if it actually works.

Do the people paying for the content get any net benefit?


i m sure there are others, there was some website to buy hn upvotes too


Using that site will get your accounts and sites banned here.


How do you determine that it's the site's owner? If I buy 1000 upvotes for mycompetitor.com, will you boot them?

Attribution is a terribly hard problem, Google has the same issue with paid links, and handles with the Disavow Links tool, but that's a hack, not a solution, as it essentially means you have to buy services that get you and up to date list of links to your page so you can disavow them before Google hands out penalties.


For sure it's a hard problem, but in this context on HN, it doesn't come up in many cases. When it does, how we handle it is by being flexible and personally available about unbanning. That's a luxury afforded by smallness.

In the more common case, though, it's clear enough who's doing it to take action. And you'd be surprised how often we get explicit confirmation of what happened. Sometimes a site owner will even passionately profess innocence and then sheepishly come back later with "I'm so sorry, you were right, it turns out my marketing-person/friend/teammate did it."


Thanks for this, this thread has been an eye opener for me.


Tragedy of the commons. A problem waiting for a solver


Eye opening isn't it? Now you can see where most of the stuff on top of HN comes from.


Unless I don't know the first thing about my job (always possible I suppose), that is definitely not true.


(always possible I suppose)

It would be funny (both -haha and -peculiar) if it turned out you do it all largely by instinct, like one of those fungus-cultivating ants.


Eh, so people are paying for upvotes, sock puppets accounts etc. but it doesn't work ? :o


That doesn't follow (because "most of the stuff on top of HN" is a much stronger claim than what you said here) but I'm happy to answer anyway. The answer is that it frequently doesn't work. What we don't know is how many other cases there are where people are getting away with it. We can't know that; by definition, we're never going to know it.

Even there, though, one can make educated guesses, based for example on how many cases come up where people got away with it at the time, but then evidence comes to our attention later and we can figure it out in retrospect. Such cases are useful because then we can extend our software to catch them in the future.

I'd never claim that such manipulations never work; obviously we can't know that. It's possible that superclever manipulators are rolling HN in their hands like a piece of silly putty. All I'm saying is that if "most of the stuff on top of HN comes from [that]", then I don't know the first thing about my job.


Yeah, you are right. Sorry, I exaggerated there, HN is the best mainstream source for technical stuff honestly.

Still many posts that make it to the top feel like commercial advertisement and I am pretty sure they use (their?) sock puppets to get the first upvotes to cheat the algorithm.

It's easy to create 20 accounts and just switch between them (and IP) when you do your normal HN procrastination to validate them and then get those initial 20 upvotes to go up to the top.


I think the GAN (prob actually not a GAN) that's months old on HN and posts rubbish and gets replys from mods and people and gets lots of karma is much more interesting

Imagine what it'd be like if AI wasn't all overhyped garage.

Downvoted I guess.

Oh, AI that can group think. That would be evil.


I have upvoted you in hopes that you will provide a link to some of the posts you suspect were generated by a GAN


It's prolific. If you read comments you will have seen it.

Question is why only one person has called it ( I can see )

It's not against written HN rules, bots are not allowed in unwritten rules, but is it a bot bot?

What more interesting is it's not a ring ( I assume it's upvoted by normal users)

They will be hard to detect. This bot could just calm down and fly under the radar and use upvoting and downvoting to control naratives. By the hundreds.

Need to get people to read comments, as compared to finding a word in them that confirms groupthink and upvoting.


One of the main reasons I read hacker news is the comments. To me they make this site one of the great wonders of the Internet world. Would you mind linking to a few of them that you regard as non-human generated?


I sent you a email.

I just ask you don't pass it on since I think what is more important is people on HN think more about comments. (Or call it themselves)

Its history and upward karma are interesting. I think it's clever in that it anthropomorphizes itself to reduce attack. Next step would be to hint at mental illness.

But it's about comment quality. Blackhat, pretend OpenAI intelligence don't matter. Comments need to speak for themselves.


There are no fake accounts. There are only accounts or not.

There are no sockpuppet bloggers; those are called bloggers.

If you go down the fake road, you have to realize real news by real reporters is often wrong. I’ve had someone in law enforcement tell me the news accidentally labeled them the victim on televised news and didn’t correct it.

Classbooks written for US public school students- I understand that some content in them has been incorrect and intentionally biased.

There is truth, but it’s an ideal.

I’m glad that this is calling out those that are manipulating people, but on the other hand- what is the goal?

Will shaming bring fairness?

We could have communist dictatorial leaders enforcing their version of truth, if you’d rather have that sort of thing.

Our president should tell the truth, and it should be a scandal if not, to a point of course, because I’d bet most have lied at times.

But if it’s time to activate something like a libel superpower on the internet, how would that even work in a fair and practical way?

Freedom of speech cannot be freedom only to tell truth; truth can be aspired to, but not necessarily known by all, and what’s understood to be truth by some may change. So, really, what should be done?

Btw- I’ve done my best in past years to tell the truth as much as I can when I’m not kidding around, and it typically makes things difficult, but better. I’m not recommending anyone fake up things to boost rep. But, it’s happening, it’s not good, and I don’t see how AI or oversight or a control play would end well when it comes to enforcing truth. However, the notion of a “fake” account is what allows most of the users to post content on HN and Reddit more freely.


> By the way, there are no fake accounts. There are only accounts or not.

Yes, until we see a great advancement in AI, actual meat based mammals are driving these accounts.

> There are no sockpuppet

Sometimes it seems like half of Twitter is fake accounts. You've never seen photos or videos of a Bangladeshi click farm? 50 people sitting in small cubicles running proxy-connected virtual machines on desktop PCs, posting stuff, upvoting things on reddit, etc?

I assure you that such things exist. Some of the places that used to do MMORPG gold mining to trade virtual currency for real money have shifted into the market, because it's much more lucrative.

You've never seen the pictures from China of 1 person sitting in front of a board with 40 budget android phones mounted on it, upvoting and reviewing apps?



looks honestly like the worst job ever.


it probably pays about the same as manually assembling cheap plastic and fabric children's toys or christmas ornaments or whatever.

https://www.theguardian.com/business/2016/dec/04/the-grim-tr...


Precisely that.


Those are iPhones.


I can't find the photo right now, but absolutely the same thing exists in the android app ecosystem. If you read and write fluent Mandarin you can probably find such in 30 seconds of searching within-the-GFW search engines.


I’m with you that it’s a serious problem. If it weren’t, Amazon and others wouldn’t be working so hard on AI to combat the AI or human that’s beaten their AI.

Those aren’t “fake accounts”, though. They’re real accounts being abused. There’s a difference. If Amazon and Twitter allow it to happen, it will happen. But what does shaming accomplish here? It just means people waste time talking about it. It has little chance to change behavior. More likely the outcome could become Reddit and HN enforcing a real ID. That may hurt the community, because not all of us want our name on everything; it’s not because I don’t stand behind what I’m saying- I’m just not going to treat every post like I want to carry it around with me on a sign for the rest of my life, even though at some point, maybe I’ll have to!


So companies pay to get featured everywhere. How is this "black hat"?

That's how the world works, bud.


These must be some of those those “sufficiently smart manipulators (SSM)”, whose genius HN’s moderation is so utterly helpless against


We caught those accounts months ago and their patterns were within the range we understand.

I haven't had a chance yet to look at Troy's latest report to see if there's more that needs cleaning up, but that's a question of inbox load, not smartness.

Edit: it would be interesting if anyone found recent submissions by these accounts that made HN's front page. If anyone does, please let me know at hn@ycombinator.com because I'd like to look at whether and why we missed something. HN has anti-abuse measures that do not show up publicly (because we don't want abusers to observe it) that ought to have prevented most if not all of that.


Social media -- including this site -- is far more manipulated than people even begin to realize. Social media propaganda and sock puppetry is a huge industry.


I'm curious how you think you know that about HN, because if that's true, you know much more about it than I do.


Not the OP and I cannot provide any evidence. However, for a large number of controversial topics, I try to guess the top comment in my head before opening the comments page. My estimate is that 70-80% of the times I guess it right.


That would be evidence that people on the internet are predictable, not evidence of manipulation.

Predictability on the internet, which increases both with group size and with the divisiveness of a topic, is also a huge problem for HN [1]. But it's not the problem we've been discussing in this thread, and I think it's important to make clear distinctions between the issues. It's common for people to leap from some other issue to "astroturfing! shill! spy!" explanations, instead of facing the original issue.

[1] Here's why, if anyone wants an explanation. Predictability is the enemy of curiosity. Worse, when discussions are predictable, there isn't anything intellectually interesting in them, and the mind seems to resort to flamewars to amuse itself in the absence of anything better to do (https://hn.algolia.com/?dateRange=all&page=0&prefix=true&sor...). So predictability destroys community as well.


I see a lot of “I can’t be hypnotized” comments here. I’ve got bad news for you: you can be hypnotized, advertising works, and comments by astroturfed accounts will affect your view of the world. This is a serious issue for those of us who enjoy content aggregators. What I want to know is what the owners of this site can do to assure readers of the effectiveness of their countermeasures.

Given that I don’t think I’ve ever seen a negative news story about Reddit on Reddit, I think I know their approach.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: