Hacker News new | past | comments | ask | show | jobs | submit login
Reddit founders made hundreds of fake profiles so site looked popular (2012) (arstechnica.com)
139 points by qzervaas on June 11, 2023 | hide | past | favorite | 40 comments



Many small startups generate fake traffic, either for testing or for marketing purposes (e.g. cheating on the numbers).

I would be surprised to find any successful company that had no shenanigans in their origin story.

My history included a startup that shipped empty boxes to meet numbers, scraped thousands of emails from more popular websites to sell as their own traffic, even one that forged stock certificates to secure funding. (The FBI ended that one)


LinkedIn famously abused oauth permissions to take over the email accounts of their users and send invites to their service to the contact list.

Please don’t normalize this. Just because it has many famous examples does not mean it should ever become socially acceptable. Fuck every company that has done this.


> abused oauth permissions to take over the email accounts of their users

Woah, never knew this. Do you have any reference? I cannot find anything about it.



Well, it definitely goes against the economic propaganda that the US fed the rest of the world, that you can become wealthy by hard and honest entrepreneurship.


I'd guess that this often happens when companies are failing. For every one that laughs about it years later with billions in market cap, there are 100 that failed.


> I would be surprised to find any successful company that had no shenanigans in their origin story.

And it's not just in tech. New restaurants hire actors as customers to make the restaurant appear busy and popular. Publishing companies and authors buy their own books to make it appear popular and climb the best sellers list. Clubs give out free tickets. Studios buy seats or entire theaters to make it seems like their movies are selling out. The older you get, the naive idealized world of business recedes as the stark "fake it til you make it" world emerges. Everyone fakes it. But not everyone makes it. Taken to extreme, we get elizabeth holmes and theranos.


I've been considering using Reddit data to pre-seed the content in a successor to Reddit. Though I am unsure how that would stand legally.

As a side note, I created an alternative Reddit API[1] and Reddit didn't like that so much they banned my 13 year old Reddit account.

1 - https://api.reddiw.com


IANAL. For the US, users grant Reddit a license to use their content when they post it. The users still own that content. Reddit's license does not extend to your reuse of it[0], nor have the underlying users directly granted you permission, so it would not be legal (in the US) for you to reuse like that.

[0] "you may not... license, sell, transfer, assign, distribute, host, or otherwise commercially exploit the Services or Content" https://www.redditinc.com/policies/user-agreement-september-...


Wouldn't that mean it would be down to the individual users who still own each bit of content to issue a DMCA takedown if they objected?

I imagine the number of such requests would be small.


Ah. The old “I did so much copyright violation it would be infeasible for everyone I took content from to enforce” defence. I see nothing that could go wrong.


Posting that you’re going to be “using Reddit data to pre-seed the content” may make it a bit harder to dodge Reddit in court.


Although prompting “write a comment replying to the text ‘<snip> in the style of u/landfe“ would yield something I copyrightable…


I was chatting about this with some friends. If we had a million or so spare, just fork Reddit. Grab the latest open source version of Reddit, pay the pushshift guys for the most up-to-date dump they have and get it in.

Make a system for claiming your old Reddit account. I'm guessing if you try to use OAuth, Reddit will just ban you. So you need to get creative, probably make an extension that grabs the users sessionid from their cookies or something (or let people copypaste it in if they are technical enough).

Fun to imagine but unfortunately probably won't happen.


Noone will use it


Just launder it through an LLM, problem solved.


Indeed. Could call it something like the RedditCrawl corpus.


don’t even need reddit with an llm, I did some back of the napkin token math and you can fake a year of activity for a couple thousand dollars (varies by number of users and comment length of course) - hell, you can even make it look active in real-time and respond to real users - as long as you give it some guidance about commenting style (as in not the default gpt 8th grade essay style) it’s very hard to tell


Adversarial interoperability like this would be a great way to neutralise network lock-in effects and create a more level competetive playing field between social media companies. I think we should enshrine protections for this kind of thing.

There was a strong 2019 precedent in favour of allowing this kind of scraping of public content (from LinkedIn in that case): https://www.techdirt.com/2019/09/10/big-news-appeals-court-s...


> As a side note, I created an alternative Reddit API[1] and Reddit didn't like that so much they banned my 13 year old Reddit account.

"I broke Reddit's TOS deliberately and repeatedly and they banned me!" is another way to put it. But it doesn't sound as good and because of the current zeitgeist people will tend to side with you anyway. Perfect timing for you :)


Having first rephrased it all via Chat GPT.

Load up those liabilities.


Do you mean using ChatGPT this way would also be a liability?


Their eventual replacement will have to employ similar shenanigans, right? It's basically tradition among social media unicorns.

Except now, of course, one can build MUCH more interesting tools for pre-seeding communities. Should be fun!


They are doing the same now.... Reddit will end the same way as it begun


They don't need to; the incentives are aligned, just as they were for Twitter when it was a public company. Banning reasonably well-behaved spammers (the competent ones) would only serve to lower the IPO price.


I wasnt talking about spammers... I'm talking about content created by Reddit itself.


I dunno, as someone generally disillusioned by tech and its shady practices/ethics, this doesn't seem all that bad to me. It's a little gross but maybe a necessary evil in this day and age.


To me, it was gross when I first learned about it, and it remains gross today. Just because approximately everyone[0] seems to be doing it, doesn't mean it's right.

Now, I would have less of a problem with this, if all the founders and CEOs would openly admit, "this is how business works", "we do what we must", or "we don't call someone a business shark for being nice and friendly to the fish". But they don't. Instead, they go on stage or order press releases and creation of blog posts and other content, talking about changing the world for the better, about excellence, moral virtues, diversity, inclusion, responsibility, sustainability.

It's all bullshit. Why should I believe they mean it, when they have a long history of lying to or bullshitting people, dating all the way to the first days of their companies, and they're 100% fine with it?

--

[0] - At least in the group of companies that would be discussed on HN and similar places.


I don't expect CEO's mean anything they say. But I do think this infraction is minor compared to everything else that goes on. Overall, I agree with you.


> Huffman explains the strategy in a video for Udacity, an online education service.

The Udacity web development course was my introduction to coding. It blew my mind. He was a good teacher.


So does every social network.


It's a solid technique. I did the exact same thing for a news site I started with comments on articles and the forum.

No one wants to hang around a 'dead' site, but post enough activity, post enough 'controversial' comments and people will want to join in, and then suddenly you have an actual ecosystem of real people driving traffic.


I don't see a problem at all. You need to bootstrap a UGC platform somehow.


Price Club #1 on Morena Blvd., San Diego, asked employees, friends, and family to park cars in the lot. Seems to work. Of course the financials weren't reporting average daily cars in the lot either.


This was something that was (mostly) praised in the start-up community up until now.

I guess every big-boy internet entity is losing a layer of shine in 2023.


This was well known


Dating apps do the same. It's kinda easy to spot a bot. I wonder what GPT-powered bots will do on such apps.


Well, dating sites and other sites that needed to bootstrap to critical mass used to do it all the time. YouTube filled their site with copyrighted clips.

With AI, it would be far easier to bootstrap a plausible-looking site full of “active users”, and unscrupulous startups might do that.

It is why we built this, which at least is the most ethical approach we could come up with, as a service for any community to encourage discussion on ghost-town topics, while clearly disclosing its bots:

https://app.engageusers.ai

In about the broader debate about generative AI, this seems to be one of the least harmful (it has responsible disclosures) and most helpful (https://xkcd.com/810/) approaches

https://news.ycombinator.com/item?id=35779455

PS: feel free to use it if you own a Discourse forum, we would love to hear feedback


And all these years later the site is overrun with botters and fake profiles


Fake it till you make it!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: