Hacker News new | past | comments | ask | show | jobs | submit login
Twitter Followers Vanish Amid Inquiries into Fake Accounts (nytimes.com)
336 points by ganlad on Jan 31, 2018 | hide | past | favorite | 167 comments



Twitter is in a dire situation. As a fun project I wrote a Lua - Torch bot to search for certain tweets and hit like on them based on sentiment analysis.

I realized that API query results were mostly news bots, retweet bots, corporate PR bots, social media aggregator platforms like Buffer, and just plain old spam bots.

How bad was it? After filtering 1,000 tweets per query, I barely found 10-20 real human users. That signal to noise ratio is dismal, and detrimental to the core product experience. Twitter must be forced to maintain this fake high activity to prop up the share price.

BONUS: Guess who else is spamming their post feed: Tumblr. Tumblr didn't allow any adult content or keyword search; since Marissa Mayer took over she seems to have loosened that policy to fluff the numbers. Tumblr today is drowning in porn.


> BONUS: Guess who else is spamming their post feed: Tumblr. Tumblr didn't allow any adult content or keyword search; since Marissa Mayer took over she seems to have loosened that policy to fluff the numbers. Tumblr today is drowning in porn.

...what?

Tumblr was known for porn long before it sold to Yahoo. If anything, Tumblr started cracking down on blogs with adult content afterwards (for example, requiring users to log in before visiting them). There was a huge backlash from artists and bloggers with non-pornographic gay-themed content, because many of them were caught in the ripple effects from these changes.

I had friends who worked at Tumblr before 2012, and the running joke was that Tumblr was 50% porn and 30% pictures of cats. (Don't take those figures too seriously, but clearly porn was a large part of Tumblr long before the acquisition).

Also, Tumblr definitely did allow adult keywords in searches. I know this because of a rather unfortunate incident at a student hackathon my company sponsored, in which a student thought it would be a "funny" idea to search for a risque phrase when demonstrating his weekend hack (an aggregator of Tumblr posts).


The porn bots have only gotten worse, though. In 2012 it was at least mostly human-run porn blogs, and they didn't start following you if you were just a regular blog.


That's true and it needs to be dealt with, but OP definitely is speaking outside of their domain of knowledge because Tumblr was always drowning in porn. OP didn't specifically mention bots which leads me to believe they are speaking of porn in general.

At boarding school many moons ago, we all used Tumblr to get our rocks off because most porn sites were blocked.


One of Tumblr's differentiators is allowing adult content. It's very hard to find places friendly to that when you're looking to distribute your artwork and get a following going.

Thankfully more and more websites are based out of other countries now and don't need to swim in the puritan current of the US.


Try hosting such artwork out of India.


I have spent several years working on a product (https://www.rapidcrowd.co) that cuts through the noise (bots, fake accounts, inactives) of Twitter to find real users that fit related topics - and your rough estimate of 20 real users in 1000 tweets isn't too far off.

For this reason, trending topics and keyword search are essentially hijacked features.

However, because I believe there are many useful bots that have organic followings in the millions - I don't believe they need to be simply removed from the ecosystem.

Instead, my suggestion would be a 'bots' account type. Some ideas:

- A robot version of the 'blue checkmark'. This would allow users to quickly identify a tweet as sent from a bot.

- This account type could be linked to a real owners account, much like Twitter apps are. Accounts flagged and failed to register as a bot could be subject to deletion.

- Bots would automatically receive low ranking in search queries, and trending topics. Perhaps they would be completely delisted.

- Bots cannot follow other users.

- Bots cannot tweet at* other users.

More extreme:

- Bots cannot tweet without some sort of spend. Maybe they can only tweet in some ratio from real Likes they receive. This is a bit extreme, but would mitigate a lot of problems.

I really believe a happy median could be found - and currently think that a well-curated Twitter timeline is amazing, but as I stated search results and trending topics are completely broken.


When I first joined Twitter a few years ago I tried the ‘search near me’ feature a few times.

Weather bots. For any city within 100 miles of where I am. Plus bots posting job listings. Plus companies posting those same job listings.

There was basically no signal to find, it was all noise. The few ‘legitimate’ ones I found were from local PD/FD.

I think they just need to ban all bots/automated postings. Or make them filters le and require a $100/mo account and $1/tweet. Something to discourage the absolute garbage.


But these bots exist because some people actually use Twitter as a newsfeed for that sort of thing. And being partly an RSS substitute is surely part of Twitter's business model these days. Not suggesting that many of these accounts weren't still absolute garbage, but the signal/noise ratio isn't always great with obviously human-run accounts either.

The other issue is that the distinction between a bot and a human isn't clear-cut: there's plenty of shades of grey in between something which spits out badly-scraped listings all day and an actual human having noncommercial conversations with Twitter friends. It's more classifying what's bad behaviour (using the pornbot tactic of mass follow/unfollow to attract attention even if you're a human marketer tweeting actual content, and even if you haven't written a script to do it) and what's perfectly acceptable automation like tweet schedulers that could use some work


>But these bots exist because some people actually use Twitter as a newsfeed for that sort of thing.

I would dispute that. I'd argue that the bots exist because spamming twitter is free. If it costs me $0 and I get even a tiny benefit out of it then it is to my advantage.

Its essentially the same problem as spam email.


We tweet the scores of our games automatically from our backend. There's no profit in it, but our members appreciate it based on follows and retweets. That's real people following and retweeting. I actually comb through and remove obviously fake accounts from following us.

I think this is a legitimate use of a bot. We even mention the host club because they want us to.

What's funny is that the account got squelched 3 times before we got a human at twitter to officially prevent us from getting flagged. So they do definitely have some measures in place to prevent spam accounts. I suspect it's become non-trivial to identify all the bad actors.


That sounds fair. Maybe it’s more of a volume issue. You get one or two ‘bot tweets’ per day without paying.

I know lots of people also use bots for cross posting from Instagram or something else. Or to post when they put a new article up on their site.

I’m sure you’d have to allow them to some degree. But there are some really noisy bots out there that need a fee attached to em.


I was signing up for your service until I saw the permissions your app wants. Permission to update my profile? See my DMs? It doesn't explain anywhere I could find why it needs these permissions.


We do not have any application functionality relating to modifying profiles.

We do request access so you can send Direct Messages from your dashboard. We are considering removing that functionality for the sake of privacy.

Unfortunately, the Twitter application permission model is not granular:

https://developer.twitter.com/en/docs/basics/authentication/...

We would prefer to just have write functionalities and not read, but this is not possible in their model.


How is what you are describing any different from sponsored tweets?


You don't have to follow a bot, so you will never see their tweets in your timeline. You can't remove a sponsored tweet from your timeline.


I haven't used an official Twitter client for a while, but you certainly used to be able to block the account making a sponsored tweet, which will remove it from your timeline.


A lot of Tumblr porn is actually very specifically (and astutely) curated. The few times I've visited, it's actually a much nicer experience than porn websites. If anything, I think Tumblr would be wise to lean into being, among other things, a space for the sharing of erotic content. Maybe there's some spam but it's pretty easy to find your way into non-spam on Tumblr in my experience.


Isn't that a bit thorny since most of the content is copyrighted...


My understanding of the Tumblr porn content is that they’re usually small clips and I believe under so many seconds doesn’t count under copyright infringement laws. However, nothing would be a problem if it was original content that had waivers and age of consent forms and such on file.


> I believe under so many seconds doesn’t count under copyright infringement laws

Careful here. This appears to be referring to the concept of fair use, but firstly there are multiple criteria used to judge fair use (https://en.wikipedia.org/wiki/Fair_use#U.S._fair_use_factors), secondly these criteria are (intentionally) subjective and up to a judge's interpretation, and thirdly fair use is US-only.


The term fair use is US-only but many nations have similar concepts or concepts that overlap with it. Australia's 'fair dealing' policy allows for the use of copyrighted material without seeking approval if its purpose is in satire, research, reviewing, media criticism, or news reporting, for example. What's notable there is that length or amount used aren't as important, which has some positive effects but also some important negative ones (it would not be possible to create Google in Australia because taking the summary snippets and image thumbnails has no legal justification). Interestingly in the last big debate over loosening Australian copyright law and adopting broader fair use, the American MPAA was the biggest funder of opposition efforts.


> thirdly fair use is US-only.

That seems fine. In the interest of their userbase, companies should aim for the least copyright-damaged user experience possible; this means picking a single country (ideally one with liberal copyright law) and ignoring copyright law in other countries they aren’t based out of. If countries want to force their censorship standards, they can at least be honest about it and block the website (rather than silently deflecting the responsibility of censorship to the website itself).


Not your problem when the content is user-uploaded.


tell that to piratebay


No.

They brought the heat on themselves when they decided to mock DMCA requests and cease and desists instead of accommodating them.

There's a reason 4chan, Reddit, imgur, Tumblr and its ilk all still exist in the age of copyright. None of them produce their own content-- it's all user submitted, and mostly in violation of some copyright or another.

Generally speaking, my understanding is that hosts aren't liable for user-uploaded content unless they curate or promote it (deletion notwithstanding). That changes their role to that of a content distributor/publisher instead of a mere platform. This is why backpage's CEO got arrested for human trafficking whereas craigslist's did not-- when challenged craigslist shut down its prostitution ads, but Backpage actively reworded and posted them and in doing so became their publisher.


The content uploaded to the pirate bay wasn't (isn't) copyrighted either.


Most on porn sites is, though.


Most on the internet too. Every image meme and reuploads of other images are basically also copyright violations.


This doesn't mean a company Tumblr's size can ignore it. Reddit, Google, Facebook, et al all need to consider these things as they operate.

I'm not saying they can't skirt the laws a little bit and get away with it, all I'm saying is they need to have awareness.


Nah, they just need to be vigilant about DMCA requests.

Due to the Safe Harbor rules in US copyright laws they don't really need to remove copyrighted content proactively. There's tons of subreddits exclusively dedicated to piracy that they don't care about.


Everything you say, and everything you think, is a copyright infringement. Pay up!


I've yet to figure out why anyone would consume any kind of content on Twitter.

I've used it for a while, and what I got is that it's goos for (and people use it) to spam others about your projects or show off. However, if you try to use it to get news or updates on anything it is the least efficient, most stressful thing I've ever used.

I see Twitter as a good tool for outages, natural disasters, and protests. That's pretty much it.


> However, if you try to use it to get news or updates on anything it is the least efficient, most stressful thing I've ever used.

Depends heavily on the set of people you follow. I've found it to be a great source of news, and I typically see news show up there hours to days before I see it show up in places like HN.


For instance, if you follow sports, it's an extremely efficient and direct way of keeping up-to-date with teams and players.


Twitter is the most awesome, amazing source of news and updates I've found. For example because I follow Tavis Ormandy on Twitter I know that a vulnerability related to bittorrent will soon be released by Google's Project Zero.

That said, it took months to get my feed to where it is today. It's not easy for each person to find the mix of accounts that is best for them. My recommendation is, be fast to follow folks who look interesting, and fast to unfollow folks if they are boring or you don't like them. When you find folks you like, see who they retweet, reply to, follow, etc. and follow all those folks to see what you think.


are there services out there which curate the channels available to insure quality and content? Say if I wanted to follow a particular sport or team, are there services which can set it up? Same goes for any subject


That would be very subjective. But probably exists. I am a heavy user of Twitter's List feature, and have columns in Tweetdeck for friends, local, sports, politics, colleagues etc. And try to curate these for my interest as best as possible, ie weed out too noisy tweeters etc.

I guess you can find other people's lists and follow them? E.g. a sports journalist's sports lists etc. I have not tried to do that so not sure if there are hurdles to overcome. And maybe someone can aggregate these public lists for others to find/follow?


> Say if I wanted to follow a particular sport or team, are there services which can set it up?

It's called ESPN.


ESPN only really cover one country. If you're in the US I'm sure it's fine but for everyone else the coverage is not only almost nonexistent but often factually wrong when it does exist. I saw AFL coverage on ESPN where they did not understand the distinction between goals and points and left the wrong scores up for ages.


I use it to follow a relatively small (a few dozen) group of mostly computer scientists with a few cooks/chefs and basketball writers. At this level it takes maybe 1-2 minutes to read an entire day's feed, and I usually end up with a few interesting things to read that I might not have seen otherwise. I have also found it useful at conferences to follow what's happening.

I have no idea how people follow more than say a hundred people profitably.


Yes, no idea.

What I find awful is that HN and Reddit save me time by sorting out low-quality content, while on Twitter you're the one that has to work hard to do it (if you can), and Twitter itself just adds more noise in the meantime.


I use it to follow artists, local media personalities, local journos, etc. to get live information from them.

This is particularly relevant for hyper-local media which basically has zero media footprint outside of dead-tree newspapers and talk radio.

Basically: follow your city councilor on Twitter. Start from there.


The only reason I use twitter is because it's where the forum culture of the 2000s has migrated. All the good SA posters pretty much only use twitter now, so it's the only place for that sort of content.


What's "SA"?


SomethingAwful - the accounts they meant are probably guys like @arr, @livestock, @dogboner, @sexyfacts4u, @dril


Yup, thanks.


I use it mainly to keep up with streamers and fellow artists, along with a good amount of news from various aggregators. I also have interesting conversations on there occasionally.

Twitter is what you make it. I'm sincerely worried their VC backed top heavy company is going to topple over and carry with it to oblivion a service that could probably run on 1/20th of it's infrastructure.


For sports, it's the fastest way to get news - if you follow the right people.


I was using it to get notified every time my favorite columnist put up something new. Unfortunately, the publisher started requiring registration to read stuff.

But I see it as a way to keep track of what a columnist, journalist, or public figure is doing.


Twitter is quite excellent for some communities. For Javascript and politics, for example (and also javascript politics), Twitter has become the place where the news actually happens, instead of just being disseminated.


Or it could be that Twitter cares about fake account impact mainly in terms of user experience. If a fake bot likes a tweet from another fake bot, does any human actually care?

I used to have a much bigger problem on Twitter with fake accounts and bot likes than I did, but despite my decades of willingness to bitch about spam, I have to concede that they've gotten better lately.

There's also a user benefit to letting bots run when they're not hurting anyone: you don't give hints to the spammers on how you're finding and nuking them. Indeed, if you identify them but don't block them, you can get a lot of data on what is spam. For example, on my mail server, I noticed I was getting a lot of dictionary attacks. So I took the hundred most common first names not in use on my domain and fed them all into the spam training system. That means odds are very good that my spam trainer will have seen a piece of spam before they try my actual account.


You get to see a flood of spambots if you make the mistake of clicking any of the trending hashtags, but otherwise yeah it's a non-issue for the vast majority of Twitter usage.


> If a fake bot likes a tweet from another fake bot, does any human actually care?

Very nice!


What's crazy is that bots are not violating Twitter TOS. I reported a bunch of fake accounts I found and was told they aren't breaking any rules, even ones just posting spammy links.


Well, yeah. If they're just tweeting and not @ing at other users, they can't really "spam" anyone since only their followers, who choose to see them, will see those messages.

Twitter does take a pretty dim view of DM and @reply spam though because it is genuinely annoying and not opt-in.


Is follow spam targeted? Because most of my interaction with obvious bots is follow-churn.


The trick with Twitter is that basically these bots are invisible unless you search hashtags or read the replies to a popular person who is spammed by them. It's not like anybody sane would follow these bots.

Twitter's model is basically "shouting at the universe" but if nobody listens it doesn't matter.


Problem being investors do listen to these metrics and nobody really knows how bad the noise is in that data being shared.

I 100% guarantee twitter knows the exact ratio of real to unreal users and their content.

They're also trying to slowly crack down on bots.

A year ago they insta locked your account if it looked like automated liking of content.

Now they're insta locking if you look like a follow/unfollow bot.

Its a balancing act and I feel like its analogous to getting out of extreme debt. They're trying to replace bots as their real user base grows internally.


Using Twitter for search/discovery by hashtag or keyword is totally pointless.

I follow specific people that I've discovered outside of the network, or by referral, and as a result my feed is basically "all signal".

This seems like the only reasonable way to use Twitter now..


>After filtering 1,000 tweets per query, I barely found 10-20 real human users.

I am kind of surprised the number of human users is so high. I know a lot of bloggers of various sizes. To the best of my knowledge virtually all of them have hooked up one of the available services to post on their behalf. They either spend a little time scheduling out their tweets for the next week/month then forget about twitter until their schedule runs dry OR they set something up to randomly pick from some pool of blog posts and spam links.

Either way they then essentially never go on twitter again once things are up and running. The whole thing is full of bots talking to each other.


I occasionally fetch some works from japanese artists (SFW and not) from twitter since not all of them use pixiv and as far as my browsing experience goes their retweets, conversations, mentions and so on all seem to point to other humans.

Likes, Followers and accounts yelling into some hashtags can be dominated bots. But if you look at the feeds of individual people and those that they talk with that's a totally different experience.

> Tumblr today is drowning in porn.

Tumblr has had NSFW for a long time now. If anything some of it may have migrated to patreon now that it is easier to monetize.


Isn't that kind of expected? The rate of real users pretty much matches the rate of real users on websites.

Would also be interesting to run similar analysis in FB, I'd expect similar rates here.


I would love to get copies of your script if you wanted to email a link. I'm not so interested in bots but I am interested in social network analysis.

I don't know about Tumblr wrt porn - my impression is that there's a lot of sexual material there but that it's community-driven rather than commercial, partly due to the demographics of its user base.


> Tumblr today is drowning in porn.

'Drowning' in porn? In the sense that that's a bad thing?


It's so obvious that social platforms with business models based on number of ads/impressions like Twitter/FB are not incentivized enough to remove fake accounts, yet there seems to be very little public discussion or outrage about it. I agree with Mark Cuban here [1], they should do more to make sure each account has a real user behind it, even if it means less revenue. It's just the right thing to do.

[1] https://twitter.com/mcuban/status/957686987229618176


> It's just the right thing to do.

That sounds like a sweeping statement with little forethought when you're talking about a platform that enabled protests in dictatorial regimes.

Step back here: Fake users, sold by the thousands to give credibility and a megaphone to whoever shells out money: bad. Identity verification: A solution with many consequences to be weighed before jumping on the bandwagon.

And dictatures aside, I have no desire to give Twitter my ID or passport.


I mean... we are the best of the best of the best, top of the class when it comes to technology and especially creative solutions enabled by technology. We have world-leading machine learning capabilities and hundreds of engineers working around the clock for money that, not that long ago, would have been completely unheard of. We have talent. We have time. We have money. We have passion. We have absolute shitloads of data on everyone, whether they use our platform or not.

We don't need a passport or state-issued ID to determine if someone is a real user. That's the laziest solution any tech company has ever come up with, and it's deliberately lazy.

Facebook and Twitter can already tell if you're a human or not, shit Google lets you just click a button to tell them you're a human and then they determine if they believe you or not. All based on info they already have.

We're the best software engineers the world has ever seen on the cusp of an AI revolution... we don't need your passport. We just don't want to find out the truth, so we make it so hard no one will do it.


When I last checked my google ad profile (and this was a while ago; now it contains a lot less info probably because I opted out of a bunch of stuff), it was full of errors. Google did get my gender right but they put me in the wrong age range and a bunch of interests were out of whack. Good thing this was for ads, rather than a heuristic to ban me.

Best of the best, top of the class is not infallible. It's good enough to sell, because even something as low as 80% accuracy is good enough for things like "Do you want to subscribe to weekly pop tv news", it's not good enough when at the other end you get your account banned.

Google has, on this very site, built up a horrible reputation of using automated processes and having too many users to give those affected by those processes some good recovery. What you're describing is a recipe to replicate that.

99% is not good enough. 99.99% accuracy gets you 0.01% false positives. That's crazy low, right? It's also 100k users when you have 1bn users. Those fancy AI processes are nowhere near that.


I have the same experience with ads all the time. Like, I get ads from Google to buy the Pixel XL 2 while I'm browsing the web on the Pixel XL 2 I recently bought from Google on the same account I'm currently signed in to Google with... Or Amazon will give me ads for books and things I recently purchased through Amazon.

I think a lot of their abilities are overstated.


It's not Google's fault. It's the people buying the ads who are too lazy or don't know how to do proper targeting.

I've met three people in my life who were employed primarily to place ads on the internet. All three were glorified secretarial drones who fell into the position because nobody else wanted to do it.


"Lazy"? It seems to me that the ad ecosystem is incredibly complicated (for example, not even Google understands how malvertising gets on Google's ad network).

Any person who knows how to effectively place Internet ads has such high-status skills that they don't have to actually do it.


> We have absolute shitloads of data on everyone, whether they use our platform or not.

We^W They may have shitloads of data, but I'm very skeptical we^W they can use this data in any meaningful way. At least, for now.

I don't know what's wrong but despite all that data giant AD companies are supposed to have, I never ever saw any relevant ads, even when I've specifically wanted to see one. Unless I've already bought something - then, sure thing, I'm spammed with more of the same (which is, again absolutely useless - I've already made a purchase).

> We don't need a passport or state-issued ID to determine if someone is a real user.

I don't think we are even capable to come to a consensus what does this mean - to be a "real user".

Am I? What about my alter ego, posting about something I don't feel like publicly associating with (like, porn)? What about a whistle-blowing throwaway account I may create if I learn something fishy? Or what about a "thoughts are my own but went through editorial" corporate representative persona account? And that's just the obvious cases.


You underestimate the difficulty and complexity of that problem.

I'm increasingly seeing this annoying trend of people hand waving about how a given problem can be solved with data/AI/ML/DL.


"I mean... we are the best of the best of the best, top of the class when it comes to technology and especially creative solutions enabled by technology. We have world-leading machine learning capabilities and hundreds of engineers working around the clock for money that, not that long ago, would have been completely unheard of. We have talent. We have time. We have money. We have passion. We have absolute shitloads of data on everyone, whether they use our platform or not."

- now just imagine the speech was pointed at the bad actors. and that's why this is not as easy as you make it sound.


>We're the best software engineers the world has ever seen

Maybe some of you are, but people like me read HN too ;-)


As with my estimate of real-world influencers on HN ("a handful"), I'd suggest there's probably the same amount of "world best software engineers" on HN (and no, I'm definitely not including myself in that.)


I know you're not saying that anonymous social networks are /strictly/ better than verified, but to maybe round this thought out explicitly - I think it's potentially naive to think that governments won't be able to wield social networks even more effectively than "freedom fighters". Whenever they catch up. Arab Spring was round 1, but I feel like round 2+ may have already played out in state actors' favors. Who knows.

Relatedly: if you participate or "observe" any online communities of marginalized people, you may have noticed there are starting to be earnest conversations of whether or not "free speech" is actually more easily wielded to abuse them, than they can use it to have a voice.

Just food for thought. A lot of built-in assumptions are being tested right now.


Stepping back from the whole arab spring thing: Let's think about what identity verification actually means for the user.

Let's assume for a second Twitter has a magic wand that, without consequences, automatically deletes fake users used for marketing purposes.

Now what does identity verification actually do for the remainder of the users? In what way does it benefit them? In what ways does it inconvenience them? In what ways does it put them at risk?

Should there be a process in place, that is required of the entire userbase, just to fix the fake user problem? A problem that most users aren't even sort of aware of (otherwise the NYT article wouldn't have been as successful as it was).

We do a lot of reactionary things as a society that follow this exact pattern: Put massive processes in place to remove tiny bits of risk. Governments do that and justify it all the time, cf. the "Terrorism" or "For the children" memes. You often read people's complaints here on the ridiculous security theater that the TSA is, how it doesn't solve anything and inconveniences everyone for something that happens to people less often than winning the lottery.

This, is that. The pattern is: You have a tiny problem, you fix it with something which impacts your entire userbase because you didn't stop to think whether there are more targeted solutions, or even whether it's worth it.

[Note: I argue lower down that AI and statistics aren't the correct fix for this either... gotta keep digging. Maybe the "correct fix" is a mix of identity verification and statistical analysis.]


> I think it's potentially naive to think that governments won't be able to wield social networks even more effectively than "freedom fighters".

That exact thing is happening. The Reply All podcast did an excellent piece on the dominant political party in Mexico using armies of people on Twitter - not bots, but people - to drown out news they didn't like, and eventually, to harass people. Show, with full transcript: https://gimletmedia.com/episode/112-the-prophet/


Enabled protests in dictatorial regimes, sure. But let’s not overstate social media’s positive impact: nearly a decade after the Arab Spring, few governments outside of Tunisia look markedly different.

The ability of state actors to use social media to spread propaganda should also be considered in our tally of social media impact. The jury is still out on whether they are a net good.


This isn't an argument about whether social media is a net positive. This is about people, and whether Twitter protects its users at risk, or builds a system which just begs for government to request their real names.


It is an argument about whether lack of verification on these platforms is worth preserving. That is not proved convincingly by the instances of protests they allow, since this same practice also allows propaganda.


Would it be possible to do a 1-time verification, and then throw away the data? Twitter needs to see a driver's license or passport once, not keep a scan of it.


That seems fraught with peril also. Does Twitter do any check of the data? If not, you've probably just dramatically expanded the market for fake IDs. If so, can governments monitor such verification attempts and associate them with new accounts? If you're living under a hostile regime would you trust Twitter with your ID regardless of assertions of confidentiality?

And there are many people who lack a driver's license and passport. Are they disenfranchised?


> not keep a scan of it

What prevents me from using the same id in bot accounts then? Verification implies they have to keep some form of personal identification.


That's a different problem though. What if you could do identification that didn't involve giving someone your passport?

Authentication adds a potential cost-of-reputation to social media content that you post, which in turn may raise the value of the content. If there were a safe, secure way to do it, then why not?


Oh great, so now we’d be tied down by chains of conformity. How’s that for a surprising change of meaning to the phrase “Global Village”!


Could you explain how you be tied down by chains of conformity?


Speaking as an ex-Facebook growth employee, fake accounts actually hurt growth and are actively sought out and removed by a dedicated team. Think about it--if a user receives a bunch of fake friend requests, it's a bad experience. This is one of the (many) reasons MySpace died--because of the onslaught of porn-promoting accounts that they never cleaned up until it was too late.


Facebook certainly has been more diligent about stomping out fake accounts than most other services, with Twitter being social media's problem child.

A telling anecdote: A security researcher friend of mine found a somewhat small botnet of twitter accounts (~7000). Reported it to twitter, a few months passed and he noticed twitter hadn't done anything. So he turned it over to a journalist who eventually poked someone at twitter and... poof all 7000+ accounts were gone 6 hours later.


WTF? What can account for that? I'm not cynical enough to believe that they _encourage_ botnets.

Maybe simply no dedicated people or team for the problem?


I set up a fake Facebook profile a long time ago as a prank, added everyone at my college as friends, and people I know still say they get notifications about his birthday. And people still reach out to wish him a happy birthday.

I have no idea what credentials I used to create it so I can't delete it, but it still exists (unlike the person it's pretending to be).


I think the signature of those kinds of fake accounts are different than most of the malicious accounts though.

You creating a fake college account and forgetting about it probably passes as human enough, there's no concerted effort or agenda to that account besides existing and adding friends at your college.

I think what makes bots and fake accounts generally detectable is consistently pushing certain messages in ways that exceed normal human behavior, as well as showing patterns across many fake accounts.

It's hard to see a pattern in one fake account, but easy to spot it across many.


Twitter is full of those porn promoting accounts. Typically this is an account with some female name and a profile pic of a person in a state of undress following lots of popular accounts.

Twitter doesn't even take them down if you complain about them. Very annoying.


Not just porn. My account with ~150 followers was, out of the blue and within a week, followed by a dozen different "startups". Every single one of them follows tens of thousands of people in the hopes of getting followed back. It's pathetic.


I'm not sure that's quite the same thing. That seems to be an example, at least somewhat, of using the platform as intended.

If your profile is public, and they decide to follow it, in part, to make you aware of their existence, perhaps that's a feature not a bug, given that the main use for the thing is to connect with accounts and follow them and discover content.


I see random accounts occasionally following me, and it's clear that they're just casting a very wide follow net to see who will follow back, presumably to get their readership up. To me, that's spam, no question about it. Even if it's targeted. (Targeted spam is still spam.)

I want people to follow me because they're interested in what I post, not because they're looking for me to follow them. Perhaps I'm asking too much of the platform, but... it is what it is.


I have had a Twitter account for I think close to a decade that I never use. I’ve never tweeted or followed anyone. Last time I logged in which was probably over 2 years ago, I had dozens of followers, all following nothing!


Yeah that’s probably accounts managed by some random shaddy tool that follows tons of people in the hope of getting follow-back then massively unfollow everyone.


> Think about it--if a user receives a bunch of fake friend requests, it's a bad experience.

Except that Facebook knows who is real and who fake. Just like my mail provider knows who sends spam and who doesn't.

Facebook can display ads to fake users and still make the campaigner pay for the impression, or make a group pay for reaching more users, fakes included.

Facebook can easily filter requests from fake profiles, so no, fake users do not necessary worsen the experience for real ones.


"Facebook can easily filter requests from fake profiles" "Except that Facebook knows who is real and who fake."

You say that so definitively. Have you worked on a product with millions of new users a week? It's an extremely hard, constantly shifting problem.

"Facebook can display ads to fake users and still make the campaigner pay for the impression"

Facebook's entire business relies on user and advertise trust. Why would they sacrifice that for some short term growth that would inevitably kill the business by eroding trust?


While I don't pretend to have been party to internal FB conversations when certain decisions were made, as an advertiser, FB has definitely done some things to significantly erode that trust over the years.

For starters, encouraging advertisers to spend money to build up an audience with the assumption they could continue reaching that audience much like email, only to throttle organic reach to zero was pretty bad.

There have been other things such as being extremely...generous...with the definitions of how some ad metrics are defined and what defaults are presented. Even as an experienced advertiser who knows to look for those things, the lengths to which some of it it is buried is astounding to the point of it being hard to trust that it wasn't intentional. And the recent lawsuits around such things shows I'm not alone in that feeling.


> You say that so definitively. Have you worked on a product with millions of new users a week? It's an extremely hard, constantly shifting problem.

Facebook has hundreds of engineers and enough data to do match patterns against. Facebook can largely identify who's who.

> Facebook's entire business relies on user and advertise trust. Why would they sacrifice that for some short term growth that would inevitably kill the business by eroding trust?

Yes, the infamous "they trust me, dumb fucks"?


"Facebook has hundreds of engineers and enough data to do match patterns against. Facebook can largely identify who's who." I know, I used to work there. I was asking you. Pattern matching isn't black-and-white, as you alluded to in your previous comment.

"Yes, the infamous 'they trust me, dumb fucks'?" You didn't do or say stupid things when you were 19? You don't believe in giving second chances, let alone to a teenager?


Weirdly enough and probably unrelated, I had at least 4 fake friend request from “attractive” women in the last 2 weeks on FB. Mostly these accounts would disappear within hours or minutes. Never had that in x years of being on FB.


It's worse than not being incentivized to remove fake accounts - they are _actively_ incentivized to be selective about which fake accounts they "detect".

How many of David Brock's bots have been removed from Twitter?


But why does it matter to a general user if there are real or fake users behind twitter accounts? It makes no difference to me whether I have fake followers or not, as I'm not going to be forced to see anything they're posting (or not posting). I can just ignore them and follow who I want.

It seems like it only matters from an investment stance.


I care because I get a notification when someone follows me, and I want my notifications to be a source of data with a high SN ratio. I want to notice that a someone has followed me so I can click through and decide if I want to follow back. If it's a bot account, I've just wasted my time, because I never want to follow bots.


The only practical way to do that is to take a credit card and run a test transaction against it. If Twitter had tried that it would have died in the nest. People wouldn't take the risk for an unknown benefit. A large portion of their current tweeters still wouldn't, or couldn't provide such proof.


> The only practical way to do that is to take a credit card and run a test transaction against it.

Doesn't really work everywhere in the world.


Ah, to solve the social media bot problem, we must first simply uplift the world to a unified economic system.


They just need to actually open up verification to everyone not just saying they are but in reality keeping the same requirements that are so subjective.


Is a brand considered a "real name and real person"?


Yes, at least according to the Supreme Court of The United States.


In most legislations, a brand is considered a "legal person". (which is hilarious when you register your company as forms kind of treat the company as if it were a person)


I don't see why not. A brand is owned by a company. That company has a registered name and real people working the social media accounts.



Why? Same applies.


There is no ReactJS company. There is a company behind ReactJS but that's irrelevant since there's also a company behind those millions of fake users.

What about my pet open source projects, shall I shut down their account because they don't have a company backing them?


I think you're arguing a point that no one upthread made. The proposal was to make sure that a real person was behind the account, in good faith - not necessarily that person's personal account. So your accounts and the ones you linked to upthread are fine. The "good faith" part means that there can't be a hard-and-fast rule for which Twitter accounts are okay. Human judgement will be required to sort out good-faith users (including brands or open source projects) from bad-faith users (paying people to run accounts that just inflate other accounts' numbers).


"It is difficult to get a man to understand something, when his salary depends on his not understanding it."

-Upton Sinclair


Bots will only increase as AI improves, and throwaways are important. They should definitely be honest about their numbers, though.


False dichotomy. Spam and impersonation is bad. Anonymity and pseudonymity and tweet interfaces to bots that provide useful services are all hugely beneficial.

Twitter’s main strength is its low barrier to entry, for end users and developers alike.


> .. FB are not incentivized enough to remove fake accounts

What makes you think Facebook isn't prioritizing this? I'm not questioning whether you're right about this, just curious about what made you draw this conclusion.


"Hey, $MAJOR_CORP, advertise with us, you'll reach millions of your target market! Just pay $X."

"So how many of them are real and how many of them are fake accounts?"

"We work very hard to make sure Facebook is free of fake accounts."

"That's not the answer to my question.".


Its interesting you'd say that, considering that facebook publishes an estimate of the percentage of fake accounts in its quarterly public filing.

Disclaimer - I work for facebook.


Take this hearsay as you like.

Someone I am intimate with left Facebook in the last month, in anger at the way senior management up to and including MZ first dismissed the severity of the problem of their platform being used to push propaganda during the election cycle.

And then stonewalled and foot dragged efforts to correct this. By the account I got the only incentive for action has been unflattering attention at the national press level.

This from someone with a personal relationship with Z.

Make of it what you will. The poison is real.


> Someone I am intimate with left Facebook in the last month, in anger at the way senior management up to and including MZ first dismissed the severity of the problem of their platform being used to push propaganda during the election cycle.

Intelligent and reasonable people can disagree - strongly as to the extent of how much of a problem this is, whether or not it is a problem, or whether the cure is worse then the disease.

This also has little to do with fake user accounts.


Honestly, once you become that rich and powerful, I can imagine it's pretty difficult to not live inside a bubble. They probably don't understand how serious some problems are because they're not normal users. Also, it's probably hard to not let it go to your head and think that you always know what's best for the common rabble.


Propaganda is separate from fake users though.

I haven't seen a claim that fake users played a large role in spreading propaganda or fake news. As far as I understand it the issue lies in what people were sharing and what Facebook identifies as trending news.

That's a pretty different issue from Twitters fake accounts.


I've seen stuff debunked by Snopes ECT so I link to Snopes to show my friend that what they shared was fake news, only to see him share the same link next week. He claimed he did not read the article but it agreed with his political views so it must be true.

I've seen fake Twitter tweets as well that have bad spelling and other stuff in them that it is obviously a fake tweet made by some website fake tweet generators out there. Then shared on Facebook as an uploaded image. If they don't link to the original tweet on Twitter consider it a fake tweet.


Your criterion for determining a fake tweet doesn't work if someone regularly tweets things and then deletes them. The combination of this behavior and abundant fake tweets allows people who like and who loathe a particular person on Twitter to build up opposing views of what is real.


Yes it is hard to determine fake tweets if someone keeps deleting their tweets.


there seems to be a twitter discussion happening with regards to authoritative/oppressive governments subpoenaing twitter to get a user's identity if that were to happen for voicing an opinion that isn't in line with the regime's. how do you solve this while giving every account an identity?


We have to be careful about botshaming people who have too many fake followers: it's pretty easy to buy your enemy a bunch of fake followers just to discredit them.

Fun story, years ago in my office, before buying followers was well known, folks would prank each other by buying fake followers for our co-workers. They'd wake up and be so happy and surprised and then have to spend the weekend manually blocking each one.

At the time it seemed pretty harmless, but now it is definitely a threat to someone's credibility.


Also fake SEO stuff like make an illegal web farm copied from your competitor's site and use a spambot to post their URL everywhere so Google penalizes then in web rank.

I was on Uncyclopedia when someone did that to get them removed from Google.


https://news.ycombinator.com/item?id=13726214

"Link to this post in three years and ponder in its prescients... Web 3.0 will be born in the death of the heavily botted social networks."


What a profoundly insightful comment thread that is. Thanks for linking.


I have a (maybe) interesting twitter related anecdote. 5-6 weeks back I went to the twitter website and I was greeted with a login screen. I couldn't remember my password and didn't feel like finding it so I went off to look at other sites. That happened a few times, until I went back and POOF, I am automatically logged back in again. I've never seen a site un-invalidate an auth token before. Cool beans.


I had something similarly weird with amazon.

I got a new iPhone and on iOS Safari I needed to sign in to all my accounts.

Except for some reason Amazon recognised me with one-click enabled. I never used one-click to buy anything before and accidentally bought a kindle book while browsing the site.

More strangely, when I went to turn off one-click in my settings I was forced to log in.

So I could one-click buy without explicitly authenticating, but needed to authenticate to disable it. Very strange and/or shady.

Btw - is there an easy way to cancel an accidental one-click buy? In my case it was a local author I wanted to support anyway, so I’ll keep the purchase. But surprised it’s so easy to accidentally purchase something from the mobile site if you swipe to scroll on the wrong place.


FYI Kindle purchases can only be made via one-click. I hated when Amazon forced me to enable it to get a copy of Traction.


In a somewhat related note. I read the original NY Times article that laid out the case for the fake followers. The combination of good writing, investigative journalism and compelling presentation actually made me feel like they are producing content worth paying for. I am now subscribing to the new york times.


Don't take anything you read in the New York Times about twitter bots too seriously. They do publish articles that look superficially well researched but which are nonsense or actually deceptive:

https://blog.plan99.net/did-russian-bots-impact-brexit-ad66f...

The problem is that some phrases that appear on the surface to have one meaning have been grabbed and redefined by particular political groups, almost used as code words. "Bot" and especially "Russian Twitter Bot" for example isn't used by Twitter or others in the way you'd always expect:

https://www.projectveritas.com/2018/01/11/undercover-video-t...

"Just go to a random [Trump] tweet, and just look at the followers," Singh says. "They'll be like guns, God, America, like, and with the American flag and like the cross. Who says that? Who talks like that? It's for sure a bot."

The idea that Twitter bots can change society in fundamental ways is one that seems to obsess journalists, who all seem to spend half their day on Twitter anyway, but I've yet to see evidence that it's true.


Okay, who puts time on the Y axis of a graph? Honestly.


We scroll down and that becomes a "animation while scrolling". While its an unusual orientation, it believe it is the right one. Its not a time axis - its "time since the user was created".

The graph isn't a "this is how many followers the user had at this point in time" but rather "right now, here are all the followers this user."

A point is "the Xth user following had a join date of Y" and thus the X axis is in effect a time axis (though not a linear time axis).

From an eye scanning view, the horizontal bands are easier to follow and notice than vertical ones - and that is part of the goal of the graphic (to emphasize those bands).


thaaaaaaank you


I'm having trouble understanding the follower visualizations. Does the X axis represent her followers in the order they started following her and the Y axis is the date they (the follower) joined Twitter?


The original article suffered from the same flaw, and they never really explained it.


Correct.

It's a kind of weird visualization, but the patterns which emerge are pretty striking (and obviously artificial).


I don't get the justification for labeling the initial block as organic growth, unless they're claiming that only the horizontal stratification is signs of bot activity. The vertical stratification should also be signs of bot activity, and as they point out, many of the supposed bots were in the vertical stratification groups.

For those not familiar with the graphs, the graph shows date followed on x and date created on y. Where there's horizontal stratification, NYT noticed it means the follower accounts that all began following the account at the same time were also created around the same time, indicating a strain of bots. However, when there's vertical stratification, there's a shift in the rate at which accounts are following the account. When the vertical stratification is followed by horizontal stratification, it's an indication that both sets are bots - for example, one scenario is that instead of providing a mix of bots created at different time, someone got lazy and just grabbed a list of bots all created at the same time. However, vertical stratification could also just indicate that the person did something good or bad to change the rate of acquiring new followers so it isn't clear cut. That being said, I'm not sure that justifies labeling the initial section which lacks horizontal stratification as organic.


What I don't really understand about this is, why is anybody concerned with fake accounts and followers outside of ad companies? (Disregard now the possible public opinion influencing use of huge fake account networks, I can't even see this argument against bot accounts in this article.) As I see it, both bot account sellers/maintainers and celebrities profit from this, only the ad companies can lose when their business partners realize that they get fake visibility for their money and decide not to give money for this. Of course this would in turn eliminate the sponsorships of accounts and thus the buying of fake followers. So I guess I don't really get why this industry exists in the first place...


We’re probably 10-20 years away from the internet and especially instant messaging becoming a public utility. The product itself is immensely useful but so far hasn’t been monetized except by lock in to a bigger platform or by turning the user into a product.

We’ve had AIM, ICQ, MSN, Yahoo Messenger and many others that can be seen in the list of protocols supported by pidgin.im. Now we have Facebook Messenger, Whatsapp, Skype, Hangouts and the thing from Apple.

Or at least there should be a standard everyone that wants to sell to public institutions should follow for instant messaging.


I am not sure why this is a problem, but I don't really use twitter. Yes, fake accounts exist. It's only twitter's problem not sure why this something the NY times would care about?


To name but a few reasons off the bat:

1. Twitter is selling content feeds to TV news networks. With enough bots out there your tweet may very well end up on TV.

2. It counts for SEO. I once met a guy in 2010-ish who was into casino SEO. He was running a network of ~150k FB/Twitter accounts to promote articles that quoted the oddball news outfits that quoted his clients' press releases.

3. Some people actually read what's going on on Twitter. In particular journalists and swaths of opinion leaders. See any late night show, really, for ample Twitter coverage.

4. Some people with tons of followers occasionally retweet garbage memes on Twitter, including racist videos tweeted by white supremacist UK groups that turn out to be fake.


Because of advertising fraud, because of media manipulation, because of violations of campaign and fraud laws, because of foreign enemy powers influencing internal affairs of the U.S., UK, France, Germany, Ukraine, and other states.

The problem being that even if you yourself don't view or access Twitter, it is influencing, and largely for the worse, the world in which you live.


Perhaps the original article, which this is a follow-up to, could be enlightening: https://www.nytimes.com/interactive/2018/01/27/technology/so...

two young siblings... earn a combined $100,000 a year as influencers, working with brands such as Amazon, Disney, Louis Vuitton and Nintendo. Arabella, who is 14, tweets under the name Amazing Arabella.

But her Twitter account — and her brother’s — are boosted by thousands of retweets purchased by their mother and manager, Shadia Daho, according to Devumi records.

The idea that dubious businesses are possibly skimming money off big brands (or even more disturbingly, possibly with the knowledge of those brands) by using fake followers is interesting, and newsworthy. Add widespread identity theft into the mix, and it’s even more newsworthy.


If a fake account is using the name and likeness (pun intended?) of a real person (as many are, having harvested them from real accounts) then it becomes the impersonated's problem as well.


> not sure why this something the NY times would care about?

They're just looking for any blame they can shift for their culpability in helping throw the 2016 POTUS election to Trump.


Not sure why I got down voted? Honest question? Why is the government and NY times looking into this?


What’s even worse is that advertisers end up paying a lot of money to advertise to these bots. In an early ad campaign with Twitter we quickly realized a lot of our spend on “engagements” was engagements with bots. We pulled the plug on Twitter spend really quick.

This has been a known problem for a while and Twitter has done little to fix it at scale. Given that they make their money on these paid “engagements” there are going to be a lot of people taking a real close look at this. Interesting days for Twitter ahead.


I don't know if it is only me but I like when the graph changes as you scroll down to give you more insights.


I tend to soft-block (block and then immediately unblock, it removes them as a follower) most new followers, because most are either brands or fake accounts.


One Indian actor threatened to quit twitter, because another actor has more followers than him.


I have completely ignored Twitter since some guy told me about it 10 years ago, and I have not once thought "wow, I wish I had paid more attention to that."


Just wait until you learn about Bitcoin


This isn't a matter of personal-access only:

https://news.ycombinator.com/item?id=16277524




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: