Hacker News new | more | comments | ask | show | jobs | submit login
Twitter Still Can't Keep Up with Its Flood of Junk Accounts, Study Finds (wired.com)
124 points by seapunk 15 days ago | hide | past | web | favorite | 91 comments

This is humiliating to Twitter. The fact that someone can create a better spam filter without having access to corporate private information (such as user interaction signals, which are normally incredibly powerful) points to that they simply don't want to kill the spam:

> In fact, the paper's two researchers write that with a machine learning approach they developed themselves, they could identify abusive accounts in far greater volumes and faster than Twitter does—often flagging the accounts months before Twitter spotted and banned them.

I wouldn't consider it humiliating. And there's a dimension this doesn't talk about: false positives. Twitter has to make an actual decision to ban an account. When they get this wrong, they have unhappy users and I'm sure articles would appear here about the terrible work they are doing. These researchers pay none of those costs about false positives. Perhaps they have a 7% false positive rate and consider that pretty good. That amounts to about 12,000 legitimate users banned.

Some of Twitter's users are more equal than others when it comes to bans. I don't think they are concerned about blowback from false positives from spam accounts.


"Learn to code" was tweeted at me by a sketchy account. I reported it as abusive behavior as part of targeted harassment. Twitter suspended the account within 20 minutes.

Journalists if they tweet "learn to code" at you don't stay silent, take a moment to report it. https://t.co/RXgqqV2ptw

— Ben Popken (@bpopken) February 1, 2019

I wouldn't consider that a typical example given that the whole "learn to code" thing was a targeted harassment campaign. Not difficult to imagine that in that timeframe Twitter had prioritised investigating and banning users who were participating in it.

They didn't extend such protections to schoolchild Nick Sandmann

"#MAGAkids go screaming, hats first, into the woodchipper"

Probably because there was no coordinated harassment campaign, and he isn't even on Twitter. But sure, keep up the false equivalence.

Who was co-ordinating the tweeting of "Learn to code" at Ben Popken?

Perhaps I can put it another way

"I don’t believe that we can afford to take a neutral stance anymore. I don’t believe that we should optimize for neutrality." - Twitter CEO Jack Dorsey Feb 5 2019

4chan. I mean, the threads are all public, it seems exceptionally stupid to claim it wasn't happening.

What would you call this highly publicised and still present Tweet from CNN presenter, Kathy Griffin, I wonder [1]

> Name these kids. I want NAMES. Shame them. If you think these fuckers wouldn’t dox you in a heartbeat, think again.

[1] https://twitter.com/kathygriffin/status/1086927762634399744

I would call it a single abusive tweet. Not a coordinated campaign, not specifically targeted at any Twitter user.

This isn't complex.

Suggesting people tweet "learn to code" at journalists wasn't specifically targeting Ben Popken

And he got 1 response.

Not very effective.

It could also be that accounts that take part in harassment/harassment adjacent behavior may already have a few strikes against them for similar things.

Of course they don't want to kill the spam! Then they'd have to admit a huge fraction of their user accounts are fake.

For all the attention it gets, Twitter is a niche product with many fewer real human users than Instagram or even Snapchat.

Bingo. Fake accounts and bots Make Number Go Up And To The Right, so why on earth would they want to stop them?

Yep. They told their investors tall tales about engagement and growth, and would have to back track considerably if they were honest about the amount of bots on their service.

Twitter has a symbiotic relationship with its spammers: It needs those accounts to drive the metrics their advertisers and shareholders crave.

No it doesn't. Spam doesn't necessarily generate revenue traffic, and it can drive actual consumers off the platform.

This is the same issue as the naive attempts at driving web page hits during the early web. "Look at how many pages people are looking at!"

Then people realized that people were going from page to page because they couldn't find their damned answer anywhere.

And so now more sophisticated metrics are used: time on page. Customer sat metrics — were people actually happy using your website?

While total weekly/monthly active user counts are important what is also important is churn — are people abandoning the platform because of spam, trolls, etc.

Who cares if you create a million fake accounts if you lose a half-million real human users?

I don’t doubt that it’s bad in the worst possible scenario: where spam accounts are highly visible to real users and cause them to disengage.

But in practice what we’ve seen on Twitter are accounts that are largely silo’d off from users, who act to “signal boost” certain tweets and hashtags. Most users never even see these accounts, and will never interact with them directly: but they will see content on their timelines boosted from spammers.

You are grossly overestimating the technological competency of the average Wall Street analyst and retail investor.

Twitter's existing spam filter also evidently has a lot of false positives. I created an account many years ago and used it pretty much just to browse around Twitter, follow people, etc., without posting anything. I dropped off for a couple years and when I came back my location had been changed to Russia (???) and the account was suspended. I saw no evidence that the account had been hacked or anything - there was no activity that I wasn't responsible for.

My appeal was denied without explanation. I can only assume I was wrapped up in some anti-spam measure.

EDIT: To address responses to this, I think it's extremely unlikely the account was hacked. To be clear, there was no account activity of any kind other than the location change. The password was long, complex, and one I did not use anywhere else.

How is having your location set to Russia not evidence of a hack?

Yeah, that's a pretty strong signal that your account got used for some nefarious shit.

There was literally no account activity other than this. What 'nefarious shit' could it have been used for, given there were no tweets, likes, retweets, follows, DMs, or anything of that nature?

EDIT: I should also add that I have a very strong password which is not reused anywhere, so I doubt very much someone guessed the password.

It began creating a history for that account based out of Russia. Which would be useful in the future should somebody in--oh, I don't know, Russia--wanted to use that account for nefarious shit.

So, at some point, someone guessed a complex password I used nowhere else, logged into my Twitter account with it, changed the location, and then did....nothing, because they wanted to one day use the account for something. And Twitter somehow figured all this out, determined the account would one day be used for nefarious ends, and suspended the account because...the location changed? Why? Even if the account was hacked (which I see no evidence for), your contention is that what, Twitter suspended my account for logging in from a new IP, but not until after a successful login and profile change?

I'm sorry, it seems a whole lot more likely Twitter used some heuristic to assume the account was a bot, set the location as a marker, and suspended it.

Twitter doesn't need some kind of stupid hack of setting a visibile-to-you field to flag an account. Your account got hacked. This doesn't necessary mean they guessed the password, there are other ways to take over a well aged but apparently abandoned account.

Yeah, and the bots don't need to set the location to Russia either. In fact, doing so would pretty transparently work against the alleged goals of Russian bot activity on Twitter - if you self-identify as Russian, then you're not posing as an American. I would assume Twitter has separate heuristics for different types of "bots" and flagged it to help make it "clear" to other users that my account was actually a "Russian bot." I don't know.

But again, I don't see the point in taking over an account (please name these other ways of taking one over) and then doing absolutely nothing with it except changing the location. If I'm going to take over aged accounts, why wouldn't I do something with it? How did Twitter identify that the account was hacked, then? Why did they deny my appeal? Why didn't they just ask me to change my password?

Russian propaganda doesn't exclusively target the US. They also target their own citizens. Plus, we don't know that whomever took over your account was Russian, just that they were prepping it for activity in Russia. After an account is taken over they don't immediately start spamming and get themselves banned. They need to gather thousands of accounts before they launch attacks so the anti-spam bots don't shut them down. You can't effectively multiply a message with just a handful of accounts.

Of course at some point the guy who hacked your account fucked it up and blew a bunch of his accounts. At this point Twitter thinks you're just a bot account and doesn't care what you have to say.

Most common other ways to take over an account involve calling tech support and telling them you lost the password and the email account.

For what it's worth, after having this discussion, I logged back into the account. Going to "Apps and devices" shows nothing out of the ordinary.

My country has been reset again to Russia. I had fixed it when it happened the first time.

Yeah, I'm sorry, there's no way it's not Twitter doing this. I'm not sure why so many people here are dead set on the "it must be secret hackers" explanation.

I'm also pretty sure there's not 1-800 Twitter line to call to reset your password, and if Twitter support is giving random people from random emails account access, Twitter has a much bigger problem.

How do you know that content wasn't deleted when your account got suspended?

What type of content were you posting prior to not using the account?

I did not post any content. The only thing I ever did was follow people, mostly writers and friends.

I assumed Twitter did it. Why would someone log into my account and then do absolutely nothing other than set the location?

why would twitter do that?

They needed to show some success fighting the "russian bots", so they flagged some unused accounts and can claim success now?

Geoff Goldberg (@geoffgolberg) [1] who is very vocal about this on twitter and has done bunch of analysis about foreign bots, got his account suspended by foreign trolls flagging him.

Presidential Candidate, Kamala Harris (@KamalaHarris) account was inflated by millions of fake followers. There is an analysis of it here [2]

Which is if paid by the campaign, by law it should be reflected on their campaign spending (however, it would violated twitter's rules and her account could be suspended). Or paid by some PAC. Either way, if it's so easy to detect, it's obvious twitter knows about these and are just not reacting to them and only removing some accounts for PR purposes.

[1] https://twitter.com/geoffgolberg

[2] https://twitter.com/likingonline/status/1092643779402620928

Don't discount the alternative possibility of an opponent buying followers. Someone pulled that on Roy Moore:


> It involved a scheme to link the Moore campaign to thousands of Russian accounts that suddenly began following the Republican candidate on Twitter, a development that drew national media attention.

> “We orchestrated an elaborate ‘false flag’ operation that planted the idea that the Moore campaign was amplified on social media by a Russian botnet,” the report says.

It's fairly common with Google AdSense, too - generate obvious click fraud on a competitor's ads, watch them get suspended.

This seems really difficult a problem to tackle for Google, Twitter etc.

Other than a victim trying to make a strong case that they are not responsible, are there any elegant solutions to this problem?

why not tie twitter accounts to some strong real world identification? One cent payment to twitter, national id or something of that sort?

Just push the bar up for people. The flipside is that people who want to stay anonymous or people with privacy concerns will stay off the platform, but I think fundamentally there is a price to be paid if you want an authentic community.

Because Twitter would be exposed as 95% throwaway/inactive/bot accounts.

”One drawback to the Iowa researchers' method was its rate of false positives: They admit that about six percent of the apps their detection method flags as malicious are in fact benign. But they argue that false positive rate is low enough that Twitter could assign human staffers to review their algorithm's results and catch mistakes. ”

If I understand right, out of 460k apps 170k were malicious. That should lead to quite many manual reviews, if you don’t trust the algorithm. Also I’m not sure if this would a task where humans are any better.

At these numbers 6% is quite much in absolute terms. That would mean quite many legitimate apps getting blocked (unless I’m missing something here).

It’s simply not worth the time and money to pay someone to review every app created everywhere. Even with a 6% false positive rate, those app developers will appeal if it’s worthwhile (which will probably be less than the full 6%), and then they can be manually reviewed, and the false positive rate brought down after some more iterations.

6% is 6%. The absolute number is meaningless. It’s just a crappy twitter integration that no one really cares about. It’s not a human life.

What is Twitter other than a way for celebrities and consumer brands to advertise to their customers? I suppose with paid-for follows and retweets from click farms, you can position yourself as an 'influencer' and get a gig pitching slim tea, or testosterone gel or something out of a late night TV infomercial.

The whole thing feels highly commercialized and yet, at the same time, very gauche, like stepping into a neighbourhood where the only stores sell payday loans, bail bonds and liquor. I follow ~12-15 people in my industry who are knowledgeable about stuff, but honestly, that makes Twitter extremely boring.

I'm starting to realize that the drama and beefs (often manufactured) is what keeps ordinary people coming to a medium where they aren't even the intended customer.

I use twitter to follow people (usually scientists or public intellectuals) that tweet solid ideas or share essays with solid ideas. I only follow accounts with high signal to noise ratio. Noise is empty tweets that have no substance or buzzfeed like articles.

Here is a prime example of an account and tweet that has lots of substance https://twitter.com/michael_nielsen/status/10618244705564672...

He also has lists of accounts on his profile that have high signal to noise ratio. https://twitter.com/michael_nielsen/status/10810700446483988...

Personally I would love to have the feature of following accounts for their self-written tweets but not retweets. Definitely will help with signal to noise. But that goes against Twitter's business model, so I doubt it will happen.

I use it to follow and post news, as well as what academics are interested in posting/sharing.

Microblogging is great, and Twitter remains the least worst solution for this.

The authors conveniently forget to mention the false positive rate with their approach.

Time for everyone to move to the Fediverse! Preferably small instances for your irl social circle. I know it probably won't happen that easily, but a man can dream.

I tried it out for about a month. In principle, I think it's a great idea. In practice, no one in my IRL circle is interested in using it when we can just message each other. The lack of use and discoverability means it's difficult to meet new people on it. At least that was my experience.

Hot take: the Fediverse won’t catch on unless it gets a killer app of its own. Every federated project I’ve seen is a clone of something else.

I think it may take off when blogs offer activitypub streams and you can follow them and it shows up in your feed. The technology is already there, people just have to add it.

The killer app will be when journalists, public institutions, and community orgs are contributing content to the fediverse on instances run by their employers.

Those are the last sections of society that usually adopt new tech though.

The only spam in my Twitter feed is “people you follow also follow.” I’ve been doing a lot of blocking.

Twitter should charge ~$1 for every new account. Since Twitter is no longer a growing product, they are way past the stage where a nominal charge for new accounts would hurt their KPIs.

I just moved and in order to file a change of address via the USPS website I was able to give them a credit card and was charged a buck or two as some sort of identity verification process. To your point, certainly reasonable must be available to Twitter, __if__ they were truly interested in cleaning up their act.

Seems like most spammers / scammers are finding out passwords of old accounts instead of creating new ones (which are a pain to create due to the phone number verification and the disposable phone numbers ban).

Twitter should probably scan their user passwords for obvious / most-used ones and require their users to change their passwords.

Assuming they aren't using plain text, or simple hashes, how would Twitter do that? Even a standard salted password against that many accounts is tens of millions of cpu hours. Using something like bcrypt would be 10,000x slower than that!

Honestly, the easiest thing to do would be to force a password change on any user that hadn't logged in in X months.

Since they have the salts, they could just take a common passwords list, generate the hashes associated with all of them, then force any users with matching hashes to change their passwords.

You're misunderstanding how this would work. This is effectively what a hacker would do brute force. You have to generate a hash for each password in your common list, for EACH salt in your user list.

326 million Twitter users x each password in your list.

Really comes down to which hashing function they chose, and how fast it does that calculation.

Something like bcrypt is roughly 100ms per hash, which is a LONG time when you're crunching billions of them.

Do it lazily at auth time and use some recency and/or fingerprint heuristic to force either immediate block or password reset through email.

For 10k passwords and 100ms per hash it is 90 hours of single threaded execution time.

There is no excuse not to do it, especially you need to do it only once per user (until password change).

>For 10k passwords and 100ms per hash it is 90 hours of single threaded execution time.

That's 90 hours per account, multiply that by 326 million accounts and it's over a billion CPU days.

You can reduce it a lot (no need to spend time calculating the password for users who log in regularly, just wait for them to give it to you) but it's still a massive scale. Especially since the biggest risk comes from the millions of rarely used accounts.

Correct. Forgot to multiply by 10^6

I'm getting 90 million hours of single threaded execution time for 326 million users with 10k passwords each, and 100ms per hash.

As somebody else commented, a quicker solution is to just force a password reset based on number of months since last login. If you havent logged in for over a year is a good rule of thumb.

Right, but I was replying to OP. "Twitter should probably scan their user passwords for obvious / most-used ones and require their users to change their passwords."

I mean, the account stealers can do it, and not Twitter itself? Which one has the most ressources?

Account stealers aren't brute forcing passwords. They are trying common passwords that dum dums often use, or circumventing by other means like email.

And then why couldn't Twitter itself do it?

Read my responses above.


Could you please avoid commenting like this and post civilly and substantively?


Force every single user to change their password right now. When a new password is entered, check it against common passwords, and reject bad inputs as needed.

This would cause friction and they would lose users, which would piss off investors. This isn't a technical issue.

Considering the degree to which most Twitter users are trash, I wouldn't mind losing a few of those. The rest will understand the need for security.

Yeah but you're not an investor. Twitters been struggling to make money for a long time, do you think the investors would be happy to see a decline in active users because of "security" - a concept they barely understand?

I’m not an investor in Twitter because I don’t invest in companies that are unable to assume an appropriate security posture. The same reasoning leads one to conclude that dilettantes who do hold Twitter stock deserve to get burned when poor security practices come home to roost.

Here's an honest question: Why not just let Twitter users pay to get more followers?

The effect is two-fold. You're removing the grey market of fake bots for padding follower numbers, and also make a paper trail for accounts

Can't keep up? Or doesn't really want to keep up? Not a month goes by where there's not a HN post on this subject (or very similar). If they're making a profit then there is - at least at this point in time - not enough incentive for them to change.

The KPI we need to know is: As bot accounts increase what is the churn of real accounts? Unless real accounts are falling, and it's because of bots / spam, then Twitter is unlikely to do much about the problem; because to them it's not really a problem.

Maybe they don't want to fix the problem? Keeps numbers inflated for quarterly reporting. Same reason why facebook hasn't closed a few of their holes, black hat ad dollars keep the investors happy.

There's a bit of "can't" and a bit of "don't want to"... they just make more money this way.

That's Justin sun in action.

Disable the API.

That wouldn't stop them.

Twitter has a perverse incentive to allow them because it is directly related to their stock price

Yes, but only up to a point. DAU and MAU is a reflection of advertising audience size, and thus potential revenue. If company A has. 1M DAU, and The grows it in 2M in a year, that’s good growth, revenue should double even with everything staying the same. It’s stock price should at least double, probably more than that to price in future growth.

Compare that to Company B that has 500k DAU and flat. Not so good.

Now what happens if it turns out that Company A’s DAUs we’re actually bots? Suddenly, the audience isn’t worth anything, because bots aren’t an advertiser friendly audience. So now the stock should at least drop in accordance to the real-DAU, and maybe even less to price in lack of confidence in the future.

Right. It comes down to “can’t” or “won’t”. It makes little sense imo for Twitter for demolish their user stats.

The last big round of fake bot purges saw significant percentages of users removed from prominent accounts like Obama’s.

If normal people don’t know it’s a problem - why would Twitter want to fix it? Add that to a real financial hit if they do. Not “can’t”, “won’t”.

Financial analysts should really look at the make-up of accounts.

Exactly, this is a non-technical issue.

I think that Twitter could but they care more about growth and err on the side of caution and less aggressive account validation, otherwise the new account sign ups would flatline and investors would be pissed. Capitalism.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact