Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Twitter spam and motivation to report it (marco.org)
52 points by yan on July 18, 2011 | hide | past | favorite | 25 comments


I work on the Trust & Safety team at Twitter and hate spam as much as everyone else. As with any large system the solution is never as simple as "implement this thing, problem solved." As Marco points out, what is or is not spam is a balancing act. Our head of Trust & Safety, Del Harvey, gave an interview to the Guardian earlier this year about this balance: http://www.guardian.co.uk/technology/2011/apr/07/twitter-int...

If you'd like to help us out we have several open positions on the team: Anti-Spam: https://twitter.com/job.html?jvi=oBPbVfwg,Job Tools: https://twitter.com/job.html?jvi=oSbdVfwV,Job Front End: https://twitter.com/job.html?jvi=owPbVfwb,Job


I get that Twitter want to be careful and not accidentally penalise legitimate users, but nearly all spammers I get mentioning me fit the following points:

  - 0 followers (ok, sure, if they had to they could start making spambots follow each other to avoid this)
  - Account only just created
  - Has tweeted the exact same message, with the same link at the end, to 10s or 100s of people, in a very short space of time.
Are there really any non-spam usecases that would fit those points? Can't these accounts just be automatically suspended?


I wrote a free Twitter spam filter service, trained on thousands of accounts reported to @spam before the "report spam" button.

You'd enter your Twitter username, and asynchronously it'd grab your follower list and score each account for spamminess. Because of Twitter's rate limiting on the API, it could be a while before finishing the work if a lot of people were using the site at once. When done, it'd send you a @tweet with a link to get your results (and, if you want, click another button to block the spammy accounts).

Within an hour or two of people starting to use it, Twitter shut down the account. Tweeting the exact same message with a similar link at the end, requested by each and every user it tweeted to, was considered too spammy.

That was a year and a half ago. Nobody was reporting that account as spam, so they must have some basic heuristics already.


Question: why would I want a list of which tweets directed at me are spam? The problem I want solved is that I ever see them at all, I don't really care if Jimbo scored a 78%-probability-of-spam for linking DJ Kittens on youtube.


It was a spam filter for your follower list. Some people just don't like having 300 spammers show up when someone clicks "followers" on their profile. Regardless, I wrote this comment to make the point about Twitter already having some automated spam systems, not to entice you to sign up for anything.


Oh, gotcha. I never really understood why having bad followers matter, it's one of the things you truly cannot control.

Anyway, it sounds pretty shitty that not only did they (ironically) shut your service down, but wouldn't even explain why they did it. :/


The 3rd point might be a legitimate notification service, like ifttt (e.g., I sign up for something that tweets a link to something to me whenever a page with no RSS updates; if a hundred people sign up for the same thing, it would look exactly like your third point.)


Easy solution - anyone who wants to run a twitter account like that is told that they can only send those notifications to people who follow them.


That seems like an edge case - the first time they trigger the auto-suspension, they'd just need to explain their service and get a 'not a spammer' flag added to their account.



I've noticed Twitter spam bots have started to pull random quotes/text off the internet. Funny to see a tweet of a Machiavelli quote followed by a cheap viagra tweet.


I have a few bots of my own that do weather based alerts (by state). They don't @reply anyone; they just do normal tweets.

What's so frustrating for me is how often I get slapped for spamming yet these quite-obvious @mention spam bots continue to prosper (I can never get a response out of twitter as to why I keep getting blocked; they just remove the block without actually responding to anything I asked).

I get probably 10-20 @reply spam a day on my more active account and it all follows the same pattern:

* account is nothing but @replies with just a link

This has been going on for probably a month and I've reported every single account that's spammed me.

If twitter can't figure out how to auto-block this obvious spam, it doesn't give me much hope that they'll figure out how to take care of spam in general.


I've received mention spam in the past on Twitter, but last week it was used against me as a primitive DoS attack on my account (a DoS on my time). In a very short period of time after having tweeted something regarding the fallacy of vaccines and autism, I started receiving @mention spam from a user who was spamming all accounts that had RT'd my original tweet as well as me. Even if I blocked that account, a new one would pop up immediately after that and it effectively rendered my Mentions column in TweetDeck useless while this 'attack' was taking place.

It's nice to know that the accounts were quickly disabled but a more nefarious individual or group could have caused even more problems, and it does seem that Twitter should have some sort of prefiltering heuristics in place (if they don't already exist) to prevent this kind of abuse (new account / low # of tweets / low # of followers / consistent message 'n link being sent).


I'll still use the report as spam feature because it doubles as a block, which is the only way to make the tweet go away.


What if it's an engineering problem? What if it were the case that in order for Twitter to be able to distribute new tweets at the rate and volume that they do, that they can't run the type of analyses required to effectively curb spam?


They already analyse tweets in real-time, for example to find out what topics are trending, and what the current "top tweets" are.


That's done after the tweet has been posted though. What I'm wondering is if they have the ability to check tweets after they're submitted but before they're posted.


I don't think it's interesting nor pertinent to do a tweet-per-tweet spam analysis. What would be something is to automatically block account when a few operations has been made with it after its creation (like following 1000+ people without having tweeted, following noone but having already 50+ tweets with @pseudo and a link in it... these kind of obvious indicator of spam activities).


It strikes me that posting a tweet and displaying it to another user's account could actually be two separate transactions.

Can't the spam analysis be performed on a just-in-time basis when a client refreshes its @replies?


"Twitter needs a far more aggressive, automated, proactive, heuristic-based anti-spam system. And if someone has trouble legitimately tweeting a link with no text to 100 people in a row who don’t follow them at precise 1-minute intervals, that’s just the price we’ll have to pay."

Actually, Twitter already has similar measures. If you try to send out a link too many times in a row, your account is disabled. And too many is not 100s of tweets, but around 10.

This is just anecdotal evidence, but there are clearly some measures in place to prevent spam. Why Twitter only targets some spam-posting methods and not others would be interesting to know.


He's right Twitter spam is a problem. Spam is one of the things that really hurt myspace, and now it has moved big time to Twitter. I just wonder how many of the millions of new twitter signups are spammers / spam bots.


I've written some spam filters for twitter and frankly, it's pretty easy to spot in most cases. The bigger question is why are you following them? If it's popping up in search that's a different problem I suppose. I have noticed that they do shutdown a lot of spam accounts when I look back at accounts that I found were spamming in my spam filter at a later date. I see the results of them tackling spam, I guess there is a slippery slope problem of - what is or isn't spam? A lot of stuff is borderline like RSS feeds to twitter accounts? Bots that message people (twitter seems to have setup their own recommendation bot) for various reasons?


Spam can show up in your @mentions feed if you tweet some keyword that a spambot is searching for.


Or often even without any (noticeable) key word that they might respond to, presumably sometimes they just tweet random accounts.


A good thing would be to be able to subscribe to spam list of account we trust (and tell if we trust who they trust or not). I always block and flag as spam accounts on Twitter and Identi.ca which are spams and follow me. If they keep track of this I must have flagged something like a hundred of accounts, maybe more. If we could just split this work between trusted people we could avoid a lot of following notification caused by spam accounts.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: