

Twitter spam is really annoying, let's fix it. - jrussbowman
http://joerussbowman.tumblr.com/post/7614367859/twitter-spam-really-is-annoying

======
dennisgorelik
Effective auto-moderator should check for multiple flags:

1) IP address (if it was banned in the last couple of months).

2) Registration Email address (was it banned in the past?).

3) Buzzwords (how frequently that word was used by banned accounts in the
past)

4) Web sites (was it banned).

5) Age of the account (the younger - the more likely it's spam).

Based on all these red flags auto-moderator should calculate overall "spam
rating", and above certain threshold accounts should be deleted automatically.
All key attributes of deleted account (IP, email, buzzwords) should be
analyzed to train automoderator to recognize new spam).

We implemented such system and it automatically deletes most of spam:
[http://postjobfree.blogspot.com/2009/06/postjobfree-
automode...](http://postjobfree.blogspot.com/2009/06/postjobfree-
automoderator.html)

~~~
jrussbowman
I like that a lot, will help with the problem mentioned above of spammers just
creating new accounts to some extent.

Another idea would be once I url is identified as spam, filtering it from ever
being posted, and flag or suspend the account. Not sure how well Twitter can
support pre-commit hooks for the tweets though.

------
jfruh
Ironically, since this idea is on a Tumblr blog, I find Tumblr spam a lot more
annoying. I help run a fairly popular Tumblr w/4K+ followers and occasionally
I see that our posts are "liked" by users with names like
"freebatteriesonline"; clicking on their name doesn't even take you t a Tumblr
blog, just a dodgy battery sale site. The process of reporting spam Tumblr
blogs is not intuitive -- the only instructions I found on the subject were on
an eHow blog, for Pete's sake, and it's not clear to me whether I'm actually
reporting them for spam or just blocking them so I can't see them.

------
ry0ohki
So banning the accounts is the end result? Aren't these throwaway accounts
that can be recreated in a second? It seems like a reasonable concept to
filter the Tweets from getting to me in the first place, I'm guessing a "spam
box" is a little too complicated for Twitter though.

~~~
jrussbowman
I took the report approach with an assumption (which like all assumptions
could be extremely wrong) that pre-commit filters might not be something
Twitter could implement for performance reasons at this time.

The report approach could eventually help with making the identification of
spam easier, leading to more optimized approaches for filtering later.

------
pavel_lishin
I have an easier solution. Every time I get spam, I follow it through to the
account. All of the tweets are identical: @<someuser> <filler text here>
<identical url>

1\. Scan all accounts 2\. If you notice the above pattern, flag them for
review. 3\. In addition to #2, if X people flag them for spamming, suspend
their account as well.

~~~
slig
It seems that their flagging system goes to /dev/null.

------
mattwdelong
I'm sure if twitter really wanted to fix the spam problem, they probably
would/could have already. They seem to have the resources to tackle the
problem, this poses the question: do they really want to? I'm sure on the
business side of things, they measure various metrics, such as; account
creation and tweets/minute/hour/day - they can then go to advertisers and tote
such metrics. Perhaps tackling spam would actually hurt their bottom line? In
the end, how bad does spam really affect you? I barely notice it, myself.

In the end spam == more traffic == more $$.

------
evanjacobs
I think there may be a different approach: let spammers (aka bots) flourish
and let users upvote the most helpful ones.

In short: 1\. Create a second type of Twitter account for bots. 2\. Give users
the ability to opt out entirely of seeing bot replies. 3\. Give users the
ability to upvote helpful automated replies.

I wrote a blog post with this suggestion a while ago:
[http://www.readwritehack.com/how-twitter-can-win-back-
develo...](http://www.readwritehack.com/how-twitter-can-win-back-developers)

~~~
robtoo
I think you'll find that the spammers won't actually be that keen on
officially registering as a bot.

~~~
evanjacobs
Offering spammers a regulated way of reaching a mass audience also gives
Twitter the ability to be _much_ more aggressive in shutting down accounts who
pose as real users.

~~~
pavel_lishin
Isn't it a trivial task to automate twitter account creation? Shutting down
christinanik2883 because she didn't register as a Botified Citizen doesn't do
anything about jonnyi9032 or shellystack48799823.

------
ianlevesque
"heck, give me access to the resources where I can tap into the stream and
I’ll build it for you"

Another reminder that you're not using an open system, just someone's walled
garden.

~~~
jrussbowman
I live in a community governed by an HOA (even if I am on the board) I think
pretty much ever facet of my life is a walled garden. Makes it easier to
accept.

~~~
ianlevesque
There's no doubt twitter's useful. I use it myself daily. But I'd argue
acceptance of the growing world of walled gardens is dangerous to innovation.
I was sad reading your article because if this wasn't about twitter you
could've written about how you SOLVED the spam problem, instead of begging a
large company to smile upon your idea and maybe try it out.

~~~
jrussbowman
Well, I supposed I could write a Twitter client that checked links and marked
tweets as spam before presenting them to the user.

Sorry, it's my nature to try and solve problems working around whatever
obstacles are in the way. So when your post made me think, well.... I could...
In no way mocking or disagreeing with your opinion which I do comprehend and
for a large part agree with.

On the same token I also think a truly open system, while not impossible, is a
difficult proposition for financial reasons.

Also from a security stand point. If the system was open enough for me to
solve the spam problem, then it would also be open enough to make the spam
problem, or even virus/malware delivery much easier.

------
eli
Most of the spam links I get to porn sites or free ipod scam sites, not AFAIK
malware sites.

That said, this seems like a reasonable idea.

------
sc68cal
Just pull in the trending stream from twitter and pass it through Hadoop
Streaming (<http://hadoop.apache.org/common/docs/r0.20.0/streaming.html>).
This is the kind of work that MapReduce is made for.

------
larrik
Not horrible, I guess, although I would have it warn and/or close the accounts
automatically, and then have employees review appeals (ideally very quickly).

Otherwise I don't think that "initial surge" you mentioned would ever actually
subside.

~~~
jrussbowman
Maybe have 1 threshold for review, and another threshold for automatically
suspend and review?

------
mchusma
I thought twitter essentially was spam. Just everyone spammed everyone so it
was transparent.

------
jrussbowman
Any other ideas to flush out the solution are welcome.

~~~
jrussbowman
I updated the blog post with other ideas as well.

------
ddemchuk
twitter is inherently a spammer's heaven. It's really easy to make 140
characters seem legitimate. An hour of writing spun content and a really easy
to build signup bot and you can have full conversations, retweets, everything.

There's a reason twitter gets spammed so much, and it's because most of the
time, normal people's 140 character messages look just as spammy as the actual
spam.

