

The earthquake that killed Twitter? Spam makes the utility useless - seaicethoughts
http://seaicethoughts.blogspot.com/2011/06/earthquake-that-killed-twitter.html

======
hopeless
Twitter's inability to deal with spam accounts is quite bewildering. Tweeting
the same link over and over again? Spam. Follow hundreds and no one follows
back? Spam. A high percentage of blocks/spam reports? Spam. Every tweet
contains a link? Spam. Does anyone I follow, follow this person? Perhaps not
spam but not a good start either

~~~
yuvadam
I'm not sure the heuristics are that simple.

The signal-to-noise ratio in Twitter is already low as it is. Most tweets
contain links, lots of users are followers but are not followed.

Sure, combining several clever heuristics can reduce spam levels, and no doubt
Twitter must attack this aspect if it wants to stay relevant.

But can you put your finger on what exactly constitutes spam in a platform
that is so noisy?

~~~
hopeless
It would require more than 30secs thought but I'm sure there are some
heuristics which will work. I'm not even convinced the "report spam" button is
connected to any action on twitter's server. And why not use blocking as an
indicator? Surely if Person A @-replies several people, and they block A, then
just disable @-replies from A.

Has anyone tried using a Bayesian filter for twitter spam? These have been
very successful for email. In fact, I consider email spam a solved problem now
(thanks Gmail!)

It's for things like this that I wish I could insert a proxy between my
twitter clients and twitter itself and build my own rules/spam filter.

~~~
lurker19
Sounds like you would prefer a communications medium not running a proprietary
protocol controlled by a single company whose business model relies on their
ability flake sure you cannot block unwanted content.

------
mtkd
I want to bias my Google SERPs, Twitter search etc. with my social graphs -
linkedin, twitter, facebook - like I do offline.

Sometimes I may still want to allow anonymous (to me) signals to influence
what I see, but generally I only want to see content/recommendations that
people 2 or 3 degrees of separation from me have given some positive signal
for.

Too much of my day is filtering noise.

~~~
markkat
We just need a knob. :) -I'm going to dial the internet back to 4 for awhile.

Our online personas treat us like those stupid parties where you have to wear
a "Hello, my name is:" sticker. IMHO, every good service is a fragment of your
life, and doesn't try to be the entirety of it. That's how the offline world
works, anyway.

------
ahrens
*

Until twitter sorts out this themselves, there is a need for something like a
browser add-on that hides the following from hashtag searches: new accounts,
Retweets, Accounts with less than 10 followers, Accounts that often tweet
trending topics, Tweets with more than one trending hashtag

------
laserhase
I noticed the same thing last year when I was stranded in Barcelona during the
Icelandic volcanic eruption. Then, people were looking for ridesharing, free
rooms, news, etc. under the hashtag #ashtag. Not as severe as an earthquake,
but a similar mechanism blocking twitter's utility.

Then the problem wasn't spam, but #ashtag retweet avalanches from some well-
followed celebrities. The most egregious example I remember: this guy Paulo
Coelho in Brazil, retweeted through pages and pages of search results (the
source: <https://twitter.com/#!/paulocoelho/status/12399786645>). I'm pretty
sure Justin Bieber said something too, so that was it for #ashtag.

This would probably have been easier to deal with than spam, because people
weren't actively trying to game them. Twitter just needed to aggregate some
information and make it blockable, e.g. "don't show me (re-)tweets with this
text anymore". As a sometimes-user, I still don't see a straightforward way to
do something like this in the clients I use. So now the spam angle is not a
surprise at all.

~~~
robtoo
I just add "-RT" to my twitter searches.

------
Tichy
Wouldn't the spammers be mostly new accounts, as the old ones would be blocked
already.

So I propose the heuristic: new account+trending topic => spam

~~~
semanticist
Professional spammers will just keep a 'stock' of created-but-unused accounts
and let them 'mature' before using them for spam.

~~~
Tichy
True - I guess no side can win this game in the long run :-(

------
tibbon
Things have gotten much worse on this. Years ago, I never got spam @ replies
on Twitter. Now just about anything I say triggers some shit about a free iPad
or ebook being messaged to me. I guess the target has gotten large enough that
the spammers have really gotten onboard heavily.

I wasn't in the very first batch of people using Twitter (sxsw 2006), but I
was on there shortly after any now I barely care about it anymore. Its an
annoyance. A tool. Worse than email with few real benefits. At this point its
just back to texting the 100 or so people who I really want to stay in touch
with instead of tweeting at them like I was doing for a few years.

------
jrockway
The problem is that Twitter has a little "list of topics that will be most
profitable to spam" list on the side. It's not surprising, then, that those
topics get a lot of spam.

------
tatsuke95
I'm hearing a lot of simple, interesting solutions for the spam problem. But
Twitter has to be doing things like this, don't they?

The main issue is that any of these methods will hurt marginal users. If you
fall into the false-positive pool, you likely don't use Twitter much, and may
drop off entirely if your account gets flagged. Twitter can't afford to lose
those users, or it will seriously hinder their reach. How valuable is Twitter
is it contains a bunch of super users Tweeting at each other?

(this all beside the point that any of these methods will kill their "Tweets
per Day" and "Total Users" metrics)

------
utunga
@wordsontheweb: "Credit where credit's due, @twitter appears to have cleaned
out the #eqnz spam"
<http://twitter.com/#!/Wordsontheweb/status/80183699127275521>

I think she's right.

------
rmc
Why not have a PageRank for twitter accounts? Essentially if I retweet
something you say, that's the same as a link from one website to another
(other heuristics: person A replies/@mentions B, or person A follows B, or
person A follows B and vice-versa).

~~~
ahrens
It might work for a while but it would need to be more advanced to work. If
not, the perps will just set up accounts that retweet each other. It would
also have to meassure if the retweeting account has good karma as well, if
not, don't count it. But hey, let the arms race begin!

~~~
rmc
Google has the exact same problem. Surely one could borrow some of their
insights. i.e. to detect 'retweet farms'. If all the spammers retweet each
other, they'll still be on a PageRank of 1. No matter what you do, there'll be
people who try to hack it. Twitter (and all organisations in this problem) can
only keep trying to make things better.

------
utunga
Cross posting here in case it can help connect with someone that might be able
to help: The problem with the geographical filtering is that very, very few
tweets sent about the subject of the earthquake are geotagged.. as far as I
can tell that is what twitter uses for the 'near: filter'.

For what its worth I set up this site <http://chchneeds.org.nz> immediately
after the last big earthquake.. and it is still operating today. As a first
step to finding tweets relevant to chch earthquake and filtering out spam it
does also allow for filtering by address when people mention an actual address
in the tweet.

For example if they say 'at 23 Maidavale lane' or 'in Sydenham' it tags those
tweets as being in a particular suburb and in the region of canterbury. For
example: <http://chchneeds.org.nz/#!/loc/canterbury>

I had grander ideas for this but it was all about people in canterbury
choosing to actively use the tags #offer and #need. When I realised that
eq.org.nz and their volunteers were doing a better job of filtering and
sorting (and getting their message out) than i put my time into helping them.

At this time eq.org.nz has been shut down as it does use quite a lot of
volunteer hours to keep going, but everyone and anyone is welcome to try and
use the data from chchneeds in anyway that they think it may helps. It's all
open data made publically available of course. I'm @chchneeds on twitter if
you want to get in touch with me.

Also for what its worth I also set up a similar service for the japanese
earthquake, but was overwhelmed by the amount of data, and I wasn't keeping up
so I had to shut it down. The new rate limiting rules from twitter don't help
much with this.

~~~
utunga
Just want to add its really such a goddamn shame that spammers are doing this
- basically polluting the public resource of ad-hoc hash tags with their
probably automated choosing of popular hash tags. I do hope twitter finds a
way to deal with this.

------
username3
Twitter could add an option to fill reCAPTCHA with tweets. Updates posted with
some human verification could have higher weight.

