

Facebook's learning defense system against attacks [pdf] - Revisor
http://research.microsoft.com/en-us/projects/ldg/a10-stein.pdf

======
Revisor
As I run a forum and we assess all threats manually (a bunch of moderators
watching over the forum and reacting to flagged posts), this is an interesting
view of what the moderation looks like when it's automated.

What caught my eye:

 _Threats to the social graph can be tracked to three root causes. These are
compromised accounts, fake accounts, and creepers.

Our earlier phishing classiﬁers made heavy use of features on IP and
successive geodistance. Attackers have responded by using proxies and botnets
to log in to their compro- mised inventory. Malware is a tough problem because
the attacker is operating from the same machine as the legitimate user, so IP
does not provide signal. To combat malware, the most effective mechanism we
have discovered is to target the propagation vector using user feedback.
Attackers can also try to game user feedback features. That is combatted with
reporter reputation and rate limits.

Chain letter volume can explode when spread using the powerful viral channels
of Facebook. In the past, they have been observed to reach 1-5% of total user
communica- tions in minutes.

Chain letters exploit social engineering to trick otherwise well-behaved
Facebook users into propagating the attack. As with other creeper attacks, the
best long-term answer is education. In the short-term other mechanisms can be
used against chain letters speciﬁcally. For example, fuzzy n-gram matching or
other forms of locality-sensitive hashing on text.

Like users, attacks use many different channels. For the system to be ef-
fective it must share feedback and feature data across channels and
classiﬁers._

~~~
Revisor
This is also interesting:

 _The decision about how and when to respond can depend on business or policy
considerations. For example, an action in one re- gion might be more creepy or
undesirable than in another region. Another example would be applying amore
aggressive spamclassi- ﬁer to pages depending on their admin preferences.
Business logic or policies of this form do not belong in learned models and
would only damage their performance._

