If a person registers as an American while being in the Philippines , they could improve their security checks and ask for more details. For instance.
Seems to me ultimately Facebook needs these fake profiles so it can inflate its user base and its ads revenue. Isn't it what people call "growth hacking?"
Interesting article nevertheless. I don't condemn what these folks oversea are doing. It's not hacking, nor extortion , it's obviously some form of spam and against Facebook TOS but I prefer people working in these "digital sweatshops" rather than selling drugs , working in brick factories or prostitution.
It's more a matter of long term vs short term incentives. Focusing on the short term might lead a fly by night startup to boost user counts unscrupulously, but FB is focused on long term sustainable growth.
You say it isn't true, then describe the scenario that is actually happening today, described in this article: a lower quality experience for users and advertisers. The problem exists because of misaligned incentives. Sure, if advertising and users growth plummet due to the problem, Facebook will be forced to do something. But it's happening now and little is being done.
The spam accounts are not new. I worked for a relatively successful gaming company during the social gaming goldrush a few years ago. While doing some research into cheating in our game, we looked at trying to determine players who were using multiple accounts. We figured that close to half of the daily user accounts were fake. Half.
Hell, each of us had multiple accounts ourselves, since we did a lot of testing and market-research, and didn't want to inundate our real-life streams with spammy game updates.
Of course, when the value of your company rests on those DAU figures, no one is going to talk about it. Just report what it says.
SIM is not a national identity card scheme. Also, caller ID is entirely spoofable - this is something of a problem for fake SWAT calls.
From the readysim site:
>> For military and law enforcement personnel engaged in covert or undercover activities, Ready SIM offers voice, text, and even email communications.
But that makes it easy for the terrorists too, to do covert operations and hide their identities.
Also, it probably only works _in_ the US, even though you can order it _from_ anywhere.
Even then as with everything telecoms related, everyone does things slightly differently. They would need to reimplement it for every single operator - even then I doubt it would be 100% accurate, especially once you take roaming into account.
I'm curious how they can do this so cheaply though. Maybe they're using a sketchy carrier that FB could flag for further review.
He says he's able to show ads that violate FB's terms of service by selectively showing a different ad to people from within FB's network, so that they don't cancel the account. If they do find out, everything is done via anonymous credit cards, so no big deal, just rinse and repeat.
I wonder if facebook is working on such an update. One which would squash most of these click farms. I would think, given their resources, they would. The fact that they haven't yet leads me to believe they think this is a necessary evil and that they willingly allow it to continue.
Judging from the anecdotal amount of blatantly "fake" accounts I've reported to Facebook, and seen nothing done about... I'd have to say that they don't "care" all that much short of plausible deniability. Accounts, likes, clicks, views are all currency to Facebook, which it trades to prospective companies in exchange for real money.
FB get's 10 individual and separate reports for one specific account being fake. Let's assume this is some sort of "spike", and unusual. They then have an investigator manually check this account, and verify if it is a real account or not. If X people report it as fake, then it's likely a fake account, therefore human-intervention is required to confirm.
Then, stage two. After taking action, they note that those 10 individuals who reported this account matched the outcome of the human investigator. We could then infer that those accounts are:
A. Less likely to be fake themselves. And
B. More reliable signals of fake accounts.
So, next time around. Perhaps the cumulative, or "enhanced" scoring of those accounts' reports requires a lower threshold for human intervention.
A cynical view would be: why are fake accounts bad for Facebook? These fake accounts help boost the number of users they report every quarter. These accounts don't use any server-side resources. They just sit around generating "likes", and with each "like" Facebook hears a cha-ching!
The argument presented in the article is that they do:
>These fake likes weren’t just an empty number. Whenever Second Floor Music posted content, Facebook’s algorithms placed it on the newsfeeds of a small, random sample of fans—the people who had liked Second Floor Music—and measured how many “engaged” with the content. High levels of engagement meant that the content was deemed interesting and redistributed to more fans for free—the main goal of most businesses that use social media is to reach this tipping point where content spreads virally. But the fake fans never engaged, depressing each post’s score and leaving it dead on arrival. The social media boost Bronstein had paid for never happened. Even worse, she now had thousands of fake fans who made it nearly impossible to reach her real fans. Bronstein struggled to get help from Facebook, reaching out repeatedly through help forums, but, in the end, she scrapped the original page and started again from scratch. Second Floor Music had effectively paid to ruin one of its flagship Facebook pages.
Sounds like a business opportunity. Fake fans who actually engage with content. They could even post fake comments and falsely claim they are planning to attend events. What a wonderful future we've built for ourselves.
We did not buy likes and I am not concerned with how many people like our page. I'm concerned with our engagement rate and reach -- which is (now) phenomenal. In the campaigns I have administered (after the incident that the article reported), I've created carefully targeted campaigns with high-quality copy. The New Republic article, however, discusses our first campaign, which was unfortunately low-quality and poorly targeted. Instead of learning that we hadn't created a good ad by seeing bad engagement, we learned by having our page overrun with fake likes who could not be removed. That's not a typical response for an ad buy, which is a real problem with Facebook's system.
Thanks for reading the story!
Rachel Bronstein, Web Editor, Second Floor Music
the problem with facebook:
> From January 2013 to February 2014, a global team of researchers from the Max Planck Institute for Software Systems, Microsoft’s and AT&T’s research labs, as well as Boston and Northeastern Universities, conducted an experiment designed to determine just how often advertising campaigns resulted in likes from fake profiles. The researchers ran ten Facebook advertising campaigns, and when they analyzed the likes resulting from those campaigns, they found that 1,867 of the 2,767 likes—or about 67 percent—appeared to be illegitimate. After being informed of these suspicions, Facebook corroborated much of the team’s work by erasing 1,730 of the likes. Sympathetic researchers from a study run by the online marketing website Search Engine Journal have suggested that targeted Facebook advertisements can yield suspicious likes at a rate above 50 percent. In the fall of 2014, Professor Emiliano De Cristofaro of the University College of London presented research which found that even a page explicitly labeled as fake gained followers—the vast majority presumably bots.
> The bot buildup can even affect companies that aren’t advertising with Facebook, but are just passively hoping their pages gain real fans. In 2014, Harvard University’s Facebook fans were most engaged in Dhaka, Bangladesh. (They stated that they did not pay for likes.) A 2012 article in The New York Times suggested that as much as 70 percent of President Obama’s 19 million Twitter followers were fake. (His campaign denied buying followers.) Less prominent pages from across the world—from those belonging to the English metal band Red Seas Fire to international bloggers—have been spontaneously overwhelmed by bots that are attempting to mask their illicit activity by glomming on to real social media profiles.
Certainly social media profits short term from these schemes... as long as they don't get out of control and threaten the entire business model. So the cynic in me suggests Facebook and the like aren't actually interested in eliminating the scams entirely but rather keeping them managed within certain parameters while appearing to be trying to do everything they can to shut them down.
If this were my project to investigate, I'd want a random sampling of, say, new account registrations from within the past month or three. A 100-1,000 profile sample would be sufficient to get a strong estimate of the mean, based on scoring of accounts based on some sort of activity profile.
Remember: what's most critical in data analysis isn't the size of the sample, but the selection of it. Facebook have direct access to their own data and could ensure that any such selected sample was in fact random. Further Monte-Carlo re-sampling of a larger sample (e.g., subsets of the 1,000 sample set) could show possible non-randomness.
Other identifiable parameters, especially clustering of registrations through proxies, would also be generally determinable.
Note that the decrease in uncertainty from increasing a sample size (and hence: analysis costs) 10x is only about 3.2 -- your standard deviation decreases with the square root of the sample size. For a normal distribution a sample of only 30 is generally considered sufficient for "large-sample" methods. Most national opionion and political surveys are based on samples of 300 people. The cost that's incurred comes from the fact that once you've selected your sample you make repeated contact attempts to reach all those identified for inclusion.
By contrast, self-selected surveys, and most particularly online surveys and popularity polls, are highly susceptible to sampling bias. Most often, they're not analysis tools but contact-selection or lead-generation tools. They may have limited utility in identifying issues of concern, but should not be relied on in ranking them absent further research.
The only thing I didn't catch was how they were to figure out which of the accounts in the sample were fake.
Fake accounts are almost certainly going to have characteristics which distinguish them from legitimate ones, and likely a quantitative, manual, or combined assessment might determine that. Points to consider:
⚫ Photo images. Search for duplicates or analyze for manipulation.
⚫ Correlation with other data sources. Real names, Social Security, marketing, and numerous databases exist which tend to point to more legitimate profiles.
⚫ Direct contact. Set up events, meet-ups, special purchase offers, etc., or otherwise try to elicit direct action. Even statistically differential response rates are useful.
⚫ Social graph. Real people _interact_ with other real people, and there's a web of trust or certification which can be imputed or determined.
⚫ Network profiles. Identifying large numbers of accounts with similar or suspicious profiles _and_ originating from the same or adjacent network addresses (CIDR/BGP block), or from known nonresidential space that's _not_ a generally-used proxy, would be a strong indicator. If proxy use continues to climb, that's going to be less useful (and I suspect this will be the case).
Tests along these or similar lines could be used to generate more general scoring algorithms for identifying suspect accounts.
Once you've done the manual assessments, deriving heuristics for broader estimates are also possible. The key is having training data of known classified accounts.
You'd be better off with something like tineye or google image search; something that snarfs all the photos off the web, you know, does this regularly and timestamps where it saw the photo first. It wouldn't be 100% but would likely work better than the cooperative solution you propose, and it's something that is well within the capabilities of facebook.
Even then... yeah, I think this sort of thing is going to be pretty much impossible to completely stamp out as long as you can sell fake profiles for enough money that you can pay an actual human to create them.
Also images uploaded on Facebook aren't public by default, so tinyeye etc wouldn't have them.
Of course it's not fully thought out how it would work, maybe just for people who are new to FB? You also need some sensible image recognition algo, or you'll get a lot of chopped off head mashes. But I'm sure FB has staff for that sort of thing.