Which is why I severely curtailed my FB use around that time.
None of that data is actually particularly scary by itself, although I wouldn't enjoy the thought of it all being out there.
What Cambridge Analytica was bragging about was their big data model promising to distill users' data to a psychographic profile.
There's this story about 30 "likes" (of books / artists/ movies/ etc) being enough to predict peoples' sexual orientation with higher accuracy than the coworkers sharing a desk with someone.
The large publicity for such "big data" methods in the last years may have contributed to the change in public opinion, and (part of) the reason Facebook is just now getting its 15 minutes of infamy.
In our circle around that time people were calling out apps that required these permissions and recommending that users not authorize them if it was not obvious why the app would require that kind of data. John Gruber comes to mind as one of those people. These calls for caution were not in the mainstream though.
Edit: I believe this API contained info like favorite tv show or band if those fields were filled out in your profile, but again I’m fairly sure that your own privacy settings could have limited this if you had them set to draconian levels. So in other words, yes this API was limited to publicly available data, more or less.
The worst, for me, was posting on behalf of other people and getting replies to your app like "I'm so glad to hear from my favorite grandson after so long!".
I feel all this discussion around CA is non sense because everybody knew that was possible and we are arguing about it only once someone big took an advantage. Is like leaving you door wide open and finding that someone stole your stuff. On the other hand a more limited API makes it unusable for fair purposes.
At this time whenever we have released any app, useres could login to it using OAuth which means they were presented with a list of privileges our app needs from them (eg. Friends list, photos, posts etc). Once user has authorised the app we could fetch all of this data.
This was how Facebook worked at this time, you can’t say it is a leak of data because we explicitly ask user for permission. You basically say, I want to use your app, here is my profile if you need it. I don’t really understand why people are so mad about their data privacy. If you publish your photos, list of friends, what you like, where you live and work, who are you merry to, then it shouldn’t be a surprise this data can be viewed by not only your neighbour but also a dodgy automated scripts. Once the data is fetched then you can only imagine what people can do with it. It’s not really Facebook fault. It’s people who thinks when they publish things on the Internet, it’s safe and can be only viewed by other people.
Maybe Facebook only role should be to make people more aware of all of this, but is it in their interest? I don’t think so.
I don't understand why I keep reading comments like this. One of the main issues is that your data could be leaked to an app developer even if just ONE of your friends installed said app. So even if you diligently made sure only your friends, or even particular friends, could see your stuff, it'd still be accessible to the app developer.
That is absolutely not something even a privacy conscious person would've expected, and absolutely enough to get mad about.
I stopped using Facebook back in 2011 (only used it to manage and test my apps) as I was really concerned how easy is to collect personal data.
But I guess for me, as a developer, it is easy to imagine how things works and when to get suspicious online.
On the other hand it makes me really happy, Facebook privacy issues like this one with CA, got much attention and finally more people, hopefully, will understand how things works.
An app on Facebook is only an automated way to ask your friend to share data he/she has access to. You cannot both share data with your friend and expect him/her to not be able to share it with 3rd parties.
If you don't want your friends to be able to share your data you don't become friends with them on a social network and/or you don't share data with them.
You don't tell to your acquaintances things you don't trust them to keep secret. And you can't expect them to keep secret things you share with every acquaintance
Let’s see which prevails.
For the record Tepix point is supported by essentially anything published by any authority on the subject from the beginning of facebook. As others, further down on the page, have pointed out even the FTC complained about this in 2011:
'''Facebook represented that third-party apps that users' installed would have access only to user information that they needed to operate. In fact, the apps could access nearly all of users' personal data – data the apps didn't need. [...] Facebook told users they could restrict sharing of data to limited audiences – for example with "Friends Only." In fact, selecting "Friends Only" did not prevent their information from being shared with third-party applications their friends used.'''
I absolutely do not think the average user has an idea about the implications of that.
I have to say Google has been mildly better in that way with their Oauth system but it really struck me one day when signing up for a passive service with my Google account — they for some reason requested unlimited access to read, write/send, and delete my emails from my account. Needless to say I backed away. I work in the field and know what this really means. I still get tripped up.
The average person doesn’t 1) give it the time to consider what it means, they just want it to work 2) even if they read the request they don’t understand what it’s actually saying they are doing(data harvesting) and 3) they have little idea of the scope of implications. “Oh it’s just a stupid farming game” but don’t realize the massive trade and profiling going on behind all of it.
The Facebook fiasco is the first time in recent memory where people have been reminded that their data is being taken and not only that — it’s being traded, bought, sold, compiled, refined and worse.
I watched an old 60 Minutes episode yesterday on Amazon circa 1999. Bezos was showing the reporter the recommendation engine. The reporter was clearly shook when it recommended a short list of books he’d actually bought recently outside of Amazon based on a couple of purchases on the platform. In 1999. They collected about a GB a day then.
I guess people got used to the idea of generally benign profiling, and the questioning stopped after a while.
At the time, signing into apps with Facebook meant you were not only giving the app access to your account, but also anyone clever enough to steal the token. In some cases, "clever" even meant anyone who had a basic understanding of sqlmap or other pentesting tools. In theory shady "analytics" firms could have hired a low level security researcher and had him use shodan and sqlmap all day to expand their databases.
Today it's pretty rare for apps to ask for intrusive permissions, and people tend to be a bit more wary of apps that do. Facebook has also made an effort to alert users when the permissions requested are more intrusive than the usual email address and profile picture - often requiring explicit agreement to these permissions.
Nonetheless, if Facebook's "audit" turns up apps that did a lot of suspicious queries, what stops them from saying "oh we were hacked someone took our tokens from our DB, we are conducting a full investigation". Sure it's still bad press, but it's probably better PR to look incompetent than creepy.
Facebook were required to do this by the FTC in 2011.
That Facebook now feels betrayed by CA because such data (generally available to service providers) has been used inappropriately shows either that they were complicit by enabling CA to do so and knew about it all along, or they didn't know what they were doing at all. I'm not sure which is worse. If a third party says "trust me, I'll handle all the data responsibly" doesn't mean anything, because there is no oversight whatsoever. Additional clauses in contracts do not make Facebook a victim of contract breach. The product in itself is flawed, because it handles the data irresponsibly.
That is a gross oversimplification of the issue. There were controls in place to stop excessive data collection.
In fact, the only app in this situation that was allowed to "suck out" "vast amounts of data" was the Obama For America app. According to Carol Davidsen, Obama's Former Campaign Director "We ingested the entire U.S. social graph" , despite the fact that less than 1 million people actively authorized the app to access their data. Approximately 99.5% of the hundreds of millions of people whose data Obama took, with Facebook's blessings (actively allowing it to bypass its data collection limits for apps), never knew about or authorized Obama to have or use their data.
So only one app was "actively encouraged" to suck out vast amounts of data in the history of the existence of the API. All the rest of them were subject to relatively strict controls, requiring months or years to collect even a small fraction of the data that the Obama app was allowed to collect. The API was not a data free-for-all, except in one unique case with the explicit authorization of Facebook.
"But thousands of other developers, including the makers of games such as FarmVille and the dating app Tinder, as well as political consultants from President Barack Obama’s 2012 presidential campaign, also siphoned huge amounts of data about users and their friends, developing deep understandings of people’s relationships and preferences."
Last I checked, Farmville was not associated with OFA. So you need to back up your assertion "All the rest of them were subject to relatively strict controls, requiring months or years to collect even a small fraction of the data that the Obama app was allowed to collect. " and explain why this also renders the entire dataset collected by CA to be completely harmless, in contrast to the current narrative. Because it's not that interesting if a relatively benign political organization got a little more data than another which used it to impose the will of a foreign enemy upon the US electorate.
It was somewhat possible to overcome this by spreading the collection out of a number of months or years, which is what
I believe the Kogan app did. But even in that case, with the data collection spread out over a long period of time, they had nowhere near the data that the Obama campaign was allowed access to. Finally, the CA data was years old, while the Obama data was allowed to stay fresh right up to election day because they had no API limits.
So these apps with much larger install bases than Obama could have ever dreamed of had access to less data than OFA did because their ratio was not allowed to be as ridiculously asymmetric as Obama’s was. They still had access to large amounts of data, but only because of their massive authorized install bases (which in all of the cases you mention were far larger than the OFA install base). But none had the entire US social graph - with the exception of OFA.
It certainly looks suspicious, but I don't think you can rule out the profit motive. Perhaps in 2012 Facebook's position was, "If you pay us enough money we'll remove the API limits"
Considering that at least half of them - ~100 million people - would have consciously objected to helping Obama do anything, much less get elected, that should have been earth shatteringly scandalous and probably should have buried Facebook right then and there. But instead, the press celebrated this technique, right up until it helped create a result other than the one it wanted.
(1) Was Facebook biased in favor of the Democratic party?
(2) Should the press have been more critical of the way personal data was used for political purposes?
My answer to (1) is 'probably', but maybe Facebook would have been willing to "manually shut those alarms off and open the floodgates to the data" for anyone who gave them a million dollars?
Great, what's your source for that data ?
> So these apps with much larger install bases than Obama could have ever dreamed of had access to less data than OFA did because their ratio was allowed to be so ridiculously asymmetric. They still had access to large amounts of data because of their massive authorized install bases. But none had the entire US social graph.
Nevertheless, they didn't seek to hoist a fascist leader apparently controlled by a foreign enemy on the US. It is perfectly fine that OFA and other organizations should not have this kind of access either. But this is whataboutism. Its truth or not does not mean the CA situation is not revealing of an enormous problem.
My source is that I can do math. The campaign admits it had “the entire US social graph”. With ~200 million US Facebook profiles, and less than 1 million people that actually authorized Obama’s app, we come out with a ratio of roughly 200:1.
Its truth or not does not mean the CA situation is not revealing of an enormous problem.
It revealed an enormous problem when Obama did it (and worse, with Facebook’s approval). It only became an issue for the press, though, when they failed to manipulate the election in the way they wanted. I’m actually not an advocate of Trump - I’m just saying that this was either wrong or it wasn’t, regardless of who did it. In truth, they both did it. Obama just did it on a vastly larger scale.
"In 2011, Carol Davidsen, director of data integration and media analytics for Obama for America, built a database of every American voter using the same Facebook developer tool used by Cambridge, known as the social graph API. "
is ridiculous. There are millions of voters who have never had a Facebook account, and lots more back in 2011. They are handwaving, and this is not data. You have little basis to use qualitative, exaggerative terms like "vastly more" and you also don't know what kinds of access Cambridge Analytica may have had that differs from the data limits you experienced in your own development.
Obama took data from ~200 million people ("the entire US social graph" is unambiguous). The CA incident involves ~50 million profiles. I think a difference of 150 million is "vastly more". That's not exaggeration, that's math. It was wrong in either case, but one did it on a vastly larger scale.
You can sit here splitting hairs and arguing with the specific wording of my comments and the articles backing this up, or you can say "yes, it was wrong in principle" which is what any objective person would do. Democrats and Republicans alike do bad things sometimes. Are you actually trying to defend the actions of any party involved in any of this?
The problem with that statement is that you believe it was a catastrophic outcome. For the first time in history, media and new technology is able to instantly tell you if the president has so much as sneezed while the sitting president is a (highly controversial) Republican. Obama was considered a nightmare scenario for the American right, and if you listened to the right outlets, you heard every single one of Obama's missteps. Every poor decision, embarrassing association, or undelivered promise was constantly paraded as grounds for impeachment, but social media and what many consider "neutral" news sources have a noticeable progressive bent, and the Obama administration got the benefit of the doubt more often than not.
The Obama administration deported more undocumented immigrants than every past president minus HW Bush combined. The Obama administration killed over 3700 people (over 300 of which were civilians) in drone strikes. In that regard, Obama reportedly said that it "Turns out I’m really good at killing people. Didn’t know that was gonna be a strong suit of mine." Attorney General Eric Holder was the first US cabinet member to be held in contempt of Congress, and Obama exercised executive privilege to support Holder's decision to withhold documents pertaining to Fast and Furious.
Most US citizens who actively use social media don't know those things (among many other controversies) happened, but they sure as shit know that "Donald Trump gets two scoops of ice cream", and constantly beat the dead "covfefe" horse; you can swipe right on Snapchat or Twitter to that social media company's curated news page and instantly hear every single slip-up and hot take that Trump has. This notion of a "catastrophic outcome" largely exists because for the first time, news outlets are at odds with the administration in the age of mass media.
It is of course a rote and entirely predictable exercise to take my statement that "this is a catastrophe", turn it around and say, "well Obama was a catastrophe for the right". I'm here to say that this is a false equivalency and the right has lost its mind. The conservative view of Obama is equivalent to the liberal view of Bush, that's a lot more reasonable. But the right has not has their "liberal Trump", and it is perfectly fine that they never will.
That is your opinion, and you're entitled to it. But do you not detect even a whiff of hypocrisy in this whole thing? It should have been portrayed as the terrible thing that it is today back when Obama did it. That is what I have an issue with.
Therefore, the left has no incentive to fight the bias of the press, because it favors their side on nearly every issue. The ironic part is that by making the misuse of Facebook data acceptable (even celebrated) when Obama did it, the press may have inadvertently helped Trump get elected years later. They’re upset now, but they’re just trying to put the genie that they unleashed back in the bottle. We reap what we sow.
Does yelling at your computer make it work better? (Perhaps an increasingly poor example in the age where voice commands are becoming popular.) Or figuring out what's wrong and doing what needs to be done to fix it? Gotta work with the facts on the ground, not how you think it should be. And maybe by doing so, you can move what is towards what should be.
In the case of working for user data privacy, we're working against companies that are benefiting from the data they collect and their lobbyists, as well as people who aren't aware of the situation and often only see the benefits of the apps they're using. And those people cross many demographic and political lines. I don't think user privacy is a partisan issue. It's going to be an uphill battle. You think pointing out all of the differences and disagreements of those who don't agree with you, all of the inconsistencies you see in their positions is going to help? I sincerely don't think so.
The point isn't that pointing out hypocrisy is wrong, it's that it gets in the way of your goals.
If it's more important to you to point out those inconsistencies than improve the user privacy situation, I think you have your priorities wrong. If you think you can only remedy the user privacy situation by first correcting their inconsistencies, I think that's wrong-headed and doomed to failure on both counts. But you're certainly entitled to your opinion and free to disregard mine.
Then of course terrorists from roughly the same organization tried again on the same building eight years later and it was the largest attack in US history by orders of magnitude. Terrorism proceeded to become the key societal and political issue pretty much for a whole generation.
The problem was there for a long time, the hazard was there for a long time, but it's only when we get the catastrophic outcome; e.g. the Uber car actually killed someone, the plane actually crashed, the WTC was actually destroyed, it's largely rational that that is when lots of people care. There's nothing unusual or hypocritical about it.
You might not think the outcome of this election was catastrophic but the reporters who cover the government and the white house do. They know the norms and customs that have been in place for many decades and they see that they've been ripped to shreds in just a little over a year. They know this is a catastrophe and they are rightfully trying to shine a light on every possible thing that we can get at to both patch the situation and hopefully prevent it from happening again.
That of course doesn't conclude that liberals or what you prescribe to liberals is correct. But there were still significant differences between the reelection of Obama, a sitting president who won with a lesser margin than expected and Trump, an outsider not only to politics but the republican party, who won by a slim margin targeting certain fringe issues and key states.
Further, the "only angry because it's President Trump" narrative ignores the fact that there has been a rising tide of concern and pushback on data collection and customer surveillance business models. The fact that Trump's campaign was involved may have been a catalyst, but a catalyst cannot operate in a vacuum, it requires very specific preconditions, ready and primed for that spark. And the longer it builds, the smaller the spark needs to be.
Facebook should never have facilitated this level of data collection for any campaign, but characterizing the current anger at Facebook as media-driven favoritism is Internet Research Agency level trolling.
Why? Because four years later, Facebook would let Cambridge Analytica do something similar without having a reason to particularly like them.
And even after learning about CA's breach of trust, they chose to mostly remain inactive, even when they learnt that the company was working for Ted Cruz and Trump, two candidates you probably wouldn't want to help if you're making business decisions based on political sympathy.
That’s far different than the Obama situation. They accessed ~4 times the number of profiles, and were allowed to keep the data current because Facebook voluntarily removed the API limits for them. Why would you question Obama’s own campaign director? As if it were a secret that Facebook is left leaning and wanted to help Obama.
Precisely because everyone knows/thinks Facebook was sympathetic towards Obama, I believe it's not unthinkable that someone used that fact as an explanation when they were asked and did not actually know the specific reason.
Ruling out money as a reason seems a bit presumptuous.
Google is your friend.
Here’s one to get you started...
“The campaign’s exhaustive use of Facebook triggered the site’s internal safeguards. “It was more like we blew through an alarm that their engineers hadn’t planned for or knew about,” said St. Clair, who had been working at a small firm in Chicago and joined the campaign at the suggestion of a friend. “They’d sigh and say, ‘You can do this as long as you stop doing it on Nov. 7.’ “
There are also enough stories from FarmVille and far smaller apps and the access they had. I wonder if anyone remembers running into API rate limits in the 2010-2012 timeframe?
I remember being as surprised by the API access as many of the testimonials we now hear. I had initially implemented rate limiting on my side. After never running into any problems, I gradually reduced the delays to 0 without ever hitting any limits. But I only ever accessed maybe a few dozen GB of data, not the 100+ million accounts others apparently got.
It may also have been important that CA saved all the data with no expiration policy. The API limits may have been set based on the usage pattern the ToS prescribed, namely never caching data for more than 24 hours. If you save and reuse the data instead, you save a lot of duplicate API calls and get new data instead.
I told Mark about this exact problem in 2005.
And I warned him about FTC liability if he ignored it.
After that, we stopped talking.
Do you feel even the slightest bit of embarrassment at the irony and hypocrisy of using this content, that you acquired without consent, to further your argument against the trustworthiness of this guy?
Why is Zuckerberg's nick 02? Who is 01?
Social mores are changing, becoming better developed.
The Internet, social networking, OAuth -- these are not exactly well-trodden subjects in humanity's past. It's not like we have decades or centuries of precedence to look back on.
The important thing is what FB does now.
In at least 90% of the cases people don't understand what the privacy policies or permissions mean or what they could be used for. People trend to trust others, in general. And many developers abuse that trust, especially when they're allowed to do it by design with the permissions they're given by the platforms.
When an app asks me for "Access to media" I only give that access expecting that maybe it needs that access for when I will open a media file with that app or to download or create a media file inside the media folder.
I do not expect the app to analyze my media for the type of content I have in there, and I do not expect the app to upload those files to its servers, or any other uses that developers may come up with for that particular permission.
Yet, the permissions are set-up in such a way that they allow much more than people expect them to allow.
Saying "well you shouldn't have given them access to media" or "you shouldn't be using the Internet or a smartphone" is really a nonsense type of comment to make. If it's a video player, of course I have to give it access to the media. That's why I need a video player. But I didn't intend to give it access to upload my media to its servers. That's what the platform developer allowed it to do, without me knowing or understanding that it can do that, not me "not caring."
This is just an example, but it can apply to phone permissions, contact permissions, and other types of permissions just as well.
I'm not sure that most people understand even now, after the CA story broke, what the specific issue was with CA, FB, and app permissions. CBS news characterized it just like any other data breach. Slate's Political gabfest did the same. Most news articles near the tip of Reddit's front page were also light on details. Channel Four's original report didn't even focus that heavily on the FB/App problem. Friends in my FB news feed similiarly sound confused about what specifically happened. Everyone is outraged, but few seem to understand, even now.
Apps on Windows have access to pretty much everything. There are legitimate reasons for that. It could be abused.
When a Windows app misbehaves, abuses the trust the users have placed in it, we don't blame Windows. We blame the app.
Why then, in FB, in the same situation, have we mostly blamed FB?
If only someone would have used this hole to seed something like Diaspora to help break the critical mass problem for those kinds of projects.
It's like when Trump said he could shoot someone on the street and still win... It's when your supporters start to back off that you start giving in to demands, not before.
The fact is that the majority of users can not be expected to look out for themselves. People hit install and then hit accept to whatever permissions request pops up. It is like agreeing to the TOS that no one reads.
I tried to download an alarm clock app on Android. It wanted access to virtually everything. Why do you need so much information for a fucking alarm clock? My analog alarm clock doesn't know my name but it still wakes me up each morning.
Platforms (Facebook, Mobile OS's, Desktop OS's) need to reject apps that request unnecessary permissions.
Part of the reason is pre-emptive cost cutting by these companies to remove human review from these apps.
Google also played a massive part in this with their strategy for growing the Android app store. (as did Facebook)
Favouring quantity over quality puts users at risk.
Android has similar behavior, but only for new-style apps (starting with 6.x, as someone corrected me in last year’s thread). Trillions of old-style apps still enjoy TOS-like god permissions, afaik.
Seems like this is picking up a lot of steam.
I have a feeling all this coverage is driven by quite a bit of schadenfraude from the traditional media.
This coverage is well deserved, but I am sad that people are only taking notice now.
Syrian Civil War began in social media first. The journalists criticising Facebook nowadays used to run campaign for how social media helps protesters organize. What if Arab spring wasn't an organic movement? Isn't it weird that some "experts" suddenly changed their mind about social media after Trump's election?
Also, why noone even talks about Google? It's much bigger weapon for manipulating facts if you consider millions of people trusting its results for their questions. People ask Google if Brexit is good, people ask Google if Trump is doing good. And we don't even know how Google picks the best results. What if there are some SEO tricks shared with only a few companies?
I don't entirely blame them - it's driven by extremely perverse incentives, and alternatives haven't worked out (yet). But it's undeniably terrible for everyone, and IMO contributes to undermining their usefulness.
People that weren't developers or in marketing probably had the expectation Facebook was a private walled garden where they were only sharing with their friends but once one friend gave those permissions, many bad apps started to see how they could pull down the entire social graph. This has since changed with OpenGraph v2 in 2013-14 but it was exploited by nefarious groups for a time.
I think most of the permissions model was fine before the bad apps and shady groups that are using your data for targeting purposes beyond games, apps and ads. Once it started to be used for aims beyond harmless fun like games that is where people got angry especially in targeted politics.
"I can't not ask for these permissions - even just for a basic login - facebook forces this information to be available to my systems. I'm not using it for anything, and I don't take much of the information I'm given, but to connect via Facebook, they require me to have access to this information". That became my standard-ish response, and it wasn't that surprising why many people got miffed, especially if I was just doing basic "login with facebook" stuff.
IIRC, FB has changed the minimum permissions a couple of times in the last several years (or, at least it's seemed like it - maybe names or presentation of the info has changed?)
In today's age, you need a phone number and e-mail. It's ok - they are decentralized. Don't let a centralized platform of Facebook's evil nature become necessary for you to live your life.
WSJ had reported several times about apps leaking FB IDs and about companies such as RapLeaf linking them to users.2 They were apparently combining Facebook data with some data from public sources to identify Facebook users, in 2011.
Zuckerberg in his statements so far has used the term "derivative" data a couple of times, as if the word derivative is significant. Does Facebook believe this somehow takes it outside the scope of what they are responsible for?