Hacker News new | past | comments | ask | show | jobs | submit login
In NSA-intercepted data, those not targeted far outnumber the foreigners who are (washingtonpost.com)
181 points by bcn on July 6, 2014 | hide | past | web | favorite | 26 comments



I'm sure I'm going to take a karma hit, but please tell me if I'm getting any of this wrong, since it doesn't seem to mesh with a lot of the Snowden reporting: Snowden grabs a significant amount of the NSA's actual reporting and collection, hands it to the Washington Post and they find that:

- The NSA is actively scrubbing the collection and removing the identities of Americans

- The most egregious privacy violation that WaPo could find was a set of love letters between an Australian government employee and her boyfriend who went off to Afghanistan to join the Taliban

- In the process of doing this, the NSA is pulling out information on "a secret overseas nuclear project, double-dealing by an ostensible ally, a military calamity that befell an unfriendly power, and the identities of aggressive intruders into U.S. computer networks"

- The WaPo estimates that around 900k people's communications are caught up in the NSA's "incidental" collection

I could see an argument made for future abuse, but this really seems to fly in the face of grand conspiracy theories that we've been seeing for the last year.


In many ways, this article is more important to the inside baseball of the NSA policy debate rather than public understanding of the NSA program.

The Privacy and Civil Liberties Oversight Board has found the section 702 program to be reasonable in its tradeoffs between 4th Amendment guarantees to American citizens, global privacy rights, and national security gains.

The Section 702 program appears to work by either analysts or tech support personnel entering selectors into XkeyScore and similar systems(The Upstream Program) that look at network packet flows. Analysts are expected to justify even thinly that a selector is "foreign".

The data returned from a selector is both returned to the requestor and moved into a long term archive accessible by many NSA personnel.

So we believe from Jacob Applebaum's reporting that visits to https://www.torproject.org/ are tagged as valid selectors in Xkeyscore. It is indicated that these selectors are re-tasked so that once your identity is tagged your selectors are logged across the network.

We have independent corroboration that Tor and Tails are recommended tools among groups that threaten national security. The question is it worth the cost to have the journalists, human rights worker and technologists who also depend on these tools in the NSA long term archive?

The main takeaways of Barton Gellman's article are

1. NSA tech support personnel have unaudited access to XkeyScore selectors and the raw take. Numerous officials have testified to the contrary. This expands the potential for misuse.

2. The minimization process is not terribly robust. There is an enormous amount of constitutionally protected communication in the NSA full content database.


I agree. I think that if I was in possession of 11,000 accounts worth of NSA profiles I could definitely dig up something juicier.

What we are seeing here is the other side of the story. The tracking of valid terrorists, monitoring overseas nuclear projects and military events etc. I have no doubt that this article is a heavily redacted view of things but any criticism of reform of the NSA has to accept the need for these kind of actions.

Additionally, I hate how "the people" never hear about real intelligence that is of public concern (i.e. double-dealing by allies). What is the purpose of hiding this information from the public?


I suspect that both the government and media have disincentives for pushing out that kind of information. On the government side, you don't want to push out the fact that you're spying on them until the situation gets so big that it's worth potentially losing that source of information (e.g.: Iran/North Korea nuclear program, Chinese hackers, etc.) If your intelligence turns out to be bad, it could also backfire catastrophically (e.g.: Iraq WMDs).

On the media side, government intelligence successes aren't going to garner many eyeballs unless it's something huge. It would be similar to publishing a constant stream of minor military successes - if anything, they read more like propaganda. You're going to get a lot more readers by scaring the people into thinking that an Orwellian surveillance state has been constructed around them than telling them about something happening on the other side of the world that won't actually have much impact on the Average Joe's day-to-day life.

For the same reason, the government keeps falling back on terrorism as justification for spying since everyone was affected by 9-11 in one way or another instead of bringing up things like nuclear non-proliferation, trade agreements, intellectual property theft, etc.


Thanks for your comment, I think that it opens a good debate. Two thoughts on my part:

The first is a problem with the lack of transparency. The NSA, the Congress or the POTUS never told American citizens: hey, we are going to be reading your emails in order to keep you save. Are you OK with that?

The second is a thought experiment. Would you feel as comfortable as you do with this, if instead of the US doing it, it was Russia, China or even North Korea? And now, to give it a different perspective, remember that the US is one of the countries that has killed the most people in the last 50 years outside of their own territory.


I'd answer "no" to both of your questions with some caveats.

I'm not okay with the lack of transparency, but I acknowledge that some degree of secrecy needs to be kept to maintain the effectiveness of intelligence operation. As an example, I remember seeing a snippet from an interview once where General Alexander even said they should have been more open with the public about the 215 program (I can't seem to find the link right now). We haven't heard much about the similar e-mail program that was discontinued back in 2011. If Snowden had stopped there, I don't think there would be nearly as much debate.

As a counterexample, those dragnet concerns don't apply to PRISM/Section 702, which is targeted exclusively against foreigners. I don't think there was any reason to notify every adversary in the world that our government could get access to their Gmail/Hotmail/Yahoo accounts. Snowden didn't open up a debate there, he unilaterally made a decision on behalf of the American people to reduce the effectiveness of our intelligence services.

As for the second thought experiment, no I'm not okay with that, either. That said, espionage is mankind's second oldest profession - they're not going to stop and it's silly to think that shutting down the NSA will somehow cause them to stop. Having an effective intelligence service targeting them is going to do more to keep them in check than voicing my disapproval online.


I think this flies in the face only in the conspiracy theory "they are watching us because they are evil".

I don't think they are. I think the dragnet did started with good intentions, does bring some useful intelligence and is kind of useful.

The problem is the collateral damage is just too high, and the debate should be around that.


We haven't actually been getting the data to make an informed decision on these, either. It took nearly a year to get any sort of numbers out of the government regarding how many people were being targeted[1] and they still haven't given any numbers on incidental collection. Meanwhile, most of the the media has been claiming that the communications of millions or even billions of people are being swept up, and now the Washington Post is actually looking at the data they've had for the last year and saying the number is more likely to be somewhere around 900k worldwide. I'm not saying that's necessarily a great number, but that's orders of magnitude less than what we've been led to believe. Nor does the article go into much detail on what "incidental" really means versus "targeted" - are these people in contact with the actual targets, or completely unrelated? Is there a better way to protect the privacy of these people without compromising actual intelligence operations? If not, which side should we err on, and how far?

These are tough questions to answer - especially without hard facts. It's hard to have an honest debate when we the people are left trying to discern the actual facts somewhere between the secrecy and sensationalism.

[1] http://icontherecord.tumblr.com/transparency/odni_transparen...


The conspiracy theory is a bit more complex. The NSA has two programs for upstream collection prior to 9/11 2001. THINTHREAD and STELLARWIND.

THINTHREAD was designed by long time NSA personnel to protect privacy while providing comparable digital surveillance capability to ECHELEON's capability in the analogue world.

STELLARWIND was the full take and archive program.

The White House chose the lower privacy technology, forced the people involved in THINTHREAD into retirement and then prosecuted the whistleblowers.

Covered in depth here: http://www.pbs.org/wgbh/pages/frontline/united-states-of-sec...


I understood it as the NSA is scrubbing U.S. identities only on their analists reports, they still collect the data and don't anonimize it in any way while in storage.


A funny line:

> Some of them border on the absurd, using titles that could apply to only one man. A “minimized U.S. president-elect” begins to appear in the files in early 2009, and references to the current “minimized U.S. president” appear 1,227 times in the following four years.

Barton Gellman clarified:

> Lotta questions on this. The 1200 references to “minimized US president” come when people talk about him in intercepted conversations.

https://twitter.com/bartongellman/status/485604791867817986

So this doesn't confirm that the NSA was reading Obama's mail, like Russ Tice has claimed. But it does belie NSA's claims that Snowden had no access to FISA material, and cast doubt on all their "stringent" security procedures to prevent misuse of intelligence material. If Snowden could sneak out with it, what else could be done without their knowledge?


Is it my bad understanding of the language or does the word "target" in headline mislead? I read it as "don't worry, the NSA does only target a minority" while the text starts with "Ordinary Internet users (...) far outnumber legally targeted foreigners in the communications intercepted (...)"

I would have added "legally", so it would read "In NSA-intercepted data, those not legally targeted far outnumber the foreigners who are"


Yes, this is a very confusing headline, and incorrectly appears to be reassuring. The Guardian's headline is better: "NSA intercepts: ordinary internet users 'far outnumbered' legal targets".


Agreed. One would almost think they were trying to both publish and bury this story simultaneously, as the headline is practically unintelligible.


This is a stunning report...not least of which in how it continues to show the NSA's apparently shoddy IT...the PowerPoint presentations being taken are one thing, but the hundreds of thousands of pieces of surveillance data, and that's just what the Post chose to sift through.

Still, the specifics of data extraction aren't clear...are the NSA mining a data stream as they please from Facebook? Or are the Facebook transcripts, as detailed in the closing anecdote, a result of a data request in an ongoing investigation of a previously identified suspect?...which is, purportedly, the same kind of access any law enforcement agency can make.


The intercepts referenced in the Post article date from 2009 to 2012, and during substantially all this time Facebook chose not to force SSL by default.

Facebook did not enable SSL by default until about only half a year before Edward Snowden went public: https://developers.facebook.com/blog/post/2012/11/14/platfor...

The absence of SSL and presence of plenty of fiber taps provides a pretty big big data stream to mine...


> This is a stunning report.

Significance buried with poor headline, also could have been broken up into multiple stories. This is def a top 5 Snowden revelation, with the added bonus of a denial from a top official.


At least the way it's presented it appears as "nothing to see here." The main story is the guy who wants to join Talibans being monitored, including the conversations with his ex-partner who converted to Islam and who today understands that it was reasonable for such communication to be monitored.

Kind of anticlimax for the claimed "a four-month investigation by The Washington Post."

How could the article have been written to make the readers see it as "one of the top five?"


So out of billions they only "target" (whatever that means) hundreds of million of people? That doesn't make me feel a lot safer. The 9:1 thing is completely out of context and meaningless. Plus, we don't know exactly what a target is and what's a non-target (on which they could still gather data...just not, you know..."target it").


I have some methodological problems with this piece. They're making claims about the makeup of this data set, that 50% of the documents contain minimized references to US persons for instance, or that 90% of the account holders were not intelligence targets, but we don't actually know whether this data represents NSA collections as a whole. There are 160,000 documents over a four-year period. That seems far too low to be the sum total of all NSA collections, so it must be a sample. But is it a random sample or was this data selected by Snowden using some criteria? How do we know that Snowden didn't choose data from one particular program or using certain selectors and that that particular data tends to have more or less US persons or a higher or lower percentage of intelligence targets than NSA intercepts taken as a whole?

I also find the following claim problematic: >> Many of them were Americans. Nearly half of the surveillance files, a strikingly high proportion, contained names, e-mail addresses or other details that the NSA marked as belonging to U.S. citizens or residents. NSA analysts masked, or “minimized,” more than 65,000 such references to protect Americans’ privacy...

So there were 65,000 minimized references in 160,000 documents. But we also know that a "minimized reference" doesn't actually mean the a US person was the sender or the recipient - for instance, we know from Gellman that someone talking about President Obama would constitute a minimized reference. The first sentence, "Many of them were American" is not quantified, likely because the Post doesn't actually know how many participants in the intercepts were American.


> Is it a random sample or was this data selected by Snowden using some criteria?

Let's take one probable situation: Let's say Snowden took everything he could. That means there a gathering of data made by an employee at the NSA which results in 90% non-intelligence targets. What was the employee working on?

- Increasing the relevance of data? Not probable.

- Working on a usual sample of data? Most probable.

- Targetting non-intelligence targets on purpose? Scary and illegal.


The infographic shows 556 intercepted videos out of 100,000 intercepted communications, the majority of which are text messages and email.

So why aren't the terrorist evildoers hiding their few KB of text communications within multi-GB YouTube videos of cats and video games?


USA claim AQ have been using stenography since the 90s:

http://usatoday30.usatoday.com/tech/news/2001-02-05-binladen...

Commercial companies then built tools to identify most common steg, and AQ adapted by writing their own algorithms - which turned out to be not very good:

http://edition.cnn.com/2012/04/30/world/al-qaeda-documents-f...

It still breaks down, because you need to pre-share the arrangement of how you will exchange information that is hidden in images.


That really puts into perspective how huge Yo app is a threat to national security. It's only a matter of time until the NSA is intercepting Yos, or at least capturing Yo metadata.


I bet a Yo takeover is coming within few months. Buying the company may cost less than hacking it...





Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: