
Xkeyscorerules100.txt - peterkelly
http://daserste.ndr.de/panorama/xkeyscorerules100.txt
======
WestCoastJustin
Related to
[https://news.ycombinator.com/item?id=7983124](https://news.ycombinator.com/item?id=7983124)

Additionally, /. has a pretty good summary of what this is [1]. -- _" If you
search the web for communications security information, or read online tech
publications like Linux Journal or BoingBoing, you might be a terrorist. The
German publication Das Erste disclosed a crumb of alleged XKeyScore
configuration, with the vague suggestion of more source code to come, showing
that Tor directory servers and their users, and as usual the interested and
their neighbor's dogs due to overcapture, were flagged for closer monitoring.
Linux Journal, whose domain is part of a listed selector, has a few choice
words on their coveted award. Would it be irresponsible not to speculate
further?"_

[1] [http://yro.slashdot.org/story/14/07/03/1846215/nsa-
considers...](http://yro.slashdot.org/story/14/07/03/1846215/nsa-considers-
linux-journal-readers-tor-and-linux-users-extremists)

~~~
tlrobinson
Linux Journal seems out of place. They must have had an article about Tails?

~~~
MatthiasP
Yes they did[1], and a link to that article was published on an 'extremist'
forum, triggering the inclusion of the domain in the filter.

[1][http://www.linuxjournal.com/content/linux-distro-tales-
you-c...](http://www.linuxjournal.com/content/linux-distro-tales-you-can-
never-be-too-paranoid)

~~~
acqq
So if the link to the NYT article is published on the extremist forum, and you
job is to search for the extremists, you'd search for the NYT readers instead
of the readers of the forum?

Your explanation has no sense, sorry.

------
Centigonal
/* _These variables define terms and websites relating to the TAILs (The
Amnesic Incognito Live System) software program, a comsec mechanism advocated
by extremists on extremist forums._ */

That's an interesting definition. It might cover legitimate extremists, but a
quick look at Wikipedia tells me that TAILs has also been used by some pretty
respectable Pulitzer Prize winners.

These comments provide an interesting and ultimately disheartening insight to
how the people designing these surveillance systems view privacy software
(and, by extension, privacy?).

~~~
GHFigs
That doesn't make sense to me. It's a definition that relates to the people
they're looking for. That it's explicitly _more_ specific than "all Tails
users" really doesn't seem disheartening to me.

That Tails is used by people other than extremists doesn't invalidate the
comment, by someone interested in "comsec mechanisms used by extremists", that
it is, in addition to anything else it may be, a "comsec mechanism advocated
by extremists in extremist forums".

It's not like "legitimate extremists" have some totally parallel universe of
software that's only used by them. It's fundamental to most of these tools
that they'll be used by different people in different ways, and that some of
those people will be by some standard or another "bad guys".

~~~
Centigonal
The description is not exclusive, I'm not arguing that. Your last point makes
a lot of sense; most people/groups who do nasty things do it using off-the-
shelf components. My issue is with the description painting TAILs in broad
strokes as comsec for extremists. Yes, they've got National Security in the
name, so they're looking at the software from a national security perspective.
There is a wide gap, however, between describing software and its dangerous
potential and describing software only in context of its dangerous potential.

If an analyst who hasn't heard of TAILs reads that description, it would sound
to them like the program is something that's passed around extremist forums in
much the same way malware toolkits are disseminated in warez forums, rather
than what it is, which is a Debian fork that routes things through Tor. I say
this because that was my first impression, which seemed off, leading me to
google, then to wikipedia, and then back here in a huff.

Now, some examples (in order of ascending silliness) of why describing
something in the context of one use case is harmful when many use cases exist:

* A lot of people use nmap to explore their home networks or as part of their jobs, potentially in the computer security industry. A lot of crackers also use nmap to case out potential targets. Calling nmap a "network scanning utility advocated by computer hackers" makes illegalizing nmap sound a lot more attractive than it actually would be, even if the statement is true.

* In the real world, certain products are systematically abused for less-than-kosher purposes. Still, we never refer to canned air as a household inhalant without mentioning its dusting use-case first. Potassium nitrate is fertilizer first, rocket fuel second, and only tangentially mentioned as an oxidizer for explosives. Other oxidizers, even the ones that are illegal for consumer sale, are written about the same way.

* _Reductio ad absurdum:_ There's a lot of general purpose software everyone uses. I wouldn't be wrong if I said "Microsoft Word is a text management tool used by terrorist groups to hatch evil plots" or "SMS is a communications technology used by insurgents to detonate bombs" or, extending the idiom, "The Quran is a book used by militant Islamist groups to justify killing and brutalizing civilians." These descriptions are all, however, deeply misleading.

~~~
GHFigs
_I say this because that was my first impression, which seemed off, leading me
to google, then to wikipedia, and then back here in a huff._

I'm not sure I understand why you think an NSA analyst (who, inexplicably, is
editing an XKEYSCORE rule file regarding Tor and Tails while being completely
ignorant about Tor and Tails) is incapable of doing this same kind of
information-gathering.

I'm not going to act like I think the NSA only hires the best and the
brightest, but your example presumes the existence of an analyst that's all
of: grossly undereducated for his duties, too mentally incompetent to be aware
of it, and so far on the literal-minded end of the autism spectrum that they
could be replaced by a shell script.

I believe any such analyst, if they existed, would have been promoted to
management before they could cause any serious harm.

------
frisco
I don't have a good reason to believe that this is real, but if it is the most
surprising part to me is the "mapreduce" rule definition in there. As far as I
know the only group with a C++ mapreduce implementation called "mapreduce"
that also uses protocol buffers (the "proto:" block is protocol buffers) is
Google. This seems to say to me that the NSA is using a Google implementation
of map reduce. That can't possibly be right, can it?

~~~
jmillikin
They're not using the Google implementation; it looks like they rolled their
own.

~~~
frisco
Your profile says you work at Google... are you saying that the code snippet
here is inconsistent with Google's mapreduce?

~~~
nostrademons
I used to work at Google - this looks absolutely nothing like a Google
MapReduce specification or code would look like.

~~~
elliotanderson
Not a Google employee - but the linked file looks more like a simplified DSL
for analysts.

------
api
In other news: the NSA developed a DSL with embedded C++. Is this the most
horrific revelation yet?

~~~
Phlarp
Would I be wrong to gather that they also built an in house map reduce
implementation? What year is this code from? Most of the other documents have
been from 2007-2009, when did Google first implement map reduce?

~~~
acqq
Google's paper from 2004:

[http://static.googleusercontent.com/media/research.google.co...](http://static.googleusercontent.com/media/research.google.com/en//archive/mapreduce-
osdi04.pdf)

"We wrote the first version of the MapReduce library in February of 2003"

------
makomk
There's a nice bit in here that automatically collates a list of Tor bridge
nodes from snooped e-mails. The full list of bridge nodes isn't public, and
one of the ways the Tor project attempted to prevent someone from building a
complete list was by requiring people to use a valid GMail address to request
them, effectively piggy-backing on Google's account verification to stop
people from using a swathe of fake accounts to request nodes. Unfortunately,
that failed to take account of the fact that the NSA had completely
compromised Google's internal network.

~~~
sp332
Actually it was Britain's GCHQ that tapped Google's datacenter links and
shared the data with the NSA. I only mention this to remind peeople that it's
more than just one country and agency that's doing this.

------
jevinskie

        // START_DEFINITION
        /*
        The fingerprint identifies sessions visiting the Tor Project website from
        non-fvey countries.
        */
        fingerprint('anonymizer/tor/torpoject_visit')=http_host('www.torproject.org')
        and not(xff_cc('US' OR 'GB' OR 'CA' OR 'AU' OR 'NZ'));
        // END_DEFINITION
    

I was surprised to see that they actually tried to exclude Five Eyes
countries. The cynic in me wonders if there is "bug" that neuters the
restriction.

~~~
sp332
I'm wondering if "FVEY" meant something before it was rendered as "five eyes"?

~~~
Mandatum
The "Five Eyes" term has been in use since 2006. Strangely enough the original
Wikipedia document archived on the Wayback Machine seems to be unavailable,
however we do know it pointed to "USAUK_Community" whose relationship was
extended to include other allies which is now known as the "Five Eyes".

Before 2007 the amount of searches for "FVEY" was near zero. The relationship
between these five nations wasn't named publicly until 2006 or so, before that
the relationship was there - just under another name.

However there are documents dated to 2001 available on mors.org which have
FVEY references: [http://www.mors.org/UserFiles/file/82nd-
Symposium/Form%20712...](http://www.mors.org/UserFiles/file/82nd-
Symposium/Form%20712%20A.pdf)

These are public so you can assume internal naming was used well before then.

------
nickporter
From the filename, I thought this was some kind of X window config file for
key remapping.

------
maximumoverload
OK, I am going to play a devil's advocate.

\- The job of three-letters agencies is to find out all various enemies of the
state. That includes terrorists, but also various gangsters, officials of
sort-of-enemy states (like Russia). That's why they exist.

\- Following US nation interests has higher priority than rights of citizens,
especially in other countries.

\- People, that US have a reason to spy on, probably will use Tor/Tails, or at
least try to find something about it. It makes sense for NSA to filter those
people and focus their spying on them specially.

\- Not all Tor interested folks will be evil, but the percentage there will be
much higher than in just random internet. So it makes sense to focus on them.
Just like it makes sense for a local police to be in a neighbourhood, that's
known for a higher criminality.

So I understand why NSA does this, and why do they single out Tor-interested
folks.

~~~
chroem
>The job of three-letters agencies is to find out all various enemies of the
state.

Within the confines of the law and US constitution.

If all you care about is rooting out enemies of the state, then you're left
with organizations like the SS or Stasi.

~~~
maximumoverload
Does the US constitution matter with respect to non-US entities though?

Why should US-based agency care about rights of German citizens, for example?

~~~
chroem
While I think it _should_ apply to non-US citizens, the fact that it
technically doesn't is irrelevant as long as they're still willfully violating
the rights of US citizens too.

------
andy_ppp
Presumably reading hacker news also puts you on one of their lists.

Just remember should there ever be a problem between us, we know everything
about you.

------
xwintermutex
The file is dissected here: [http://blog.erratasec.com/2014/07/reading-
xkeyscore-rules-so...](http://blog.erratasec.com/2014/07/reading-xkeyscore-
rules-source.html)

~~~
DarkIye
lol, nobody cares. _flails arms in hysteria_

------
v64
Does anyone have any context or information as to what this is?

~~~
acqq
[https://news.ycombinator.com/item?id=7983124](https://news.ycombinator.com/item?id=7983124)

The article commented has the whole context.

For us programmers it's interesting, among other bigger issues, to see that
the rules contain the pieces of code in C++.

~~~
GHFigs
_The article commented has the whole context._

It really doesn't. In spite of repeatedly claiming things like that people
searching for Tor are "monitored" or users are "tracked", it's completely
vague about what that those terms actually _mean_ and provides zero examples.

~~~
acqq
Do we read the same article? It actually has 5 pages and it attempts to
explain the different sections of the file.

[http://daserste.ndr.de/panorama/aktuell/NSA-targets-the-
priv...](http://daserste.ndr.de/panorama/aktuell/NSA-targets-the-privacy-
conscious,nsa230.html)

[http://daserste.ndr.de/panorama/aktuell/nsa230_page-2.html](http://daserste.ndr.de/panorama/aktuell/nsa230_page-2.html)

[http://daserste.ndr.de/panorama/aktuell/nsa230_page-3.html](http://daserste.ndr.de/panorama/aktuell/nsa230_page-3.html)

...

Maybe you've read only the first page and missed the remaining four?

~~~
GHFigs
Yes, I read it, and I stand by what I said. The article explains a set of
rules used to filter a set of information out of another set, but it does not
support claims about what is done with that set of information.

There is a lot of insinuation, but no example of any individual user of Tor or
reader of Linux Journal, etc. being monitored or tracked simply for doing so.

~~~
influx
It makes more sense to me that they actually use these in AND statements. For
example, uses TOR and searches for JIHAD could be traffic that would be
interesting. If I had to guess, the Linux Journal stuff was just something a
geek put in there during testing.

~~~
runiq
That's what I thought—these rules obviously aren't the entire pipeline, and
the results obtained from them may or may not be interesting in and of
themselves.

That said, using Tails probably _does_ increase my XKeyScore rating. Is there
anything published as to the scale of the rating? Something along the lines of
"Once you get a rating of (say) 500, we're gonna come and beat down your door,
wife and dog, and not necessarily in that order".

~~~
GHFigs
There's no indication that there is such a rating exists, or that any such
decisions are made based on automated rules. It's a tool for selecting some
traffic out of all traffic, not for replacing human analysis or decision-
making.

[http://en.wikipedia.org/wiki/XKEYSCORE](http://en.wikipedia.org/wiki/XKEYSCORE)

------
coderholic
Here are details for all of the IPs in that doc:

    
    
      $ curl -s http://daserste.ndr.de/panorama/xkeyscorerules100.txt | grep -Eo "([0-9]+\.?){4}" | xargs -I% curl -s http://ipinfo.io/%
      {
        "ip": "193.23.244.244",
        "hostname": "No Hostname",
        "city": null,
        "region": null,
        "country": "DE",
        "loc": "51.0000,9.0000",
        "org": "AS50472 Chaos Computer Club e.V."
      }{
        "ip": "194.109.206.212",
        "hostname": "tor.dizum.com",
        "city": null,
        "region": null,
        "country": "NL",
        "loc": "52.5000,5.7500",
        "org": "AS3265 XS4ALL Internet BV"
      }{
        "ip": "86.59.21.38",
        "hostname": "No Hostname",
        "city": null,
        "region": null,
        "country": "AT",
        "loc": "47.3333,13.3333",
        "org": "AS3248 Tele2 Telecommunication GmbH"
      }{
        "ip": "213.115.239.118",
        "hostname": "No Hostname",
        "city": null,
        "region": null,
        "country": "SE",
        "loc": "62.0000,15.0000",
        "org": "AS2119 Telenor Norge AS"
      }{
        "ip": "212.112.245.170",
        "hostname": "No Hostname",
        "city": null,
        "region": null,
        "country": "DE",
        "loc": "51.0000,9.0000",
        "org": "AS24900 QSC AG"
      }{
        "ip": "128.31.0.39",
        "hostname": "belegost.csail.mit.edu",
        "city": "Cambridge",
        "region": "Massachusetts",
        "country": "US",
        "loc": "42.3646,-71.1028",
        "org": "AS3 Massachusetts Institute of Technology",
        "postal": "02139"
      }{
        "ip": "216.224.124.114",
        "hostname": "No Hostname",
        "city": "Aptos",
        "region": "California",
        "country": "US",
        "loc": "37.0082,-121.8777",
        "org": "AS40231 Ethr.Net LLC",
        "postal": "95003"
      }{
        "ip": "208.83.223.34",
        "hostname": "No Hostname",
        "city": "San Francisco",
        "region": "California",
        "country": "US",
        "loc": "37.7749,-122.4194",
        "org": "AS40475 Applied Operations, LLC",
        "postal": "94159"
      }{
        "ip": "128.31.0.34",
        "hostname": "moria.csail.mit.edu",
        "city": "Cambridge",
        "region": "Massachusetts",
        "country": "US",
        "loc": "42.3646,-71.1028",
        "org": "AS3 Massachusetts Institute of Technology",
        "postal": "02139"
      }

------
mschuster91
The problem I have with this selector source code is that it is incredibly
complex to execute.

How is the NSA able to do this in realtime for all their interception points?!

~~~
AlyssaRowan
XKeyScore runs nearline, not in-line - and it's distributed.

------
ogijaoijfawje
I believe this "source code" is made up, invented for the masses. In fact, the
more and more I see of these surveillance reports and reveals, the more I
believe this is all purposeful deception, and that while it may be true
they're doing all these things, they aren't leaks, but announcements.

On a board where most of us should be familiar with the concept of not
trusting user input, I think we should all take a step backward and treat
these "leaks" as just that: input from an untrusted source. This could all be
a fabrication we're buying into.

~~~
mschuster91
Poor enough you use a throwaway, but I see what you're trying to tell.

I personally believe the NDR/Spiegel to be reputable, trustworthy media. It
may be that we have another Hitler's diary scandal there, but the other
Snowden/NSA material they have published so far has held up examinations, the
NSA iirc even acknowledging some of the docs as actually valid.

~~~
ogijaoijfawje
I'm not disputing that the NSA stuff is real - of course they're doing it,
it's what the agency is expected to do.

I'm disputing whether these are leaks or announcements. The media might not
even know they're part of the plan. Is it that farfetched for the NSA to say
"Hey, we want to make this information public for some reason, and we're going
to do so by using a whistleblower. You're going to release all these documents
to a bunch of media sources and live abroad for a while and be hailed as a
hero"? There's a long history of leaking information, factual or otherwise,
for a number of purposes. There's potentially no end to the rabbit hole. Maybe
Snowden doesn't even know he's part of the plan.

It's my opinion that the NSA, some group, or some individual, is letting us
know they're watching closely. Maybe they're acclimating us to the idea of
being spied on, or worse, distracting us from or preparing us for something
else. I think the big picture has yet to be revealed.

~~~
sroerick
Something worth considering is the conditions necessary for having both
subjective and objective expectation of privacy. If nobody expects to be able
to avoid the NSA, nobody can expect to have subjective expectation of privacy.

NSA will certainly know everybody's reaction to the Snowden leaks.

I figure the absolute worst case scenario imaginable is that USGOV is
producing a list of sysadmins and privacy conscious individuals for
extermination. Most "coding" seems to be happening now several layers up from
linux systems. If the Government wanted, they could kill everyone who knew how
to actually use a computer. Then they could "Teach low income Americans to
program" and then just conveniently forget to teach them the full stack. Note,
I don't think this is happening, but it isn't unimaginable.

In addition, the narrative has been very pro CIA since the very beginning.
There have been a lot of people making the clear distinction between machine
intelligence and human intelligence, which is definately a CIA line.

------
philip1209
So the score may not be IPv6 compatible?

------
dthunt
It probably would have been a lot smarter to post this somewhere inside the
continental US.

------
sp332
What language is this?

~~~
acqq
Obviously the custom language (which uses $ like _Bash_ or _Perl_ ) but which
allows the programmer to include the pieces of C++. Hints: _static
std::string_ and _boost_!

 _The_ Boost, for which Google writes:

[http://google-
styleguide.googlecode.com/svn/trunk/cppguide.x...](http://google-
styleguide.googlecode.com/svn/trunk/cppguide.xml)

"Cons: Some Boost libraries encourage coding practices which can hamper
readability, such as metaprogramming and other advanced template techniques,
and an excessively "functional" style of programming.

Decision: In order to maintain a high level of readability for all
contributors who might read and maintain code, we only allow an approved
subset of Boost features."

