Hacker News new | past | comments | ask | show | jobs | submit login

I call bullshit. Phone metadata is saved since forever yes, but stored at ISPs, not at government organisations. There are strict regulations regarding the privacy of voice data over the phone (VoIP does not count as such though), and I don't think the secret service and military secret service (AIVD and MIVD) can do anything they like. They have more permissions, such as demanding passwords for encrypted files as long as it's not for your own conviction (while normally you have the right to remain silent), but it probably doesn't go that far. Keyword searches are probably not true.

It is however worth mentioning that we have this CIOT system which is a publicly known and automated system that actually provides automated access to name and address details of any given Dutch IP address. The system is updated with ISPs' data every morning and can be queried at will. ISPs, even the most privacy-aware one (XS4ALL) do not give statistics of how often their part of the database was queried (I asked them), but it has been made public that the database had a total of 2.6 million queries over 2010 and 2.9 in 2009. That's one in six citizens' data queried for no apparent reason.

Tech details: The CIOT system is a centralized search dispatcher, that queries systems provided by individual ISPs. A government official can enter an IP there and within seconds all ISPs have been queried and one probably returns a match.




>I call bullshit. Phone metadata is saved since forever yes, but stored at ISPs, not at government organisations. There are strict regulations regarding the privacy of voice data over the phone (VoIP does not count as such though), and I don't think the secret service and military secret service (AIVD and MIVD) can do anything they like. They have more permissions, such as demanding passwords for encrypted files as long as it's not for your own conviction (while normally you have the right to remain silent), but it probably doesn't go that far. Keyword searches are probably not true.

Yes, because secret services have been known to strictly follow the law, and not do anything without telling you first.


I think it's dangerous to go on record beforehand claiming something is "bullshit". If I've learned anything over the last view days it's that reports like these should be taken seriously and no stone should be left unturned to find out the truth. We can't just assume intelligence operations can't; we need to know they can't. Let the House of Representatives proof it's nonsense. Also, thanks for adding the bit about CIOT.


more information on the CIOT

http://ripe58.ripe.net/content/presentations/ciot.pdf

it says 250k queries per month... kinda hard to get warrants for all of them i guess


Like I care how hard it is for them to get warrants. My question to them is why do you even need all this data in the first place?


I think that is the question we all would like an answer to, but we don't all draw your conclusion from it (that if they don't need it they won't collect it).

We have a public transport system here that functioned just fine using anonymous cards, and it got replaced by one that allows near perfect tracking of every individual using public transport. Why anybody would want to is a good question, but it is the system we've got and the data is being kept.


I agree this is a bullshit story. They only store conversations when they have an eavesdrop approval.

A lot of people underestimate the amount of storage it would take to store all voice data.


So let's estimate:

http://www.telegeography.com/press/press-releases/2012/01/09... says there were 438 billion international (because that's all the NSA collects, right?) calling minutes in 2011 (in the world... not just the Netherlands).

Aberdeen will sell you 1 PB of storage for $495k: http://www.aberdeeninc.com/abcatg/petarack.htm

A narrowband speech codec will encode calls in excellent quality (for the PSTN) at 12 kbps.

So that's 438 * 10^9 minutes * 60 seconds/minute * 12000 bits/second / (8 bits/byte * 10^15 bytes/petabyte) (using lying harddrive manufacturer's definitions of a petabyte) = 39.42 PB.

Or less than $20mln/year. Which of course is the quoted budget of PRISM.


Your not counting bandwidth, cpu, facility and personnel charges required to pull this off, raw storage is a minor part of the cost.


I'm not actually trying to imply that this is what PRISM does (no one has made that claim). I'm just saying that on a government scale, the cost of storing all voice calls ever made forever is not even very expensive.

So let's add bandwidth: the most expensive estimate I've seen is $0.019/GB <http://blogs.howstuffworks.com/2011/04/07/what-does-a-gigaby.... Let's assume the original audio is captured using G.711 (64 kbps). So that's 438 * 10^9 minutes * 60 seconds/minute * 64000 bits/second / (8 bits/byte * 1024^3 bytes/GB) * $0.019/GB = $3.72mln.

Let's add CPU: A medium-sized, high-CPU AWS instance is $0.0024/minute <http://aws.amazon.com/ec2/pricing/>. A moderate laptop-class processor can encode and decode 150 channels/core in real time <http://www.ietf.org/mail-archive/web/rtcweb/current/msg05236.... So that's 438 * 10^9 call minutes * $0.0024/CPU minute / 150 call minutes/CPU minute = $7.01mln.

Facility: The NSA's Utah facility is projected to cost $1.5...$2bln <http://en.wikipedia.org/wiki/Utah_Data_Center> and will contain a 100,000 square foot data center <http://nsa.gov1.info/utah-data-center/>. A 42U rack is about 7 square feet. Let's assume a floor occupancy of 25%. That's $2bln/facility / 10000 ft^2/facility * (7/0.25) ft^2/PB * 40 PB = $22.4mln.

I don't have a good estimate of the personnel involved, but I doubt it'd require anything out of the ballpark of the other numbers here. You could have every rack maintained and operated by its own PhD-level researcher at less than $10mln/year including all overhead and benefits.

A single JSF F-35A has a $207.6mln procurement cost (excluding R&D costs, maintenance costs, and operating costs) <http://en.wikipedia.org/wiki/F-35_Lightning_II#Program_cost_....


Commercial speech compression algorithms are hamstrung by the need to only add milliseconds of delay: they can only compress over a 'window' of tens of milliseconds. You can almost certainly do a much better job of compressing speech in batches of an minutes or tens of minutes: there is much more redundancy to remove. So if the spooks wanted to store massive amounts of speech data, they may have invested in such algorithms.


Storing voice (audio) data is not what the article says. I'd imagine you transcribe the audio to text and search in that. Storing text is incredibly easy. Besides you can throw away 99.9% of the data almost immediately.

I'm actually curious how much text data this would be per day; number of call minutes * average number of words per minute. I'd be surprised if that wouldn't fit in a reasonable cluster.


You underestimate the CPU power needed to do this. The Netherlands has a population of 16 million, by comparison Google voice has about 1.4 million users. This is an order of magnitude difference. On top of this they only transcribe voicemail not all calls. What is the ratio of calls to voicemail?

Transcribing all voice calls to text in the Netherlands computationally could easily be two orders of magnitude more difficult than Google voice.


I'm sorry, but do we really think that machine transcription of millions of cell phone conversations is worth anything? How can anyone believe that after using google voice?


So you use a hybrid approach. The text transcription can be fed into programs that look for specific phrases, build up social networks, etc. And then anyone you decide you actually want to monitor you keep audio as well as the machine transcription.

The machine transcription remains incredibly valuable for broad surveillance even though it is highly imperfect.


True actually. Ironically, a call itself is much more expensive than storing it for 20 years would be.

They do have a lot of eavesdrop approvals though, or so I heard from a colleague. (But that still doesn't mean they capture all the calls.)


Also all telecom companies must provide to CIOT all new and updated registered customer buying MSISDN's. If it is postpaid, all details including address, full name and D.O.B is given, prepaid also, but that can depend on how the customer has purchased the SIM.

Anyhow, why does it matter that much. If you have something to hide, then I'd be sweating. If not, who really gives a sh*te if people are tapping into our digital lives.

Facebook, Google and rest are just as bad as the governments. They are invading us with advertisements in all parts of our digital life.

If people are worried about it, turn your crap off.


If you have something to hide, then I'd be sweating. If not, who really gives a shte if people are tapping into our digital lives.*

Because your "something to hide" may be something that is currently legal / acceptable / non-embarrassing / etc., but becomes illegal / unacceptable / embarrassing / threatening to those in power / etc. in the future. And because governments have been known to collect data, nominally for legal reasons, and use it for political purposes, to threaten, harass and intimidate people based on their political affiliation.

All of that said, it really comes down to the principle of the thing. I've said - and will continue to say - plenty of things that could endanger me in some hypothetical future. I actually tend to be very public with most of my thoughts, rants, ramblings, and what-not, as I have an attitude of "If you don't like what I say, fuck you" directed at the government and pretty much everybody else. I have almost nothing to hide. BUT... not everybody has that attitude, and some people care more about keeping their "stuff" private. And even I want the option of keeping certain things private when the need arises. Just because I'm, say, 99% transparent (whatever that means) doesn't lessen the importance of that "1% secret". And that's the rub... everybody probably has at least "1%" of things that they do want to keep private/secret, now or in the future. And they should have the option to do that if they want.


The CIOT system can not legally be queried "at will", they have to have permission. Although the bar for this permission is incredibly low, I believe there have been some cases where it was denied.


Of course some government services have blanket permission. Example 1: the (dutch) IRS.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: