Either the $20 million claim is wrong, and then all the information on the slides is suspect, or it is correct, and the scope of PRISM is much smaller than is widely believed and is believed by the author of this article. Or the author of this article properly understands the scope and is in error in his calculation.
Based on the new slides from WaPo, PRISM collection spans many other non-PRISM programs, such as the now-known MAINWAY (internet metadata), MARINA (internet content), and NUCLEON (voice content).
We've, for a very long time, said things like, "this is probably happening". That is in no way whatsoever a novel idea. What is novel, and why these discussions are happening so frequently now, is that we have evidence that a 20mil/year program is actually happening.
So when we're talking about things we have evidence for, let's please avoid throwing in conjecture.
Multiple replies below have already questioned this either-or choice you present. From what I know about government agency presentations to higher-level authorities who set budgets, the likely claim on the slide is that the marginal cost of PRISM-as-such in an environment in which NSA already has other programs and the facilities to run them is just an insubstantial $20 million. And on the more extravagant assumptions of the submitted article, that might very well be a true claim for a PRISM program that gathers and analyzes quite a lot of data. That's especially likely if NSA has low-cost in-house software development capabilities, as it surely does.
Additionally the posting assumes that all the data is stored, that is a lot of cat videos. With decent preprocessing you can probably cut the data rate by a rather large factor ( I would assume at least 100, since you do not need to store warez or the NYT homepage.) Then to do the opposite estimate, by assuming that the system is CPU bound, one needs hardware to process 120 GB/s. With roughly $10M you can then buy a few thousand machines, and your PRISM software needs to handle something like ~50 MB/s per machine. ( Which may or may not be a reasonable data rate, depending on the sophistication of the algorithms, and how much can be discarded very easily.)
"worst case scenario" is emphasized in the article.
As the storage boxes in the article also have a nice CPU, the collected data can be indexed and then compressed, saving a lot of space.
> Do you think that PRISM can be built using a different tech stack?
From https://en.wikipedia.org/wiki/Apache_Accumulo :
> Apache Accumulo is a sorted, distributed key/value store based on Google's BigTable design. It is a system built on top of Apache Hadoop, Apache ZooKeeper, and Apache Thrift. Written in Java, Accumulo has cell-level access labels and server-side programming mechanisms.
> Accumulo was created in 2008 by the National Security Agency and contributed to the Apache Foundation as an incubator project in September 2011.
It's much more efficient as an employment/pension guarantee scheme than an efficient intelligence tool.
Reminds me of the qoute from Yes Minister - Something must be done. This is something. Therefore we must do it.
Here's how I would think about it if I were building this. The total hardware costs are 168M, and the total personnel costs are 4M. Say I pay $500K instead of $200k and in doing so I get Jeffrey Dean instead of someone like me (I suspect I might have to pay more than $500k for Jeffrey Dean but bear with me). My costs have doubled but the efficiency of the system might be 5x or 10x better because I'm just quite good at my job and he's a total legend. That efficiency scales the total hardware cost, which dwarfs the personnel cost. I'd say $500k starts to look pretty cheap at that scale.
I wouldn't sell out my fellow man for anything less.
I would expect Prim to at LEAST cost more than Instagram, no? :)