
Our three year saga to release 13M pages of CIA secrets - danso
https://www.muckrock.com/news/archives/2017/jan/19/three-year-saga-behind-CIA-release/
======
danso
This part made me laugh a bit:

> _The declaration also says that CIA cannot release these TIFF files in
> electronic form because they can be so easily altered by the mere act of a
> CIA FOIA analyst looking at them, and that the security measures they must
> take to remove this accidental metadata for an electronic release (involving
> editing each file separately by hand) would take 28 years and 1,200 CDs._

Aside from the issue of the CIA apparently not being able to develop (or
download) a batch metadata editing tool, a common excuse by agencies for
releasing data as PDFs or image files, instead of Excel/CSV, is because the
latter is too easy to manipulate by the public:

[http://www.nytimes.com/2010/04/13/business/13docpay.html](http://www.nytimes.com/2010/04/13/business/13docpay.html)

> _Among the four leading drug companies making physician payment disclosures,
> Mr. Coukell said, Eli Lilly, which was the first to disclose, presents data
> as an Adobe Flash image, which he said was impossible to download or to
> sort. “They’ve gone out their way, I think, to present it as a Flash
> document,” Mr. Coukell said._

> _Mr. Dunston said Obsidian had to retype all the Lilly data._

> _Carole Puls, a Lilly spokeswoman, said the company purposely made its
> report impossible to download "to protect the integrity of the data." Lilly
> was concerned someone could change numbers and create a false report outside
> the company’s Web site, Ms. Puls said._

~~~
STRML
That's pretty sad, especially for an agency that must have a certain amount of
technical competence for their day-to-day.

This could be easily solved by releasing all documents and a list of their
hashes simultaneously.

~~~
dgrealy
The CIA keeps secrets for a reason. All this armchair talk from prople wanting
to get CIA material into the open is pure idiocy.

Stupid and dangerous. You know these are the people who protect you from all
manner of sabotage and manipulation from foreign nations right?

~~~
rl3
Declassified CIA material holds immense historical value. Each document is
reviewed prior to release to ensure there's no harm done to national security.

~~~
dgrealy
Declassified is fine, and that's what this is about ... I just see too many
people with no respect or knowledge thinking that secrets are by nature some
sort of problem, and wanting classified things to be in the public domain.

~~~
ionised
Just classify everything then. Problem solved.

------
ariwilson
In the vast majority* of cases, agencies of the US government should have all
non-classified data OCRed, indexed, hosted on their site and additionally
available on BitTorrent. Put together a reasonable "Open Access Data" budget
and make it available under the Library of Congress.

* Thinking things like astronomical data or other petabyte+ data sets.

~~~
cs02rm0
Indeed, I couldn't agree more. Though I'd be happy for them to be coerced into
providing the petabyte+ datasets too. If they can afford to store them they
should serve them.

I believe the UK signed up along with the US as part of the G8 to an Open Data
Charter [1] in 2013. In theory it aims to ensure that _all government data be
published openly by default_ , but in practice, as far as I can work out it
has no teeth. You can point government departments to it, tell them they
should release data, ask why they wouldn't and just get stonewalled.

I particularly pushed hard to try and get data from speed cameras released
(without personal details; just time, location, speed, etc.) I even offered to
aggregate data, normalise it, run the hosting, etc all at my own cost.
Nothing. As far as I can tell there's no one to go to who has the power and
influence to make it happen.

    
    
      [1] https://www.gov.uk/government/publications/open-data-charter/g8-open-data-charter-and-technical-annex

------
kumarski
I'm surprised nobody has scraped 100% of this and done OCR to make it all
clearly documented and tagged.

Seems like a good grad school research project.

------
toomuchtodo
Thanks for your efforts Team Muckrock. Keep fighting the good FOIA
transparency fight!

------
bronz
this is a little unrelated but does anyone know where i can find articles or
reports that delve into the substance of the podesta emails rather than the
circumstances surrounding the podesta emails? ive been looking all day and i
can literally count on one hand the number of articles that report on the
contents of the emails. and those dont even go very deep!

~~~
danso
Pretend you're someone who conducts a lot of daily life via email, e.g.
messages related to your day job, setting up appointments, chatting with
friends, etc, and imagine that someone has leaked 2 months of those daily
emails as a public download.

Now imagine you are someone else tasked with writing something that captures
"the substance" of those emails. You've basically been asked to capture "the
substance" of someone's life over a 2-month period. The reason why you don't
see many "deep" articles about the content of the emails is because John
Podesta is someone who is interesting to the public for a specific role:
Clinton campaign manager. So the contents of his emails are analyzed in the
context of the presidential race, not for any other kind of deep contextual
analysis.

------
bane
Is this available via torrent yet? Preferably ocr'd?

~~~
ComodoHacker
[https://archive.org/details/CIA-CREST](https://archive.org/details/CIA-CREST)

~~~
bane
beautiful, thank you

