
Working with the Clinton State Dept Email Dumps in R, Part 1 - grej
http://rud.is/projects/clinton_emails_01.html
======
anulman
Fun fact: the "disconnected mini-graphs" the author mentions toward the end
could also be attributed to Clinton being included via cc / bcc.

Source: I've built & reviewed email graphs from IMAP & POP dumps too

------
danso
For anyone who wants the raw text for themselves (though I leave the
parsing/sorting to you), I wrote up this Python process that uses Poppler's
pdftotext to extract from the PDFs:

[https://github.com/datahoarder/secretary-clinton-email-
dump](https://github.com/datahoarder/secretary-clinton-email-dump)

Though it's pretty cool that there's an API on the WSJ site that you can use
to leverage the parsing they've done.

~~~
toomuchtodo
This is the WSJ's email parser: [https://github.com/wsjdata/clinton-email-
cruncher](https://github.com/wsjdata/clinton-email-cruncher)

------
ck2
Yeah, so there has to be a pony under all that horsesh*t somewhere, so keep
digging as the joke goes.

Why again is all email public record but telephone calls are not?

I'd sure like to see all the emails from all the senators of this country
analyzed.

------
irixusr
I read through them a bit. I'd like to see the frequency of "pls print".

Seems to be her only contribution to every conversation...

------
Gratsby
Only 17 ads on that page. I think I'd have a word with my professor for
sending me to a site like that.

------
chflags
Search the emails via search engine, e.g., Google:
[https://www.google.ie/search?q=site:foia.state.gov+/searchap...](https://www.google.ie/search?q=site:foia.state.gov+/searchapp/DOCUMENTS/+your+search+here)

------
pete00
Are there any surprising names in the list? Most of the high frequency ones
are to be expected.

------
packetized
Might be better to have the direct link, as the iframe is a bit quirky on
mobile:

[http://rud.is/projects/clinton_emails_01.html](http://rud.is/projects/clinton_emails_01.html)

~~~
grej
Agree. Though I do not have authorization to adjust the link - guess it
requires a mod.

~~~
broodbucket
Yes, it'd be too easy to abuse. Post a piece of important news, get it upvoted
to the top, change the link to something malicious.

