
Show HN: Tracking the FISA Court in real time with a bit of code - konklone
https://twitter.com/fisacourt
======
konklone
Background: the FISA Court made a public docket only just last month. It's at
[http://www.uscourts.gov/uscourts/courts/fisc/index.html](http://www.uscourts.gov/uscourts/courts/fisc/index.html),
and is a tiny flat page with links to scanned PDFs. They clearly update it by
hand, whenever something becomes public.

I wrote a Ruby script to watch the page every 5 minutes. When there's a
change, it texts me, emails me, and tweets as @FISACourt with a link to a diff
of the changes.

When I get notified, I read the new documents and follow up the automatic
tweet with a hand-written one that explains the update, usually within just a
few minutes of the posting.

Simple, but it breaks news faster than the blogs and papers do. The code is
here: [https://github.com/konklone/fisa](https://github.com/konklone/fisa)

And I have some further explanation on my blog:
[http://konklone.com/post/following-the-fisa-court-the-
advanc...](http://konklone.com/post/following-the-fisa-court-the-advanced-
internet-way)

~~~
Amadou
If you aren't archiving every newly linked document, I'd like to suggest you
do so. You never know when a site like that is going to publish something that
they later retract, maybe even within minutes. Its good to have a copy for
cases like that.

~~~
konklone
Good call. I'll do that.

------
thinkcomp
Excellent work. I'm slowly adding entries on PlainSite...

[http://www.plainsite.org/courts/index.html?id=223](http://www.plainsite.org/courts/index.html?id=223)

[http://www.plainsite.org/flashlight/case.html?id=2487124](http://www.plainsite.org/flashlight/case.html?id=2487124)

~~~
konklone
Awesome! You might consider removing the years before 1978 on the FISA Court
page though:

[http://www.plainsite.org/courts/index.html?id=223](http://www.plainsite.org/courts/index.html?id=223)

------
klapinat0r
I guess we need to look for the buzzword "push" rather than "realtime" now-a-
days. At least, that's closer to how I use "realtime", rather than 5 minutes
mechanical polling.

"Sure, realtime-nazi, but what would you suggest?"

Personally I feel up to 30 seconds is acceptable, but I rely more on the
underlying method of delivery (push/long poll for change vs. polling and
comparing for change). In close comparison to the definition here:
[https://en.wikipedia.org/wiki/Real-
time_web](https://en.wikipedia.org/wiki/Real-time_web)

> _receive information as soon as it is published by its authors, rather than
> requiring that they or their software check a source periodically_

For a consensus, see [http://stackoverflow.com/tags/real-
time/info](http://stackoverflow.com/tags/real-time/info) and/or
[http://stackoverflow.com/a/5286985](http://stackoverflow.com/a/5286985)

 _Or am I wrong?_ What is a better label for what wikipedia defines as
realtime-web?

~~~
konklone
This is why I put the crucial space between "real" and "time". I cannot hold
up to intensive buzzword scrutiny. :)

It's every 5 minutes because I don't want them to ban me. And in practice,
this is an acceptable delay.

~~~
klapinat0r
To clearify, I wasn't critiquing your approach, just the phrasing.

What you're doing is how it should be done for sources you don't have direct
access to.

I'd use 10 minutes though, as
[http://www.uscourts.gov/robots.txt](http://www.uscourts.gov/robots.txt)
points to a crawl delay of 10. Whether it's checked is the risk you take.

~~~
konklone
Right, now I remember - just double-checked, and Crawl-delay refers to
seconds, not minutes.

[http://en.wikipedia.org/wiki/Robots_exclusion_standard#Crawl...](http://en.wikipedia.org/wiki/Robots_exclusion_standard#Crawl-
delay_directive)

~~~
klapinat0r
Oh, my bad. Sorry. Thanks for correcting me.

------
pivnicek
Great idea! Watching the watchers is a valuable public service. Cheers!

------
unreal37
This is a smart idea. All the power to you.

