

PACER Deleting Old Cases; Time to Fix PACER - thinkcomp
https://www.techdirt.com/articles/20140821/07015128275/pacer-deleting-old-cases-time-to-fix-pacer.shtml

======
engined
PACER is definitely interesting, a bit antiquated, and to date, the data has
mostly resided in the hands of the big information companies (Lexis, Westlaw,
etc.).

I've been building a system/website to access, search and develop intelligent
analytics from PACER court information. We're tracking cases, attorneys,
parties, judges, as well as the actual case dockets. The data is a treasure
trove of information, and if anyone's interested, I'd be very happy to chat
more about it.

The site (a signup for now as I'm working out the kinks in the system) is
www.docketleads.com. Email me there or ping me here for more info.

~~~
declan
I worked on a similar project a decade ago written mostly in Perl with the
frontend in PHP (hey, it was 2004, folks!). Just checked and I still even have
the old courtbot.com domain I registered for the project.

I suspect you'll find pretty quickly that there's a limit to how far regular
expressions or similar techniques can take you if you want to normalize and
reference precedents and make sense of cases. That's why Lexis and Westlaw pay
actual attorneys considerable sums to summarize cases, and why they can still
command such princely subscription fees even in 2014. But analytics might be
interesting. A family member is a judge, and her judicial office keeps track
of how many cases she decides per month, how many reversals she receives, etc.
I don't know if those are made public -- certainly I'm not aware of any
project to do it across a large data set, and I wish you luck with it.

~~~
engined
You're definitely right about case/precedent information, but what we've found
is that there's a whole other world of info that can be neatly organized with
a lot of crunching, and a small bit of manual manipulation.

The big guys chasing this are highly focused almost entirely on lawyers, in
the context of providing them case analysis tools. We've found a bit of a
different niche which doesn't need as much fidelity/granularity to the
information, but needs it nonetheless.

In any case, I'd love to chat about your experience, even if a decade old. Can
I PM you?

~~~
declan
Sure, happy to chat! What you're doing seems interesting, especially if you're
not targeting the lawyer/case research market. My email address is in my HN
profile. Though I am working nonstop on [http://recent.io/](http://recent.io/)
right now. :)

------
r00fus
I wonder if Recap [1] would help in addressing the censorship/deletion issue.
Ultimately, the way we fund these programs is the root the problem (and the
privatization of what is supposed to be public data).

[1] [https://www.recapthelaw.org](https://www.recapthelaw.org)

~~~
rayiner
I don't get the bit about "privatization." PACER is run by the judiciary.

~~~
DannyBee
The judiciary sees it as a profit center. Folks have offered to essentially
buy the data and make it entirely public. But they see too much profit from it

~~~
rayiner
Of course nobody is "profiting." There are no shareholders getting dividends
or execs getting bonuses. They use it fund the operations of the judiciary in
the face of a shortage of funding from Congress.

~~~
DannyBee
Except, of course, that PACER has various requirements that conflict with
this, and they make as hard as possible to keep this profit (which was 150
million in 2008) up.

For example written opinions that "set forth a reasoned explanation for a
court's decision" must be free of charge.

They make it is as difficult as possible to access this, and do not allow any
sort of bulk download, because doing so would make PACER/courtweb less useful
as a pay service.

------
thrownaway2424
It would cost Google negligible money to host this data and the only people
who would be upset would be the rent-seeking jerks responsible for the current
PACER debacle.

And EDGAR after that.

~~~
RubberSoul
What do you dislike about EDGAR? I find EDGAR to be pretty good. It's easy to
search and completely free.

------
amha
Un-fucking-believable. PACER has always been awful (I've used it since about
2005), but this is a new low---this is ACTIVE awfulness.

I assume, based on the weird specificity of what they're removing, that the
PACER office is doing this at the request of the individual courts. Which just
sort of underscores how awful this is---that courts get to decide how public
their own opinions are.

~~~
thinkcomp
Not so. The AO forced the courts to do it according to two people at the
Second Circuit.

The most likely explanation is that as part of the "upgrade" of CM/ECF (the
write component of PACER) they needed to jettison old databases that used a
different schema. This is of course nonsense. They've likely spent over $100
million on this upgrade since 2007, though actual numbers are surprisingly
hard to come by. For that price they could have probably afforded a few coders
to convert the older databases over.

~~~
toomuchtodo
Is there any way to obtain this old data through a FOIA? Or do those requests
not apply to US courts?

~~~
declan
Alas, FOIA applies only to federal government agencies that are in the
executive branch. It doesn't apply to federal courts or the U.S. Congress.

~~~
toomuchtodo
Well that's depressing.

~~~
declan
Yep. Though Congress _could_ liberate all of PACER, retrospective and
prospective, if it chose -- one data dump to Carl Malamud would do it. The
appropriations bills are wending their way through the legislative process
right now (mostly out of committee), and that might be a vehicle to add a one-
line amendment. Would require a lot of work in the next month or two.

~~~
toomuchtodo
Who do I call or who's door do I bang on?

~~~
declan
I'd point you in the direction of Carl Malamud, Jim Harper at the Cato
Institute, and EFF, probably in that order. Jim's made it a project to
liberate government data; Carl's gone further and made it his life's work.

Inside Congress itself? Hmm. I'm spending my time working on
[http://recent.io/](http://recent.io/) and now paying close attention
nowadays. But if you're local to the SF south bay try Rep. Lofgren? I've done
some Q&As with her and found she's one of the smarter and well-informed
members of Congress on tech policy issues.

------
oneweirdtrick
The day that PACER gets fixed is the day judges stop using WordPerfect.

------
MWil
I'll say it again: Bonkers!

