
Michael Arrington vs Jennifer Allen (Case 2:13-cv-00810-JLR) Filed 05/07/13 - protomyth
http://www.scribd.com/doc/140241091/Arrington
======
jerrya
This document, like so many other legal documents, has been scanned in.

I assume this is how it was delivered to and archived by the court, and not
just a scribd artifact.

The document was almost certainly produced on a computer, but it is archived
as a scanned in document. The downloadable pdf is not searchable, the
downloadable txt is useless.

Scanning legal documents encourages errors (the document is tilted), and makes
OCR and searching difficult for the average layman consumer of legal documents
(that is in many cases, the plaintiffs and defendants) that do not have access
to cheap, accurate OCR.

When will it be expected that legal documents archived by a court will be
normally available to the public in "machine readable", searchable, full text
behind the image, cut-and-paste form?

In my own experience, the ability to use simple, command line based, text
searches to hunt through collections of legal documents is invaluable.

~~~
wglb
Having been through the scanning of a large number of personal documents, I
found good success in doing very good OCR on printed documents. Further, the
software that the scanner comes with can also OCR pdf documents quite nicely.

This also includes mis-registered stuff.

I am about to start playing with tesseract ORC to see how it compares. It is
claimed to be very good.

~~~
jerrya
I have actually had good success with the IRIS software bundled in with an HP
Scanner,

But:

a) It is Windows b) It is a good (in the best sense) example of what is still
cripple ware c) It is not available to people that don't purchase a scanner,
even though PDFs and legal documents often come via email d) It is slow e) It
makes every recipient of that document have to OCR it, as opposed to just
teaching/demanding the lawyers and courts to distribute them in their original
Word/GoogleDoc/WordPerfect/TextPDF form.

~~~
wglb
I am using the Fujitsu ScanSnap scanner which also has all the parts needed to
run very nicely on Mac.

I agree that there is a bit of expense; the time is quite reasonable.

Insofar as

>eaching/demanding the lawyers and courts

good luck with that.

------
bridanp
When the news people themselves become the story, at what point do we realize
as a society that we have no reason to concern ourselves with it? TechCrunch
used to be a great place for tech type news, but then this fellow (founder I
suppose) became their main story.

------
parfe
Sucks to require a lawsuit to settle a bad breakup. The filing actually
includes direct quotations that Arrington contests are defamation which at
least means the suit potentially has substance (versus a scare-suit just to
stifle Allen's discussion of the relationship).

~~~
spindritf
There seems to be quite a bit of substance[1].

> just to stifle Allen's discussion of the relationship

She accused him of multiple rape. That's not a "discussion of the
relationship."

[1] [http://uncrunched.com/2013/04/11/jennifer-allen-false-
defama...](http://uncrunched.com/2013/04/11/jennifer-allen-false-defamatory/)

------
mnicole
I can't parse anything Ms. Allen writes. There needs to be some sort of
textspeak/Twitter babble/missing characters translation after the original
statement. Arrington's lawyer does a good job trying to pick up the pieces of
her ramblings, but still.

------
GigabyteCoin
Talk about airing ones dirty laundry.

------
benguild
And it escalates.

