
Transcribing the Phyllis Diller Gag File - reedk
https://transcription.si.edu/phyllis-diller-cards
======
revelation
It's basically all typeprinter font and expertly scanned, what exactly is the
issue in using even basic OCR?

~~~
whodunser
The Smithsonian is transcribing other, much more difficult works. Such as the
cursive lab notebook of a historic astrophysicist[0]. I am ridiculously
jealous that they are getting this sort of crowd-sourced help to clean data.

Compare to hampanda.com (from Deepgram, YC W16)

[0]
[https://transcription.si.edu/transcribe/8634/ECOFD](https://transcription.si.edu/transcribe/8634/ECOFD)

------
payne92
Interesting project, but so many design problems with the approach to involve
users.

Use OCR first, then use humans to verify.

Next, present a task right up front that anyone can help with -- draw people
right in. Don't make users "look for work" and minimize/eliminate the need for
training.

For example: "If there's a date shown, enter it here ______" (with an option
for "no date").

Or, "Correct this text as it appears on the card: ________"

Or, "Is there an attribution/credit mentioned? If so, enter it here
____________________"

etc.

------
rmason
If you're ever in Washington D.C. you can view Bob Hope's joke file at the
Library of Congress where there's a special exhibit on him.

Hope's career started in Vaudeville, then radio, the movies and finally TV.
Interestingly he did several movies with Phyllis Diller and she was on a lot
of his TV specials.

[https://www.loc.gov/exhibits/bobhope/jokes.html](https://www.loc.gov/exhibits/bobhope/jokes.html)

------
skookumchuck
It is a nice peek at the hard work behind the scenes that goes into being a
successful comedian.

