

Ask HN: Rescuing a book collection? - ig1

In early 2001 there was a major piece of digitization work scanning and putting online the Hockliffe Collection (a collection of about 1000 early childrens books from 1685-1900), sometime in the mid-2000s they seemed to have lost their funding. Their website is now broken, the staff no-longer work there and the physical book collection has been merged with a bigger library. I don't know if the bigger library has a copy of the scanned works.<p>I wasn't involved in the project, but at the time I scraped their website for some literature analysis I was working on, so I've got a complete copy of all their scanned books.<p>What should I do with them ? - I'd be hesitant to post them online myself without direct permission from someone (although I'm guessing the books are all out of copyright), but I don't want this valuable collection just to be lost either.<p>Would someone like the Internet Archive accept them knowing that I didn't have any specific ownership of them ?
======
Jun8
In 2009 Wikipedia ran into a similar problem with the National Portrait
Gallery: [http://www.examiner.com/museum-in-san-
francisco/internet-101...](http://www.examiner.com/museum-in-san-
francisco/internet-101-who-owns-the-images?render=print), here's what
Wikipedia has to say about the C&D letter:
[http://en.wikipedia.org/wiki/National_Portrait_Gallery_copyr...](http://en.wikipedia.org/wiki/National_Portrait_Gallery_copyright_conflicts).
Someone figured out how the gallery's image servers worked and downloaded
originals (normally you were allowed to see only low-res copies). As of today,
they are still on Wikipedia
([http://commons.wikimedia.org/wiki/Category:National_Portrait...](http://commons.wikimedia.org/wiki/Category:National_Portrait_Gallery,_London))

My recommendations are:

* Build a nice website, but don't put too much time into it (~a weekend perhaps) and put the images online

* Make sure you give proper reference as to original source of the images

* Try to contact someone from Hockliffe Collection (but don't try _too_ hard), as rmah says this is the polite thing to do. Document your effort on a blog for all to see.

* Don't feel squeamish about asking for donations to cover server expenses, etc. In fact, I would definitely go this route rather than putting ads on the page.

* If someone sends a C&D, contact Wikipedia guys and see if they can help. Generally the threat of negative publicity is enough for the institutions to back down.

They may still be able to shut you off, after all you don't have the clout of
Wikipedia. But still, I think the utility you'll provide to the general public
_far_ outweighs any risk.

~~~
ig1
I could probably stick them up on S3 fairly cheaply so I'm not really worried
about the cost, but I think hosting them myself would likely just be a
temporary solution, I'd rather hand them over to someone who could give them a
long-term home.

Maybe I should just seed it onto bittorrent, at least that way at least I'll
know I'm not the only person who still has a copy of the book scans :-)

~~~
Jun8
I don't know about bittorrent, to a lot of "normal" people that smacks of
piracy. I think S3 is a much better idea. Once you get some publicity and
traction, which I'm sure you will, you can then ask for volunteers to host
them.

I for one am looking forward to reading from these books to my son on my iPad!

~~~
cma
S3 uses bit torrent...

~~~
cma
<http://en.wikipedia.org/wiki/Amazon_S3#Design>

------
rmah
Most (all) of the actual books should be out of copyright at this point and
thus public domain. This means, from a copyright perspective, there is no
longer any ownership of the books and you can do anything you want.

However, since the scanning was done recently, the scans may be copyrighted by
the Hockliffe Collection. The website certainly would be. It may be prudent to
speak to a copyright attorney about it. I think it would also be the polite
thing to do contact someone from the original foundation if possible to
discuss this.

~~~
_delirium
> However, since the scanning was done recently, the scans may be copyrighted
> by the Hockliffe Collection.

In the U.S. at least, scans probably aren't subject to a separate copyright;
they're too direct and uncreative a copy, so they just inherit the copyright
of the original work. Wikimedia Commons advises people to operate as if scans
aren't separately copyrightable, anyway:
[http://commons.wikimedia.org/wiki/Commons:When_to_use_the_PD...](http://commons.wikimedia.org/wiki/Commons:When_to_use_the_PD-
scan_tag)

------
iterationx
Maybe Project Gutenberg would like them... Also Amazon or Google Books might
be interested in them.

~~~
ig1
I've only got the images and not the text form, I tried running them through
Abbey Finereader (the OCR package that PG normally use) a while back and
didn't get great results, presumably because of the age of the books. So I'm
guessing Project Gutenberg would be unlikely to be able to make use of them.

~~~
ars
Send it to Distributed Proofreaders <http://www.pgdp.net/> \- they work with
Project Gutenberg to OCR books.

You can also sign up and help proofread OCR'd books (every OCR is manually
checked by humans).

