

[Ask HN] Any feasible way of Managing Large Scanned Imaged Over 200MB PDF's  - Krshna

I have 400 PDF files. Almost all of them over 200MB. They are the scanned book and all of them are imaged pdf. There is no text. It can't be OCR'ed because there is no OCR for the script in that book. Almost all the PDF in all three OS go crazy when trying to load and change page including the official Adobe Reader. Currently, the most responsive of all readers is the Google Chrome's inbuilt reader. Though it is not as snappy for small pdf, it is better than all other. These pdf don't have their meta data written so I want to write meta data on them and also want to Antone, comment etc. Basically do every thing which we can do in small pdf files. I tried mendely but after taking 1 minute to load the first page it takes about 3 mintues to load 3 pages and then it can' t load anything. Till now for quite long i Have been a serious user of Mendely but it failed now. Any suggestion for how to mange these large pdf with easy way to view.<p>BTW: I also checked out some ebook reader but they still need years of development before they can be like paper book not in terms of quality and environmental stuffy but rendering this large pdfs.
======
pwg
You left out the two most significant data points to help us help you. What is
the scan resolution and bit depth of the images inside these pdf's. Because if
you have scanned these books at 600dpi 32-bit color, then no wonder page
flipping is slow. If so, you should reduce the color depth to 1-bit (black &
white, assuming when you say "book" you really mean "book" as in black text on
white paper). Going to black and white would likely seriously reduce your file
sizes, while also seriously accelerating your page flip rate.

------
bockris
You could try to convert them to djvu <http://djvu.org/> That might help your
size issue but I have no idea if it will be readable on a ebook.

