
Prepping for the Transfer of 25,000 Manuals - r721
http://ascii.textfiles.com/archives/4695
======
donpdonp
Context/Background link:
[http://ascii.textfiles.com/archives/4683](http://ascii.textfiles.com/archives/4683)

tl;dr: A technical manual reseller in Finksburg, MD is throwing out 25,000
high-quality paper manuals from as far back as the 1930s this week and Jason
Scott & friends are driving there with "$900 of banker's boxes" to save
whatever they can.

------
nadams
I know many people may be thinking "just throw it out". But you don't
understand - you may be faced with system that is in one of those manuals and
some previous genius decided that when they cleaned their office to toss that
manual.

Or even come into possession of one of those systems and have no idea how to
use it. If google hadn't crawled the manual for certain products I would have
thrown away a number of electronic paperweights.

~~~
rasz_pl
Those are manuals for 40-80 year old technology, something second year EE
student could explain to you in their sleep. The only value they posses is
historical.

~~~
avian
This is incredibly naive. Many second year EE students today will not even
know what a vacuum tube is, much less be able to explain how a device using
them works.

Even if you're used to looking at discrete transistor circuits (and even that
is getting rare these days), a device with tubes can look like magic of the
highest order.

~~~
acdha
Seconded. Another factor is that looking at a device won't tell you why it was
designed the way it was. A good technical manual will be invaluable for
telling you things like valid ranges, service or environmental limits, etc.
which would otherwise need to be reverse-engineered. If you're trying to
replace an old system or studying the history, know what does and doesn't
matter can save a ton of time.

(This goes double if the manual was annotated by a good operator)

------
e0
Why does he say the Linear Book Scanner ([https://code.google.com/p/linear-
book-scanner/](https://code.google.com/p/linear-book-scanner/)) destroys
books? I thought the idea of the Linear Book Scanner was to automatically scan
books without cutting the binding.

Anyhow, cool project.

~~~
pmorici
The FAQ on the linear book scanner site says the following,

"Prototype 1 could scan the majority of books without damage, but may tear one
or two pages in some books. Out of 50 books tested, 45% had one or two of
their pages either torn or folded. This is a very early prototype and there
are many areas for improvement in the design."

~~~
m0dest
Look into commercial book scanners that use vacuums. This is a solved problem.
Kirtas is one of those manufacturers.

[https://www.youtube.com/watch?v=ds63ZBXFdLM](https://www.youtube.com/watch?v=ds63ZBXFdLM)
[http://www.kirtas.com/](http://www.kirtas.com/)

50 pages per hour with no damage to the books.

You're looking at $50k-$90k for the equipment plus $8k/year service contract,
though. So you need to figure out whether book scanning is something that the
Internet Archive is interested in, beyond this project.

~~~
pmorici
I was over there yesterday morning helping sort and it seemed like the vast
vast majority of the books were either ring binders or spiral bound and could
easily be taken out of the binding to be scanned in a standard scanner.

------
rebootthesystem
Having gone through the gut-wrenching task of choosing which data books to
throw out on more than one occasion I fully understand the value and sentiment
of this project.

As a young hobbyist and later engineer I learned TONS out of data books,
application guides and equipment manuals. I'd spend hours paging through data
books, learning about the various chips, going through the application notes,
building circuits, testing them and studying schematics when equipment
actually came with schematics.

Anyone who was "all in" in electronics did exactly the same.

To this day I've kept my National Linear Applications books and a few others.
eBooks have yet to capture the speed and convenience of holding a 500 page
book in your hands that you can page through and explore. Worst yet, having
five or six such books spread across your workbench as you work on a design.

That said, having the ability to search books or, better yet, your entire
library, is useful. I don't buy programming books in paper form any more. And,
I still prefer PDF to any other eBook format. For me it tends to be a far
better experience across platforms.

This thread has made me think about the idea of digitizing my physical books.
I find myself thinking about this every few months. I have both engineering
and business books that will never be available in electronic form and I would
definitely like to preserve them and make the searchable.

Is there a service or a device one could use for this purpose. The linear book
scanner seems interesting yet apparently it is known to damage books. A
service could be interesting but it would have to be comparable to buying a
book, meaning, $20 per book or thereabouts, not $500 (or whatever). This would
mean they'd have to have a slick and low cost means to digitize books or
monetize the process in some form beyond charging for digitizing.

Building a scanner could be interesting, of course. I'm thinking about
bringing this up as a project for the FIRST FRC robotics team I mentor. You
never know what the kids might come up with.

Any resources on this front?

~~~
yourapostasy
When I looked into book scanning a few years ago, the Kirtas (mentioned
elsewhere in this thread) was as far as I could tell from Net sources, and
remains, the reference method of fast, high-volume, non-destructive, high-
quality scanning. Even so, many libraries with Kirtas units still employ
someone to stand watch over the page turning and ensure only one page at a
time is flipped. Perfect page turning is apparently not a _completely_ solved
problem yet.

If procuring (and paying nearly $10K USD per year in maintenance fees) through
a hacker collective or maker space is infeasible in your area, then the
community at www.diybookscanner.org have a workable solution for a much
smaller subset of what the Kirtas units address, so you could look into that
as a modest workaround for the time being (though I wonder what results they
got for dewarping by simply taking pictures on all the sides of the scanning
target to synthetically construct a 3D volume, as perfect dewarping continues
to be an open and unsolved problem).

~~~
userbinator
_Even so, many libraries with Kirtas units still employ someone to stand watch
over the page turning and ensure only one page at a time is flipped_

Most books have page numbers; couldn't they use that along with OCR to detect
and retry skipped pages? Maybe even a state that shakes the pages more than
usual in an attempt to separate ones stuck together. It doesn't sound too
difficult to do (perhaps you'd have to tell it where the page number is),
given what the Kirtas machine costs.

~~~
yourapostasy
The challenge seems to be the OCR takes place in a post-processing phase
instead of real-time, and the desire is to catch the improper page flip before
putting away the book. Perhaps with one or more gigabit pipes, the image
processing can take place in the cloud in near real-time.

The Kirtas units seem highly-regarded by conservators; they might have lots of
objections to even gentle shaking of their sometimes fragile charges. The
impression I get is that the slight vacuum employed by the Kirtas on pages is
the most handling that is accepted. There might be recent developments in
computer vision and robotic fingers which could see an improved robotic analog
to a human page flipper in the future.

My personal hunch is the popularization and (relative) mass adoption of the
slower, lower-tech open source book scanners will eventually outstrip the
dedicated scanning throughput of the high-end units, and put more digitized
content onto the Net, along with a legal fight over content "abandoned" by
publishers. When I digitize my content, it goes into my private collection,
but I sure wish publishers were more aggressive with digitization of the older
material, or lenient with letting that older material go into the public
domain if they aren't even chasing the long-long-long tail of that material
anymore.

------
bkrausz
If the sole limiting factor is manual labor, can he hire a bunch of people off
of Craigslist? Though I can't be there, I'd gladly donate for another person
with a truck to help.

------
monochromatic
> Why are these even worth anything or worth keeping, tidy your life, lighten
> up, etc. Either you really understand why 80 years of manuals, instructions
> and engineering notes related to 20th century electronics are of value both
> historically, aesthetically and culturally, or you don’t. To try to make the
> case would be a waste of time for both of us.

I'm not sure I do understand the motivation, but I don't think that I'm
_beyond understanding it._ Is it that some of these systems are still in
service? Is it just the history/archeology aspect?

~~~
engi_nerd
You would be surprised how often old technical manuals are useful. Examples
from my own work:

1) I was tasked with instrumenting the T58-GE-16 engines in a CH-46 [0]. So
what did I need to inform my sensor placement and selection? Some schematics
and technical manuals, all from the late 1960s, all undigitized.

2) I needed to reverse engineer an old test set. The documentation had been
lost to time. When I cracked it open, I saw lots of 5400 & 7400 series chips.
Now, this is kind of a trite example, because lots of working EEs still have
copies of the TTL Data Book at hand. But still, I needed to refer to that old
tome when working on this project.

3) When I worked at a NASA contractor, a primary piece of equipment failed. We
needed a replacement in a hurry. Fortunately, someone had kept the older
version of this system around. It dated from 1959 (!) but the manual was still
around, too. A quick read through that manual got us back in business.

Technology never dies [1]. But without the manuals to understand that
technology, things become much harder when you need to use that technology
again.

[0]:
[https://en.wikipedia.org/wiki/Boeing_Vertol_CH-46_Sea_Knight](https://en.wikipedia.org/wiki/Boeing_Vertol_CH-46_Sea_Knight)

[1]:
[http://www.npr.org/sections/krulwich/2011/02/04/133188723/to...](http://www.npr.org/sections/krulwich/2011/02/04/133188723/tools-
never-die-waddaya-mean-never)

~~~
phrogdriver
Thank you for 0), you may be able to guess why from my user name.

~~~
engi_nerd
I was working on instrumentation for testing an IR suppression system for the
Phrog's exhaust. Here [0] is the exact aircraft (BuNo 152578) sitting at the
Pax River NAS museum, with the IR suppression attached (and a nice big bundle
of our thermocouple wiring, too). From what I could find in the records, this
was either the 4th or 5th time the Navy had tried something like this, and the
results were less than promising.

[0]: [http://cdn-www.airliners.net/aviation-
photos/photos/3/0/8/16...](http://cdn-www.airliners.net/aviation-
photos/photos/3/0/8/1691803.jpg)

------
Animats
It's great that someone is doing this. The person involved is affiliated with
the Internet Archive, so they will know about their book scanning
capabilities.[1]

Once they have the books in storage, the next step is to take a picture of
each cover, and put those on line. With an inventory, people will be able to
ask for (and perhaps pay for) digitization.

[1]
[https://archive.org/details/partnerdocs](https://archive.org/details/partnerdocs)

------
verytrivial
Beyond the immediate rescue (which is by no means secured), perhaps someone at
the Society of American Archivist or the student membership thereof would be
interested? I'm seeing discussion here on HN regarding the worth of the
collection and how to handle it. SAA have mailing lists[1] and there appear to
be a couple of plausible Twitter handles[2].

[http://www2.archivists.org/initiatives/askanarchivist-day-
oc...](http://www2.archivists.org/initiatives/askanarchivist-day-october-1)
[https://www.google.co.uk/search?q=ssa+twitter+archivist](https://www.google.co.uk/search?q=ssa+twitter+archivist)

------
pmorici
Photos and updates on the progress available on Twitter...

[https://twitter.com/textfiles](https://twitter.com/textfiles)

------
userbinator
25k sounds like a lot, but from the pictures it looks like most of them are
not very thick and it'd be relatively easy to grab a whole stack of them at
once. From my estimation that is around the size of a small library.

If they can be removed from the shelves and boxed at an average rate of 5 per
second, that's 5,000 seconds or <1.5h at the most. Even after adding in trips
to the new storage location, packaging, unloading, etc., and considering it's
a trivially parallelisable task, it definitely seems doable to move the whole
collection of 25k within a few hours.

~~~
GauntletWizard
I doubt you'll get anywhere near 5 per second, even with a large number of
people working on it. A single person will likely need 5-10 seconds to grab a
manual from a shelf, place it in a banker's box, and move on to the next one.
You'd need 25-50 people to get that.

Still, even if it's 100 person-hours, it's an achievable goal; A dozen
volunteers over the course of a day can do so.

~~~
userbinator
_A single person will likely need 5-10 seconds to grab a manual from a shelf,
place it in a banker 's box, and move on to the next one._

If you look at the pictures like this one:

[http://ascii.textfiles.com/wp-
content/uploads/2015/08/IMG_69...](http://ascii.textfiles.com/wp-
content/uploads/2015/08/IMG_6993.jpg)

I could probably grab 25 or more of those at a time and set them in a box in 5
seconds, hence 5 per second. Getting the first ones out (because there is
little "gap" to stuff hands into on the shelf) will be slower, but once the
gap is made the whole pile easily comes out. This isn't about pulling one out
at a time, spending another few seconds inspecting it, and then putting it in
a box; it's about getting them off the shelves and out of the building ASAP.

~~~
raldi
Keep in mind, you also have to remove just one unique copy of each set, and
throw the rest away, being very careful not to accidentally throw out a
"duplicate" that is actually a similar-looking, but unique, manual.

~~~
acdha
Given the really hard time limit, wouldn't it make more sense to grab
everything as quickly as possible and the sort them over a couple months? That
requires more storage space but has the advantage of a bounded, easily-
calculated maximum time.

~~~
smackfu
Depends how many duplicates there are. If there are eight of each manual, then
you are talking about eight times as many boxes. One truckload is now eight
truckload, one storage unit is now eight storage units. And now you have to
pay for discarding 7/8 of the manuals, which is not free.

------
pmorici
Looks like the place is in Finksburg Maryland. about 30 minutes from Baltimore
and an hour from DC.

