Hacker News new | past | comments | ask | show | jobs | submit login

> More than 100 boxes of the two men’s writings, correspondence, speeches and other items were contained in one of two modular buildings that burned to the ground at the Fountaingrove headquarters of Keysight Technologies. Keysight, the world’s largest electronics measurement company, traces its roots to HP and acquired the archives in 2014 when its business was split from Agilent Technologies - itself an HP spinoff.

https://www.pressdemocrat.com/article/business/hewlett-packa...

----

I would recommend folks upload material like this that they want archived to both archive.org and bitsavers.org

While a corp might have copyright, they don't have the right to have these cultural artifacts lost to the sands of time.




That's been a perennial topic on HN. My suggest as to the expense of archiving it was that a student could be hired at minimum wage to photograph them, page by page, with a phone camera.

This suggestion was usually followed by a deluge of angry responses that archiving should be done properly by a trained archivist. Of course, that's expensive, and now the only thing archived is ashes.

I've personally archived a lot of family letters using this technique, and it's fast (several pages a minute) and plenty accurate. Just do it on the kitchen table in the sunlight.


>My suggest as to the expense of archiving it was that a student could be hired at minimum wage to photograph them, page by page, with a phone camera.

I'm not sure you'd need a trained archivist, but my 15 year-old Brother multifunction laser allows you to scan around 20 pages in a minute through an automatic document feeder at 600dpi.

From a quick look it seems like modern, dedicated scanners in the sub-US$500 range (brand new) have even larger feeders and faster scan rates - being able to chew through a 50 page feeder in around a minute.


If you have such a device at hands, go ahead and use it.

I understand the suggestion to go "minimum wage + phone camera" as "do whatever is prudent so stuff is archived _now_ in whatever quality, it doesn't even cost the world."


I agree with you in principle - my comment was just that for anything except the smallest jobs, it would probably be more economical to get some dedicated hardware to assist the worker.


Yes, but starting from there will generate resistance because of the initial expense and lack of percieved value.

If you can get the role of archivist created then later on there can be an argument to increase the budget without so much risk of the project being cancelled altogether.


My sheet feeding scanner always jams when it has anything other than perfect copier paper in a stack. For folded, odd sizes, fragile, dirty, stapled, glued, torn, other paper, etc., it just doesn't work. Ya also have to constantly clean the platen with alcohol, or your scans get vertical lines on them.

(It's the same problem with a sheet feeding printer. If the printer paper isn't perfect, it jams.)

I agree that a dual side sheet feeder is awesome when it works.


Yeah, I have a ton of stuff I'd like to scan and have scanned some of it. But very little of it is in the category of just load in the hopper and you're done. And once you're looking at scanning hundreds, if not thousands, of pages of stuff and attaching metadata to it, it's pretty much a matter of laying down until the desire to digitize it goes away.

As a side note, I've been involved in a project to digitize the back issues of a student newspaper I was involved with. It's all a combination of very labor intensive and expensive (given the issues that they only have bound copies of are a real pain to scan even with an expensive large format flatbed scanner).


The Fujitsu ScanSnap sheet fed scanner we have is pretty forgiving with what will successfully run through it. It was about $550 with a full version of acrobat.

I see the commercial version of these at drug stores and hospitals, so I'm guessing that the robust flexibility for scanning isn't unique to my experience.


I don't know. Our (I'm sure multi-K$) Xerox copier/scanners in the office are pretty temperamental if you don't give them fairly clean stacks of paper to scan--especially double-sided.


I'm using the Fujitsu ScanSnap sheet fed scanner :-)


> I'm not sure you'd need a trained archivist, but my 15 year-old Brother multifunction laser allows you to scan around 20 pages in a minute through an automatic document feeder at 600dpi.

Document feeders can and do eat documents.


> That's been a perennial topic on HN. My suggest as to the expense of archiving it was that a student could be hired at minimum wage to photograph them, page by page, with a phone camera.

> This suggestion was usually followed by a deluge of angry responses that archiving should be done properly by a trained archivist. Of course, that's expensive, and now the only thing archived is ashes.

Isn't likely that a student willing to work for minimum wage may not care enough to not treat the material carelessly and cause quite a bit of destruction or damaging disorganization?

You're also conflating archiving with digitization, when they're distinct activities.

IMHO, in most cases, it's also more likely that the original paper documents will survive in readable form than a mass of mediocre quality scans.


> a mass of mediocre quality scans

I challenge you to pick a random piece of paper, put it on the kitchen table, take a shot with your phone, and look at the result. Tell me it's not easily readable.

I also have an app on my phone that will OCR the jpg.


> I challenge you to pick a random piece of paper, put it on the kitchen table, take a shot with your phone, and look at the result. Tell me it's not easily readable.

That's a straw man. I never claimed that you can't take a readable photograph of a document with a phone.

The gist of my point is your idea seems to be mainly about insurance against catastrophic destruction (which happens, but infrequently). You're very focused on basic usability of the output of your proposed project, but you don't address 1) if your digitization will survive long enough to do its job as insurance, 2) damage that your project could cause (e.g. poorly paid workers being careless and damaging or disorganizing things irreparably).

I've read a very little bit about archival science, but one of the basic things they emphasize is preserving the original organization, because important information can be encoded in it. That could get easily get lost by minimum wage workers spilling documents on the floor, or rearranging things to make their job easier (e.g. when I used to scan receipts, I'd order them by width and rough length, because that would cause the fewest issues with the document feeder). Then you have issues with old fragile documents, accidentally tearing things out of binders, etc.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: