
The Quest for a Universal Translator for Old, Obsolete Computer Files - Erlangolem
https://www.atlasobscura.com/articles/how-to-open-old-computer-files
======
xvilka
Reverse engineering efforts, like it was done for old "office" file formats by
the LibreOffice team. Or the same by various free *CADs teams. There are even
tools for helping reverse engineering file formats like Kaitai [1].

[1] [https://kaitai.io](https://kaitai.io)

~~~
speps
[https://kaitai-io.github.io](https://kaitai-io.github.io)

Your link gave me a warning that the certificate was only for github domains.
This one redirects to kaitai.io but doesn't warn.

~~~
zingmars
That's because it redirects to the http version of the page.

------
kwhitefoot
Was looking good until the penultimate paragraph where it says that this tool
will not be available to the general public. Would have been nice to have had
that up front so that I could have stopped reading right there.

------
vadimberman
If the physical media are indeed floppies and CDs, these guys will need to
take into consideration high share of corrupted data. I know, I know, this is
not what the project is about, but without it, the outcome will be more
theoretical than practical.

I remember having trying to read a decade-old stash of floppies a few years
ago. Too many were unusable. CDs are much better but also relatively
unreliable. My USB drives fared much better, even the 64 Mb one with the
kangaroo logo still works.

~~~
8bitsrule
I haven't seen much discussion about how 20+ year-old CDs are doing. IIRC
about half of my self-burned CDs have failed after about 10 years. I wonder
what the situation is with factory-stamped -data- CDs. (I have yet to -hear- a
problem with my 30+ y.o. music CDs, but I suspect that some aren't 100%.)

~~~
jaclaz
Slightly off-topic (but not much) and as a single data point, IMHO a part of
the issue with old "burned" CD's is not only the CD but rather the "new"
readers (and their speed).

I still have a very old 1x SCSI CD drive (the kind that not many people will
remember, using a caddy-cartridge for the CD) on a computer dedicated to
"recover" old CD's and the success rate is much higher than on modern CD/DVD
readers. I would say 9 out of 10 with the SCSI drive vs. 6/10 on a modern
CD/DVD drive.

~~~
8bitsrule
Very good point.

------
hprotagonist
I have a PDP/11-23 in my lab right now. It contains the data and code my
former advisor used to get his tenure-track job, as well as an interface to an
electrophysiology recording rig. Takes up a good part of the room.

I have no idea where to put this thing, but we virtualized the entire thing
years ago and spin up the VM when required.

~~~
ChickeNES
If you're looking to get rid of it and are in the Midwest I might be
interested... :P

~~~
hprotagonist
east coast. I've already asked several computing musems, whose answer was "we
have 5 in the basement already, but thanks!"

------
saagarjha
> Each new release tends to only support files created within the last two
> versions

Ugh, I hate software that does this. If you're creating an obscure proprietary
format for your files, the least you can do is support it.

------
tyingq
Laudable idea, but there's just so much variety of old platforms, file
formats, and backup formats. It's hard to guess what might be highest
priority. Maybe "Universal" just sounds a little strong to me.

Just covering every popular RISC/Unix platform would be daunting. Ever hear of
Pyramid Osx? It was once popular. That's skipping over mainframes (not just
IBM either, see
[https://en.m.wikipedia.org/wiki/BUNCH](https://en.m.wikipedia.org/wiki/BUNCH)),
mid-range (os/400, mpe, VMS, Tandem), OS/2, BeOS, embedded platforms, and
much, much, more.

~~~
neuromantik8086
There have been some efforts in the past to take care of at least one of the
problems that you enumerate. PRONOM [1], GDFR [2], and UDFR [3] sought (and
still strive in the case of PRONOM) to be more formalized versions of the
/etc/magic file, so that digital formats could be more readily classified
automatically.

Unfortunately UDFR and GDFR have fizzled out (a theme that occurs sometwhat
often when projects have high ambitions and inadequate support). PRONOM is
still around, but has difficulty adding in accurate information since the
overlap between the technologists with the domain expertise necessary for such
a database to succeed and librarians is quite small. It would benefit quite
greatly from an infusion of engineers who wouldn't mind filling out forms with
corrections. :)

[1]
[https://www.nationalarchives.gov.uk/PRONOM/Default.aspx](https://www.nationalarchives.gov.uk/PRONOM/Default.aspx)

[2] [http://library.harvard.edu/preservation/digital-
preservation...](http://library.harvard.edu/preservation/digital-
preservation_gdfr.html)

[3] [http://www.udfr.org/](http://www.udfr.org/)

------
neuromantik8086
This is a good introduction to some of the issues facing academic libraries
these days. For a more in-depth look at what strategies librarians are using
to provide access to older operating systems / formats / user experiences, I'd
recommend taking a gander at "Emulation & Virtualization as Preservation
Strategies" by David S. H. Rosenthal:

[https://mellon.org/media/filer_public/0c/3e/0c3eee7d-4166-4b...](https://mellon.org/media/filer_public/0c/3e/0c3eee7d-4166-4ba6-a767-6b42e6a1c2a7/rosenthal-
emulation-2015.pdf)

------
retox
I've worked on this problem in the past for modernization and migration away
from legacy languages to modern ones. You need to write a parser for the
source language, but that is a one-time upfront cost. The parser should
populate an intermediate language agnostic model and then you can write as
many generators as you require to translate that agnostic model into your
target language of choice.

We had one-click solutions for COBOL to C, Java to C#, VB to Angular, etc.
Once you have the parsers and generators the work for each project after that
is minimal.

------
voltagex_
File formats are fascinating:
[https://github.com/corkami/pics/blob/master/binary/README.md](https://github.com/corkami/pics/blob/master/binary/README.md)

For game file formats you may want Xentax
([http://forum.xentax.com/](http://forum.xentax.com/)) or the "fork" Zenhax
([https://zenhax.com/](https://zenhax.com/))

Open some files in a hex editor and have a look. Run "strings". Investigate
and learn.

------
maerF0x0
When I was younger I used to play BBS games. Things like LORD, ArrowBridge,
Usurper et al. It would be cool to get those running again as a website.

~~~
unscaled
Well, it's already been done for LORD: [http://lotgd.net/](http://lotgd.net/)

I can see other ways you could do that generically for every door game: 1\.
Run the game in local mode inside a DOSBox running under WASM in a webpage.
2\. Run a classic BBS software in DOSBox using nullmodem and a virtual FOSSIL
driver and have a telnet server listen and redirect connections to BBS nodes
(you'd need multiple nodes). There seem to be already classic telnet BBSs
doing exactly that. 3\. Run a more modern BBS software like Synchronet on a
modern OS and shell out to DOSBox with a nullmodem connection (and possibly a
FOSSIL driver) every time someone creates a connection.

Options 2 and 3 are much more computationally expensive, but they will gave
you proper multiplayer if you share folders correctly between the nodes.

