
The Cuneiform Tablets of 2015 [pdf] - akavel
http://www.vpri.org/pdf/tr2015004_cuneiform.pdf
======
WalterBright
One of the sad consequences of perpetual copyrights is that copies of
interesting items are not being made, and hence risk getting lost completely.

~~~
mintplant
A good example of this is the early iOS game _Rolando_. Once the _Angry Birds_
of its time, it's now just... gone.

~~~
ashitlerferad
This Rolando?

[http://rolando.ngmoco.com/](http://rolando.ngmoco.com/)

~~~
golergka
Yes. I remember it from 2008, when gaming press for iOS was trying to become a
thing, some guys have published a great and expensively produced video reviews
of iOS games, and this was one of the best games they ever reviewed.

------
awinter-py
As someone who doesn't care about proprietary 80s laserdiscs but does care
about software produced in 2016, 'depend on the oldest still-used technology'
is an interesting (non-horrible) design choice.

The python PEP for universal linux binaries advises building on an ancient
version of fedora for ABI compatibility.

If you think about it, supporting many different deployment targets is kind of
like time travel.

~~~
thechao
The "ancient version of fedora" is the OS I work on for one of my deploys. It
takes a 3-level bootstrap to get a working C++11/C11 compiler.

~~~
awinter-py
ok so there are downsides to being a time traveler. But buildsystem blues is a
small price to play for passing your final history exam, saving your parents'
prom, or finding carmen sandiego.

------
anateus
They mention Lincos, there's another project inspired by it that attempts to
transmit Scheme called CosmicOS:
[https://cosmicos.github.io/](https://cosmicos.github.io/)

~~~
gkya
As a sidenote, I like the (A B | C | D) and $A ideas from it, it would be some
nice syntactic sugar for an actual lisp that we use today.

------
timthorn
> The Computer History Museum, run by “hardware guys,” has an extensive and
> impressive collection of vintage hardware of all kinds and from all eras,
> but it sits there as a collection of lifeless beige boxes.

And yet The Centre for Computing History in Cambridge has its exhibits powered
up - including a collection of several Domesday Machines.

~~~
outofband
I think this is the point of the Living Computer Museum -
[http://www.livingcomputermuseum.org/](http://www.livingcomputermuseum.org/)

------
aaronbrethorst
...and that's why I like to shoot film. It seems improbable that JPG or TIFF
will go obsolete any time soon, but—someday—they will. The RAW files from my
digital cameras are far less likely to be readable ten years from now.

But a negative or a positive image? Not a problem. Never gonna be unreadable.
There's a time and place for both formats, just like there is a time and a
place for both a Kindle book and a real paper book, and it's important that we
make a point of thinking carefully about which the circumstances dictate:
convenience or long-term survival.

~~~
WalterBright
Having a lot of faded old photos, I am not convinced of their longevity.

~~~
aaronbrethorst
My fiber-based prints, properly fixed, washed and toned, are likely to survive
for 150+ years.

~~~
WalterBright
Famous last words :-)

But seriously, how many photos does anyone bother to preserve on archival
quality materials? I've lost family photos due to fading, floods, and simply
losing them. I've digitized most of the remainder, in order to preserve them.

~~~
aaronbrethorst
For me? All of them that matter. I mostly print on gelatin silver paper in a
darkroom, and I don't use digital inkjet paper that contains OBAs.

I have digital copies, of course, too, and those are stored on my iMac, in a
backup hard drive, in Backblaze, and on S3.

------
biofox
Searchable online version of the BBC Doomsday data:

[http://www.bbc.co.uk/history/domesday](http://www.bbc.co.uk/history/domesday)

~~~
Nutmog
Excellent, except it's half missing as the article mentioned. This turned out
to be a dramatic demonstration of how difficult it is to preserve data even
when you're trying and you're as big as the BBC.

"You may remember collecting data for the 'National Disc' which,
unfortunately, we have not been able to re-publish at this time."

------
Houshalter
The other day I was thinking about what would be required to preserve all
scientific knowledge. There are a great deal of papers that have been made
available by hackers, but the filesize is quite large. Many of them are
scanned documents.

Raw text itself is pretty compressible, but pdfs record everything about the
layout and typesetting, font choice, small smudges on the page, etc. You could
maybe run OCR on them and get the text (if the OCR is reliable enough.) But
then you lose equations, figures, unusual symbols, and other important info.

~~~
Thrymr
There has been some effort in Project Gutenburg to typeset public domain books
(and at least one journal issue) in mathematics:
[http://www.gutenberg.org/wiki/Mathematics_%28Bookshelf%29](http://www.gutenberg.org/wiki/Mathematics_%28Bookshelf%29)
Doing this for all of the scientific literature is rather daunting, however.

------
jerryhuang100
Cuneiform Tables vs 2013/2014:

[http://imgur.com/IaiqCXX](http://imgur.com/IaiqCXX)

~~~
Houshalter
Yes but the storage space of the clay tablet is a dealbreaker. It could maybe
store 1 kb. And the file format has been lost, and is only usable by
historians that have spent decades decrypting it.

------
kragen
This paper is a major contribution to digital archival. I'm embarrassed that I
hadn't read it until now. Thanks, HN!

I've been thinking a lot about how to solve this problem myself.

Kay says their Smalltalk virtual machine for the 8086 was 6 kilobytes of
machine code; I think we can do several times better than that. The most
recent BF interpreter I wrote, in 2014,
[http://canonical.org/~kragen/sw/aspmisc/brainfuck.c](http://canonical.org/~kragen/sw/aspmisc/brainfuck.c),
is a bit over a page of code, and with -Os, it compiles to 863 bytes of i386
code (768 bytes .text, 38 bytes .init, 23 bytes .fini, 34 bytes .rodata).
Maybe the ideal archival architecture would take a little more code than BF to
interpret, because an actually efficient BF implementation has to do all kinds
of somewhat unpredictable optimizations.

Some kind of Forth machine, like Calculus Vaporis
[https://github.com/kragen/calculusvaporis](https://github.com/kragen/calculusvaporis),
is one possibility.

Another would be a simple register machine, something like the PIC; also in
2014, I wrote a proposal for a nearly-MOV-machine version called "dontmove" at
[http://canonical.org/~kragen/sw/aspmisc/dontmove.md](http://canonical.org/~kragen/sw/aspmisc/dontmove.md).
The C implementation at
[http://canonical.org/~kragen/sw/aspmisc/dontmove.c](http://canonical.org/~kragen/sw/aspmisc/dontmove.c)
compiles to 855 bytes of i386 machine code (38 .init, 784 .text, 23 .fini, 10
.rodata) and is dramatically more efficient than a simple BF implementation,
and unlike BF, it has features like memory indexing and subroutine calls; but
it's not as well tested.

Each of BF and Dontmove took me about half an hour to implement. Even though a
simple Dontmove implementation is exponentially faster than a simple BF
implementation, it's still orders of magnitude slower than native code. I'm
still exploring how to better bridge that gap; ideally implementing the
virtual machine wouldn't take the entire afternoon that is the goal of Chifir,
and it wouldn't run as slow as Chifir. I suspect that some kind of SIMT
architecture (like GLSL and GPUs in general) might be the right path, allowing
a simple emulator to amortize interpretation overhead over many lanes of data.
I expect Alan would be allergic to this idea.

As mentioned in the Cuneiform paper, Lorie and van der Hoeven have published
some papers on what they call a Universal Virtual Computer, directed at
archival, but unfortunately some of the design decisions in the UVC run
strongly counter to the goal of ensuring that from-scratch implementations
have a good chance of being compatible: bignum registers, for example, and
complicated fundamental operations like float division. The consequence is
that writing emulators to run on the UVC should be very easy, but no two
implementations of the UVC will be compatible, so those emulators will not run
successfully on new implementations of the UVC written after we are all dead.

An issue barely mentioned in this paper is I/O devices. You'll note, for
example, that Chifir has no mouse and no real-time clock, although it does
have a keyboard and a framebuffer; its keyboard interface is somewhat
underspecified but appears to lack control, alt, or other similar modifier
keys, and there are no key-release events, and as abecedarius points out,
_reading the keyboard is blocking_. This means that it will be impossible to
write simple video games for Chifir that move your guy only as long as you
hold down a key, and the lack of a real-time clock means that it can't do
animations at a constant speed. It's likely that the specification and
implementation of I/O devices for an archival virtual machine will require as
much effort as that of the CPU.

------
drostie
Somehow my brain turned off for the logo & the authors and I just started
reading the main text. So somewhere in section 5, I started thinking, "Wow,
this is great, it's like several talks I've heard from Alan Kay. ... ... Wait
a minute..." Yep, he's the coauthor.

------
fritzo
But there is mere 3-4 decade gap separating (1) when incomprehensibly complex
but working computer systems are created, and (2) when sufficiently powerful
reverse engineering systems can simulate those obsolete systems.

------
mmckeen
I went to UCLA with one of the authors, weird to see this pop up on HN.

~~~
eschaton
They wrote a paper with Alan Kay, why wouldn't it pop up on HN?

