

Markov & statistical models unlock ancient script - sgoraya
http://www.sciencedaily.com/releases/2009/08/090803185836.htm

======
ggruschow
They haven't unlocked anything. At best, they've gained confidence in the type
of lock.

The title on the site is bad (" _unlock_ more secrets of ancient script"),
especially for a place called " _science_ daily". The title here is atrocious.
The end of the first paragraph is "researchers are using mathematics and
computer science to _try_ to piece together information about the _still-
unknown_ script."

~~~
jimmybot
Agree about the bad title; but it does resolve an important question, which
was whether the script was a language at all. Establishing that it has the
regularities of human language was a necessary first step.

Here's another interesting example. The Luwian script's _directionality_
couldn't be established. It was not certain whether it was up-down, up-down or
up-down, down-up or various possibilities. This paper established using
Expectation-Maximization the correct order of reading for the script:

Discovering the linear writing order of a two-dimensional ancient hieroglyphic
script <http://www.csie.ntu.edu.tw/~sdlin/publications.html>

Luwian script: <http://en.wikipedia.org/wiki/Hieroglyphic_Luwian>

------
jamesk2
Google should create a translation module for dead languages. Why stop at
scanning old books? Scan some ancient tablets Sergey!

~~~
jimmybot
Modern statistical machine translation works by training on large amounts of
already translated data, called parallel corpora. Monolingual data is helpful
and can help you fix grammar, choose words that are more likely, but it won't
work without that parallel corpora. Basically, you need a Rosetta stone, and
if you had that, the humans could slowly deduce all sorts of things from only
a small amount of data that would probably be useless for an MT system.
Actually, scanning, or OCR, is a similar process--you train first on text that
already has been properly transcribed, then you have a model that can be used
to do OCR on new data.

------
luckyland
I think the researcher here has confused Bayesian inference with Markov
chaining? I'm doubtful this is an effective approach to unlock any kind of
-meaning- from the inscriptions.

If this isn't an article for PG's comment, I don't know what is :)

------
adi92
such an approach might teach them everything about the language's syntax but
nothing about its semantics.. but i guess deciphering that would be a lot
easier once they can identify the structural components of the sentences

------
jberryman
Hmmmm... colon, open paren, close paren... curly brace... I think I've got it!

:(){ :|:& };:

...now to run the script!

NO CARRIER

------
pageman
the paper be found here:
<http://arxiv.org/PS_cache/arxiv/pdf/0901/0901.3017v1.pdf>

