
Team uses Ohio Supercomputer Center to translate lesser-known languages - breck
https://www.osc.edu/press/disaster_relief
======
YeGoblynQueenne
>> Schuler’s team is working to build a Bayseian sequence model based on
statistical analysis to discover a given language’s grammar. It is
hypothesized this parsing model can be trained to learn a language and make it
syntactically useful.

I'm not sure who is "hypothesising" this but the consensus in the field of
grammar induction [1] has long been that natural language learning from
examples is intractable [2].

I also see some really funny claims about "rapid grammar acquisition" from
just 20,000 sentences. For comparison, in English that'd be about 2.8 MB of
text (assuming sentences with an average of 25 words and words with an average
of 5.5 characters). So, what exactly do you need a supercomputer for, if
you're training on just 3 ish MB?

Not to mention- 2.8 MB might be OK to train a simple statistical model, but
for grammar induction? Oh, you can forget about that!

I don't see any results either- just some "excitement" about how the thing
"seems to be yielding positive results". So- where are they?

Sorry but I have to say that this article is really a bit rubbish.

________________

[1] Grammar induction is thee sub-field of machine learning that researches
the possibility of learning languages from examples.

[2] Oh, I don't know- try Colin de La Higuera, Grammatical Inference -
Learning Automata and Grammars. You won't find any source disagreeing on the
intractability issue.

~~~
glup
Of note, children demonstrate pretty robust grammatical expectations for
specific syntactic phenomena after hearing comparable number of utterances,
with conflicting evidence regarding the availability or reliability of
negative evidence. Of course the proofs you cite from formal work are
unassailable (=math), but it's possible that "learning" tolerates more error
than identification in the limit (see PAC learning for example), and that
developing a facility with natural language has more to do with matching the
distributions over strings from the adult grammar than inferring the correct
grammar (identifying which utterances are in the language, vs. those that
aren't). Child language induction is often used as the model for zero-resource
natural language processing.

This seems to be the direction Dr. Schuler is coming from:
[http://aclweb.org/anthology/C/C16/C16-1092.pdf](http://aclweb.org/anthology/C/C16/C16-1092.pdf).

~~~
YeGoblynQueenne
I'm not convinced about the "learning to match the distributions of strings,
rather than the full structure of a language" bit. I'm not saying that you can
only learn a language in full. For instance, I speak Greek natively and I know
a bit of Italian. I don't have to know Italian perfectly to read books (and
fumetti) and communicate in it. Then again, I already know _a_ language so the
problem of learning a new one is quite a different one from learning the
first.

The problem I have with statistical language learning, especially in children,
is that, just because you can learn a distribution over strings doesn't mean
you should, or that you do, as a human. I'll move the goal posts a bit now
(but then again, you started talking about language learning in children) but
an important ability of language users is that they can produce coherent
utterances in context. Statistical language models can learn some structure
and parameters, but I have yet to see any statistical learner that can do what
humans do, know what it's saying and what is being said.

This tells me that, although statistical learning may take us even say half-
way to language learning, it won't get us all the way. I'm convinced that
there's something much simpler, much easier (in terms of computational
operations) that children -and adults- do when they learn language. We just
haven't found out what it is, yet.

Btw, limited human ability for introspection not withstanding, I can tell that
when I try to understand a spoken or written utterance I definitely do
something sort of probabilistic- I will sometimes even consciously ask myself
what is more likely being said. But that doesn't tell me that language
learning, or intelligence itself, is probabilistic- in the same way that just
because I can walk I don't think of myself as a pair of legs with stuff on top
(accurate as that description might be).

Edit: upvote from me, for being the first person on HN that knows what I'm
talking about when I'm talking about Gold's result etc. I was starting to
think it was all a dream I had.

------
thechao
I do GPU driver development, and I _love_ GPUs, but articles like this, which
are extended statements of the form “GPUs are faster than CPUs” is just
ignorant, or a sales pitch. A lot of times I see this process:

1\. CPU code is slow;

2\. Port code to GPUs; it’s super-fast!

3\. Backport low-cross-communication parallel codebase to CPU: it's even
faster!

That is, the GPU port forces developers to think about fine- and large- scale
threading granularity, and to design low-communication algorithms. However,
having done this, the same algorithm runs _better_ on the CPU than the GPU,
because the GPU is starved of bandwidth.

~~~
zeusk
/offtopic

I'm interning with a GPU driver development team next summer but have no
experience with Graphics programming or Shaders. I have some experience with
university level OS course and a past internship where I worked on linux but
not much else.

Can you recommend some resources on GPU from the point of view of an operating
system/driver programmer? I tried some of the AMD and Nvidia architecture
sheets but were not basic enough for me to grasp and moreover they were
oriented towards extracting performance from the respective architectures.

~~~
thechao
The only API facing GPU programming I’ve ever done is the first dozen-or-so of
the NeHe tutorials back in 2003. And lots of single triangles & quads.

I’ve written lots of software rasterizers. If you want to understand modern
graphics HW, then try to write a fully programmable (shader based) software
rasterizer in less than 500 loc of C/++. It’ll really bring to clarity the
stages, what they do, and why they’re included. Hint: use function pointers
for the shaders.

------
itchyjunk
I wonder if they are using deep learning models similar to Google. If so, I
wonder if googles Tensor processing units would have been computationally more
efficient.

Tangentially, can a neural network that has learned most of human languages
somehow some up with a better language?

~~~
glup
What do you mean by better? More communicatively efficient? Less ambiguous?
Easier to learn?

No NN has developed human-like domain general facility with language, but even
without doing that we can think about how variants of existing natural
languages perform on the above objective functions.

------
nategri
Awesome! I had an account on OSC for some of my graduate research. Fantastic
resource.

~~~
antognini
As did I! I ran a _lot_ of gravitational dynamics experiments on OSC in my
graduate student days.

------
ryacko
When will Codex Seraphinianus be broken?

