
Julia on Google TPU: Shakespeare RNN - KenoFischer
https://colab.research.google.com/github/JuliaTPU/XLA.jl/blob/master/examples/3_LSTM_DistributedTraining.ipynb
======
FridgeSeal
I’ve been using Julia for a project at work and it’s been a pretty fantastic
experience.

I’ve only being doing some analysis stuff at the moment, but I’ve got a few
machine learning/NLP projects coming up that I’m super excited to use Julia
and Flux for!

~~~
xiaodai
How did u find the compilation process

~~~
FridgeSeal
Package compilation or code compilation?

Package compilation was painless, code compilation isn't really noticeable
once it starts running and the benefits you get from native speed are evident.
I recently learnt that the IO functions are async as well (powered by libUV!)
and the parallelism is easily an order of magnitude nicer to work with than
Python's (and it's only going to get better).

------
KenoFischer
Oh, I should have mentioned somewhere. Credit for putting the notebook
together goes to Elliot Saba
([https://github.com/staticfloat](https://github.com/staticfloat)) :).

~~~
zackmorris
This is brilliant work. I'm not strong in NNs yet, but I am strong in
prerequisites/blockers. This demos:

* Working in a rapid application development (RAD) fashion by operating on vectors using a language like Julia/MATLAB/Octave/Scilab which allows focusing on abstractions instead of implementation details and other distractions.

* Running code optimized automagically on GPU/TPU/etc.

* Sharing work over the web in a standard fashion (Jupyter Notebook on colab.research.google.com)

It's not clear to me where in this process the code is actually run on TPU
(maybe someone has a tutorial?) but that doesn't really matter. The specific
machine learning algorithm used is also not really that important.

The important part is that this enables amateurs to tinker with machine
learning, see results quickly and share their work. Which means that now we'll
finally see the accelerated evolution of machine learning.

Any of these blockers alone hindered the evolution of AI for decades, but
seeing all three knocked down in one fell swoop is pretty astonishing, at
least for me. I favorited it as a watershed moment in the history of AI!
Congrats to him.

------
fartcannon
"Error Could not access the resources needed to display output. This is
probably because third-party cookies are not allowed by your browser.
SecurityError: The operation is insecure."

You know what doesn't throw errors whenever I just try to look at it? Products
like Jupyter notebooks - stuff not made by google. I believe Colab is a using
dark patterns to discourage blocking 3rd party cookies by subtly breaking what
should render as a simple non interactive webpage to passive viewers.

To me, at least, Google and their dark patterns are considered harmful.

~~~
3JPLW
Here's what I white-listed on uMatrix:

* raw.githubusercontent.com — it's loading the notebook directly from GitHub, so yeah, this one is fairly essential. Just need one XHR here.

* googleusercontent.com — no cookies, but it does load scripts, css, a frame and an XHR here.

That's it. There are some other domains it hits (like fonts.google.com and
gstatic.com), but they're not needed to view the file.

~~~
__random__
[https://github.com/googlecolab/colabtools/issues/17#issuecom...](https://github.com/googlecolab/colabtools/issues/17#issuecomment-459030596)

------
Mizza
That's a whole lot of work for a result that's indistinguishable from a Markov
chain generator..

~~~
jefftk
_> indistinguishable from a Markov chain generator_

A Markov chain generator wouldn't:

* Capitalize the first word of each line

* Make lines of approximately the right length

* Mark text with who is to speak it

While this is just a toy example, it's powerful enough to start showing the
ways RNNs can produce text that looks superficially correct.

(Generating Shakespeare is actually one of the examples given in the classic
[http://karpathy.github.io/2015/05/21/rnn-
effectiveness/](http://karpathy.github.io/2015/05/21/rnn-effectiveness/))

~~~
AstralStorm
You're mistaking a Markov chain toy with an actual Markov chain generator.

1\. Yes it would. It would see capitalized words as high probability for first
word on a line or after a point.

2\. Obviously it could, depending on stop condition. Especially if you include
line length.

3\. If trained on corpus of plays, for sure it would.

The strength on the RNN is supposed to be in context and memory... Perhaps
handling of grammar.

There are advanced hierarchical grammars that are related to Markov random
field models that are about on par with RNN based on many text and music
analysis loads. (In fact probabilistic math is often used to describe results
and workings of a deep NN anyway.)

------
aperrien
How large is the project? I'd like to clone it to my drive account, but not if
the data sets are too huge.

~~~
KenoFischer
The notebook is about 400kB. It also downloads a 4MB dataset and julia,
tensorflow and dependencies need about 400MB, but that gets stored in the
ephemeral Colab VM, not on drive if you copy it there.

