
Learning to Learn in TensorFlow - espeed
https://github.com/deepmind/learning-to-learn
======
panarky
"Learning to learn by gradient descent by gradient descent"

[https://arxiv.org/abs/1606.04474](https://arxiv.org/abs/1606.04474)

~~~
lucidrains
Other equally exciting papers that relates to learning to learn in DL.

"Neural Architecture Search with Reinforcement Learning"

[https://arxiv.org/abs/1611.01578](https://arxiv.org/abs/1611.01578)

"RL^2: Fast Reinforcement Learning via Slow Reinforcement Learning"

[https://arxiv.org/abs/1611.02779](https://arxiv.org/abs/1611.02779)

"Designing Neural Network Architectures using Reinforcement Learning"

[https://arxiv.org/abs/1611.02167](https://arxiv.org/abs/1611.02167)

~~~
ericjang
another one: Hyper-Networks
[https://arxiv.org/abs/1609.09106](https://arxiv.org/abs/1609.09106)

and a blog post to go with it: [http://blog.otoro.net/2016/09/28/hyper-
networks/](http://blog.otoro.net/2016/09/28/hyper-networks/)

~~~
lucidrains
This is why I love hackernews. Reading this tonight, thanks Eric :)

------
pepijndevos
What's a good explanation of Tensor Flow for someone living under a rock? I
dismissed it as some machine learning library, but I read it is in fact a
general computing framework. If I can use it for things like numerical
integration or some numpy-type tasks, that would be interesting.

~~~
conmdur
Tensorflow does general computation using data flow graphs; you assemble your
graph from operations and variables (tensors, as they're called these days),
and Tensorflow handles distribution of this computation over hardware which
you make available.

A quick google gives these [1] impressive results for Tensorflow, at least for
linear algebra operations.

Despite the advantages, I think you'll find many more readily available
functions in Numpy for what you want, while Tensorflow remains quite 'low
level', exposing building block operations rather than higher-level methods
(the exception is machine learning/neural network stuff). That said, I don't
imagine it would be too difficult to implement a fast quadrature method for
integration, or whatever else your heart might desire. This [2] is a simple
example solving a PDE.

\-----

[1] [https://simplyml.com/linear-algebra-shootout-numpy-vs-
theano...](https://simplyml.com/linear-algebra-shootout-numpy-vs-theano-vs-
tensorflow-2/)

[2]
[https://www.tensorflow.org/tutorials/pdes/](https://www.tensorflow.org/tutorials/pdes/)

------
espeed
Video related to library...

Nando de Freitas - Learning to Learn, to Program, to Explore and to Seek
Knowledge (NIPS 2016)

[https://www.youtube.com/watch?v=tPWGGwmgwG0](https://www.youtube.com/watch?v=tPWGGwmgwG0)

~~~
lucidrains
Same topic presented at KDD, with timestamp that cuts to the results.

[https://www.youtube.com/watch?v=x1kf4Zojtb0&t=1h8m46s](https://www.youtube.com/watch?v=x1kf4Zojtb0&t=1h8m46s)

------
rectangleboy
Can someone explain to me the benefits of using TensorFlow over Theano?

~~~
melse
I've heard that Tensorflow is built to take advantage of multiple GPUs
automatically, whereas Theano (by default at least) can only make use of a
single GPU.

~~~
sja
"Automatically" isn't the best word, as TensorFlow won't make use of multiple
GPUs unless you explicitly tell it to (at this time). That said, there are a
number of benefits to using TensorFlow (including the ability to use multiple
GPUs, if not automatically :) )

\- Several common gradient optimization algorithms (Momentum, AdaGrad,
AdaDelta, Adam, etc) are implemented already, which makes it a bit faster to
get your training logic in place

\- Going along with the above, there is more in the TensorFlow API focused
specifically on training models, as opposed to being purely a math engine.
Some might consider the extra funtionality "bloat", but I think it serves a
good purpose

\- The afforementioned multi-GPU functionality is nice, once you get used to
it. It's good for either training multiple versions of a model in parallel or
doing data parallel updates of parameters

\- There are tools for compiling your trained models as static C++ binaries on
mobile devices

\- The TensorFlow ecosystem is quite nice: TensorBoard for visualizing
training, the topology of your model, and various statistics (most recently
visualizing projections of embeddings). TensorFlow Serving for deploying
trained models. TF Slim for a more Keras-like layer by layer approach to model
building. Several pre-trained models to jump start your own work.

\- No compile times. There is a "no optimizations" option in Theano to remove
the compilation, but many people's experience with Theano is having to wait to
iterate on their code.

\- I think the community is pretty swell too :) The Google team does a good
job of responding to and working with folks who open issues or PRs

Generally, I'd say TensorFlow is really good when you want to minimize the
amount of time between researching, training, and deploying your model.

Edit: line formatting

~~~
krab
> \- There are tools for compiling your trained models as static C++ binaries
> on mobile devices

I'm looking for such tool but I haven't found anything apart from C++
libraries that also focus on training. Can you give me some pointers? Thanks.

~~~
sja
In the contrib folder in the TensorFlow project, you'll find the makefile
subdirectory:

[https://github.com/tensorflow/tensorflow/tree/master/tensorf...](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/makefile)

The readme has a general overview of how you'll approach using it. Note that
you'll want to optimize for inference (remove unnecessary operations from the
graph) [0] and freeze your graph (convert Variables into constant tensors) [1]
to drop in your own model for the pretrained Inception model that's used as an
example.

[0]:
[https://github.com/tensorflow/tensorflow/blob/master/tensorf...](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/optimize_for_inference.py)

[1]:
[https://github.com/tensorflow/tensorflow/blob/master/tensorf...](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py)

------
oelmekki
From the related article:

> The move from hand-designed features to learned features in machine learning
> has been wildly successful.

Are the features here the "feature vectors" or the network architecture? Or
something else? In other terms, does this project help normalizing data, or
does it help tweaking hyper parameters?

~~~
conmdur
Here the features are the feature vectors themselves, yes. It's been found
that taking somewhat of a hands-off approach and allowing networks to engineer
their own mid-level representations from raw data can be very beneficial.

This is the idea behind the learning to learn paper. Instead of taking our
gradient and plugging it in to a hand-engineered (i.e. on paper) update rule,
we feed it to a neural network, which is trained to find the optimal update
rule, in some sense (neural networks are just function approximators after
all).

