
Keras: Theano-Based Deep Learning Library - tekacs
https://github.com/fchollet/keras
======
jre
I know of 4 projects for deep learning based on Theano.

Keras, Blocks and Lasagne all seem to share the same goal of being more
libraries than framework. You can use only one part (e.g. a Layer
implementation, training algo) without having to pull in everything :

[https://github.com/bartvm/blocks](https://github.com/bartvm/blocks)

[https://github.com/benanne/Lasagne](https://github.com/benanne/Lasagne)

Then there is pylearn2, which look more like a framework and seems to be a
good candidate for becoming the GPU-accelerated scikit-learn :

[https://github.com/lisa-lab/pylearn2](https://github.com/lisa-lab/pylearn2)

I have started using blocks and did some tests with pylearn2. Anybody with
more experience want to share the strength/weaknesses of each of those
projects ?

~~~
albertzeyer
I know some more. Some of them are made as libraries, some are just code
examples, where you however could extract out some relevant code.

* PyLearn, LISA labs, [http://deeplearning.net/software/pylearn2/](http://deeplearning.net/software/pylearn2/)

* LSTM, [http://deeplearning.net/tutorial/lstm.html#lstm](http://deeplearning.net/tutorial/lstm.html#lstm)

* LSTM, [https://github.com/skaae/nntools](https://github.com/skaae/nntools)

* LSTM, [https://github.com/JonathanRaiman/theano_lstm](https://github.com/JonathanRaiman/theano_lstm)

* LSTM, [https://github.com/mohammadpz/Recurrent-Neural-Networks](https://github.com/mohammadpz/Recurrent-Neural-Networks)

* LSTM, [http://christianherta.de/lehre/dataScience/machineLearning/n...](http://christianherta.de/lehre/dataScience/machineLearning/neuralNetworks/LSTM.php)

* LSTM, [https://gist.github.com/jpuigcerver/9358036](https://gist.github.com/jpuigcerver/9358036)

* LSTM + CTC, [https://github.com/kastnerkyle/net](https://github.com/kastnerkyle/net)

* Speech modeling, LSTM, [https://github.com/kastnerkyle/speech_density](https://github.com/kastnerkyle/speech_density)

* FF + RNN, [https://github.com/lmjohns3/theano-nets](https://github.com/lmjohns3/theano-nets)

* FF + RNN, [https://github.com/lmjohns3/theanets](https://github.com/lmjohns3/theanets)

Speech: [https://github.com/lmjohns3/arrnn-
experiment/blob/master/tas...](https://github.com/lmjohns3/arrnn-
experiment/blob/master/tasks/speech.py)

* FF, [https://github.com/benanne/Lasagne](https://github.com/benanne/Lasagne)

* RNN, [https://github.com/pascanur/trainingRNNs](https://github.com/pascanur/trainingRNNs)

* RNN, [https://github.com/pascanur/GroundHog](https://github.com/pascanur/GroundHog) (Razvan Pascanu, KyungHyun Cho, Caglar Gulcehre)

* RNN + CTC, [https://github.com/shawntan/rnn-experiment](https://github.com/shawntan/rnn-experiment) (Shawn Tan)

* RNN + CTC, [https://github.com/shawntan/theano-ctc](https://github.com/shawntan/theano-ctc) (Shawn Tan)

* RNN + CTC, [https://github.com/rakeshvar/rnn_ctc](https://github.com/rakeshvar/rnn_ctc)

* RNN + CTC, OCR, [https://github.com/rakeshvar/chamanti_ocr](https://github.com/rakeshvar/chamanti_ocr), [https://github.com/rakeshvar/chamanti3_ocr](https://github.com/rakeshvar/chamanti3_ocr)

* RNN, [https://github.com/gwtaylor/theano-rnn](https://github.com/gwtaylor/theano-rnn)

* LSTM, RBM, DBN, [https://github.com/kratarth1203/NeuralNet](https://github.com/kratarth1203/NeuralNet)

* RBM, [https://github.com/benanne/morb](https://github.com/benanne/morb)

* Q-learning, [https://github.com/spragunr/deep_q_rl](https://github.com/spragunr/deep_q_rl)

* Deep Generative Models, [https://github.com/dpkingma/nips14-ssl](https://github.com/dpkingma/nips14-ssl)

* RNN, agents, “bricks”: [https://github.com/bartvm/blocks](https://github.com/bartvm/blocks)

* NTM, [https://github.com/shawntan/neural-turing-machines/](https://github.com/shawntan/neural-turing-machines/)

* RL + CNN, [https://github.com/brian473/neural_rl](https://github.com/brian473/neural_rl)

* DRAW RNN, [https://github.com/jbornschein/draw](https://github.com/jbornschein/draw)

And this is far from complete, there are countless more examples. Just search
on GitHub. I just filtered out the ones which interest me (which at least have
RNNs/LSTMs or some other interesting things).

~~~
benanne
Nice work! Since you mentioned you're looking for RNNs/LSTMs specifically: the
implementation at
[https://github.com/skaae/nntools](https://github.com/skaae/nntools) is an
extension of Lasagne (which used to be called nntools) and will be merged into
the library at some point. Hopefully in time for the first release, but we
don't know yet if that will be feasible.

------
michaf
How does Keras compare to Lasagne [0], which is also Python/Theano based, and
which was used with some impressive results [1]?

    
    
      [0] https://github.com/benanne/Lasagne
      [1] http://benanne.github.io/2015/03/17/plankton.html

~~~
benanne
One of the authors of Lasagne here! Lasagne is being built by a team of deep
learning and music information retrieval researchers. Keras seems to share a
lot of design goals with our project, but there are also some significant
differences.

We both want to build something that's minimalistic, with a simple API, and
that allows for fast prototyping of new models. Keras seems to be built 'on
top of' Theano in the sense that it hides all the Theano code behind an API
(which looks almost exactly like the Torch7 API).

Lasagne is built to work 'with' Theano instead. It does not try to hide the
symbolic computation graph, because we believe that is where Theano's power
comes from. The library provides a bunch of primitives (such as Layer classes)
that make building and training neural networks a lot easier. We are also
specifically aiming at extensibility: the code is readable and it's really
easy to implement your own Layer classes.

Another difference seems to be the way we interpret the concept of a 'layer':
a Layer in Lasagne adheres as closely as possible to its definition in
literature. Keras (and Torch7) treat each 'operation' as a separate stage
instead, so a typical fully connected layer has to be constucted as a cascade
of a dot product and an elementwise nonlinearity.

Layers are also first-class citizens in Lasagne, and a model is usually
referred to simply by its output layer or layers. There is no separate "Model"
class because we want to keep the interface as small as possible and so far
we've done fine without it. In Keras (and Torch7) the layers cannot function
by themselves and need to be added to a model instance first.

For now, all Lasagne really does in the end is make it easier to construct
Theano expressions - we don't have any tools yet for iterating through
datasets for example, but we do have plans in this direction. We plan to rely
heavily on Python generators for this. The scikit-learn like "model.fit(X, y)"
paradigm, which Keras also seems to use, only really works for small datasets
which fit in memory. For larger datasets, we believe generators are the way to
go. Incidentally, Nolearn (
[https://github.com/dnouri/nolearn](https://github.com/dnouri/nolearn) )
provides a wrapper for Lasagne models with a scikit-learn like interface. We
may also add this to the main library at some point.

Lasagne is not released yet - the interface is not 100% stable yet, and
documentation and tests are a work in progress (although both are progressing
nicely). But a lot of people have started using it already, we've built up a
nice userbase and a lot of people have started contributing code as well!
We're currently aiming to put out the first release by the end of April.

A non-exhaustive list of our design goals for the library is in the README on
our GitHub page:
[https://github.com/benanne/Lasagne](https://github.com/benanne/Lasagne)

