Hacker News new | past | comments | ask | show | jobs | submit login
Top Deep Learning Projects (github.com)
217 points by aymericdamien on Aug 11, 2016 | hide | past | web | favorite | 35 comments

A few comments on some of these projects:

Keras is pretty much the best way to do almost anything these days. If you are starting out learning, use ConvNet JS, but after that switch to Keras.

TFLearn is really nice if you are already using Scikit.

There's lot of frameworks on there: TensorFlow, Caffe, CNTK (that's a lot of stars for something no one outside MS uses!) Theano, Torch etc. But I think the sleeper there is MXNet. I haven't used it, but I hear good things about it, and DLMC have good track record in producing some pretty nice software (XGBoost!).

Also, DeepDetect! I keep trying to find that and never can remember the name.

I ported ConvNet JS to C# in order to really understand what's going on: https://github.com/cbovar/ConvNetSharp

Brilliant, I've been looking for projects like this. I'm currently working through a couple of RBM C# projects but will add this to my list of reference code.

Top tip: If you use the matrix and vector classes in math.net then you can optionally configure it to use optimised version of e.g. matrix multiplication, that map through to one of the providers, such as Intel Math Kernel Lib, OpenBLAS, and I think there's a CUDA provider too.

I tried the Intel MKL one and the dense matrix multiplication was about 60x faster than a plain C# version.



Thanks for the tip. I'll see where I can apply it.

Most of the time is usually spent in the convolution layers. Convolution is not a matrix multiplication in the current implementation. I guess it would be a matrix multiplication in frequency domain or by using a Toeplitz matrix.

I've implemented a CPU Parallel version and gave a try at GPU implementation. But I'm not satisfied at all by the GPU version :)



Pull requests more than welcome!

> Convolution is not a matrix multiplication in the current implementation

I figure there's a code re-organisation task since propagating node activations through a layer of weights is essentially a matrix multiplication (fully connected => fully dense matrix).

The optimised routines make use of vectorised CPU instructions and the FMA instruction (fused multiply and add), all of which are perfect fits for [dense] matrix multiplcation. Not so great for sparse matrices, but they help, usually unless it's very sparse it's faster to use a dense matrix format with zeros for the missing weights.

> Pull requests more than welcome!

Duly noted :)

> Keras is pretty much the best way to do almost anything these days

What makes Keras take the advantage?

It's a well designed API for using deep neural networks rather than an API for doing optimized mathematical operations.

Compare how you build some vaguely comparable models in Keras[1] and raw TensorFlow[2]. Keras uses TensorFlow (or Theano) underneath, so there is no performance penalty.

It's like in Python machine learning, most people use Scikit instead of implementing things in numpy.

[1] https://github.com/fchollet/keras/blob/master/examples/mnist...

[2] https://github.com/tensorflow/tensorflow/blob/master/tensorf...

Thanks! Your example to explain the differences makes me more leaning towards Keras. It feels like Scikit for Deep Learning.

Yes, that's a decent analogy. If you are already familiar with Scikit then TFLearn is worth looking at too.

Theano itself is more like a language, not a deep learning framework. There is no NeuralNetworkClassifier class, for example. Although, you could write a neural network library / framework using Theano, and it would have all the benefits of Theano (code compiled for the GPU, various common neural net ops available, etc.), which is what it looks like the Keras folks have done. I took a stab at this a while ago (1), but I didn't keep up on it. I haven't used Keras much, but it looks like it fills a much needed gap, which I'm glad for.

(1): https://github.com/notmatthancock/neural_network

Very extensible API, accessible and widely used programming language (Python), the ability to use both Theano and Tensorflow as a backend and easy to implement non-linear neural networks (where data is split and merged at will) all contribute to this. Using Keras means you will almost never need to implement some custom layer or function, whilst sacrificing very little performance-wise.

> Also, DeepDetect! I keep trying to find that and never can remember the name.

Ah, thanks for the reference, DD author here, the funny fact about the name is that translated to my mother tongue (French), it sounds like porn ;)

CNTK is actually pretty damn good. It just lacks a good scripting interface. They're adding one though.

The network description language (now 'BrainScript') is a far nicer way to specify networks than the approaches used by any other network. Especially for recurrent networks. In CNTK you can just say `X = FutureValue(Y)` or `X = PastValue(Y)`. It's so convoluted in TensorFlow I actually never worked out how to do it.

It also has their fancy 1-bit SGD stuff, but I doubt many people use that, and it has a more restrictive license anyway.

Keras support for recurrent models leaves a bit to be desired at this point, so it's great if it has what you want, but otherwise you have to start peeking under the hood, which may be harder than just learning the underlying framework.

When you say starting out, do you mean a true beginner to machine learning, or someone new to these frameworks?

This is, frankly, a naive way to rank deep learning projects, because Github stars are cheap. Francois Chollet, the creator of Keras, comes out with a monthly ranking that takes other factors into account, such as forks, contributors and issues, all stronger signs of community and users. Here's his July update:


Most of these frameworks are Python-oriented: Keras, Theano, Caffe, TF, neon, Mxnet, etc. The space is saturated. If you look at deep-learning projects by language, then Torch stands out -- it has a Lua API. And Deeplearning4j is the most widely used in Java and Scala. You don't have to crowbar it into a Spark integration, like you do with TensforFlow. http://deeplearning4j.org/

MXnet is not talked about a lot, but it's growing fast. It was heavily used by Graphlab/Turi, recently bought by Apple, so the question is what will happen with it now.

No metrics are perfect but I highly suggest following francois chollet (fchollet on twitter and hn, creator of keras) for his monthly updates.

He does a pretty good job of showing actual activity on github: https://twitter.com/fchollet/status/753980621823750145

I point this out because drive by stars of a project are easy...but it doesn't reflect actual user activity.

This is true. Stars in this context are also more indicative of projects that are attractive to a more mainstream audience. Examples like Deep Dream and Neural style, where laymen visit the Github page and star the project because it's "cool" are prominently featured, while projects like Torch and Theano have had massively more impact on the deep learning world today.

> Stars in this context are also more indicative of projects that are attractive to a more mainstream audience.

"Popular Deep Learning GitHub Projects" would have been a more accurate title. It is mentioned in the text but the title mentions "Top" which gives a different impression than the word "Popular".

It's impressive how fast TensorFlow has become this popular (judging not only by the number of stars it has but also by the number of other projects in that list related to TF)

What about Spark MLlib?


Not a Deep Learning project, also doesn't seem to be a real focus for Databricks.

Adding a programming language column would be great!

Torch is split across many different repositories. Many of the relevant issues occur in `torch/nn` or `torch/cunn`, for example. Only considering `torch/torch7` underestimates the popularity by quite a lot.

I wander how many people still implement their own networks as opposed to use these prepared frameworks. Or do you guys stick to single framework or use some sort of mixture of tools?

Implementing a basic neural network is a lot of work, to be honest. I tried in C, in fact even made it parallelizable. It was hell. Lots of hair-pulling. I thank the gods for Keras and keep my head down.

Unsurprising. Normally, you at least have something like BLAS to do linear algebra work.

It takes maybe 15 minutes in Python / numpy for something basic like a 2-layer net, but backpropegation is a little annoying to get right. Thus, tensorflow (or theano or caffe or whatever).

Only as a learning exercise, for anything you actually care about results for you want a GPU implementation.

Based on number of stars, this project should be on its own list

This list is old :)

What are some obvious additions the list is missing since it's old?

It was updated on 9 August. That isn't too old.

Nop, check the stats of Caffe for instance: https://github.com/BVLC/caffe

This is much older than the 9th of August.


It is 20/11819 = 0.16% stars off. It isn't old, just not automatically updated.

Ah, it was when it got posted to HN! Count was ~9K.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact