
Leaf: Machine learning framework in Rust - mjhirn
https://github.com/autumnai/leaf/tree/f0b11961b5a0649544a1101b960c670a0bebf57c
======
wall_words
The performance graph is deceptive for two reasons: (1) Leaf with CuDNN v3 is
a little slower than Torch with CuDNN v3, yet the bar for leaf is positioned
to the left of the one for Torch, and (2) there's a bar for Leaf with CuDNN
v4, but not for Torch.

It's good to see alternatives to Torch, Theano, and TensorFlow, but it's
important to be honest with the benchmarks so that people can make informed
decisions about which framework to use.

~~~
kibwen
The graph in the readme is outdated, you can see the version with Torch/CuDNN
v4 here: [http://autumnai.com/deep-learning-
benchmarks](http://autumnai.com/deep-learning-benchmarks)

And I don't believe the first point counts as deceptive; the bars are ordered
by Forward ms, not by the sum of Forward and Backward. In both CuDNN v3 and
v4, Leaf is faster than Torch by that metric (25 vs 28 for v4, 31 vs 33 for
v3).

------
IshKebab
I think Microsoft's approach with CNTK is far preferable to this. Rather than
defining all the layers in Rust or C++ it uses a DSL to specify mathematical
operations as a graph.

You can easily add new layer types, and recurrent connections are easy too -
you just add a delay node.

Furthermore, since the configuration file format is fairly simple, it is
possible to make GUI tools to visualise it and - in future - edit it.

~~~
hobofan
A DSL based format has some advantages as it easy to get going with building
networks. However you are then constrained by what the program that
interprets/executes the DSL supports in terms of loading/saving data, solvers
etc.. If you want to do something more dynamic e.g. AlphaGo then you have to
go back to a "real" programming language anyway.

That's not to say that Leaf won't have a DSL at some point, but we will wait
until the features of the layers are a bit more stabilized and we have more
clearly mapped out what goals we have for a DSL.

~~~
NotUsingLinux
so you say prototyping is easier without a DSL?

~~~
hobofan
Depends on what kind of prototyping. At the current state of neural networks
DSLs are mainly helpful if you want to tune a network architecture for well-
established tasks like image classification for the Imagenet dataset.

Outside of that I see more dynamic alternatives used much more.

------
rubyfan
I'm honestly skeptical that Rust is all that appealing for this type of work.
It just doesn't seem like the main concerns like performance and type safety
are #1 the top priority in this space and #2 this offering is differentiated
enough from what you already get from Java today.

Honesly, many modeling problems are clunky and inefficient at scale - however
that's ok. When you need to scale bad enough, you already have a significant
set of library support in Java to support this.

I'm failing to see an answer to the one question I have, "why rust?"

------
andreif
Previous discussion 4 months ago
[https://news.ycombinator.com/item?id=10539195](https://news.ycombinator.com/item?id=10539195)

------
YeGoblynQueenne
> super-human image recognition

That's a bold claim. As far as I know there was one paper that reported a
model beating human scores in a specific test (imagenet, I believe). Whether
that translates to "superhuman" results in general is followed by a very big
question mark.

In general I really struggle to see how any algorithm that learns from
examples, especially one that minimises a measure of error against further
examples, can ever have better performance than the entities that actually
compiled those examples in the first place (in other words, humans).

I'm saying: how is it possible to learn superhuman performance in anything
from examples of mere human performance at the same task? I don't believe in
magic.

~~~
benbou09
It is much faster than humans

~~~
tomp
Computers have been faster than humans for the last 40 years. That doesn't
make them more intelligent.

~~~
deepnet
Then by implication this task does not require intelligence ;)

Computers are faster serial processors but brains do more in parallel.

Parallel pipelines only really hit Neural Nets with GPU's and the Imagenet
convnet solvers like Alexnet were among the 1st parallel implementations -
this gave 30 - 300 speedup but still relatively tiny compared with squishy
wetware.

------
kingnothing
I'm completely new to ML and what real world applications it's suitable for.
Are we at the point yet where you can train a computer to look at arbitrary
images and count the number of people in it? What if it was the largely on the
same background and only the number of people were changing -- for example, a
camera shooting a queue of people to determine queue depth at a bus station.

~~~
danielvf
In the scale of computer vision problems, the stationary camera case is
relatively easy. It's not too hard to isolate moving objects from a
background, it's not too hard to decide if an object is a person or not, and
it's not too hard to keep track of an object once you've identified it. You
would still have to handle overlapping people, scene illumination changes,
etc, but these can be solved and have been done before.

If you would like to play with some of this stuff, take a look at OpenCV.
[http://opencv.org](http://opencv.org)

~~~
kingnothing
Excellent, thanks. I'll take a look at that and hack around!

------
eggy
I will take a look at it, but are the benchmarks comparable, since to quote
the site, "For now we can use C Rust wrappers for performant libraries."?
Torch is LuaJit over C, and Tensorflow has Python and C++. Is Rust making it
fast, or the interface code to the C libraries?

~~~
hobofan
The interface code to the C libraries (which is written in Rust). We are
however optimistic that there will be Rust libraries popping up in the future
that outperform the current C implementations. (Optimistic as a Rust user, not
as developer as Leaf)

------
ybrah
Its interesting to see "technical debt" become a more common term. Is there a
rigid definition for it?

From the article: _" Leaf is lean and tries to introduce minimal technical
debt to your stack."_

What exactly does that mean?

~~~
jamesblonde
It's code that you write (typically quickly), that you know will need to be
re-written at a later stage. It's debt that will need to paid at some stage in
the future. You didn't do it right first time.

Technical debt typically arises because the code was poorly structured or the
programmer used the wrong tools/libraries (from a longer-term perspective) or
didn't abstract when she should have. The current obsession with MVPs has led
to an increase in technical debt.

~~~
choosername
it could be any kind of maintainance, I thought

~~~
lloyd-christmas
My view on it is that I lean towards _known_ future maintenance at the time of
programming. Adding a global variable because you aren't sure how it's going
to tie in with someone else's current feature is a bit different than adding a
global variable because you think it's how it's supposed to be done. I try to
make the point of distinguishing bad code and technical debt, as it becomes
pretty easy to just say "it's technical debt" as an excuse for doing something
poorly. I tend to put general code maintenance in a different bin. To each
their own, though.

------
eranation
This is very cool! When I presented it to my CTO however, he said that he
doesn't think this will gain traction from data scientists over Scala or
Python, as Rust is even more complex than Scala (which is not the simplest
language out there, even though I'm a big fan of both Scala and Rust and I
know this might start a flame war)

Do you think Data Scientists can write their models directly using Leaf? do
you think there will need to be a DSL that translates form the R / Python
world to something you can run on Leaf to make it happen?

~~~
kibwen
By what metric does your CTO consider Rust to be more complex than Scala? A
lot of Scala's complexity has to do with interfacing nicely with Java, and
Scala has a lot of implicit behavior and TIMTOWTDI-ness that Rust deliberately
tries to avoid. Odersky has even said that he's hoping that he can remove many
features from Scala in the future.

------
rck
The benchmarks would be a lot more useful if the context around them were more
obvious. In particular, it would be nice to know if the benchmarks are for a
single input, or for a batch of inputs. If for a batch, then the batch size is
important too. Maybe this stuff is somewhere on their site, but it shouldn't
require digging.

Without this information it's hard to make a useful comparison at all.

~~~
hobofan
You are right, batchsize is important and we should make that more clear.

The numbers in the benchmark are taken from our deep-learning-benchmarks[1]
which we are still in the process of building up. It might actually make sense
to test the same model with different batch sizes. The current benchmarks are
based on the convnet-benchmarks[2] where the Alexnet model has a batch size of
128. (Alexnet was chosen because out of the benchmarks that's the one I am
most familiar with, since it small enough that I can work with it on my
Laptop)

In some informal tests Leaf was generally faster than other frameworks in
smaller batch sizes, but no benchmarks that we could publish with confidence
yet.

[1]: [https://github.com/autumnai/deep-learning-
benchmarks](https://github.com/autumnai/deep-learning-benchmarks) [2]:
[https://github.com/soumith/convnet-
benchmarks](https://github.com/soumith/convnet-benchmarks)

~~~
rck
That all sounds reasonable. I was surprised at how much the batch size can
matter on different hardware. Maybe you've seen this post:

[http://svail.github.io/rnn_perf/](http://svail.github.io/rnn_perf/)

It's primarily RNN-focused, but the discussion about batch sizes on GPUs is
interesting.

------
zump
Any recurrent layers?

~~~
mjhirn
We would love to have them and compare their performance with recurrent layers
of other frameworks[1]. There exists an issue for the implementation of
recurrent layers in Leaf (#73)[2].

[1]: [http://autumnai.com/deep-learning-
benchmarks.html](http://autumnai.com/deep-learning-benchmarks.html)

[2]:
[https://github.com/autumnai/leaf/issues/73](https://github.com/autumnai/leaf/issues/73)

------
mastax
I'm glad that rust has crossed the point where posts to HN that would be "_ in
Rust" are now just "_". I hope this means that Rust is starting to be used for
its own merits rather than just novelty.

~~~
dang
We changed the title to say "in Rust" because someone else complained about
"for Hackers". I suppose we could take both of them out, but the project
highlights its Rustiness so this seems more representative.

------
yarrel
1\. Rust warning.

2\. If "for hackers" is the new "for dummies" then gentrification is complete.

