
Open sourcing Sonnet – a new library for constructing neural networks - lopespm
https://deepmind.com/blog/open-sourcing-sonnet/
======
gcr
This isn't "yet another completely different neural network library." This
library just has some new layer types for TensorFlow.

Looks like there are some new layers for special kinds of attention RNNs, word
embeddings, alternate implementations of spatial transformers, and so on. They
also have another Batch Norm implementation that of course requires tons of
fiddling to work properly, a classic tf staple :-)

As a machine learning environment, tf is so complex that different research
groups have to define their own best practices for it. Using tf is like
learning C++ where everyone learns a slightly different, mutually
incompatible, but broadly overlapping subset of the language. We're just
seeing a glimpse into DeepMind's specialized tooling along with reference
implementations of the operations they use in their work.

This will be really useful for researchers who want to mess with deepmind's
ideas/papers, but I'm a bit relieved that there isn't anything claimed to be
fundamentally paradigm-shifting about this release.

~~~
Cacti
It's nice to see people building on top of TF rather than continuing to
develop completely new frameworks.

I've been working with TF in detail for several months now and it is a very
nicely designed framework. It is quite low-level, as you pointed out, (and so
can be a challenge to get used to, and why you see helper wrappers/libs like
this and keras popping up) but ML models and methods are becoming increasingly
complex and TF provides exactly the foundation and flexibility needed to
implement this stuff while still being able to slap together some basic layers
if you need to. It also helps that the engineering is very solid (e.g.,
distributed models and datasets) and most of the performance kinks by now have
been worked out.

For those maybe putting off working with TF because of its steepish learning
curve, I'd strongly suggest you dive in. I learned last year in the usual
method (docs + google), but the new OReilly books are the first good ones I've
seen if that's your style.

The analogy with C++ is a good one, and historically appropriate given how
scarce GPU resources are now.

~~~
eanzenberg
If you're already using TF, have you looked into keras? Does it not have the
complexity or control you are looking for?

~~~
Cacti
Keras is a wrapper around TF to make some common model operations easier
(construction of linear layers of nets like a CNN) and to make common training
operations easier.

So you can still do whatever with Keras by combining it with lower level TF
operations, it's not really keeping you from doing anything. But more complex
layer types and training might have to be rolled by hand, at which point it
can be easier to just do it all at the lower level TF ops. This comes up a lot
when implementing newer papers where they are proposing new training methods
or Operations where you really need the flexibility to get in there and
manipulate the ops directly.

------
mrdrozdov
There's definitely some examples of useful ops here:

    
    
        # 1. Define our computation as some op
        def useful_op(input_a, input_b,
                      use_clipping=True, remove_nans=False, solve_agi='maybe'):
            # ...

------
drej
Looks cool, but: "This installation is compatible with Linux/Mac OS X and
Python 2.7."

Sad face.

~~~
wdroz
Python 2.7 only is a no-go for me.

~~~
mark_l_watson
I am not a Python developer, but I do use Python occasionally.

Is Python 2.7 requirement really a big deal for mostly using other people's
code (adding a bit of your own application layer code)?

I use Anaconda and just have different environments for 2.7 and 3.x,
TensorFlow or spaCy, or NLTK, etc.

Really easy to switch to the environment I need.

~~~
ganfortran
It is more about ideology.

The 3.x is catching up, but still not fast enough as the community wants it
to. But 2.7 is here for a reason, and shaming people that used 2.7 to push 3.x
adoption, in my opinion is a bad practice.

------
dbcurtis
For us NN newbies, could some one do a comparison to Keras?

~~~
Cacti
Not really, but from a look at the documentation, this seems to be geared
towards recurrent nets (and their internal organization) and keras is a little
more focused on layer abstraction.

Hopefully they improve the docs as they go along; I could follow along because
I work in the area, but they're pretty researchy, and I would imagine most
would have difficulty parsing them unless they're regularly reading papers in
this area.

------
singhrac
> Making Sonnet public allows other models created within DeepMind to be
> easily shared with the community

I'm looking forward to this - DeepMind has a frustrating track record of not
open sourcing their code. While it's their prerogative and I know of many
valid reasons for doing so, the many other researchers who publish their
(sometimes terribly hacked together) models have been incredibly helpful in
verifying that their work, well, works.

------
vmsp
Where does this leave Keras? Wasn't it supposed to be the new high level
interface to TF?

EDIT: Also, what about TF-Slim?

~~~
sbbq
Not sure. I think Keras is more meant for education and off-the-shelf models,
not so much for research due to its abstraction. Or in other words, its less
customizable. I think the probably best trade-off between low-level code and
coding productivity would be tf.layers, which is also in tf.core now.

~~~
nl
Keras is heavily used in research.

------
waleedka
Tip for the project lead: A quick way to lose half of the early adopters is to
release a library without Python 3 support. Luckily, fixing this is easy and
shouldn't take more than a day or two. And the ROI is high if your goal is
wider adoption.

~~~
rch
It's also an easy way for an outsider to contribute.

------
ReeSilva
It looks nice and I will certainly try it. But, no support to Py3? Common,
guys, it's 2k17

------
likelynew
Anyone knowledgable with sonnet here mind detailing how is it different from
keras.

------
qeternity
As a startup that's in the midst of hiring, I can't wait to see how quickly
CVs get updated to include this.

~~~
ravenstine
Do you often find a significant number of CVs listing technologies that are
too new for one to have gained any mastery over?

~~~
qeternity
No, but we've only been interviewing for a few weeks. However we see an
unbelievable number of CVs that have a full paragraph dedicated to buzzwords
and python libraries.

