This doesn't feel fair to say, just like it wouldn't if you replaced "Tensorflow" with "C" and "Keras" with "Python". They operate on fundamentally different levels and provide a similar trade off in control/ease of use.
We started using Keras for a project a few months ago, and it was great while it supported what we were doing. Once we needed to go outside of the box a little bit we essentially had to rewrite it in just Tensorflow.
This is great news though! Hopefully it will make the barrier to entry much lower for getting started with Tensorflow.
The API looks similar to something like Keras or Lasagne, but the layers are just simple operations on tensors, so it integrates seamlessly with vanilla TensorFlow in a way that something like TF-Keras won't.
At the same time, though, the layers are the same foundation that TF-Keras look like they'll use, so you get much of the same expressive power, but without sacrificing flexibility.
Google also released prettytensor, which is designed to address the same problem.
That's a valid point. But I'd still like to see Tensorflow improve on a few areas. In particular, the documentation and overall marketing/positioning. For instance, the Udacity course and the Tensorflow tutorials do not make it at all clear that Tensorflow is the low level plumbing that you only need if you really have to customize the algorithms or build new ones.
Furthermore, the API really is over-complex, and the docs and tutorials tend to show the full complexity when much simpler approaches exist. I'd like to see context managers, scope, sessions, and explicit graphs disappear from all but the most advanced documentation - show us how to build in Tensorflow without all that cognitive overhead (and indeed, it can be done!)
Tensorflow's defaults are unfriendly too. For instance, grabbing all available memory on all of your GPUs is unexpected and unhelpful. Open up another Jupypter Notebook tab and you've got a nasty error message coming up...
> We started using Keras for a project a few months ago, and it was great while it supported what we were doing. Once we needed to go outside of the box a little bit we essentially had to rewrite it in just Tensorflow.
I'm surprised to hear that - I've found it so much easier to implement parts in Tensorflow or Theano and then call them from Keras. Trying to reimplement all the DL best practices from scratch in Tensorflow is a huge amount of work and hard to get right first time (e.g. handling dropout in RNNs correctly), and you still end up with an API that's less elegant than what Keras already provides.
Why did you find you needed to rewrite in TF rather than integrate with Keras?
We were building a character based RNN for generating text, with some special tweaks to it. This wasn't a good fit for Keras based on what we were reading, and there was a great straightforward Tensorflow example already out there which we could work off of so it made more sense to us at that point in time.
Personally I've found a somewhat obscure low-level and (comparatively) little documented DyNet [ much easier to understand than tensorflow, so the critique rings true to me.
I'd like to personally see keras as the industry standard as an interface. We are starting to give classes with it for our customers and the feedback has been positive.
CNTK is adding one: https://github.com/Microsoft/CNTK/issues/797
Mxnet is there:
TF being there obviously helps.
//Begin blantant self promotion :D
No one will care about ours as much (research crowd,python,etc etc,..), but I'll throw our hat in the ring here too, keras has been amazing for our use case and we are personally pushing more python people towards that:
^^ that will bridge the spark ecosystem and friends so you can run on yarn/mesos/spark etc from a familiar interface.
//End blatant self promotion
As it happens, I literally just now posted the curriculum for part 2 of the course - http://www.fast.ai/2017/01/17/curriculum2/ . If you're near SF, you may want to join us. Either way, I'd love to get feedback on the curriculum - anything you'd like to see added? Anything from part 1 that you'd like to know more about?
For starters, I found the pairing of the lecture, the code, AND the documentation to be particularly useful. The setup in anaconda really enables to you compare/understand inputs and outputs, which at least for me, is very helpful! Big fan of learning through practical application, which the aforementioned combo is well suited for, imo.
Kudos and thank you (and the team) very much for all of the hard work! I am not sure I'll be able to attend part 2 in person but I will be sure to follow along online. If I am ever in SF at the time of a course, I will certainly apply!
All the best.
"A key teaching goal for us is that you come away from the course feeling much more comfortable reading, understanding, and implementing research papers. We’ll be sharing some simple tricks that make it much easier to quickly scan and get the key insights from a paper."
My interest is musical style transfer. I'd like to replicate these examples from Sony Computer Science Lab-Paris: http://www.flow-machines.com/odetojoy/
They've published papers, but not code (except for DeepBach).
As person without any eduction in the field of maths or neural networks, examples from Keras docs and blog reasonated with me far more than Tensor Flow ones.
Theano support will continue for as long as Keras exists. This integration changes nothing for existing Keras users, only for TF users. - https://twitter.com/fchollet/status/821090410659344384
Rather than a "new" Keras it sounds (I could be wrong) as if the Keras API will be included now with TF as an alternative way of interacting with TF (as it has been for some time) and simple bundled with TF.
Theano support will continue for as long as Keras exists. This integration changes nothing for existing Keras users, only for TF users.
I fear that from now on the Python API will be the best documented API used by most example code, and the low level API will over time become something obscure that is hard to approach and badly documented, just like Torch's C API.
I don't like this split between a core system written in C or C++ and a high level API written in a language that's too slow and memory hungry to write the core system in.
This architecture is probably meant to reflect an existing split between users and implementors of the system and I can understand the arguments in favor. But I think it also creates and reinforces that split, which isn't a good thing at all.
We were actually looking into moving to TensorFlow from Torch because the Torch C API is considered an internal thing not meant for public consumption.
What I do insist on is that we keep the capability to selectively dig deeper where we need to and combine different libraries with our own code.
That's why I'm always trying to reduce the frameworkness and increase the librariness of any third party code we use. Scripting language runtimes are a major roadblock in that approach.
I wouldn't consider myself an expert in the "ml" field either, and only really informed because of my academic/contract research work in neuroimaging wrt DSP and beamforming and having to rely on performative code involving what at the end of the day comes down to manipulating matrices with some set of mathematical/physics constraints; I wished reviewers gave us an easier time with our paper and just believed our results without taking months before accepting it haha.
But if you are able to use these frameworks/hack around the quirks for your use case with minimum effort and solve your problems to satisfactory degree, I completely understand that they can be good enough.
>…we keep the capability to selectively dig deeper…
>That's why I'm always trying to reduce the frameworkness and increase the librariness of any third party code we use. Scripting language runtimes are a major roadblock in that approach.
I don't disagree with your sentiments at all, but it seems like you are already at the intersection where you just need to go down your own road because its probably unrealistic to expect Google's needs to be aligned with yours in the long term. Then again, I'm the type of person who would fork the nightly version of firefox and remove and add the stuff I want out of the box so maybe this is not the best suggestion >.<
When your model changes every day, you really need a framework that abstracts away the underlying computational resources.
Keras goes beyond simply being a concise API, a variety of examples, or a strong community with the "best practices included" philosophy. Opinionated but quite useful.
As an example, the settings for an LSTM are complex and require reasonably thorough understanding of many topics. There's dropout (and the many debated ways one could apply it), there's the forget bias, there's weight initialization, there's ... You get the idea.
If you use `keras.layers.recurrent.LSTM` however, bam, you get an opinionated version of these for free.
Initialization is Glorot uniform for most of the weights but then orthogonal for the inner weights.
The forget bias is set to one - as I hope every library has by default now but wasn't the case for some time.
Dropout is variational inference based dropout - recent, likely what you want, and zero complexity.
At some point you'll likely want to learn about all the details - and this provides a smooth easy transition for that as you go "wait, what's a Glorot?" - but for getting your feet wet and/or solving a specific task, "best practices included" seems the best combination. I've successfully recommended this to high school students and they've been up and running with neural networks in short order!
Given all of this, whilst I'm a researcher who works on fiddly novel architectures that require some pretty specific features so use Chainer at work, I turn to Keras for my fun side projects as it keeps me sane and happy :)
Full disclosure, I've committed examples to the Keras codebase and know Francois in person.
We got the use case for auto diff but not being focused on research we just decided it was easier to hand implement the layers and just have the graphs be defined at the layer level.
Thanks for clarifying.
I am considering just using WebGL or something like Turbo.js to experiment with my Radeon hardware for AI.
Maybe AMD should hire some people to work full time on Tensorflow. Anyway at this point next thing I buy will probably be an nvidia card and chalk up the amd graphics card purchase to ignorance. At least I played a few games on it.