
Introducing AdaNet: Fast and Flexible AutoML with Learning Guarantees - dsr12
https://ai.googleblog.com/2018/10/introducing-adanet-fast-and-flexible.html
======
minimaxir
Example notebook from the official repo which gives a better example on how
AdaNet works:
[https://github.com/tensorflow/adanet/blob/master/adanet/exam...](https://github.com/tensorflow/adanet/blob/master/adanet/examples/tutorials/adanet_objective.ipynb)

Despite being "AutoML", there's still a lot of work that needs to be done to
optimize the model. (the second example notebook is a bit better in this
aspect:
[https://github.com/tensorflow/adanet/blob/master/adanet/exam...](https://github.com/tensorflow/adanet/blob/master/adanet/examples/tutorials/customizing_adanet.ipynb))

~~~
scottyak-adanet
True, while the AdaNet project doesn’t currently include the most advanced
architecture search policies/search spaces out of the box, it does provide an
abstraction (adanet.subnetwork.Generator) for researchers to implement their
neural architecture search algorithms, while exposing a production-friendly TF
Estimator interface for users. This makes it easier for developments in
algorithms to be integrated into production, and for users who are already
plugged into the TF Estimator ecosystem to benefit from these developments
sooner.

I would like to point out though, that even though some aspects of AdaNet
aren't as advanced as they could be, we have seen good results with AdaNet on
a wide variety of tasks. Try it on yours and see!

And of course, we are not going to stop after exposing an interface! We are
excited to work in this space ourselves too, and we will continue to update
this project as we make progress.

------
cweill
Author here with several of the core team; we can answer some questions here.
We also have a post on r/machinelearning for further discussion:
[https://www.reddit.com/r/MachineLearning/comments/9spw5g/p_g...](https://www.reddit.com/r/MachineLearning/comments/9spw5g/p_google_ai_opensources_a_tensorflow_framework/)

~~~
bitL
I haven't read it in detail so my apologies if you explain this in the
notebooks; can AdaNet handle blocks with variable-length skip connections
(like DenseNet), or even come up with AmoebaNet-style models on its own? What
is the meta-strategy guiding the hyperparameter/architecture selection process
(grid search/Bayesian/etc.)? Thanks!

~~~
cweill
Great question! In the simplest case, AdaNet allows you to ensemble
independent subnetworks from a linear model to user-defined
DenseNet/AmoebaNet-style networks. But more interesting is sharing information
(tensor outputs or which hyperparameters worked best) between iterations so
that AdaNet can do neural architecture search for you. Users can define their
own adanet.subnetwork.Generator in order to specify how to adapt training
across iterations.

Out of the box, the meta-strategy is little more than simple-user defined
heuristics (e.g., “if the the deepest candidate subnetwork performed best, try
subnetworks that are one layer deeper than that”). However, the AdaNet
framework is flexible enough to support smarter strategies as you mentioned,
and abstracts away the complexities of distributed training (Estimator),
evaluation (TensorBoard), and serving (tf.SavedModel).

------
bitL
Cool! Will there be a simple Keras support for it? Or just via TF estimators?

~~~
cweill
We started with Estimator because it’s the easiest to train and deploy with
production infrastructure. Having a Keras model API is a good idea: it would
make it easier to try AdaNet. Feel free to file a feature request on the
GitHub repo:
[https://github.com/tensorflow/adanet](https://github.com/tensorflow/adanet).

------
yazr
Yikes. There is just so much progress with ML and DL.

Can anyone estimate how much CPU $$ we need to get some results with this ?

I train on about 20M samples (1K data points each).

~~~
scottyak-adanet
TL;DR: It depends on the number of subnetworks you search over and the cost of
training each subnetwork.

You could define a search space with a single DNN, and give it one single
iteration, and this would behave identically to the canned DNNEstimator.

In a slightly more interesting case, suppose your search space consists of
(say) 5 DNNs that cost you X each to train over (say) 1 epoch per iteration,
and you train them over (say) 10 iterations, that would cost you around X x 5
x 10 = 50X.

Do consider, though, that using AdaNet to explore, train, select, and ensemble
these 50 DNNs is probably worth it considering how annoying it would be to do
it without AdaNet :) Additionally, since AdaNet is implemented as a TensorFlow
Estimator, it is easy to scale up the number of machines to speed up the
training, if this is something you want.

------
visarga
Can AdaNet work with CNN and RNN layers?

~~~
cweill
We have an example of AdaNet on GitHub that uses several complex CNN layers
(conv2d, pooling, linearly-separable layers) as part of the NASNet-A
architecture (presented in
[https://arxiv.org/abs/1707.07012](https://arxiv.org/abs/1707.07012)):
[https://github.com/tensorflow/adanet/blob/master/adanet/exam...](https://github.com/tensorflow/adanet/blob/master/adanet/examples/nasnet.py)

AdaNet supports high level APIs (e.g tf.layers, tf.losses), so it can support
RNN cells as well.

------
riku_iki
Curious if there are any results for well known benchmarks achieved by this
project?

~~~
scottyak-adanet
Yes, the blog post mentions that AdaNet, using NASNet-A as subnetworks,
achieves 2.30% error rate on the CIFAR-10 benchmark.

~~~
singularity2001
In what wall time? Compared to similar error rates.

------
ur-whale
Article doesn't load for me. Anyone else experiencing the same?

