
AutoML-Zero: Evolving machine learning algorithms from scratch - lainon
https://github.com/google-research/google-research/tree/master/automl_zero
======
manually
Next:

\- Autosuggest database tables to use

\- Automatically reserve parallel computing resources

\- Autodetect data health issues and auto fix them

\- Autodetect concept drift and auto fix it

\- Auto engineer features and interactions

\- Autodetect leakage and fix it

\- Autodetect unfairness and auto fix it

\- Autocreate more weakly-labelled training data

\- Autocreate descriptive statistics and model eval stats

\- Autocreate monitoring

\- Autocreate regulations reports

\- Autocreate a data infra pipeline

\- Autocreate a prediction serving endpoint

\- Auto setup a meeting with relevant stakeholders on Google Calendar

\- Auto deploy on Google Cloud

\- Automatically buy carbon offset

\- Auto fire your in-house data scientists

~~~
neximo64
Would be funny but most of those things are already on AutoML Tables,
including the carbon offset

[https://cloud.google.com/automl-tables](https://cloud.google.com/automl-
tables)

~~~
westurner
> _Would be funny but most of those things are already on AutoML Tables,
> including the carbon offset_

GCP datacenters are 100% offset with PPAs. Are you referring to different
functionality for costing AutoML instructions in terms of carbon?

...

I'd add:

\- Setup a Jupyter Notebook environment

> _Jupyter Notebooks are one of the most popular development tools for data
> scientists. They enable you to create interactive, shareable notebooks with
> code snippets and markdown for explanations. Without leaving Google Cloud 's
> hosted notebook environment, AI Platform Notebooks, you can leverage the
> power of AutoML technology._

> _There are several benefits of using AutoML technology from a notebook. Each
> step and setting can be codified so that it runs the same every time by
> everyone. Also, it 's common, even with AutoML, to need to manipulate the
> source data before training the model with it. By using a notebook, you can
> use common tools like pandas and numpy to preprocess the data in the same
> workflow. Finally, you have the option of creating a model with another
> framework, and ensemble that together with the AutoML model, for potentially
> better results._

[https://cloud.google.com/blog/products/ai-machine-
learning/u...](https://cloud.google.com/blog/products/ai-machine-learning/use-
automl-tables-from-a-jupyter-notebook)

~~~
perl4ever
This sounds like the sort of thing that would be useful outside of data
science. Which leads to the question of whether it needs to be generalized, or
redone differently for different specializations. Which in turn seems like the
sort of question that it's tricky to answer with AI.

~~~
westurner
> _This sounds like the sort of thing that would be useful outside of data
> science._

The instruction/operation costing or the computational essay/notebook
environment setup?

Ethereum ("gas") and EOS have per-instruction costing. SingularityNET is a
marketplace for AI solutions hosted on a blockchain, where you pay for AI/ML
services with the SingularityNET AGI token. E.g. GridCoin and CureCoin
compensate compute resource donations with their own tokens; which also have a
floating exchange rate.

TLJH: "The Littlest JupyterHub" describes how to setup multi-user JupyterHub
with e.g. Docker spawners that isolate workloads running with shared resources
like GPUs and TPUs:
[http://tljh.jupyter.org/en/latest/](http://tljh.jupyter.org/en/latest/)

"Zero to BinderHub" describes how to setup BinderHub on a k8s cluster:
[https://binderhub.readthedocs.io/en/latest/zero-to-
binderhub...](https://binderhub.readthedocs.io/en/latest/zero-to-binderhub/)

~~~
perl4ever
The notebook/procedure thing. Like, doesn't everybody everywhere operate on a
basis of mixed manual/automated procedures, where it needs to fluidly
transition from one to another, yet be controlled and recorded and verified
and structured?

~~~
westurner
REES is one solution to reproducibility of the computational environment.

> _BinderHub ([https://mybinder.org/](https://mybinder.org/) ) creates docker
> containers from {git repos, Zenodo, FigShare,} and launches them in free
> cloud instances also running JupyterLab by building containers with
> repo2docker (with REES (Reproducible Execution Environment Specification)).
> This means that all I have to do is add an environment.yml to my git repo in
> order to get Binder support so that people can just click on the badge in
> the README to launch JupyterLab with all of the dependencies installed._

> _REES supports a number of dependency specifications: requirements.txt,
> Pipfile.lock, environment.yml, aptSources, postBuild. With an
> environment.yml, I can install the necessary CPython /PyPy version and
> everything else._

REES:
[https://repo2docker.readthedocs.io/en/latest/specification.h...](https://repo2docker.readthedocs.io/en/latest/specification.html)

REES configuration files:
[https://repo2docker.readthedocs.io/en/latest/config_files.ht...](https://repo2docker.readthedocs.io/en/latest/config_files.html)

Storing a container built with repo2docker in a container registry is one way
to increase the likelihood that it'll be possible to run the same analysis
pipeline with the same data and get the same results years later.

...

Pachyderm ( [https://pachyderm.io/platform/](https://pachyderm.io/platform/) )
does Data Versioning, Data Pipelines (with commands that each run in a
container), and Data Lineage (~ "data provenance"). What other platforms are
there for versioning data and recording data provenance?

...

Recording manual procedures is an area where we've somewhat departed from the
"write in a lab notebook with a pen" practice. CoCalc records all
(collaborative) inputs to the notebook with a timeslider for review.

In practice, people use notebooks for displaying generated charts, manual
exploratory analyses (which does introduce bias), for demonstrating APIs, and
for teaching.

Is JupyterLab an ideal IDE? Nope, not by a longshot. nbdev makes it easier to
write a function in a notebook, sync it to a module, edit it with a more
complete data-science IDE (like RStudio, VSCode, Spyder, etc), and then copy
it back into the notebook.
[https://github.com/fastai/nbdev](https://github.com/fastai/nbdev)

~~~
westurner
> _What other platforms are there for versioning data and recording data
> provenance?_

Quilt also versions data and data pipelines: [https://medium.com/pytorch/how-
to-iterate-faster-in-machine-...](https://medium.com/pytorch/how-to-iterate-
faster-in-machine-learning-by-versioning-data-and-models-featuring-
detectron2-4fd2f9338df5)

[https://github.com/quiltdata/quilt](https://github.com/quiltdata/quilt)
(Python)

------
TaylorAlexander
Shouldn’t this link directly to the Readme?

[https://github.com/google-research/google-
research/blob/mast...](https://github.com/google-research/google-
research/blob/master/automl_zero/README.md)

------
lokimedes
Reminds me of
[https://www.nutonian.com/products/eureqa/](https://www.nutonian.com/products/eureqa/)
which I used quite productively to model multivariate distributions from data
back in the 2000’s. Funny how everything stays the same, but with a new set of
players on the bandwagon.

~~~
jmmcd
Not really similar. Nutonian did straight-up genetic programming symbolic
regression. This does genetic programming to discover ML algorithms.

~~~
lokimedes
Actually it is somewhat similar as both find the model, and obliviously fits
the data to that model in the process. My use was finding a parameterization
that could be reused through regular regression fitting.

------
joe_the_user
_AutoML-Zero aims to automatically discover computer programs that can solve
machine learning tasks, starting from empty or random programs and using only
basic math operations._

If this system is not using human bias, who is it choosing what good program
is? Surely, human labeling data involves humans adding their bias to the data?

It seems like AlphaGoZero was able to do just end-to-end ML because it was
able to use a very clear and "objective" standard, whether a program wins or
loses at the game of Go.

Would this approach only deal with similarly unambiguous problems?

Edit: also, AlphaGoZero was one of the most ML ever created (at least at the
time of its creation). How much computing resources would this require for
more fully general learning? Will there be a limit to such an approach?

~~~
darawk
> It seems like AlphaGoZero was able to do just end-to-end ML because it was
> able to use a very clear and "objective" standard, whether a program wins or
> loses at the game of Go.

Just a fun note: winning or losing at the game of Go is actually surprisingly
subjective:

[https://en.wikipedia.org/wiki/Go_(game)#Scoring_rules](https://en.wikipedia.org/wiki/Go_\(game\)#Scoring_rules)

~~~
pmontra
The game ends by agreement of the players. If they don't agree on the result
("those stones are alive!") they must keep playing. Chinese rules are much
better at this than Japanese ones especially (IMHO) the old ones with the
group tax. There are no ambiguities there. Unfortunately the group tax is
unpleasant and Chinese rules are a pain to score manually. Japanese rules are
full of flaws but are such a nice shortcut that almost everybody except China
use them or some variant of them.

Btw, if any Chinese player is reading this, how do you count the score while
playing? Do you count territory and remember the number of captured stones or
do you count both stones and territory? Thanks.

~~~
darawk
> The game ends by agreement of the players. If they don't agree on the result
> ("those stones are alive!") they must keep playing. Chinese rules are much
> better at this than Japanese ones especially (IMHO) the old ones with the
> group tax. There are no ambiguities there. Unfortunately the group tax is
> unpleasant and Chinese rules are a pain to score manually. Japanese rules
> are full of flaws but are such a nice shortcut that almost everybody except
> China use them or some variant of them.

Ambiguities? No. Subjectivity? Yes.

~~~
mlyle
No, not really. Under Chinese rules, eventually it will reach a clean,
objectively scored state. Of course, human players will agree on the score
before this point.

------
mark_l_watson
This reminds me of John Koza’s Genetic Programming, a technique for evolving
small programs. There is an old Common Lisp library to play with it.

~~~
drongoking
My reaction too. They've reinvented genetic/evolutionary programming. They
should probably read some of the decades of work that have already been done
on it.

~~~
imvetri
Same here. When I studied genetic programming, I was hoping that's where
problem solving evolve from as it was flawless. But recent events prove
otherwise which made me believe we are using the wrong tool for the wrong
problem. Here is why.

When AI gets to 100% accuracy, the equation to find the answer becomes 100%
accurate. We no longer have to run the AI with heavy resources and equation
can be converted to an executable program. This modal of AI will save
computing power, and uses resources smartly.

Example.

AI tries to find right equation to add two numbers.

AI finds the equation to add two numbers.

AI outputs the equation as an executable program.

AI discards itself.

~~~
jmmcd
You might not be familiar with how neural networks work. When training they do
use a lot of computing power. But when running they don't. Yes, they still
require some external boilerplate code to multiply the matrices, but you
already have it and it's not heavy. So yes there is some convenience in
program synthesis in a human programming language, but it is a small
convenience, not a game changer.

------
tmpmov
For those interested AutoML-Zero cites "Evolving neural networks through
augmenting topologies" (2002) among other "learning to learn" papers and is
worth a read if you have time and inclination.

For those with more background and time, would any mind bridging the 18 year
gap succinctly? A quick look at the paper reveals solution space constraints
(assuming for speed), discovering better optimizers, and specific to the
AutoML-Zero paper: symbolic discovery.

------
ypcx
Now, can we evolve a ML algorithm that would in turn produce a better AutoML?
Ladies and gentlemen, the Singularity Toolkit v[quickly changing digits here].

------
jxcole
Interesting, but how does it perform on standard benchmarks like image net and
MNIST?

~~~
JulianWasTaken
I am way out of my depth so maybe this is pure nonsense, but presumably the
goals aren't so much to evolve better performing models directly for some
dataset specifically, but to see what kinds of model families it evolves and
whether we've thought of all of them?

Once we've got one we can then presumably train specific models in a more
targeted way.

------
manthideaal
if AutoML-Zero is going to be more than a grid-like method then I think it
should try to learn a probabilistic distribution over (method, problem,
efficiency) and use it to discover features for problems using an auto-encoder
in which the loss function is a metric over the (method,efficiency) space.
That means using transfer-learning from related problems in which the
similarity of problems is based of the (method,efficiency) differency.

Problem P1 is locally similar to P2 if (method,efficiency,P1) meassured in
computation time is similar to (method,efficiency,P2) for method in a local
space of methods. The method should learn to classify both problem and
methods, that's similar to learning words and context words in NLP or matrix
factorization in recommendation systems. To sample the
(space,method,efficiency) space one need huge resources.

Added: To compare a pair of (method,problem) some stardardization should be
used, for linear problems related to solving linear systems the condition
number of the coefficiency matrix should be used as a feature for
standardization and, for example in SAT an heuristic using the number of
clauses and variables should be used for estimating the complexity and
normalization of problems. So the preprocessing step should use the best known
heuristic for solving the problem and estimating its complexity as both a
feature and a method for normalization. Heuristic and DL for TSP is
approaching SOTA (but concord is better yet).

Finally perhaps some encoding about how the heuristic was obtained could be
used as a feature of the problem (heuristic from minimum spanning tree, branch
and bound, dynamic programming, recurrence, memoization, hill climbing, ...)
as an enumerative type.

So some problems for preprocessing are: 1) What is a good heuristic for
solving this problem. 2) What is a good heuristic for bounding or estimating
its complexity. 3) How can you use those heuristics to standardize or
normalize its complexity. 4) How big should be the problem so that the
assymptotic complexity takes over the noise of small problems. 5) How do you
encode the different types of heuristics. 6) How do you value the sequential
versus parallel method for solving the problem.

Finally, I wonder if once a problem is autoencoded then if some kind of
curvature could be defined, that curvature should be related to the average
complexity of a local space of problems, also transitions like in graph
problems should be feautured. The idea is using gems of features to allow the
system to combine those or discover new better features. Curvature could be
used for clustering problem that is for classification of types of problems.
For example all preprocessed problems for solving a linear system should be
normalize to have similar efficiency when using the family F of learning
methods otherwise a feature is introduced for further normalization. For
example some problems could require to estimate the number of local extrema
and the flat (zero curvature extend of those zones)

~~~
no_identd
Very insightful comment, thank you. There's one other related thing I also
find worthy of exploring further namely the population based training used by
AutoML-Zero at the moment seems extremely simplistic, and there exist a lot of
bleeding edge methods in that area which can tremendously improve outcomes of
evolutionary algorithms, I've tweeted about them here (and at the AutoML-Zero
people):

[https://twitter.com/no_identd/status/1238565087675330560](https://twitter.com/no_identd/status/1238565087675330560)

And it doesn't seem unlikely that tweaking these would tremendously improve
the outcomes. Combining that with what you've just described would… well, I'll
leave that to the readers imagination. ;)

------
nobodywillobsrv
Basic Tech Bros still don't get it. This is cool but real problem is
finding/defining the problem. And you don't get a million guesses.

Here is a simple test: get me data to predict the future. Can an algo like
this learn to read APIs, build scripts, sign up and pay fees, collect data
(laying down a lineage for causal prediction), set up accounts, figure out how
account actions work and then take actions profitably without going bust?

If it can even do the first part of this I am in. But I doubt it. This is
still just at the level of "cool! Your dog can play mini golf."

~~~
dang
Can you please omit name calling from your comments here? I'm sure you can
make your substantive points without that.

This is in the site guidelines:
[https://news.ycombinator.com/newsguidelines.html](https://news.ycombinator.com/newsguidelines.html).

