
Black box optimization competition - silentvoice
http://bbcomp.ini.rub.de/
======
darkmighty
I really dislike the term "Black box optimization". There's no such thing. You
_have_ to make assumptions about your function, so in the end this is just
rewarding people whose optimizers happen to match the chosen functions; but
those functions are not made explicit whatsoever. That doesn't make any sense.

For example, if the output/input are floating point numbers than you can
assume the domain/range is [-M,M]. Otherwise, with even the most clever
function you have no guarantee of ever approaching the optimum, even if the
function is continuous. Now even with a limited range there are no guarantees
if the function is not well behaved -- so you have to again assume the
function is well behaved. And for any assumption you make there is a condition
on function for which it is terrible. There is no best assumption, or best
algorithm, then. You could, for instance, _assume_ the function is adversarial
(trying to make your life difficult), for which the best algorithm is perhaps
just sampling randomly the range, which is really a terrible algorithm -- but
that's of course just another assumption, and a terrible one.

I would much prefer 'Typical function optimization', if you're optimizing
unlabeled functions so frequently, or at least not try to hide the inevitable
assumptions.

TL;DR: The contest may be useful, but the concept of "Black box optimization"
is nonsense.

~~~
jpfr
Yes, there is such a thing.

There exist many more techniques than trivially assuming some "template"
function and fitting the function parameters against the data.

Have a look at nonparametric modelling techniques. For example kernel
regression or gaussian processes. You either don't make any assumptions, or
you take an uninformative prior that distributes over all possible results.

This competition evokes modelling, optimisation __and __the exploration
/exploitation tradeoff. I'm sure there will be very interesting theory behind
the winning entries...

~~~
darkmighty
The point is, I don't even need to look up your techniques (although I did out
of respect) to know there really isn't such a case; what I stated is a simple,
almost trivial principle (apparently it has a name [1] as some pointed out).

Mathematics models data, and you can't model without assumptions. It's like
developing a theory which can't have axioms. For example, kernel regression
probabilistic model is a terrible model (assumption) with very large error for
a large class of distributions[2], and so on. We're talking about picking the
_best_ technique; this technique is going to pick some assumptions arbitrarily
that will or will not work well based on an unclear choice of the organizers.
That's why I would prefer if they stated instead "Functions with some real
world relevance", or "Typical functions", or maybe "Poorly behaved functions",
and so on.

[1]
[http://en.wikipedia.org/wiki/No_free_lunch_in_search_and_opt...](http://en.wikipedia.org/wiki/No_free_lunch_in_search_and_optimization)

[2] On the wikipedia page you can see they do make assumptions on f to
minimize the squared error for choosing the kernel. It's inevitable.

~~~
jpfr
You are fighting a mathematically pure interpretation of black boxes that are
making no assumptions at all. Your observations are correct. But nobody
actually interprets the term "black box" the way you deem wrong.

Taken from here [1]:

White-box models: This is the case when a model is perfectly known; it has
been possible to construct it entirely from prior knowledge and physical
insight.

Grey-box models: This is the case when some physical insight is available, but
several parameters remain to be determined from observed data. It is useful to
consider two subcases.

1\. Physical modeling: A model structure can be built on physical grounds,
which has a certain number of parameters to be estimated from data. This
could, for example, be a state-space model of given order and structure.

2\. Semiphysical modeling. Physical insight is used to suggest certain
nonlinear combinations of measured data signal. These new signals are then
subjected to model structures of black-box character.

Black-box models: No physical insight is available or used, but the chosen
model structure belongs to families that are known to have good flexibility
and have been 'successful in the past'.

[1]
[http://www.sciencedirect.com/science/article/pii/00051098950...](http://www.sciencedirect.com/science/article/pii/0005109895001208)

~~~
darkmighty
Fair enough. I wasn't not familiar with the literature to be honest, it was
just a remark.

I still dislike the term and concept, but it's hard to argue with a
conventional definition. I believe assumptions should be made as clear as
possible and the term seems like a futile attempt at hiding them.

------
murbard2
It's a little strange that they do not have a track that gives gradient
information, given that it is often a real world possibility. Also, this
basically allows unlimited time between eval... So this becomes a contest
about \- coming up with a distribution over R^n -> R function \- finding the
optimal evaluation points to do Bayesian update

I predict the winner will use some a mixture of Gaussian processes with
various kernels and stochastic control (with a limited look ahead, otherwise
it blows up) to pick the test points.

~~~
pilooch
You can compute the gradient, it just has a high budget cost.

The usual winner is a flavor of CMA-ES, though they may have picked up the
functions to avoid this.

~~~
murbard2
You're missing my point. In many real world problems, it is cheap to compute
the gradient. Thus, black box optimization methods which can use gradient
information are inherently valuable, and it is surprising that they do not
have a track that would allow showcasing those.

~~~
nullc
In a great many real world problems, including most of the most expensive ones
gradients are _not_ available, or can only be expensively computable... even
if your objective is differentiable, automatic differentiation isn't cheap on
non-trivial functions.

Experiences differ, but in mine the most common place to find objectives with
gradients is in optimizer challenges.

That said; sure, there should be a track that gives you the gradients. I agree
that it would be nice if there were another track.

------
obstinate
Seems really interesting. Too mathy for my skillset.

If I may, I propose that the organizers remove the restriction on
disassembling the client library or intercepting network connections. This
restriction seems like it cannot benefit the organizers, unless the protocol
is insecure. People are going to ignore this rule anyway, and you can't stop
them or even detect them doing it. So why put it in there? It's only going to
generate ill will.

~~~
rer0tsaz
It's probably insecure, because you don't want to do 75,850,000 sequential
evaluations over a network. It would take over a week for a single track with
even just 10ms response time.

~~~
obstinate
Presumably a secure implementation would just batch up evaluations into groups
of ten or a hundred. You could make something like this work.

------
darklajid
Whoa. The servers for this competion are about 8km away. That's the most
'local' content I've ever seen on HN.

Unfortunately I have to agree with obstinate here. The pure math is too much
for me and reverse engineering (still daunting, but interesting/possible) is
not acceptable. If any HN person wins this contest, I offer beers close to the
black box :)

~~~
pilooch
Shameless plug: anyone interested should be able to get baseline results and
above easily by using libcmaes. I am one of the authors with no time to
compete, but am interested in reports on how it goes. Also if you are a
researcher or a student the lib should let you experiment easily with various
custom strategies.

[https://github.com/beniz/libcmaes](https://github.com/beniz/libcmaes)

------
cshimmin
Wish I had seen something about this sooner. The competition began in January
and ends on the 30th of this month.

~~~
nullc
That was my thought too! but instead of even click on the HN comments I went
and wrote a contestant. Within a couple hours all my runs will have completed;
assuming no big power failure I'll make the deadline! :)

(Uh, no doubt I won't do well, since I had no time to ... like.. actually test
my code on any functions except a couple trivial trials. :P ... I hope they
put up some kind of ranking information as soon as it closes. I have no idea
if my results are awful or merely bad :) (and I probably shouldn't share best
numbers before it closes) )

------
ramgorur
1\. You do not know what the function looks like, even there is no gradient
information

2\. You have a fixed number of probes M

2\. Among M, You have N number probes to get the silhouette of the function
(exploration).

3\. Then from the rest of the (M - N) trials, you need to find the optima
(exploiation).

Sounds more like a pseudo-science than a math problem to me.

~~~
Houshalter
Huh? Who said it was a math problem? And pseudoscience? Most real world
optimization problems are like this. Sometimes don't get gradient information
or unlimited trials.

The point of the task is to reward methods that work efficiently with limited
trials and domain information, rather than who can run hillclimbing on the
biggest computer or hand tune the parameters the best.

