
Non-Convex Optimization for Machine Learning - jonbaer
https://arxiv.org/abs/1712.07897
======
Xcelerate
From the preface:

> Put a bit more dramatically, [this monograph] will seek to show how problems
> that were once avoided, having been shown to be NP-hard to solve, now have
> solvers that operate in near-linear time, by carefully analyzing and
> exploiting additional task structure!

This is something I've noticed in my own research on inverse problems (signal
recovery over the action of compact groups). And it's really quite mind-
blowing. What this means is that you can _randomly_ generate problems, and
these will be NP-hard to solve. However, assuming the problem is _not_
randomly generated (i.e., there is some regularity in the generative process
that produced the data), there often appears to be some inherent structure
that can be exploited to solve the problem quickly to its global optimum.

I feel like future research will focus on finding the line that divides the
"tractable" problems from the "intractable" ones.

~~~
gabrielgoh
A simple example of this that has been shown rigorously is compressed sensing.
Finding the sparsest vector, subject to linear constraints Ax = b is NP hard
for general matrices, but is solvable in polynomial time if A satisfies the
RIP property (e.g. w.h.p if A is generated by randomly sampling gaussians for
each entry). Quite surprising!

~~~
mturmon
Another example (that may be related to yours) is the general linear
programming problem, that is, for vector x,

    
    
      max c^T x
    
      s.t. A x <= b  [component-wise]
    

If problem instances (A, b) are chosen at random, from a rotationally-
symmetric distribution, Borgwardt and others showed that, with high
probability, the number of steps of the solution method is bounded by a
polynomial in the number of dimensions. But on the other hand, there are
explicit constructions of (A, b) that cause an exponential number of solution
steps.

The usual interpretation is that "most" problems are friendly to the standard
linear programming approach, but a few are not.

------
purushot
(Disclaimer: I am an author of the monograph being discussed)

The authors are very glad at the discussion this monograph has generated and
would be very happy if this leads to more research and advances in the
foundations and applications of non-convex optimization. Solvers for various
problems that arise in machine learning and signal processing applications,
have seen great improvement in recent years. This includes MILP solvers that
are very useful for scheduling and other problems and LASSO-style solvers.
This monograph was an attempt to present a concise and lucid introduction to
another set of very useful and highly successful algorithmic techniques for
machine learning problems.

As one comment below mentioned, the scope of non-convex problems is quite vast
as it pretty much includes any problem that is "not convex". However, the non-
convexity that arises in machine learning and related areas is often very
structured and this monograph was an attempt at collecting our collective
knowledge on how to address these specific instances of non-convexity and
develop extremely scalable solvers for them. A key component of this effort
turns out to be a general avoidance of "relaxations", a common trick used to
convert non-convex problems into convex ones so that well-understood
techniques can be applied (LASSO techniques fall into this category).

Frequently, these relaxation-based approaches struggle to scale very well. In
this monograph we present several algorithmic techniques that avoid
relaxations for this reason. These techniques are very intuitive and offer not
just excellent performance in practice (much faster than relaxation based
techniques), but provably so (the monograph does indulge in formal analyses of
the algorithms). This theoretical aspect of the monograph dates back at least
a decade, to the initial advances in compressive sensing due to the seminal
works of Candes, Tao and Romberg. These initial results were later shown to be
a part of a much more general framework, which this monograph also seeks to
present, by presenting applications of non-convex optimization techniques to
sparse recovery, matrix completion, latent variable models, and robust
learning.

We welcome comments and suggestions form all. This monograph is very much a
work in progress, especially given the fast pace at which machine learning is
moving forward and we wish to keep this text relevant to the community in the
future as well, by making relevant updates and additions.

------
vladislav
Perhaps it's unavoidable, but there are large swaths of relevant literature
missing from this survey. In particular there is not a single mention of the
non-convex Burer-Monteiro method for solving semidefinite programs, a pretty
widely applicable technique.

------
fmap
I wonder how the techniques in this monograph stack up against optimization
techniques on manifolds
([https://press.princeton.edu/absil](https://press.princeton.edu/absil)).
Projected gradient descent seems like an approximation to steepest descent on
a suitable manifold, so I would expect conjugate gradient or Newton methods to
perform better in practice.

------
XnoiVeX
Why is this getting so few comments?

~~~
govg
It's a monograph that is pretty dense with not so trivial math. Maybe there
aren't many on HN who fully understand the material or have strong ideas about
it.

~~~
kleiba
Also, it's not clear to me why this entry makes it to the front page over all
the tons of other research paper / monographs. It's significance does not
stand out to me, i.e., I cannot grasp why this is something I should look at
instead of many other interesting reads from the field.

~~~
jpfr
This was just released as part of the now series for machine learning. They
usually publish superb reference material. Research that is foundational
and/or has become a stable body of work, but is recent enough to be
interesting.

Read these and you are on the edge of what is currently happening.

[http://www.nowpublishers.com/MAL](http://www.nowpublishers.com/MAL)

