
Using mixed integer programming to assign air cargo to flights - mattmarcus
https://flexport.engineering/using-mixed-integer-programming-to-assign-air-cargo-to-flights-42437bae9945
======
dunkelheit
If anyone is interested in learning more about MIP and other discrete
optimization topics, there is a great MOOC on coursera (link:
[https://www.coursera.org/learn/discrete-
optimization/](https://www.coursera.org/learn/discrete-optimization/)).
Lectures are quite engaging, delivered by enthusiastic and a bit eccentric
prof. Van Hentenryck. Also there are no stupid quizzes, just 5 problems (with
6-8 instances each, ranging from comparatively small to huge) that you need to
solve (in any programming language) in order to pass. The only drawback is
significant investment of time it requires.

~~~
yoshimiagava
@dunkelheit, thank you for the link to the course. I'll give it a try!

------
edejong
In my experience, each sufficiently complicated model becomes non-linear in
some respect. For example, you might want to work with margins for time-slots
that are non-linear but smooth.

So, even though I've worked with constrained linear programming in the past, I
tend to prefer algorithms with meta-heuristics, such as simulated annealing or
Tabu search. Although this might not provide the 'best' solution, it provides
a wider range of modelling tools.

To elaborate a bit on a use-case. Lets say we want to plan a high-school
roster. Teachers might have a maximum of 8 hours per day of work, but if we
make a hard cut-off at that time, we might miss an interesting 8:15 schedule
that gives a teacher more time to have proper lunch. If, furthermore, we do
not break the 40 hr./week rule, it might be a workable schedule. Also, it
provides the search algorithm a usable gradient, so the search space becomes
smooth and easier to navigate. Finally, if we provide several top solutions,
we give a human planner information on how the problem is (over-)constrained.

~~~
zelos
Could you handle your example in MIP through the objective function, making
time outside normal hours expensive, but balancing that with a positive value
for lunchtime? Possibly with an integer value limiting the number of days
where normal hours can be violated?

~~~
edejong
In addition to this remarks of the sibling comment, we don't always have a
well-understood objective function. For example, how much is lunchtime worth
compared with time outside normal hours? What we can say is that there is a
certain 'badness' to it, which should increase exponentially or polynomially
as we thread further outside of our preferred domain. These are known as soft
constraints.

A fundamental problem with soft constraints in MIP is that we cannot create
cuts in the conflict graph. Technically, everything conflicts with everything
else, but at very high badness. So, we then have to decompose the problem in a
preconceived way, such as on geographical boundaries, time boundaries or using
heuristics. This engineering can be challenging, especially given that MIP is
often hard to debug.

So, like the sibling comment: I prefer meta-heuristics over constraint logic
programming in many cases, but I do not deny that CLP/MIP can be very useful
as well.

~~~
7thaccount
Aren't CLP/MIP very different mathematically/programatically?

I know they are more similar to each other than meta-heuristics, but I'm not
sure shy.

Assuming your linearization is good, at least you'll get a global optimum and
the MIP gap. With meta-heuristics you have no clue where you could end up
right?

------
andrew771
Hi all, author here. I'm very humbled that this made it to the front page! A
few things I want to say:

1) Most of this work is a simplified copy of the papers I linked to. Special
thanks to James Bookbinder and his team at University of Waterloo.

2) I'm an OR novice, and this is my first optimization project. I feel now
that I can roughly model the flight assignment domain, but what the solvers do
is a mystery to me, and when I read about Lagrangian relaxation and column
generation, I'm lost. Fortunately I haven't needed those techniques in this
project.

3) The mathematical model, when written down, looks mystifying even to me, the
author, and that's unfortunate. My goal in writing this post was to reduce the
mystery and explain the model in plain English (plus math notation), but in
the end I'm afraid it still looks eye-glazingly complicated. Data science can
be all too happy to cloak itself in mystery by writing down what are actually
pretty basic equations. I don't have a good solution to this.

Most of all, I got such joy out of building the model one constraint at a time
and seeing the solver follow my directions and spit out optimal solutions. I
couldn't believe it worked. It was like I discovered electricity. Lastly, much
credit goes to Flexport leadership for allowing me, an OR novice, to embark on
this optimization project.

~~~
LolWolf
Heya! Very nice and congratulations on the front page :)

On (2) that's the great part about solvers, is that they're essentially quite
incredible black boxes. Most people who actually do optimization theory also
don't really know how they work. (My money is on black magic for a number of
cases.) Kidding aside, writing one is always informative and interesting (and,
with languages like Julia, surprisingly not complicated).

3) My suspicion is that matrix notation would actually improve the end result
quite a bit. A lot of the constraints you've written down have very common
structure ( _e.g._ , the total weight constraint can be written as y = X'g,
where m is the vector of weights, or the required assignment constraint, X1 =
1, where 1 is the all-ones vector).

Writing this out and the corresponding descriptions next to it would make it
much easier to parse. E.g., for the above cases

    
    
      min tr(CX) + c'y
      s.t. X1 = 1    (all items need to be shipped)
           X'g = y   (y is the total weight on each ULD)
           etc.
    

Congrats again on the front page and the article!

------
philzook
I love MIP. It's super useful.

I like to think that MIP is a framework that meets halfway between a natural
modeling language and a natural solution language. The reason it's good is
that the framework naturally lends itself to abusing linear programming as a
subroutine, which can cut out a large amount of search state space. And you
tend to find you can encode many problems without too much difficulty with
just a handful of tricks.

You can encode a great deal into MIP. Piecewise linearized versions of non
linear problems, or logical constraints.

Some interesting applications I've seen or played with:

\- Verifying neural nets
[https://github.com/vtjeng/MIPVerify.jl](https://github.com/vtjeng/MIPVerify.jl)

\- Solving the classical Ising Model [http://www.philipzucker.com/solving-the-
ising-model-using-a-...](http://www.philipzucker.com/solving-the-ising-model-
using-a-mixed-integer-linear-program-solver-gurobi/)

\- Global Robot arm kinematics [http://www.philipzucker.com/2d-robot-arm-
inverse-kinematics-...](http://www.philipzucker.com/2d-robot-arm-inverse-
kinematics-using-mixed-integer-programming-in-cvxpy/)

\- model predictive control with collisions constraints
[http://www.philipzucker.com/flappy-bird-as-a-mixed-
integer-p...](http://www.philipzucker.com/flappy-bird-as-a-mixed-integer-
program/) See Russ Tedrake's group
[https://groups.csail.mit.edu/locomotion/pubs.shtml](https://groups.csail.mit.edu/locomotion/pubs.shtml)

------
kragen
MIP is NP-complete; you can reduce any problem in NP to it. But there is a
large and interesting set of MIP problems where the continuous relaxation to
ordinary linear optimization ("linear programming" in the jargon, although
that's as misleading a term as "analog computer" or "military intelligence")
gives you enough interesting information about the MIP problem to solve it
enormously more efficiently than you can with a generic SAT solver.

One of the notes in Dercuano is an overview I put together of the field last
year:
[http://canonical.org/~kragen/dercuano-20191230.tar.gz](http://canonical.org/~kragen/dercuano-20191230.tar.gz)
file notes/linear-optimization-landscape.html. It seems that the best free-
software solver is COIN-OR CBC, and the best free-software modeling language
is GMPL, aka GNU MathProg, which is compatible with the popular proprietary
linear-optimization modeling language AMPL, and includes its own somewhat
weaker solver GLPK; it's much easier to get GLPK solving a problem than CBC,
but the relevant incantations are in the note. But some of the proprietary
solvers are much better; most of them are available on the NEOS Server.

I didn't evaluate embedded DSLs like Pyomo in any depth, unfortunately.

I'm somewhat disappointed with the Flexport post, which I feel is badly
formatted and uses needlessly obscure notation and then never gets around to
actually writing down a model in MathProg or Pyomo or anything similar. But I
guess my own note is only a little better, and the Flexport article at least
gives a MIP model of a nontrivial problem. (There are many more such examples
in the GNU MathProg distribution, and MIPLIB has a wealth of extremely
nontrivial ones.)

Mathematical optimization in general (optimization in the sense of minimizing
a possibly constrained function, not in the sense of making code run faster)
amounts to programming at a higher level; I think it was Norvig that described
it as "the ultimate in agile software development", because you basically just
write the tests. And linear optimization is the best-developed subfield of
mathematical optimization: linear solvers can solve enormously larger problems
than more general solvers.

(There's also an inferior PDF rendering of Dercuano for cellphones that can't
handle tarballs:
[http://canonical.org/~kragen/dercuano.20191230.pdf](http://canonical.org/~kragen/dercuano.20191230.pdf)
)

~~~
leethargo
Embedded DSLs in Python sometimes have a lot of overhead because of the way
large expressions are built with operator overloading (x + y + z + ...),
creating lots of intermediate objects.

But there are solver-specific alternatives, such as for Gurobi or MOSEK that
provide good performance, as well as vector-oriented modeling such as cvxpy.

Myself, I'm partial to Julia's JuMP as an embedded DSL that has good
performance, a general purpose language and "nice" syntax, that is comparable
to algebraic modeling languages.

~~~
kragen
> _Embedded DSLs in Python sometimes have a lot of overhead because of the way
> large expressions are built with operator overloading (x + y + z + ...),
> creating lots of intermediate objects._

Is this really a concern in practice? My mental model is that the algebraic
model is, you know, half a page to 10 pages of code without any loops or
recursion, and you evaluate it to get an MPS file, and you feed that MPS file
to your solver, which then chews on it for the next 15 seconds to 15 days.
It's hard to imagine that Python's dynamic dispatch overhead would contribute
more than a few milliseconds to this multi-day process. What am I missing?

~~~
mochomocha
There are low-latency applications of MIPs where such overheads matter.
Sometimes one needs to solve small MIPs very often under strong latency
requirements.

~~~
anonsivalley652
There's always rewriting in other languages or transpilation. If a framework
is so crucial, not just for prototyping, perhaps the runtime components (as
opposed to model building) could be ported to something like C, C++ or Rust.

~~~
leethargo
True. For some classes of problems, code generation is actually used here:
[https://cvxgen.com/docs/index.html](https://cvxgen.com/docs/index.html)

------
comment_guy
That's pretty cool these guys get to put 1950's math to work in real life. In
my experience, I was never able to sell this idea to anyone based on the
(potentially justified) excuse that it was completely unmaintainable without a
math guy around.

Edit: I can personally attest that potential bootstrappers would find fertile
ground making a service to do this stuff for transportation companies. I know
of many who basically don't try to solve this problem because they don't have
the talent and don't trust their ability to maintain something bought from a
consultant. They need a service they can offload it to.

~~~
kragen
Many of the fundamental algorithms used were developed during the 1980s and
1990s, as were modeling languages like AMPL. So it's not 1950s math, even if
the "simplex algorithm" was discovered in 1947.

It occurs to me that such a "service" could to a significant extent pick
winners and losers among transportation companies; whoever they chose to plan
for would have lower costs by several percent, and so would have profits
several times higher than the competition.

------
4er
We're seeing here the debate over model-based vs. method-based approaches to
optimization. My argument for the model-based approach is reflected in these
two sets of slides:

[https://ampl.com/MEETINGS/TALKS/2019_09_Bolzano_Model-
Based....](https://ampl.com/MEETINGS/TALKS/2019_09_Bolzano_Model-Based.pdf)
[https://ampl.com/MEETINGS/TALKS/2018_08_Lille_Tutorial1.pdf](https://ampl.com/MEETINGS/TALKS/2018_08_Lille_Tutorial1.pdf)

A recording of the first presentation will be available soon. The first uses a
simpler example, but both were prepared for audiences not committed to the
model-based approach. Their purpose was to familiarize people with model-based
optimization and the circumstances in which it has clear advantages.

------
kristianp
They did a poor job of explaining what Integer Programming is. Here's the
wikipedia intro:

"An integer programming problem is a mathematical optimization or feasibility
program in which some or all of the variables are restricted to be integers."

[https://en.wikipedia.org/wiki/Integer_programming](https://en.wikipedia.org/wiki/Integer_programming)

------
siilats
Cvx solves a relaxed problem trivially in matlab and then you can
heuristically round to integer.
[http://stanford.edu/class/ee364b/lectures.html](http://stanford.edu/class/ee364b/lectures.html)
Under L1 convex cardinality

~~~
ulucs
If I may quote from the Integer Programming chapter in Vohra's book, "Before
proceeding you should convince yourself that no ‘simple’ scheme based on
solving the underlying linear program and rounding the resulting solution can
find the optimal integer solution."

~~~
LolWolf
Depends on the problem (in max-flow, min-cut the trivial rounding scheme gets
you the optimal point immediately :).

Kidding aside, generally this is very true, but for a surprising number of
practical problems, the LP relaxation and some redundant constraints will
often have zero integrality gap. (In many other problems, you’re hosed, so
there’s also that.)

EDIT: For context, this is how LDPCs are decoded in robust cases.
[https://people.eecs.berkeley.edu/~wainwrig/Papers/FelWaiKar0...](https://people.eecs.berkeley.edu/~wainwrig/Papers/FelWaiKar05.pdf)

~~~
kxyvr
I suppose that we could also state that solving with a totally unimodular
matrix also absolves us from going through the trouble of going through branch
and bound, but that's a pretty niche case.

I guess I react strongly whenever I hear this kind of sentiment because the
result of rounding the continuous solution can be arbitrarily bad, but now we
have a false sense of confidence that we did some kind of optimization, when
we really didn't. My experience has been that rounding continuous solutions to
integer gives some pretty interesting and incredibly bad solutions.

For posterity's sake, Wolsey has a good example of how things can go poorly in
the introduction of the book Integer Programming:

    
    
      max 1 x1 + 0.64 x2
      st  50 x1 + 31 x2 <= 250
          3 x1 - 2 x2 >= -4
          x1, x2 >=0
          x1, x2 integer
    

The linear programming solution is (376/193,950/193), which is approximately
(1.9482,4.9223). The integer optimal solution is (5,0), which is far away.

I'll also contend that the integer programming solvers nowadays are _really_
good. If we can get away with rounding the solution, then the the solver will
find the solution in only a few iterations because it often does precisely
that in order to go through the branch, bound, and cut algorithm. The
difference is that a good solver can get a certificate of optimality when it
finds the solution, so that we know we're right.

------
logicchains
Is a specialized MIP solver mandatory, or would using a general SMT solver
like Z3 usually be good enough?

~~~
mattz0rt
In my experience SMT solvers tends to be an order of magnitude slower than MIP
solvers, even when looking at the exact same constraints. But yes given a
small enough problem SMT solvers like Z3 are an excellent choice and provide
additional functionality like propositional logic (if A then B else C).

------
aliceryhl
Oh lucky you, your optimization problem is linear. I do something similar, but
we have to use heuristics because our problem can't be modelled as a linear
problem.

~~~
kragen
See, that's what I thought before I audited an operations research class, but
it turns out that the reduction from SAT to ILP is trivial, so there are no
problems in NP that can't be modeled as _integer_ linear optimization
problems.

~~~
LolWolf
Yeah, but the problem is that the reduction might be large in practice (in the
sense that reducing the original problem to SAT or MIP might require the
introduction of a large number of variables).

While theoretically polynomially reducible, if the rewriting multiplies the
number of variables by 1000 then it’s likely not good in practice.

~~~
kragen
I don't have experience reducing problems to SAT but I think the usual
inflation factor is a lot less than 1000. I think the trivial reduction I
found from SAT to ILP uses 3 ILP variables per SAT variable, but there might
be a better one, and I might be misremembering; it might be 4 or something.

~~~
LolWolf
Sure, the SAT -> ILP is small but the reduction to SAT—or MILP directly—can
sometimes (but not often) be large.

Either way, I agree that there is likely some way in which the original
commenter’s problem can be solved via MILP/MICP methods (indeed, I’ve found
very few problems in practice that cannot be easily reduced) :)

------
mint2
What happens if/when shipments miss their ready times?

~~~
kragen
Typically you model this kind of thing in linear optimization as a hard
constraint rather than a penalty, but you _can_ model it as a penalty in mixed
integer linear programming. It costs you two extra decision variables, one of
which is binary.

~~~
atak1
^ yes, penalties in the solver. Above post is correct.

IRL they just get rolled and we treat it as a new problem with a different set
of schedules.

