
An Intro to Integer Programming for Engineers: Simplified Bus Scheduling - dget
https://blog.remix.com/an-intro-to-integer-programming-for-engineers-simplified-bus-scheduling-bd3d64895e92#.llj20js9f
======
glangdale
Integer programming often seems magical. I remember early grad school and
seeing "Optimal and Near-optimal Global Register Allocation Using 0–1 Integer
Programming" by Goodwin et al, where the very crunchy problem of register
allocation was solved largely by punting it to a ILP solver. Gotchas abounded
but it was eye opening to see how many hard problems could be solved (or at
least adequately approximated) in this fashion.

I think this is one of those techniques that every serious computer scientist
should have in their toolbox. Someone who knew some statistics, a smattering
of something like SMT (enough to drive Z3) and some linear programming could
routinely resemble Gandalf at any shop that has people struggling with
difficult problems.

The main thing is not to turn into a cookbook guy. I've known a few people
whose ability to just build an actual algorithm to solve some interesting
problem has atrophied since they reach for a ILP solver the minute anything
gets difficult.

~~~
petters
Agreed that it seems magical. This is one of the areas where the open-source
tools are way behind commercial solvers. Cplex and Gurobi are really
impressive pieces of software.

~~~
repsilat
Doubly magical now that computers (and solvers) are so fast. A few times I've
thought, "Oh, I could reduce this to a max-flow problem, and write/dig up an
algorithm to solve it" before realising it'd just be easier to write it as a
linear program (or an integer program "just in case".)

And then if weird constraints come along that broke the max-flow reduction, I
could usually shoehorn that into the formulation.

Can't remember how to write Dijkstra's algorithm? Meh, integer programming it
is. Want to write a sudoku solver and you can't be arsed with dancing links or
backtracking or prolog or... Whatever, CBC will do it. It'll be nasty and a
hell of a lot slower than doing it "the right way" in a fast language, but it
might not be slower than doing it in Python or Javascript.

~~~
glangdale
The latter is what scares me. I solved a 'bit twiddling' problem using Z3 and
felt a little ashamed. It is reminiscent of Brewster's letter to Martin
Gardner "Feeding a recreational puzzle into a computer is no more than a step
above dynamiting a trout stream". It's very easy to let skills atrophy and
resort to using black boxes we don't really understand (well, I speak for
myself, anyhow). When Z3 or cplex or whatever tool du jour gets wedged on a
problem, I am not sure I have the background in these tools to get unstuck.

~~~
tnecniv
With Z3, unfortunately, the issue is that we are really bad at gauging what
makes for a hard SMT problem. Often, knowing that a particular problem is hard
is the same as solving it! That said, you can do some really interesting
things with SMT solvers if you can formulate the problem the right way.

Gurobi, CPLEX, and friends are less magical if you have enough mathematical
maturity. Typically, if you are comfortable with linear algebra (a subject
woefully ignored in most undergraduate curricula), you can work with them. In
my experience, working with optimization engines is a two step process. First,
you need to figure out how to formulate the problem in a smart way (typically
this means that it's convex with constraints in a certain format), which is
normally the hard part. Then you scroll through the list of available
algorithms that the engine has at your disposal and pick one with properties
that you like (they often have suggestions if you have no strong preference).
Normally, if you can formulate the problem so it's convex, you will at least
have a good shot of doing well with your optimization.

~~~
petters
I have a Phd in optimization and I still think Cplex and Gurobi seem magical.
The devil is in the details. Open source solvers are comparatively not very
good.

~~~
tnecniv
Oh for sure. You could go down a real rabbit hole if you had to implement that
stuff yourself.

------
optimali
I strongly encourage using Julia for anyone applying mathematical
optimization. It is (through JuMP[1], for example) one of the areas in which
it really stands out.

[1]
[https://jump.readthedocs.io/en/latest/quickstart.html](https://jump.readthedocs.io/en/latest/quickstart.html)

------
graycat
What the OP is describing is, except for the coding, some now classic applied
math, i.e., _operations research_. That applied math got used much less often
than one might have expected because the (A) data gathering was too much pain,
expense, and botheration, (B) there was too much software to write, and
writing it was too clumsy and expensive, (C) the computing was too slow and
too expensive, and (D) even when got a good solution ready for production, the
production situation commonly changed so fast that the work of (A)-(C) could
not change and revise fast enough to keep up. And, of course, the custom, one-
shot software was vulnerable to bugs. Net, in practice, a lot of projects
failed. Mostly successful projects needed big bucks and lots of unusually
insightful sponsorship high in a big organization.

But, now (A)-(D) are no longer so difficult. This should be the beginning of a
new _Golden Age_ for such work.

Sure, since the results of such optimization can look darned _smart_ , smarter
than the average human, might call the work _artificial intelligence_. Really,
though, the work is mostly just some classic applied math now enabled in
practice by the progress in computer hard/software.

The OP is a nice introduction to vehicle routing via integer linear
programming (ILP) set partitioning. For the _linear programming_ and the case
of ILP, I'll give a quick view below. But, now, let's just dig in:

Here is an explanation of the _secret_ approach, technique, trick that can
work for vehicle routing and many other problems: The real problems can have
some just awful non-linear cost functions, just absurdly tricky constraints,
e.g., from labor contract work rules, equipment maintenance schedules,
something can't do near point A near lunch, even handle some random things,
even responding to them in real-time ( _dynamically_ ), etc. yet can still
have a good shot at getting a least cost solution or nearly so. The "nearly
so" part can mean save a lot of money not available otherwise. When there is
randomness, then try to get least expected cost.

So, first, the trick is to do the work in two steps.

The first step call _evaluation_ and the second, _optimization_.

From 50,000 feet up, all the tricky, non-linear, goofy stuff gets handled by
essentially enumeration in the first part leaving some relatively simple data
for the optimization in the second part.

In practice, this first step typically needs a lot of data on, say, the
streets of a city and requires writing some software unique to the specific
problem. The second step, the optimization, may require deriving some math
and/or writing some unique software, but the hope is that the step can be done
just by routine application of some existing optimization software.

The OP mentions the now famous optimization software Gurobi from R. Bixby and
maybe some people from Georgia Tech, e.g., from George Nemhauser and Ellis
Johnson (long at IBM Research and behind IBM's Optimization Subroutine Library
(OSL) and its application to crew scheduling at American Airlines).

First Step.

Suppose you are in Chicago and have 20,000 packages to deliver and 300 trucks.
Okay, what trucks deliver what packages to make all the deliveries on time,
not overload any trucks, and minimize the cost of driving the trucks? You do
have for each package the GPS coordinates and street address. And you have a
lot of data on the streets, where and when traffic is heavy during the day,
etc.

Okay, let's make some obvious, likely doable progress: Of those 20,000
packages, maybe have only 15,000 unique addresses. So, for each address,
_bundle_ all the packages that go to that address. Then regard the problem as
visiting 15,000 addresses instead of delivering 20,000 packages.

So, you write some software to _enumerate_. The enumeration results in a
collection of candidate routes, stops, and packages to be delivered for a
single truck. For each of those candidates, you adjust the order in which the
stops are made to minimize cost -- so here get some _early_ , first-cut,
simple _optimization_. You keep only those candidates that get the packages
delivered on time, meet other criteria, etc. You may have 1 million candidate
single truck routes. For each of the the candidates, you find the (expected)
operating cost.

So, suppose you have n = 1 million candidate single truck routes.

Also you have m = 15,000 addresses to visit.

So, you have a table with m = 15,000 rows and n = 1 million columns. Each
column is for some one candidate route. Each row is for some one address. In
each column there is a 1 in the row of each address that candidate route
visits and a 0 otherwise. One more row at the top of the table is, for each
column, the operating cost of that candidate route.

So, you have a table of 0's and 1's with m = 15,000 rows and n = 1 million
columns. You have a row with 1 million costs, one cost for each column.

Again, you have 300 trucks. So, you want to pick, from the n columns, some <=
300 columns so that all the m addresses get served and the total costs of the
columns selected is minimized. That is, if add the columns as column _vectors_
, then get all 1's.

Second Step.

Well, consider variables x_i for i = 1 to n = 1 million. Then we want x_i = 1
if we use the route in column i and 0 otherwise. Let the cost of the route in
column i be c_i. We want the total cost (TeX notation):

z(x) = sum_{i = 1}^n x_i c_i

to be minimized. So, right, we take the big table of m = 15,000 rows and n = 1
million columns and call it m x n matrix A = [a_{ij}]. We let m x 1 column
vector b have all 1's. We regard x as n x 1 where in row j = 1 to n is x_j.
Then, we get linear program

minimize z(x)

subject to

Ax = b

x >= 0

So, this is a case of _linear programming_.

Except in our problem we have one more _constraint_ \-- each x_i is 0 or 1,
and in this case our problem is 0-1 integer linear programming.

Linear Programming.

In linear programming with n variables, with the real numbers R, we go into
the n-dimensional vector space R^n. The

Ax = b

x >= 0

are the _constraints_ , and the set of all x that satisfies those is the
_feasible region_ F, a subset of R _n.

In R^n, a _closed half space* is a plane and everything on some one side of
it.

Then F can be regarded as an intersection of m closed half spaces. So, F has
flat sides, straight edges, and some sharp points ( _extreme points_ ).

Well, if there is an optimal solution, then there is an optimal solution at at
least one of those extreme points. So, the famous Dantzig simplex algorithm
looks for optimal solutions in iterations were each iteration starts at an
extreme point, moves along an edge, and stops at the next extreme point.
That's for the geometric view; the algebraic view is a tweak of the standard
Gauss elimination algorithm.

Linear programming and the simplex and other algorithms have a huge collection
of nice properties, including some surprisingly good performance both in
practice and in theory.

But, asking for each x_i to be an integer is in principle and usually in
practice a huge difference and gives us a problem in NP-complete -- at one
time, this was a huge, bitter surprise.

Warning: At one time, the field of the applied math of optimization in
operations research sometimes had an attitude, that is, placed a quasi-
religious importance on _optimal_ solutions and was contemptuous of any
solutions even 10 cents short of optimal. Well, that attitude was costly for
all concerned. Instead of all the concentration on saving the last 10 cents,
consider saving the first $1 million. Commonly in practice, we can get close
to optimality, may be able to show that we are within 1% of optimality, so
close the rest wouldn't even buy a nice dinner, and see no way in less than
two weeks more of computer time to try to get an optimal solution.

So, concentrate on the big, fat doughnut, not the hole.

How to solve ILP problems is a huge subject -- e.g., can start with George
Nemhauser -- but a major fraction of the techniques exploit some surprisingly
nice properties of the simplex algorithm. Right, likely the best known
approach is the tree search technique of _branch and bound_.

------
techwizrd
I took a number of Operations Research courses in undergrad, and one course
was almost entirely on Linear Programming and Integer Programming. I strongly
recommend engineers to get familiar with them because they are powerful tools
for solving a whole class of really tricky problems.

------
ninjamayo
Nice post but sorry guys, never been a fan of integer programming. Too
involved with genetic algorithms

