
The Traveling Salesman with Simulated Annealing, R, and Shiny - lil_tee
http://toddwschneider.com/posts/traveling-salesman-with-simulated-annealing-r-and-shiny/
======
cm2012
I once had a real life issue with the travelling salesman problem - I wanted
to invite boutique fashion retailers to a conference in '09, and I thought the
best way would be to tell them about it face to face. I mapped out the address
of every single one on some app on top of Google Maps, and ended up optimizing
based on how clustered different groups of businesses were, to minimize my
walking time while maximizing number of visits.

~~~
e_modad
That is so cool! I love HN for stories like this.

------
graycat
Of course the traveling salesman problem is _combinatorial optimization,_ and
of course Uber, Lyft, Sidecar, etc. have such problems, as has long been the
case for _dial a ride_.

And, for the case of trips regularly planned, e.g., using one car to get the
same several people to work each day, would have a _deterministic_ problem. Of
course, part of the problem would be which people shared which car.

Otherwise there is something of a _probabilistic_ problem where the driver
gets a travel request at some random time for some random origin and
destination and has to decide what to do, e.g., accept the offer or not and
how to fit the offer in with what else he is doing.

Might be a lot of work, but might save some money and/or give better service
to the customers.

At one time, my Ph.D. advisor wanted me to work on dial a ride scheduling. But
I already had a problem I'd made good progress on and used that for my
dissertation instead! One reason: For the problem I used, I could claim to
have a solution that was _optimal_ in a quite strong sense, but for dial a
ride any solution would be only a big mess probabilistically, some big mess
that was just heuristic, and that could be evaluated only empirically. Bummer.

Now, I also have another problem! So, anyone who wants to work on
combinatorial optimization and hope to get interest from Uber, Lyft, Sidecar,
etc. -- go for it!

------
graycat
Nice visualization of an application of simulated annealing.

We should keep in mind that the OP is doing the traveling salesman problem
(TSP) in the plane. So we should recall the nice Karp result in about 1972:
With meager assumptions on the probability distribution of the cities, for any
number x, no matter how small, as the number of cities n goes to infinity,
there is a simple approach that will come within x% of optimality with
probability 1 - x.

This simple approach? Start by connecting the cities with a minimum spanning
tree. Then for the TSP tour, just traverse the spanning tree except going
directly to the next city in the tree traversal not already visited. That is,
when following the tree would cause visiting a city a second time, just go
directly, not following an arc in the tree, to the next city in the tree
traversal not yet visited.

So intuitively the value is, for large n, get about as close, about as often
as wish, and for small n just enumerate!

Intuitively what is going on is that one of the main challenges is identifying
the appropriate _clusters_ of cities and, then, connecting the clusters, and
the minimum spanning tree tends to find the right clusters.

Of course, there are at least two algorithms for finding minimum spanning
trees, and as I recall both run in time proportional to n^2. So, we have a
polynomial algorithm that is, for large n, for TSP in the plane, as close as
we please.

For the "plane", I meant R^2, where R is the set of real numbers, that is,
ordinary Euclidean 2-space. But Karp's approach also works the same in R^m for
any positive integer m.

------
startupfounder
This is awesome and I can see so many use cases with logistics in cities like
NYC by using addresses, and not capitals, as waypoints.

I would love to see this applied to the NYC taxi data to see how to optimise
taxi movements and see how much ridesharing would decrease car traffic.

Code is here: [https://github.com/toddwschneider/shiny-
salesman](https://github.com/toddwschneider/shiny-salesman)

~~~
mck-
I really don't mean to self-promote, but this is what we do: Route-
Optimization-as-a-Service [1]

[1] [https://www.routific.com/developers](https://www.routific.com/developers)

~~~
cfontes
How is the business like?

I kind of do the same thing but for a different sector, but we are not SaaS so
would be nice to know.

------
HCIdivision17
I enjoy finding retrospectively obvious details from essays like this: an
optimal solution won't have any crossing paths, and simply swapping cities is
a handy way to try detangling the path! The animated graphs make this plain,
but is it required? Having never thought about the TSP much (and not really
having the mathematical chops to prove it myself) I looked it up, and sure
enough SO user lhf [0,1] had a link to a grindingly in-depth article about the
problem [2].

[0] [http://stackoverflow.com/questions/2444125/crossing-edges-
in...](http://stackoverflow.com/questions/2444125/crossing-edges-in-the-
travelling-salesman-problem) [1]
[http://stackoverflow.com/a/2444288](http://stackoverflow.com/a/2444288) [2]
[http://www.ams.org/samplings/feature-column/fcarc-
tsp](http://www.ams.org/samplings/feature-column/fcarc-tsp)

------
pushedx
Simulated annealing is used heavily in chip design automation (EDA). I first
used it when writing a toy placement tool while taking a course on the
subject.

Slide 5 of this deck shows some of the problems in chip-design, most of which
have been heavily optimized by simulated annealing.
[http://www.ispd.cc/slides/ISPD12Slides/ISPD12_3_2.pdf](http://www.ispd.cc/slides/ISPD12Slides/ISPD12_3_2.pdf)

I love the beauty of the analogy to annealing a metal in metallurgy, and how
surprisingly well it translates into software. The fact that the result is
non-deterministic, but always useful, makes me smile.

~~~
eevilspock
_The probability of accepting an inferior tour is a function of how much
longer the candidate is compared to the current tour, and the temperature of
the annealing process._

Shouldn't the distance of the "nearest" neighbor be a function of temperature,
rather than (or at least in addition to) the probability of acceptance? Isn't
that how simulated annealing can escape a local minimum far from better or
global maxima? Intuitively that is how the real annealing process works.

~~~
pushedx
One of the fundamental premises of simulated annealing is this "looking
backwards" approach. It's for the sake of the sanity of the implementor. How
do you modify scalar values that determine a random process in such a way that
the outcome does not exceed some error condition? You can't (in general),
that's why you turned to annealing in the first place. You were unable to
control error bounds. Instead, you make a completely random guess and only
accept it if it's within the current "acceptable" range. This is easy to
implement and gives suprisigly good results.

------
makosdv
It's cool to see this kind of stuff. I used simulated annealing for my thesis
on cellular network optimization back in college. Haven't used any algorithms
like this since then, but they have their uses.

~~~
tjl
I used simulated annealing for my Master's thesis. I was trying to fit a
differential equation to some experimental results. With small changes in the
parameters, you could get huge changes in the solution of the DE. For some of
the experiments, the parameters it chose fit almost perfectly. As a result,
simulated annealing is one of the techniques I look at for highly non-linear
optimization.

~~~
Fede_V
Simulated annealing is not required for non-linear optimization. It's useful
for non-convex optimization. Non-linear but convex problems are trivial to
solve with Newton's method - a simple example is finding the weights of an
SVM.

~~~
tjl
True. But, in my case, I didn't know if it was convex. A small change of
parameters could go from a feasible solution to an infeasible solution.

------
joshvm
Presumably this could be extended to include, say, average flight costs and
other modes of transport so you could work out the an optimally fast and cheap
way to travel around the world?

~~~
alanctgardner3
It depends on how much you want to idealize things. If you add the assumption
that all flights/trains/etc. are available every day, and the price never
changes, then yes. In reality, trying to traverse a weighted graph optimally
with a bunch of constraints (arrival times, how long to stay in each place,
preference for difference modes of transit, transit times) is a whole field of
research in math/CS. Operations research [1] is the study of optimizing
processes with many interrelated factors.

1\.
[http://en.wikipedia.org/wiki/Operations_research](http://en.wikipedia.org/wiki/Operations_research)

------
pyk
Wow, the article didn't even mention the "original" solve of the 48 US city
problem (one in each state) in 1954 (they included Washington D.C., so 49 to
be precise)! For those interested, Dantzig (creator of the simplex method),
Fulkerson and Johnson made quite the stir when they tried out the TSP at Rand
in California more than half a century ago in their paper, "Solution of a
large scale traveling salesman problem" [1].

The paper contains a historical context of the problem and their methods
towards finding and proving (important!) that their solution was in fact
optimal.

For those interested in Dantzig, he was the famed PhD student who came late to
class one day and wrote down two open problems mistakenly thinking it was
homework. He ended up using those solutions as a portion of his PhD thesis
[2].

Here's an excerpt from Dantzig himself:

 _It happened because during my first year at Berkeley I arrived late one day
at one of [Jerzy] Neyman 's classes. On the blackboard there were two problems
that I assumed had been assigned for homework. I copied them down. A few days
later I apologized to Neyman for taking so long to do the homework — the
problems seemed to be a little harder than usual. I asked him if he still
wanted it. He told me to throw it on his desk. I did so reluctantly because
his desk was covered with such a heap of papers that I feared my homework
would be lost there forever. About six weeks later, one Sunday morning about
eight o'clock, [my wife] Anne and I were awakened by someone banging on our
front door. It was Neyman. He rushed in with papers in hand, all excited:
"I've just written an introduction to one of your papers. Read it so I can
send it out right away for publication." For a minute I had no idea what he
was talking about. To make a long story short, the problems on the blackboard
that I had solved thinking they were homework were in fact two famous unsolved
problems in statistics. That was the first inkling I had that there was
anything special about them.

A year later, when I began to worry about a thesis topic, Neyman just shrugged
and told me to wrap the two problems in a binder and he would accept them as
my thesis.

The second of the two problems, however, was not published until after World
War II. It happened this way. Around 1950 I received a letter from Abraham
Wald enclosing the final galley proofs of a paper of his about to go to press
in the Annals of Mathematical Statistics. Someone had just pointed out to him
that the main result in his paper was the same as the second "homework"
problem solved in my thesis. I wrote back suggesting we publish jointly. He
simply inserted my name as coauthor into the galley proof._

[1]
[http://www.iro.umontreal.ca/~gendron/IFT6551/LECTURES/TSP.pd...](http://www.iro.umontreal.ca/~gendron/IFT6551/LECTURES/TSP.pdf)

[2]
[http://www.snopes.com/college/homework/unsolvable.asp](http://www.snopes.com/college/homework/unsolvable.asp)

------
pradn
Lovely site design and clear explanation of simulated annealing. I've always
more or less equated hill climbing and simulated annealing. Your article makes
the difference easy to understand. Thanks!

------
chromaton
I believe the effectiveness of search algorithms goes something like this:

Monte Carlo < Hill climbing < Simulated annealing < Genetic algorithm <
Differential evolution

~~~
scishop
No. It's completely context dependent.

A hill-climbing algorithm will take the lunch of every algorithm listed here
if you have a clean convex optimization problem.

~~~
chromaton
Where would you find such a problem in the real world?

~~~
gh02t
Absolutely everywhere, it is a common subproblem in many many algorithms. For
example, the basin hopping algorithm (sister of SA) solves one at every step.

A specific example is in image deblurring via Tikhonov regularization, which
involves minimizing a function that is provably convex for many physically
realistic types of blurring.

