
Using Self-Organizing Maps to Solve the Traveling Salesman Problem - hardmaru
https://diego.codes/post/som-tsp/
======
soVeryTired
One of the professors in my PhD programme made his name by showing that
anything you can do with a self-organising map, you can do with a Gaussian
process. The work essentially killed off self-organising maps as a field of
study.

One time we had seminar where a guest speaker had used self-organising maps,
and the professor literally fell off his chair.

~~~
nabla9
How exactly did theoretical equivalence result kill SOM's as field of study?

There is similar connection between deep fully connected neural networks and
Gaussian process.

Deep Neural Networks as Gaussian Processes
[https://arxiv.org/abs/1711.00165](https://arxiv.org/abs/1711.00165)

~~~
soVeryTired
In much the same way the result you linked to slowed research into single-
hidden-layer NNs. Another example is when the failure to learn XOR killed off
the preceptron and nearly killed neural networks as a whole.

Researchers like low-hanging fruit, and some results just make a topic look
barren.

~~~
nabla9
> result you linked to slowed research into single-layer NNs.

You are thinking different link.

~~~
soVeryTired
Yep, sorry. I'm thinking of the (much older) results referred to in the
abstract of the linked article.

------
Jaxan
It does not “solve” it though. It’s not even an approximation (which is also
NP-hard in the general case, not sure about the Euclidean case).

------
petters
> where we are able to find a route traversing 734 cities only 7.5% longer
> than the optimal in less than 25 seconds

I could be wrong, but that does not sound like a very good method.

But I am all for people writing blog posts and exploring methods in order to
learn, of course.

------
loverofthings
7.5% longer than is optimal is way too much.

Doing simple 2-opt/3-opt heuristic (10-100ms CPU time of optimization, 200
lines of code) gets you to 1-3% of the optimum.

~~~
n4r9
That CPU time sounds reasonable for just 2-opt but I've found that 3-opt takes
longer for tours with that many stops. The quickest strategy that involves
3-opt is to do a run of 2-opt followed by 3-opt, but for instances with 500
stops this can take nearly a second to run, even after parallelizing and
optimising the code.

The tools you use will also make a difference. Python is difficult to make as
performant as C.

~~~
loverofthings
Yeah, I was thinking of a C++ implementation. The nested for loops get
optimized very well for the 3-opt case. There's also a couple of tricks one
can do with preloading (simd) of distances for evaluating 2,3-opt
simultaneously. That all fits into 200 lines of code. There's also the fact
that after executing the best moves there's a lot of previously evaluated
moves that are still valid. This also fits into those 200 lines.

Not trivial, but IMO less trivial than self-organizing maps.

~~~
n4r9
> after executing the best moves there's a lot of previously evaluated moves
> that are still valid

Ah, this has crossed my mind as well, but I hadn't got round to implementing
it yet. You could even determine a set of independent swaps per iteration and
perform them all in parallel.

------
luan42
Nice writeup Diego ;)

I worked with Diego (the author) on a first version of this, since it was a
project in the course "Artificial Intelligence Programming" IT3105 at NTNU in
Trondheim, Norway where we both stayed in our Erasmus a year ago.

While it is not very sophisticated to use SOMs for this problem, it was rather
meant as an implementation exercise. And TSP allows a graphical representation
of the process, which is nice too. That said, we spent way too much time in
the course implementing and fine tuning Genetic algorithms...

------
KyleGalvin
This is awesome. I did exactly this for a course project in 2012. While the
travelling salesman problem code seems to have been refactored out, I have a
C++ library for SOMs and hierarchical SOMs that are attatched to some openGL
code that demonstrates the categorization of RGB colors by label.

If you can excuse the embarassing mess of file structure, the core lib is
here, buried among a dozen other pet projects from my student days:
[https://github.com/KyleGalvin/Typhoon/blob/master/Cpp/SOM.cp...](https://github.com/KyleGalvin/Typhoon/blob/master/Cpp/SOM.cpp)

Edit: it seems (on second glance) the above link is exactly the TSP problem,
wired up with SDL. The core library is here:
[https://github.com/KyleGalvin/Typhoon/blob/master/Cpp/src/so...](https://github.com/KyleGalvin/Typhoon/blob/master/Cpp/src/somIII.hpp)

------
tw1010
Neural networks, self-organizing maps, genetic programming. It feels like
we're back in the 80s.

~~~
flohrian
I recently read a math paper: abelian groups, hilbert spaces, lebesgue
measures. It felt like we're back in the 1900s.

------
JustFinishedBSG
It doesn't "solve" anything, it just gives an admissible solution.

~~~
dwighttk
or at least _a_ solution

------
NicoJuicy
It doesn't solve it, because it doesn't handle roads

~~~
edshiro
I presume to handle the roads you just have to change the distance function to
use an API like Google Maps, TomTom, GraphHopper.

This would significantly impact performance though.

~~~
Zeebrommer
Yes, and additionally the problem is then not Euclidian anymore. For example
the triangular inequality (distance from A to B directly is always shorter
than or equal to distance from A to B via C) does not hold. I'd guess that
that will deteriorate results from this method since it relies on the
Euclidian distance from the current circle to the points to be visited.

~~~
n4r9
I don't understand why you think traveling by road breaks the triangle
inequality. If the time/distance along roads from A to B is fastest via C,
then just go via C...

~~~
Filligree
The triangle inequality is precisely the statement that this never happens.

~~~
n4r9
You misunderstand. I'm saying that a situation such as:

dist(AB): 3km

dist(AC): 1km

dist(CB): 1km

is impossible even if those are road distances rather than Euclidean
distances. If the road distance of traveling A->C->B is 2km then dist(AB)
should not be greater than 2km.

~~~
Filligree
So long as you're measuring in kilometers, as the bird flies. If you're
talking about the distance a car would have to drive... okay, you're correct
that it'd be possible to drive _through C_ and thereby cut down on cost, but
that's somewhat besides the point.

The algorithm this article is talking about assumes that all points are on an
Euclidean map, and the distance between them is simply the Euclidean distance.
This requires the triangle inequality to hold for _straight lines_ , and
allows optimizations based on that assumption.

The inequality in itself is a weaker statement than saying the whole thing is
euclidean, and I haven't read it carefully enough to be sure which is
required, but any violation of it is necessarily a violation of the
assumptions that make this heuristic work.

~~~
n4r9
Agreed. I was just responding to Zeebrommer's claim that the triangle
inequality fails with road distances.

------
dspillett
I've not read the article in detail, but it looks to me like they have
produced a more efficient brute force method, which might be impressive but
isn't nearly the same as solving it in any mathematical sense.

