Hacker News new | past | comments | ask | show | jobs | submit login
The Transformer Network for the Traveling Salesman Problem (arxiv.org)
49 points by djoldman 21 days ago | hide | past | favorite | 20 comments

For context, the tests here are performed on 50- and 100-city instances, as they are limited by current GPU memory sizes. Meanwhile, the best heuristic code for TSP [1] frequently provides optimal solutions to 100k-city instances ("heuristic" = no optimality guarantee, like what this paper proposes; but one can then seek a proof of optimality if desired using a slower "exact" algorithm).

Also, Dantzig, Fulkerson and Johnson solved a 50-city TSP to optimality, with an exact method, by hand, in 1954 [2]. The practical running time was slower than what is proposed here, admittedly :-).

This is not a criticism of the paper, though: To their credit, the authors are quite straightforward about this, they're not trying to hide it. Their point is to demonstrate that their machine learning approach has potential.

[1] http://webhotel4.ruc.dk/~keld/research/LKH/

[2] http://www.math.uwaterloo.ca/tsp/uk/history.html

> best heuristic code for TSP .. 100k-city instances

These "best cases" are hand-tuned for the domain by algorithm experts - right? How big is the performance jump from generic z3 or whatever?

Edit: After alphazero, there was lots of chatter about improving combinatorical problems. This has proven elusive.

My impression of Alpha Zero research is that given enough feedback, AI would find “good” routes in the sense of sensible driving, minimal lefts, nearby bathrooms, etc — but not that it had anything in particular to say about hard combinatorics.

Ie, AI would find “fast” routes in the sense of “a human driving these gets done fast, all things considered” but not necessarily “fast” in a simple combinatorics problem — which is actually good, because simple combinatorics often fails in the real world (eg, longer routes without left turns are faster).

Did I miss something about how this more directly applies to combinatorics?

Combinatorics (and AZ) can have unlimited self-play (i.e. search) with precise reward. This is quite rare in most real world problems.

I dont really have a strong intuition for mathematics. It is possible that nets are simply good with spatial boards and not good enough with arbitrary graphs.

> These "best cases" are hand-tuned for the domain by algorithm experts - right?

Correct. It is arguably more than just "hand-tuned". The algorithm I mentioned, LKH, was entirely designed for TSP, and I'm not aware of it being applicable to much else.

> How big is the performance jump from generic z3 or whatever?

For TSP, an enormous jump. Certainly more than 3 orders of magnitude.

Provably optimal? I thought TSP was NP-Hard and could only ever be solved via heuristics

NP-hard refers to solving the problem in the general case - for a given specific instance it may be possible to find an optimal solution more easily by exploiting specific features of the graph (for instance, a series of nodes arranged in a ring would have a trivial optimal solution).

"We report improved performances over recent learned heuristics with an optimal gap of 0.004% for TSP50 and 0.39% for TSP100."

Complexity: O(n^2)

If true, what implications does this have for P = NP? My intuition for a long time has been that P = NP only in the limit of an "infinitely sophisticated" algorithm. Such that as we do more research (or I suppose train larger models), we get closer and closer to the polynomial ideal.

Current best heuristics, Lin-Kernighan-Helsgaun arrive probabilistically at better percentages for TSP.

Definitely makes you wonder if P = NP is even relevant when all the interesting instances can be solved to optimality or very near to it extremely often.

Some NP-complete problems have provably bad approximations if P!=NP. You don't get the same polynomial reduction behavior if you are using approximation algorithms. So for some problems we may have utterly practical algorithms and for others we never will (assuming P!=NP).

Doesn't apply here because this algorithm is an approximate TSP.

According to the website below, the DP solution is O(n^2 * 2^n). That would be a big speedup.


When I read Transformer I thought this was some sort of electrical circuit based solution. I wonder if you could get some sort of solution by representing the network as an electrical circuit.

You could try it on a quantum computer, but you'll have to wait a decade (at least) the see one with enough bits to solve a problem of practical size ;-)

We do not have reasons to believe that a quantum computer would be much better than classical for this class of problems.

You can build an analogue simulator for TSP by exploiting the properties of the path integral.

I'd like to see how it performs on trivial instances like a NxM grid of points (perhaps with small perturbations).

I believe that AI cannot solve this problem by generating a better algorithm. The only algorithm that will give the most optimal result is brute force, trying every possible combination. Any form of deep learning will only do as well as it gets closer to that, and require just as much computation. The advent of working quantum computers and other hardware gains will be the only solution. There is no software that can solve the traveling salesman problem, nor will there ever be, without much better hardware.

I mentioned this in another thread: currently we do not have reasons to believe quantum computers are better than classica at solving NP problems.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact