
The TSP Isn't NP Complete - ColinWright
https://www.ibm.com/developerworks/community/blogs/jfp/entry/no_the_tsp_isn_t_np_complete?lang=en
======
praptak
Only decision problems can be NP- _complete_ , other problems can only be NP-
_hard_. The article is a horribly long winded rant over this nitpick.

By the way, in the computational complexity community it is understood that if
you say "TSP is NP Complete" you mean the decision version of the problem (is
there a path shorter than K?) so nobody will bother to nitpick.

~~~
anonymoushn
The great thing is that the author doesn't seem to understand the distinction.
He states in his update that he has a _decision_ problem of finding the _tour
of minimal distance_.

~~~
baddox
Isn't that decision problem also in NP? The traditional TSP decision problem
is in NP, and it seems like you should be able to run that a polynomial number
of times to solve his version.

~~~
lvh
Perhaps the parent you're replying to means to say that the author doesn't
appear to understand the distinction, because the wording being used doesn't
make it a decision problem either. As long as we're nitpicking: decision
problems have yes-or-no answers.

------
tgflynn
It's true that TSP is not NP complete because it is not in NP, because it is
an optimization problem, not a decision problem.

It's also true that TSP is NP hard. The simplest way to see this is that if
you have solved a TSP you have also automatically solved the Hamiltonian Path
problem for the corresponding graph and the Hamiltonian Path problem is known
to be NP complete.

What the op does not state and which is also very important to realize however
is that TSP is also NP easy (and hence NP equivalent). In other words if you
have an algorithm that can solve any NP complete problem in polynomial time
(ie. P == NP) then you can use that algorithm to solve TSP (actually any
finite optimization problem) in polynomial time. There is a sketch of a proof
of this in one of my previous HN comments, but I'm pretty sure this is well
known to complexity theorists, though there seems to be a fair amount of
confusion about it on the web.

~~~
gweinberg
That doesn't seem at all obvious. It's pretty easy to prove that if you can
answer whether or not there exists a path of length k in polynomial time then
you can also find what that path is in polynomial time. They prove this in the
Udacity course on complexity, which I found very interesting if not
necessarily practically useful. But it seems to me that the possible values of
k you can get grows exponentially with problem size, so you can't just use
your decision algorithm on various kays to find the optimal k in polynomial
time.

~~~
ColinWright
The maximum is bounded by the number of edges times the maximum edge cost.

~~~
wolfgke
It depends what encoding is used. If the edge costs are integers, this is
true. Also such bounds are possible for rational costs (but you have to pay
more attention). It gets a lot more difficult if non-rational edge weights are
allowed.

~~~
tgflynn
I think we're talking here about problems that can be fed to a digital
computer and hence can be represented by a finite number of bits. I believe
that would exclude non-rational weights.

~~~
wolfgke
No, it wouldn't exclude non-rational weights. Consider for example the field
Q[sqrt{2}]. Any element a+b * sqrt{2} of it can be represented by two rational
numbers a, b. Since any field extension is a vector space (in this case of
dimension 2) the addition can be implemented trivially and the multiplication
is

(a1, a2) * (b1, b2) = (a1 * b1 + 2 * a2 * b2, a1 * b2 + a2 * b1) (#)

A formula for inversion can easily be derived from (#).

TLDR: there are different kinds of encoding than binary fractions.

------
tsahyt
This might come across as quite harsh but the article could have been much
shorter.

NP is the class of _decision_ problems that can be solved by a non-
deterministic Turing machine in polynomial time. A problem is called NP-hard
when there is a polynomial-time reduction of 3-SAT to said problem. A problem
is called NP-complete if it is both NP-hard and in NP. Since TSP is not a
decision problem, it is not in NP, therefore it cannot be NP-complete.
However, it is NP- _hard_.

One can call this a minor nitpick. I still point it out every now and then
when I hear people talking about such things. However, the really important
thing is that we haven't found a way to solve NP-hard (and hence of course NP-
complete) problems efficiently on a deterministic Turing machine (and hence
computers as we can actually build them). An informal way of saying the same
thing is "TSP is really really hard to solve".

Either way, I'm always surprised at how many problems we encounter in our
every day lives are actually NP-hard. Every decent puzzle game for instance is
usually NP-hard. Goes to show that we don't really enjoy solving easy
problems.

EDIT: Of course there's also a decision version of TSP, as others have pointed
out. This one _is_ NP-complete of course.

~~~
gizmo686
>Every decent puzzle game for instance is usually NP-hard.

An interesting question is how many of these puzzles are NP-hard in practice.
For example, Minesweeper in NP-hard. However, it is possible to solve many
games of Minesweeper using simple deductive reasoning in polynomial time. This
leads to the question of what is the average difficulty of a random game. My
best attempt at formalizing this question is to imagine reducing every game
with a polynomial time algorithm, then look at the complexity of the remaining
games. The complexity of an original game of size n could be the average after
you reduce all games of size n with your polynomial time algorithm. If this
new complexity is sub exponential than it feels as if there is some sense in
which the game is not often NP-hard.

A more concrete example that comes to mind is Haskell's type inference. In
general this type inference is NP-hard. However, there are algorithms that can
solve it in polynomial time assuming a limit to nesting. Because no such limit
actually exists, type inference is still exponential. But, because we never
see arbitrarily deep nesting (except for contrived examples), this does not
matter.

~~~
tsahyt
Many NP-hard or NP-complete problems have (infinitely many) instances that are
"easy" to solve. It is quite straight forward to write up a procedure that
produces infinitely many instances of 3-SAT that can be solved in polynomial
time by the DPLL algorithm. But it is important to note that a _problem_ is
actually a set of instances.

Your observation basically boils down to "how rare is the worst case?". There
are actually algorithms with exponential worst case complexity that are used
all over the place because - on average - they perform really well. The
simplex algorithm for linear programming is an example for such. Every version
of this algorithm comes with a worst case problem such that the algorithm has
exponential runtime. However, on average it is quite a fast algorithm that can
handle millions of variables and corresponding constraints on modern hardware.
Still, that doesn't change the fact that linear programming is - from a
theoretical point of view - still hard enough that we cannot _generally_ solve
it faster than in exponential time. As is the case with DPLL, good heuristics
can speed up an algorithm a lot without changing worst case complexity. Other
examples are all over the place in AI. It's also in AI where the concept of
_pruning_ is probably encountered most often. This the practice of discarding
whole subtrees of the search tree by some sort of heuristic or deductive
reasoning.

Those things aside, I think what we should be looking at from a theoretical
point of view is when it makes sense to consider the average case complexity
over the worst case complexity. This brings us back to "how rare is the worst
case". Quicksort for instance is said to be an O(n log n) algorithm but it has
O(n^2 ) worst case complexity (trivially so). However, it can be decided in
O(n) whether this case actually occurs and hence a good sorting implementation
could switch to a different algorithm once this has been detected (something
similar happens in the std::sort function that's been implemented in the STL).
This is definitely a case where it makes sense to "forget" worst case
complexity.

For SAT, researchers have actually worked up statistics on how hard certain
instances are in terms of the number of clauses and the number of symbols. It
turns out that there is a range of ratio between the two where problems are
the hardest (roughly between clauses/symbols is between 4 and 6). I don't have
a source right now but googling "satisfiability threshold conjecture" might
turn up some results.

As such, yes, not all _instances_ of NP-hard problems are actually hard to
solve, but there are infinitely many that are. Encountering such problems in
practice one can usually make use of the special problems of the instance (or
class of instances) one actually encounters and devise an algorithm that
performs rather well on average.

------
poke111
AS others have pointed out, this glosses over the distinction between the
decision version of the problem and the optimization version. But the reason
that it often gets glossed over is that it doesn't make much of a difference
when it comes to the complexity class of the problem. In this case, the
decision version of the problem is: Does a path exist shorter than K? But once
you have a solution to that, you can use binary search to answer the
optimization version of the problem: Find k such that there exists a path = k
and no path < k.

But when it comes to the question of verifiers the distinction does matter.
because NP is only concerned with decision problems the OP is incorrect when
he says that there's no poly time verifier, because there is one for the
decision version of the problem, and that's all that matters.

------
slacka
I'm a little confused by this issue. According to Proof Wiki: "Because the
Traveling Salesman problem is both NP and NP-hard the Traveling Salesman
Problem _is_ NP-complete."

[http://www.proofwiki.org/wiki/Traveling_Salesman_Problem_is_...](http://www.proofwiki.org/wiki/Traveling_Salesman_Problem_is_NP-
complete#The_Traveling_Salesman_Problem_is_NP)

Here's another site claiming it's not even in NP:
[http://www.nomachetejuggling.com/2012/09/14/traveling-
salesm...](http://www.nomachetejuggling.com/2012/09/14/traveling-salesman-the-
most-misunderstood-problem/)

I'd love a clear explanation.

~~~
praptak
NP is a class of _decision_ problems - the solution has to be "yes" or "no".
NP-complete problems are a subset of NP, so only decision problems can be NP-
complete.

TSP, the optimization version (what's the shortest path?) is an optimization
problem therefore it is NP- _hard_ , not NP- _complete_. TSP, the decision
version (is there a path shorter than K?) _is_ NP-complete.

------
mydpy
The title is misleading! I was so excited... and then abruptly disappointed.
:(

~~~
Pirate-of-SV
First I was petrified then I was relieved.

~~~
arakawa
Cats and dogs living together, mass hysteria, P=NP!

------
j2kun
The OP's update is still incorrect. A problem in NP is a set of strings which
a deciding Turing machine must accept (and it must reject all other strings).
You cannot define a problem in NP by what the machine outputs, because for a
decision problem you can only output a single bit.

What the OP meant to say is that you can define TSP as a decision problem as
follows:

TSP = {<G, T> : G is a weighted graph, and T is a tour of minimal total
weight}

Then this achieves what we want: that deciding whether a given tour has
minimal weight is NP-hard but not known to be in NP. You can also formulate
the TSP problem as a decision problem this way:

TSP = {<G, v>: G is a weighted graph with a tour of weight at most v}.

This formulation is essentially equivalent to the first problem.

------
taejo
> The problem is to find a circuit that goes through each city once and that
> ends where it starts. This in itself isn't diffcult.

This is the Hamiltonian circuit problem, and is NP-complete.

And the OP doesn't know what a decision problem is.

Not the best person to be correcting other people's confusion.

