Hacker News new | past | comments | ask | show | jobs | submit login

It is absolutely an algorithm in the sense of "a set of rules to be followed". I think you mean that it doesn't guarantee an optimal solution. That just means it's a heuristic algorithm, same as simulated annealing is a heuristic algorithm for solving optimisation problems.

Nope. An algorithm has to be effective. You can find pathological cases for k-means such that it will never converge on anything useful. So if you set your termination case to be convergence it will never terminate and if you don't then it will never be effective.

I think you might be in the minority in this opinion. Many algorithms have pathological cases but are still considered algorithms

Minority? This is directly from Knuth.

Knuth defines effectiveness as: "... all of the operations to be performed in the algorithm must be sufficiently basic that they can in principle be done exactly and in a finite length of time by a man using paper and pencil."

K-means and other heuristic algorithms fit that description.

BogoSort is an algorithm. Not a very good algorithm, but an algorithm nevertheless.

No it absolutely is not.

The lack of fundamental computer science knowledge in this thread is alarming.

Negative 2 points on a post saying that a computational method that possibly never terminates is not an algorithm... Oh dear...

It never terminates with probability 0.

As far as I can tell you're only arguing against poor implementations of K-means. If you demand that the score strictly improves at each iteration then the algorithm must terminate.

And how do you "demand that the score strictly improves"? It's an NP-hard problem.

K-means implementations generally terminate once there's an iteration where the score doesn't improve. This happens when there is convergence to a local minimum or - less likely - the state hops between two nearby local minima with the same score. But it will terminate on something, and most of the time that something will be pretty good.

I saw your mention of Knuth elsewhere, I looked it up and he demanded that

> An algorithm must always terminate after a finite number of steps ... a very finite number, a reasonable number

This is a pretty niche characterization and almost certainly not what the original post was asking for. However, I concur that there is no guarantee on how quickly K-means terminates or on how good the output will be,. But... if you're going to be that strict about it you would even have to rule out the Simplex Algorithm, which everyone I've ever spoken to thinks of as an algorithm.

In that sense kmeans may be better referred to as a 'computational method' rather than an algorithm.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact