Hacker News new | comments | ask | show | jobs | submit login

In this case the fundamental idea is not especially complicated, but the math may make it seem more complex than it is to somebody unfamiliar with the notation/vocabulary.

Imagine the case where an input dataset has three dimensions: each piece of input data could be represented as a point in 3D space. We want to classify each point with a category. Imagine a cube containing all such points, and imagine the cube was subdivided into many tiny cubes of the same size. If these tiny cubes were sufficiently small, then each cube would only contain points belonging to a single category (sufficiently small cubes would only contain a single point, making this trivially true). We could then assign to each cube a category based on the category of the points in it, then if we wanted to classify a new point, we'd just check what cube it's inside and look up the category of that cube.

The above process doesn't scale well (would be too slow in practice). An alternative is, rather than subdividing the cube into tiny evenly-sized cubes, we use an iterative process, starting with the whole cube, finding an ideal dimension along which to split it, splitting it (subject to the constraint that splitting it wouldn't make the children too small), then repeating this process with the two 3D rectangles that resulted from splitting the original shape. This could produce extremely long, thin 3D rectangles however, which might not generalise well to new data, so we cap the aspect ratio of the child rectangles so they can't get too thin. This approach may leave some areas of the original cube unclassified, so for each such area we find the category of the 3D rectangle that the unclassified areas touches most, and assign its category to the unclassified area, then repeat until there are no unclassified areas remaining. The end result approximates the result of subdividing the cube into tiny cubes, but is significantly faster to produce.

The above approach is roughly what's described in the paper, except they describe an n-dimensional hypercube rather than just a 3-dimensional one, and provide more detail on how to determine the ideal axis along which to split a hyperrectangle into child hyperrectangles.

This sounds like a K-d tree. Am I understanding it right or have I missed something?

Yep, pretty much, except it's constructed differently to how we'd normally construct a K-d tree.

Cool, thanks for the translation! I'm gonna go read the paper now...

Thanks for the tldr. This sounds rather a lot like a tree based method. How fair is that analogy?

It seems fair to me, in the sense that they are effectively constructing a kind of decision tree. There may be some subtle-yet-significant differences, but they don't contrast their model with other decision trees in the paper.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact