Hacker News new | comments | ask | show | jobs | submit login

Their hypercube covering formalism can be seen as decision tree induction with a specific partitioning rule, and terminating branching only at uniformly labeled leaves. But try are using the tree nodes as kind of an embedding to apply a softmax on. I like the connection between relus and the geometrical representation, makes it easier to think about in spatial terms.

Reading this I got several dejavus to my grad school classes on classical ML stuff. I like the direction but it feels like it could be better if it admitted that it's a variant of decision tree embedding, and built on some of the massive amount of research work in that area. At least in terms of understanding.

I suspect doing a random forest version of this would actually help. Perhaps we will see this as a legit pre-training step.

> terminating branching only at uniformly labeled leaves

Also called Perfect Decision Tree.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact