Hacker News new | past | comments | ask | show | jobs | submit login

> Much better than a "this will be fast, if you're lucky" distribution.

This is a really wrongheaded way to look at this. BTrees are themselves an adaptive function that can be used at the top of bottom of an index. We've just got other ones here being proposed that are much faster and smaller, when they work. And there is a methodology here to identify and train them, when they do.

This exists in sharp contrast to our existing family of algorithms, which tend to lack any sort of introspective features in their operation. Tree building, for example, is (like many divide-and-conquer algorithms that are actually trees) incredibly sensitive to the ordering of the arguments, and algorithms that balance them use only structural information and balance them at the last possible (and often most expensive) moment.

What's most amazing about this approach is that you've got a recursive model where the model is trained in layers that propose a submodel, or "an expert which has a better knowledge about certain keys."

That model could be a tree, or a b-tree, or a linear regression, or a neural network. They actually test using other tree types and search algorithms (which are implicitly tree types) in the paper.

This process adaptively trains models which can, in a very real sense, increase and decrease their level of detail as the structure of the data demands.

And in the later case (which seems to excite people less but excites me more) they propose a bloom filter alternative that can automatically identify useful structural elements and latch onto those as part of its categorization, which is is perhaps one of the most revolutionary ideas I've seen bringing ML into the distributed systems world.




B-trees incrementally rebalance on every insert

https://en.wikipedia.org/wiki/B-tree


I thought maybe I had made an error in writing or editing that implied that b-trees were not self-balancing. So I reread it twice. Maybe I just don't see where I implied otherwise.

That doesn't mean that the amount of computation and memory comparisons done doesn't vary based on the order of the input.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: