Hacker News new | comments | show | ask | jobs | submit login
Using rational numbers to key nested sets [pdf] (arxiv.org)
70 points by espeed 55 days ago | hide | past | web | 11 comments | favorite

This builds on Vadim Tropashko's work related to nested intervals and is related to the Stern-Brocot and Farey sequence and Calkin-Wilf tree:

Nested Intervals Tree Encoding with Continued Fractions https://arxiv.org/pdf/cs/0402051.pdf

Nested Intervals Tree Encoding in SQL http://www09.sigmod.org/sigmod/record/issues/0506/p47-articl...

Nested Set Model https://en.wikipedia.org/wiki/Nested_set_model

Farey sequence https://en.wikipedia.org/wiki/Farey_sequence

Stern–Brocot tree https://en.wikipedia.org/wiki/Stern%E2%80%93Brocot_tree

Calkin–Wilf tree https://en.wikipedia.org/wiki/Calkin%E2%80%93Wilf_tree

Recounting the rationals https://www.math.upenn.edu/~wilf/website/recounting.pdf

The Stern-Brocot or Farey Tree http://oeis.org/stern_brocot.html

Infinite Fractions - Numberphile [video] https://www.youtube.com/watch?v=DpwUVExX27E part 2: https://www.youtube.com/watch?v=zv5M9zuZiXo

I explored Tropashko's work in PostgreSQL back in 2004:


I updated it in April 2016 to then-current PostgreSQL versions. It was interesting to me to notice not only the improvements in PostgreSQL over those 12 years, but also how my own practices had changed.

Supercool. I tried to wrap my head around the reverse continued fractions technique a few years ago [1] and got lost evaluating the other methods.

Is it reasonable to call this a bidirectional version of Tro05? Skipping the inequality flips and the "lowest index last" by/and encoding snv and sdv (all of them, before they exist, in order) with their parent seems like a very useful improvement to only being able to calc the parents from a sibling and having to count down as the tree grows. Hopefully I didn't butcher that, corrections appreciated.

Predicates 2.5 and 2.6 intuitively are more descriptive and easier to visualize than the Cel04 nested sets 1.1 and 1.2.

numberphile and all of Brady's chans[2] are pure platinum.

[1] http://911datasets.org/index.php/Help:Tree_Structures

[2] https://www.youtube.com/user/BradyStuff


> A Novel Method for Representing Hierarchies in a Relational Database Using Bignums and SQLite

Stephen Huntley, Tcl conference 2011.


Does this sound a bit like using arithmetic coding (when done for compression)? I remember implementing that a while back and seemed to be based on a similar idea.

We were really happy to read the work of Vadim Tropashko but found during prototyping that we couldn't use it because alternate rows reversed the order of siblings. This encoding started out as a simple stab at skipping every second row to correct the sibling ordering problem. But the maths fell out so easily it seemed worth writing it up. Vadim incidentally was particularly approachable at the time. Lovely fellow.

By essentially skipping every second row of Vadim's, you end up only being able to go about half as deep before you blow BIGINT. And that wasn't so deep to begin with.

I haven't visited this stuff in many years now, but it seems likely that if you were to use the pgmp postgreSQL extension, or something similar that provided arbitrary precision integers/rationals, you would solve the depth limitation.

I'd be really interested to hear about it.

Hi Dan - Nice paper. Have you seen "FUNCTIONAL PEARL: Enumerating the Rationals"? -- I just discovered it today:


I remembered Dan working on this project many years ago. The key idea is to create hierarchies in SQL without having to do self joins. This can be useful for computing permissions quickly, for instance, a department manager may have permissions for his/her department, while someone working under the manager would have a subset of his permissions. One of the problems encountered is there is a limit to how deep the hierarchy can be, and the modelling limitations of mapping permissions to an org chart.

Can anyone explain what is interesting in this work and how can it be applied? Maybe an ELI5?

You can encode and store an entire tree path/sequence in a single rational number.

The encoded number is reversible -- the number is encoded in such a way that you can algorithmically calculate/derive each node's parents and ancestors without querying the database or traversing the tree. Possible siblings and children can be calculated the same way (if they exist).

More exotic adaptations might include a form of semantic content-based addressing by mapping the code to coordinates in a metric space. For example, see the similar Calkin-Wilf H-tree layout: https://en.wikipedia.org/wiki/Calkin–Wilf_tree#Definition_an...

this could be helpful to iterate through tree-like structures. Such as trees. * I was going to mention Calkin-Wilf * discussed below.

Conway games also take values that are fractions but only dyadic numbers (2^k as denominator) https://en.wikipedia.org/wiki/On_Numbers_and_Games

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact