Hacker Newsnew | comments | show | ask | jobs | submit login

Why not index size(key)?

Then lookup the n-th largest key and then pull the values?

eg. SELECT value from key_value_pairs order by size(key) LIMIT N




I don't think that's what the OP wants. The real result expected is supposed to be ordered by the value of the key itself rather than the size of the key.

-----


I don't get it,

How would you use this approach to find say, the 195687th largest element in O(log(n)) time?

Edit: However, I do know I can write a SQL query to find this - an ordinary index plus limit should do this. It's just such an approach gets really awkward and so unpredictable when you are dealing with a lot of distinct columns and tables - why I implemented this with key-value DB (a custom index on top of Kyoto Cabinet).

-----


Why does lots of columns and tables make this approach awkward? Sure you might need a couple computed columns, and a little table de normalization, but IMHO most of where NoSQL gets it's performance is from de normalization.

Regardless of how many elements compose the key the calculation is the same. I'd be surprised if you couldn't approach O(log(n)) time with a good SQL implementation, some computed columns and/or indexed views.

Just wondering was the b-tree implementation you did a pure B-Tree or was the data structure itself modified in some way. eg. storing the number of elements to the left or right?

-----


You certainly could do a SQL implementation. For my purposes, each of the many queries was going to involve something like at least six lines of logic (it was "find the nth value satisfying X, Y, and Z condition"). Debugging and trying make sure the indices worked correctly for each query looked uglier than creating a whole different structure - my structure was native C++ so there wasn't the translation issues of SQL.

I used a B-tree modified by adding a member variable including the total number of children of each parent. Its so simple a modification I don't know why more implementations don't use it but, it seems they don't.

-----




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: