Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Array with Constant Time Access and Fast Insertion and Deletion (github.com/igushev)
300 points by igushev on Sept 4, 2019 | hide | past | favorite | 103 comments



For those reading along, the post describes a vector-like datastructure that has O(1) access time and O(sqrt(N)) insert and deletion time at an arbitrary index. The idea is pretty clever. The datastructure maintains sqrt(N) circular arrays each of size sqrt(N). For reference indexing into a circular array has O(1) access time and O(1) insert and deletion time at the ends.

For accesses, you can in constant time determine which circular array contains the element at the given index (it's simply i % sqrt(N)) and then in constant time access the element from the underlying array. For inserts and deletions, you find the circular array that contains the location you want to insert into. First you make sure there is space in the array to insert. You do this by moving one element from each circular array to the next one. Since deleting and inserting from the end of a circular array takes O(1) and there are O(sqrt(N)) arrays, this takes a total of O(sqrt(N)) time. Then you insert the new element into the middle of the designated circular array which is of size sqrt(N) so it takes in the worst case O(sqrt(N)). This means insertions take a total of O(sqrt(N)) time.

As immawizard pointed out, there is a generalized version of this idea called tiered vectors[0] that supports an arbitrary level of nesting. A 1-tiered vector is a circular array. A k-tiered vector is an array of n^(1/k) tiered vectors of tier (k-1). You can show that for a k-tiered vector, access time is O(k), while insertion and deletion have a runtime of O(n^(1/k)). The datastructure mentioned in the post can be considered a 2-tiered vector.

The post includes benchmarks comparing the datastructure to std::vector. I would be interested in seeing benchmarks vs a binary search tree. Even though the datastructure has O(sqrt(N)) performance, that's still a lot slower than O(log(N)). The square root of a million is 1000, while the log base 2 of a million is only ~20.

One nitpick is that the author names the datastructure after themselves. Naming things after yourself is typically a faux pas.

[0] https://www.ics.uci.edu/~goodrich/pubs/wads99.pdf


Here's an implementation that combines the paper you referenced with http://www.cs.uwaterloo.ca/research/tr/1999/09/CS-99-09.pdf, which gives dynamic sizing: https://github.com/KevinStern/software-and-algorithms/blob/m...


Thanks for feedback! Any naming suggestions? :-)


The key to having things named after you is to give them generic names. Name your thing "array", literally, everywhere in the readme and the code. Then people using it won't have any choice but to say "igushev's array" and that'll stick in the long run.


In competitive programming circles I've heard the basic idea called root or sqrt decomposition.


2-tiered vector.


This is fun, let's have a naming party!

Here's my nomination:

Loop List


This seems like the winner. Although I would tweak it to looped list


That just sounds like a circular linked list. They were probably going for "list of loops", so I think the ambiguity here makes it a bad choice.


tiered_array


Re benchmarking vs other data structures: Here is a B+Tree based sequence container I made: https://github.com/det/segmented_tree

Basically, it has O(log n) everything like a binary search tree you suggested but also very good constant factors.

By default, leaves hold 1024/sizeof(T) elements and branches hold 47 elements, so it can access up to 13,289,344 8 byte elements with only 4 levels.


How is that different from a rope data structure?


Depends what you consider a Rope.

Does it require constant time concat?

Does it require immutability?

Does it require a balancing scheme based on the fib sequence?

Does it require that the tree is binary?

My tree is an attempt to be as fast as possible with log everything operations and no pointer/reference stability.


Should've said a generalized rope, as ropes are generally described as a replacement for strings, but if you replace characters with other fixed size data types everything still works.


I believe this data structure is called Tiered vectors, and is described in: https://www.ics.uci.edu/~goodrich/pubs/wads99.pdf


I need to read the paper, but on high-level looks indeed similar.


I don't want to detract from the cleverness here, but I believe your benchmarks could use some work. Here are a few suggestions:

1. Simply testing something 1000 times and (presumably) presenting the arithmetic mean is not very informative. Looking at the detailed reported benchmark times (in the output file in tests), it looks like many of the timing outcomes have high variance. Rather than running the tests 1000 times and taking the mean, you might consider running 10 batches of 100 tests (or 1000, if you can) and presenting the mean and variance of the resulting distribution. In general, k sample groups each of size p will provide more reliable information about the underlying distribution than one sample group of size k*p (for reasonable k and p, obviously).

2. Related to that, the results of the "inserting a number of elements" and "deleting a number of elements" tests are significantly worse for the tiered vector vs the std::vector than the "insert/delete a single element" tests. You don't mention this in the readme, but thinking about why it is might be informative. Thrashing seems like a possible explanation, and one you might be able to mitigate.

3. Are you making sure your cache is warm before starting to measure performance? (Pardon, I didn't look through every line of your tests.) Particularly for std::vector, and likely your intermediate deques too, this will have a big effect on timing.

4. Finally, it looks like you're primarily testing using ints (?). It would probably be a good idea to see if your results hold for a different payload size.

I don't know whether these will improve or worsen your comparison against std::vector, but they will make your claims more robust.


He's going to have cache issues since he requires one extra memory lookup.

His lookup should be roughly twice as expensive as a regular array when cache is cold - it will be dominated by the cache misses two vs one.

With a hot cache, the array lookup becomes a single load op of a few cycles, and even if he gets two cache hits, his will probably be about 6-10 ops with 2 loads, a div, and a few more ops to get the index of the sub bucket.

On one cache hit and a possible miss on the other (eg his top level index is in cache, but the bucket might not be), he's going to be getting high variance in his lookup timings. OP: to help the iterator access (sum), you might be able to prefetch (eg force a read) of the folling bucket before you need it. when you move over the bucket 2, cause a fetch of bucket 3 and the same time. You might be able to hide some of that cache miss latency then.

Also, with the DEQs you are potentially starting at the middle of a slab and having to roll around its front. That isn't going to prefetch well, so you might want to fetch the next next slab's base address and the address it logically starts at since those cane be different. Just try to hide as much cache miss latency as you can because that is where you are probably getting killed on in the iterator access.


Thanks for feedback!


There's some interesting pointers for how to improve your data structure here: https://stackoverflow.com/questions/10478260/looking-for-a-d...

There's also some benchmarks for another implementation of the same idea linked to in the README here: https://github.com/mettienne/tiered-vector/blob/master/READM.... That implementation also just punted on growing the maximum capacity of the area.


Correct me if I am wrong, but I think the insertion/deletion time complexity is incorrect. It should be O(N). If each DEQ is implemented using array instead of list, wouldn’t the insert/delete operation of IgushArray still take linear time? Since popping or pushing (pick one) of DEQ implementation with array still take linear time, that means, let’s say insertion is at position N/2, then N/2 ... N-1 elements still need to be moved to the right by one? Popping/pushing is only linear if the DEQ is implemented with linked list, but that would mean linear access time instead of constant.

Update: answer is to implement DEQ with circular buffer for O(1) pop/push.


You can implement push/pull on both ends of an array-back deq in O(1) time for all. Store the backing array (know the size as well) and then store two indexes, the front of the deq and the back of the deq. You update the front/back the obvious way on push/pop except you do modular arithmetic if it would take you out of bounds.

You have a limited size of the deq of course, but here that's fine, it's actually constant size for all of them except the final possibly smaller one.

So now, what work do you have to do on insert? For the destination deq, you need to do one pop, one copy-in and up to n^1/2 moves. For every deq after that, up to about n^1/2 of them, you need to do one pop and one push, each of which costs O(1).


Insertion and deletion from DEQ indeed is linear, but in all of DEQs in the structure only one (!) needs insertion, the rest are just popping/pushing front/back.


> the rest are just popping/pushing front/back.

Sure, but this step would require a right shift for (in the worst case) every element in the structure. Suppose you have the following DEQ:

    [R0] -> [0, 1, 2]
    [R1] -> [3, 4, 5]
    [R2] -> [6, 7, 8]
And you want to remove 4. From my understanding of your README, the structure will look like this after the operation:

    [R0] -> [0, 1, 2]
    [R1] -> [3, 5, 6]
    [R2] -> [7, 8, _]
This was not and O(sqrt(n)) operation, because you had to move elements 5, 6, 7, and 8 (not just 5 as suggested by your figures). So in general you still need to move O(n) elements upon any insertion or deletion.

EDIT: Just read some of the other comments and realized that this can be done with a circular buffer with a bit more memory than the subarrays themselves. Makes sense + clever! @OP, I really think you should clarify on this important implementation detail in your writeup.


5 and 6 do need to move. 7 and 8 can stay where they are. In R2 you'd change the variable that stores where the front of the DEQ is to point to 7, but 7 and 8 themselves stay put.


Thanks for feedback! I’ll update the README.md


My question is that each move is a set of push/pop. DEQ based on fixed array has linear complexity for either push (left aligned) or pop (right aligned). One way to have constant push AND pop is to allow for the left and right element to not be aligned to the initial fixed array memory allocation.

For example, for a full DEQ, in order to perform a push, I would have to:

1. Pop the last element

2. Move all element to the right by 1

3. Push the item into position 0

This would be a linear operation. Unless the address of the starting and ending element can change, I don’t think it’s possible to get both push and pop to less than linear time.

Edit: I was wrong--key is circular buffer


You've answered your own question implicitly. The address of the starting and ending element can change. You just define yourself where in the backing array the start/end are, and let them wrap around (use modular arithmetic for the indexes).


Thanks! DEQ implementation with circular buffer. :)

Edit: should have thought of that. Brain fart—-long day at work.


Exactly! I really like this stuff, little clever ways to skip effort. Data structures and algos don't get enough love :)


DEQ has floating starting index. When you push or pop, you just changing this index. The implementation in the repository also have custom implementation of DEQs which is even more optimized for this structure.


I think the data-structure is careful to make sure the main array is at most SQRT(N) and all linked queues also SQRT(N).

It's a neat restructuring of the vector into a SQRT(N) * SQRT(N) arrangement.


I also need to add automatic restructuring when it expands or shrinks significantly.


Interesting that some people found an article about tiered vectors, I didn't find it back then. I implemented this structure many many years ago and also published in Bauman Moscow State Technical University magazine back in 2012: http://engjournal.ru/articles/101/101.pdf (in Russian)


Tiered vectors have been around for a while. The paper for them came out in 1998. They've just always been rather underappreciated. I only found out about them through this thread too.

I don't think any of us mean to imply that you copied off another implementation; independent discovery happens all the time! It's great you've made this a usable C++ library and have contributed your own thoughts on the structure!

It looks like you're having difficulty with some of the resizing. The two array approach described here https://stackoverflow.com/questions/10478260/looking-for-a-d... makes resizing a lot easier. You just double the size of the main array and increase the size of the offset array by the square root of two, without needing to fiddle around with anything else. You also get unchanged amortized time bounds.


I also found out about tiered vectors only in this thread. At the time of implementation I tried to Google for something like this but didn't find.


Thank you very much for making this MIT licensed and not GPL. I'm glad it's actually usable by people. Looks like a great project!


This is very similar to what std::deque does under the covers already. It's a chunked array. The main difference AFAICT from std::deque is that this offers n^1/2 insertion and deletion within the body of the array. This guarantee is provided by allowing for the possibility of significant overallocation in the chunks themselves, if many modifications happen in the middle of the array. On the other hand, std::deque does not overallocate by as much -- just a chunk at the beginning and a chunk at the end. But as a consequence it has to move trailing elements when an element is inserted or deleted in the middle.

Interesting!


DEQs also allocate internal arrays with hardcoded size only (8 in std, if I remember correctly)


At least in libc++ it's the greater of 4kB or 16 * sizeof(value_type). That's the default, it is configurable. 8 would be a pretty unfortunate design choice.


Can you also benchmark against std::deque?


I'd like to see the performance of growing the list (a common scenario):

    ArrayList list = new ArrayList();
    for (int i = 0; i < N; i++) {
      list.add(i);
    }
(Forgive me, Java is my mother tongue).


I believe this case is very badly handled in this data structure implementation. It seems to be saying that you want to occasionally manually recreate the DS as it grows bigger, as the deqs won't automatically resize. So you'll just have linear behavior in that case, with extra constants because the deqs are useless logic.

I suspect an amortized data structure could automatically and internally do this rebalancing operation, but I'd have to work out the scheme and the analysis. At a guess, something like normal dynamic arrays do, where you multiply the max size by a constant when it fills up, would give you decent amortized bounds.


Correct. I need to add automatic restructuring. Currently structure sensitive and benefits a lot from using reserve() method


In C++ the std::vector type handles this by doubling the capacity when full. I think this approach also works for this data structure. When the capacity has been reached simply double the capacity and rebuild from scratch. Perhaps quadruple because then each double-ended queue would double in size, avoiding that pesky sqrt(2). The amortized time should be the same.


> In C++ the std::vector type handles this by doubling the capacity when full.

Of course, the growth factor is implementation-dependent and I believe some compilers use smaller growth factors to improve performance.


Th optimal growth factor is the golden ratio 1+sqrt(5) [1.618], but 2 is much faster to calculate.


Is calculating the new size really the bottleneck here?


Here’s an old article about the performance characteristics of Deques.

https://www.codeproject.com/Articles/5425/An-In-Depth-Study-...


> Forgive me, Java is my mother tongue

I hope your mother taught you the value of using generics ;)


This is cool. I wanted to comment on the necessity of knowing the approximate size of N a priori, but this is very clearly stated on the github page.

Interesting alternatives that also support insertion, deletion, and lookup (is there a name for this interface?) are AVL trees and hash tables.

As described in https://arxiv.org/pdf/1711.00275.pdf, tiered vectors can be improved further:

"We present a highly optimized implementation of tiered vectors, a data structure for maintaining a sequence of n elements supporting access in time O(1) and insertion and deletion in time O(n^ε) for ε > 0 while using o(n) extra space."


It seems I need to add automatic restructuring to the structure, so it would maintain perfect lengths of DEQs


I wonder how this would compare to Judy arrays. Would anyone like to benchmark this?

https://en.wikipedia.org/wiki/Judy_array http://judy.sourceforge.net

To install the Judy C lib:

  brew install judy
  apt-get install libjudy-dev
Then compile with -lJudy and #include <Judy.h>

Example: http://judy.sourceforge.net/doc/JudyL_3x.htm


Interesting. I wonder how similar a Judy array is to an rrb-tree (relaxed radix binary tree), or if they're the same thing.

An rrb-tree is an effective O(1) time for every category. Scala's Vector type implements one.

If interested about the data structure: https://youtu.be/sPhpelUfu8Q

Scala bigO doc: https://docs.scala-lang.org/overviews/collections-2.13/perfo... Scala's doc on it: https://github.com/nicolasstucki/scala-rrb-vector/blob/maste...


Why would you chose this over a Van Emde Boas tree where indexing, insert, delete are O(log log) and prev/next are constant?

https://en.wikipedia.org/wiki/Van_Emde_Boas_tree

(it basically trades a little more work on indexing for a little loser structure)

I like the use of the DE queue, but having to keep all the sub vectors packed to ensure constant time lookup might not be completely needed if you are willing to keep a little more information around or have two candidate buckets. Not sure if would be a win or not though, since they your probably just better using a VeB tree.

e.g., I don't think you always need to compress everything back to the front if you can make sure all the buckets are within 1 or each other, but you can hit cases where you have to look in two buckets which isn't that difficult since you how the high and low of each bucket at the head and tail of each DEQ.

But then you would need to keep some local coloring info keep the buckets perfectly balanced, as everything should be.


Doesn't really seem much better than just a balanced size-tagged binary tree. You get O(log n) access, insertion, and deletion. Yes access is slower but insertion and deletion is much faster. Remember that for even one billion the logarithm is merely 30. That's hardly anything. Whereas the square root is more than 30 thousand.


That sounds like trees are silver bullets but they're not and that's why we have variety of basic data structures.


I'm not saying they are a silver bullet but personally from experience I think they are a good choice in the majority of scenarios that programmers encounter. Of course our experiences could differ.


Really depends on your dataset size, right? You give 1 billion as an example, that is 8GB of RAM in a single data structure just in overhead if you have a pointer for each tree leaf.

This data structure makes more sense in the 10k-1M size, where chasing pointers is expensive and the scaling doesn't dominate.


Right, or an rrb-tree which is effectively O(1) for everything. It's a tree of arrays where the arrays are usually 32 or 64 elements. Though, there's a bit more to it than that.


This type of data structure ought to have much better cache and branch prediction behaviour than most tree implementations.


I wonder if you can save a bit in the constant by only having to chase one pointer instead of log(n) of them.


Is there a reason you didn’t just use std::deque? You’re already in C++.


std::deque has O(n) insertion/deletion in the middle. Or do you mean replacing the internal deques with std::deque?


> std::deque has O(n) insertion/deletion in the middle

I'm curious to know from approximately which n this is a problem though. It can be easy to make something with a better O complexity that is actually slower for what computers can actually handle.

edit: nevermind, I thought deque was more "smart" than it is but it needs to move all elements when inserting / deleting in the middle, so it won't have a great constant factor.


I believe a std::deque is an rb-tree (radix binary tree), which suffers from insertions in the middle.

An rrb-tree solves this problem by being relaxed, which means allowing for having random empty places in the cache line sized arrays within the structure. This way insertion is effectively O(1).

Modern languages implement these, like Scala, but C++ has been around for a while, so it's the stl is not the fastest in a few ways. std::deque is one of them. Another example, std::sort isn't the fastest, I believe.


I think the typical implementation of a std::deque is actually something like a vector of pointers to vector of pointers to T. (except they can't be std::vector exactly, because then you'd only get amortized bounds on pushes).


It says in bold type at the beginning "The IgushArray class fully implements std::vector interface" but it's clear it cannot provide an equivalent of std::vector::data.


There is a disclaimer about that: https://github.com/igushev/IgushArray#limitations


Side-question: is there any way to explicitly tell github whether a file is C or C++? I have this same issue, where C code is marked as C++ and vice-versa. Not a huge issue, but a bit annoying. It seems a bit weird that linguist (github's tool for doing language recognition) marks a file with namespaces in it as C, which is obviously not supported.


All of those operations are achievable when replacing sqrt with log by using Skip lists (https://en.wikipedia.org/wiki/Skip_list), and probably with Zip trees as well (https://arxiv.org/abs/1806.06726).

That being said, the overload of a complex data structure might multiply the running time by a factor up to ~10 (mostly because of cache misses), so simpler structures will be more performant for short inputs.

There is a more general pattern in most tree data structures where you can transform the log(n) recursive operations on the tree into sqrt(n) operations on a simpler structure with 2 levels.

I happen to have described the solution to a problem where you must use a sqrt-decomposition datastructure to have updates in O(1) and queries in O(sqrt(n)) because O(log(n)) everywhere is not good enough as there are a lot of updates to do (https://tryalgo.org/en/2017/09/01/path-statistics/).


The performance numbers would be most interesting if he benchmarked against std::vector, std::list, std::multiset and std::deque just to see how it compares to all the major collections. Using different items, say std::string, int64_t and a big POD struct might fill in a few things too.


If deq had to be fixed length beforehand with length k but with many insertions k can be much smaller than n^0.5 I think you might end up with a much worse worst case for insertion? Something in the order of O(n) than O(n^0.5)

Or there has to be a readjustment of k.

Cab somebody point out my thinking error?


Yes, the structure is sensitive for that and benefits a lot from using reserve(). I need to add automatic restructuring to maintain perfect k.


The Push Front time could be improved to O(1)* by making it circular. Assuming "*" means amortized. But as-is it does provide a better comparison to an array. A circular IgushArray would be better compared to a circular array (which an std::vector isn't).


That's a good point. That would make computation of indexes much more complicated though


You could do this with O(N^1/K) for arbitrary K, using K layers of indirection, no?

This is similar to how you can do O(logN) for insertion / deletion & access using an indexed B-tree, except you're using a fixed height instead of a fixed branching factor.


True. If you go k levels deep you get indexing complexity of O(k) (because you do a constant time operation for each level) and insertion/removal complexity of O(k + n^(1/k)) (because that’s indexing plus a linear insert/remove operation for the array at the bottom). If you push k towards log n you get O(log n) for both. (I’m ignoring rebalancing issues here but these can be dealt with too) The article did mention that you can push k to values larger than 2 in the Generalization section. I think it would’ve been nice if they had explored such values when doing the performance tests, but either way I found it clear and nicely written.


for insert/remove you also need to move the last element up for each deque with higher indices, and If I understand it correctly, there should be O(N^(2/3)) (i.e. N^1/3 * N^1/3) such deques.


cool! looks like from that, you could maybe

1. extend the construction to three dimensions, so that the insertion cost is a cubic root?

2. iterate the extension to an increasing number of dimensions, so that the cost becomes logarithmic?


I mentioned that extension in the published paper, but complexity of access becomes O(d) where d is number of dimensions.


The benchmarks compare this to standard arrays, but I'd be curious to see how tiered vectors stack up against persistent vectors (either Clojure's or Python's [pyrsistent's])


As someone who doesn't know much about this level of CS, can someone explain to me why inserting/deleting an element in a linear array is expensive?


You have to move the data to make space for the new value (or fill the space for deletion). Imagine you have an array of 5 numbers:

    1 2 3 4 5
Deleting number three requires you to do three operations:

    1 2 4 4 5  # Move 4th element to the 3rd place
    1 2 4 5 5  # Move 5th element to the 4th place
    1 2 4 5    # Clear the 5th value
Similarly inserting a second three requires 3 operations

    1 2 3 4 5 5  # Move 5th element to the 6th place
    1 2 3 4 4 5  # Move 4th element to the 5th place
    1 2 3 3 4 5  # Insert the second 3
This means as the array gets bigger you have to do more work to insert/delete values. If your arrays are large and need to do this often, your software will be very slow and you should use a more suitable data structure with different trade offs.


Basically a linear array is packed tightly, so trying to put something in the middle, well first you have to move what's already there. Because it's packed tight, you end up having to move everything to the right one space from the right back to where you're creating a hole. Then you can put what you want in the middle. Sometimes you get lucky and you're adding something far to the right and don't need to shuffle many items first, but sometimes you need to put it on the very left and reshuffle the whole thing. You'll never have to shuffle anything twice, but is possible to end up shuffling everything once so we say it's O(N) which means never takes a crazy multiple of the number of items.


Inserting an element at a position in the array means shifting the element at that position and every subsequent element to the right by 1. Naively, this takes linear time.


Are you a software developer? No judging, just curious


Turns out it was a misunderstanding on my part. I interpreted the word "insert" as "replace" and I was wondering why replacing an element in a linear array would be an expensive operation. And no, I am not a software developer, I'm a chemical engineering student. Though I've done a bit of work on Nintendo 3DS hacking by writing tools for various exploits, and also bit of STM32 tinkering (https://github.com/jason0597)


I made useful stuff in high level languages for years before I dug deeper to discover that it was sometimes useful to worry about the characteristics of my sequences of things.


Your idea appears to build upon the concepts in the https://en.wikipedia.org/wiki/Hashed_array_tree .

Alternatively, here's my list ADT implementation which has O(log n) access/insertion/deletion time: https://www.nayuki.io/page/avl-tree-list .


sqrt(N) can be described as "fast" only with significant amounts of spin added. Use ropes instead. https://en.wikipedia.org/wiki/Rope_(data_structure)


I had some hard time parsing O(N^1/2)... ;)

( O(N^(1/2)), O(N^0.5), O(sqrt(N)), O(N¹ᐟ²), O(√N̅) would have been better... )


Reminds me of a skip list.


Nothing beats !a[$0]++


My somewhat naïve question is: if indeed this is an objectively superior data structure as claimed, why hasn't anyone else thought of it before in the history of computer science? Extraordinary claims of devising a radically better implementation of a fundamental data structure require extraordinary evidence that A. the idea is indeed novel and B. it doesn't come with hidden tradeoffs that are only advertised in the fine print. The work seems impressive, and maybe it is somehow a completely new genius idea no-one has thought of, but whatever the case it needs to much more heavily reference existing literature to credibly make the barely-without-qualifiers claims contained in the README. (And to be clear: I actually hope that such references bear out the claims, because if so - awesome! But in the meantime: "trust but verify")

Edit: curious about the downvotes - care to explain? I feel like this is a legitimate and fairly uncontroversial take, but happy to hear differing perspectives...


The downvotes are probably because your question could be asked of any new thing. "If this is so great, why didn't someone do it before? Therefore it is probably not so great." I mean you could literally paste that same comment with almost no modification as a response to almost anything new thing posted on this site. That's a good signal that the comment is not helpful.

Also, I don't think the author claimed it's "objectively superior." It has objectively superior asymptotic performance in certain relatively niche workloads. However, its constant time factor is iffy due to all the pointer following and calculations you have to do for random accesses. And the benchmark results in the README show this clearly -- both the strengths and the weaknesses.

Actually, your whole train of thought around the claims in the README is puzzling to me. You seem to think they're grandiose and over the top, but the README seems mostly explanatory to me. What statements in there do you think ought to be qualified, specifically, and how?

If you want upvotes next time, explain in detail what claims you object to, and why. This comment I'm afraid did not add much to the discussion, at least for me.


Thanks for your explanation.

> It has objectively superior asymptotic performance in certain relatively niche workloads. However, its constant time factor is iffy due to all the pointer following and calculations you have to do for random accesses. And the benchmark results in the README show this clearly -- both the strengths and the weaknesses.

I got that impression after going through a fair bit of the README. My feeling is neither the post title nor the introductory statements in the README reflected these very relevant qualifications to the claim of "An Array with Constant Time Access and Fast Insertion and Deletion", and putting the onus on the reader to dig through the whole thing to discover this is either naïve or a bit disingenuous. As others have noted, and as I implied in my comment, there is in fact existing literature that captures the particular ideas and tradeoffs in this implementation - Tiered Vectors. I believe the author has a duty to cite that up front and prominently in the spirit of honest academic discourse.


> As others have noted, and as I implied in my comment, there is in fact existing literature that captures the particular ideas and tradeoffs in this implementation - Tiered Vectors. I believe the author has a duty to cite that up front and prominently in the spirit of honest academic discourse.

What "academic discourse"? This is a datastructure implementation with some documentation in a Github repo, not a submission to a scientific journal. You're insinuating a level of rigor that is completely unwarranted.

Just lean back, relax, and accept that you overreacted a bit in your original post. It's not the end of the world to admit that.


> very relevant qualifications to the claim of "An Array with Constant Time Access and Fast Insertion and Deletion"

Why would the author need to add qualifications to the title? That is literally and exactly what the data structure is.


As other commentors have pointed out [1], it appears to be a Tiered Vector implementation [2].

[1] https://news.ycombinator.com/item?id=20873110 [2] https://www.ics.uci.edu/~goodrich/pubs/wads99.pdf


No one claimed this is an objectively superior data structure. Even the author presented benchmark numbers that showed that in some scenarios (calculating the sum) this data structure is slower than std::vector in all data sizes. To me, the author's claim isn't extraordinary at all. He simply offered a new data structure with different set of trade offs compared with arrays and linked lists. Also I don't think the author hid the trade offs in fine print. It's literally present in the first table of Big O complexities, and the first benchmark result literally shows his data structure beaten soundly by std::vector.


That is true. The complexities in Big O notations, but I put all real-life measurements in README.md, so readers could see. On small item counts the complexity of data structure itself makes it performs worse comparing to std::vectors, only on large item counts it performs betters. Note that access is slower then std::vector due to complexity of data structure but multiplier is always the same.


I think it would do some stuff slower. It's probably a boring old trade off. Like you'd make if you swapped arrays for linked lists.

Where I think this would be slower is streaming access. Say you have 10,000 elements, this would store it in a 100 by 100 grid. You want to read off elements 200-500.

A normal array would allow you to sequence through them as a continuous block of memory, probably as fast as it gets. I'm outside my expertise here but I believe that block could be loaded into the processors inner most cache?

This array OTOH would require some indexing/pointers along the way which would slow it down.

So this will add to some of the constants that the O notation conveniently omits!




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: