

Ask HN: Which algorithms change implementation of data structures at runtime - madewael

Data structures &quot;define&quot; how real-world concepts are structured and represented in computer memory. For different kinds of computations a different data structure should&#x2F;can be used to achieve acceptable performance (e.g., linked-list versus array implementation).<p>Self-adaptive (cf. self-updating) data structures are data structures that change their internal state according to a concrete usage pattern (e.g., self balancing trees). These changes are internal, i.e., depending the data. Moreover, these changes are anticipated upon by design.<p>Other algorithms can benefit from an external change of representation. In matrix-multiplication, for instance, it is a well know performance trick to transpose &quot;the second matrix&quot; (such that caches are used more efficiently). This is actually changing the matrix representation from row-major to column major order. Because &quot;A&quot; is not the same as &quot;Transposed(A)&quot;, the the second matrix is transposed again after the multiplication to keep the program semantically correct.<p>A second example is using a linked-list at program start-up to populate &quot;the data structure&quot; and change to an array based implementation once the content of the list becomes &quot;stable&quot;.<p>I was wondering if there are programmers that have similar experiences with other example programs where an external change of representation is performed in their application in order to have better performance. And where the representation (chosen implementation) of a data structure is changed at runtime as an explicit part of the program.
======
segner
Another application area for switching data representations explicitly is
Computational Geometry. The abstract reason for this is that the initial
representations of geometric objects are highly implicit.

As an example, consider a set of points in 2D specified as a list of pairs of
numbers. When you need to find the point in the set closest to a given
location the easiest and fastest method is to run through the list and look
for the minimal distance. However, if you repeat such a query for different
locations while the set of points remains the same every time there are much
faster (and complicated) ways to do it: geometric search trees and Delaunay
triangulations. These make the implicit structure of the geometry of a set of
points explicit. Whether this is actually worth it is a matter of
understanding the application and its profiling data.

This can also be seen as 'acceleration data structures'.

------
Hounshell
"Self-adaptive" data structures are fairly common. Splay trees
([http://en.wikipedia.org/wiki/Splay_tree](http://en.wikipedia.org/wiki/Splay_tree))
would be one of the canonical examples, where a read operation causes a node
to move up in the tree, hopefully making subsequent reads of the same value
faster.

"Representation" changes are fairly common as well. Typically these come about
with size. There was some well known data structure (I want to say C#'s
Dictionary, but a quick search disproves this) that would use an array for
some small number of items (less than 10 or so), then switch to a binary tree
because ultimately for a small number of items the array proved to be faster.

~~~
madewael
I was hoping to find more examples of application-level changes, where a
developer is undecided between 2 (or more) data structure(s) or data
representations and decides to swap back and forth at runtime between both.

The examples you provide are very interesting, but they are what I like to
call "internal", i.e., the user is not necessarily aware of these changes.

------
kidbees
Hard disk fragmentation tool could have been an easy analogy I guess.

