
Rope (data structure) - jmduke
http://en.wikipedia.org/wiki/Rope_%28data_structure%29
======
josephg
I made a skip list based rope implementation in C and Javascript. In C, the
crossover point when ropes become faster is around 200 bytes. They're an
awesome little data structure.

C: [https://github.com/josephg/librope](https://github.com/josephg/librope)

JS: [https://github.com/josephg/jumprope](https://github.com/josephg/jumprope)

~~~
kyzyl
In the usage example for the C lib, you have

'uint8_t *str = rope_createcstr(rope, NULL);'

Did you mean 'r' instead of 'rope'?

~~~
josephg
Yep - fixed! (Thanks @rymo4 for the pull request)

~~~
kyzyl
Heh, no problem. Not sure why I decided to post here instead of just doing it
there.

Nice lib, though. I'll keep it in mind.

------
pcwalton
In SpiderMonkey, JavaScript strings turn into ropes based on some heuristics
(if you append to them a lot, I think). This helps a few benchmarks.

So while you might think a JavaScript string is just a pointer + length, the
implementation is actually significantly more complex: the engine will pick
different implementations depending on how it thinks you're using it.

~~~
mccr8
The whole hierarchy of SpiderMonkey strings is laid out in this comment in the
code:

[http://mxr.mozilla.org/mozilla-
central/source/js/src/vm/Stri...](http://mxr.mozilla.org/mozilla-
central/source/js/src/vm/String.h#88)

------
Chris_Newton
If you like the idea of ropes, you might also be interested in a somewhat
related paper by Charles Crowley called _Data Structures for Text Sequences_ :

[http://www.cs.unm.edu/~crowley/papers/sds.pdf](http://www.cs.unm.edu/~crowley/papers/sds.pdf)

The material about the “piece table” method is particularly interesting IMHO.

------
rav
A while ago, I implemented ropes based on the 1995 paper by Boehm et al in
JavaScript. The interface provides methods charAt, charCodeAt, concat and
substring, so it functions as a drop-in replacement for JavaScript strings
(provided you limit your use to the given supported methods).

[https://github.com/Mortal/ropejs/blob/master/rope.js](https://github.com/Mortal/ropejs/blob/master/rope.js)

------
snprbob86
I'd like to see a rope-variant of the 2-3 finger tree.

2-3 finger trees are immutable/persistent and support access to both ends in
amortized constant time and logarithmic concatenation and splitting.

The complicating factor being that for flyweight values, like characters, the
interior nodes of the tree would be prohibitive for one leaf per character.
Surely there must be a variant of 2-3 finger trees that addresses this.

~~~
jeffffff
you can implement a rope as a 2-3 finger tree of string literals. i wrote a
java implementation of 2-3 finger trees a few months ago and implemented both
ropes of bytes and ropes of chars (rope backed by finger tree of java Strings
here:
[https://github.com/jeffplaisance/fingertree/blob/master/src/...](https://github.com/jeffplaisance/fingertree/blob/master/src/main/java/com/jeffplaisance/util/fingertree/rope/Rope.java)).

------
Roboprog
Interesting, but I'm going to have to have a really huge job before I want to
implement something like this rather than something like an array-of-
arrays/list-of-lists or something like a DOM tree. That is, find a way to
naively partition the text into intuitive chunks (newlines, spaces, tags,
whatever as chunk boundaries).

Yes, I'm deliberately being contrarian, as the data structure takes a bit of
work to comprehend, and even more work to understand the merge and split
operations when you make updates. Not saying that I would never use it, just
that the need better justify the effort. (not against going outside the out-
of-the-box libraries to do things like tries or radix sort, for example, just
want a justification to do the work)

~~~
bzbarsky
A DOM tree is rather a pain to implement in practice. You need fast
foo.childNodes[i] and also fast foo.nextChild. And fast inserts and removes.
And APIs that use (parent, offset) pairs as their basic concept (like ranges)
and other APIs that use (parent, prevSibling) pairs (like node insertion). Oh,
and memory use needs to be minimal. And the code needs to have low constants,
not just low algorithmic complexity.

~~~
Roboprog
Good point! Sloth and inertia usually allows me to use the work of others in
this case (XML DOM, when appropriate), though, but that doesn't make it
internally simply (or efficient).

------
philsnow
I guess if you have an array of characters and you want to insert into the
middle, you could promote the array into a rope as follows:

    
    
      1. break the array into two extents (start address / length pairs)
      2. make a rope out of those two extents
      3. do the insert on the resulting rope
    

this assumes that you're okay with starting with an array of characters and
ending up with a rope.

ISTR this pattern was common-ish in erlang last time I looked at erlang, but
they call ropes (of bytes) "iolists", and support for iolists goes all the way
down into the standard library.

~~~
Cthulhu_
> this assumes that you're okay with starting with an array of characters and
> ending up with a rope.

Well, in any OO language, I guess both a string and a rope would still be
considered a String object, whether it's a character array or a tree-like
structure backing it wouldn't really matter to the public interface.

------
shawn-butler
I became aware of them as SGI proposed extension to C++ standard library.

Good evaluation of them remains

[http://www.sgi.com/tech/stl/Rope.html](http://www.sgi.com/tech/stl/Rope.html)

~~~
nly
The GNU C++ standard library still includes them as a non-standard extension.

[http://gcc.gnu.org/onlinedocs/libstdc++/latest-
doxygen/a0006...](http://gcc.gnu.org/onlinedocs/libstdc++/latest-
doxygen/a00066.html)

------
Tycho
Can someone explain intuitively what this is?

~~~
inportb
A rope stores a long string as a collection (binary tree) of shorter strings.
Strings are stored contiguously in memory, so mutation (insertion, deletion,
catenation, etc) often involves much copying of data, which could be
problematic with long strings. Ropes avoid this problem, but have some
additional code complexity and resource overhead associated with data
fragmentation.

------
zippie
C implementation:
[https://github.com/kshulgin/crope](https://github.com/kshulgin/crope)

------
joe_the_user
As far as I can tell, all the properties described come because this is a
balanced binary tree of strings "with rank".

I'd be curious if there is anything else that makes this different?

(it's fairly easy to adjust any balanced binary tree to allow the "report"
function, ie generating a ordered sublist of size me in O(log(n) + m) time )

------
manish_gill
Hmm. Is this the data structure that's used in the "rope" libraries for Python
symbol lookup etc?

~~~
pjscott
Unless I mistake what you're talking about, that's something very different:

[http://rope.sourceforge.net](http://rope.sourceforge.net)

------
moondowner
One more resource on Ropes, more Java oriented
[http://www.ibm.com/developerworks/library/j-ropes&#x2F](http://www.ibm.com/developerworks/library/j-ropes&#x2F);

------
EdiX
PSA: a Gap Buffer is a far simpler data structure than a rope that works just
as well in most (but not all) circumstances.

------
itcmcgrath
Our database uses Ropes internally as it deals with large strings and
frequently inserts and removes sections.

