Contents

Is 'rope' data structure still any relevant in modern applications?

Ropes are an alternative approach to representing strings within memory but do they still hold any strong hand nowadays and if so, why aren’t they as widely adopted as classical character strings? This is something I’m gonna try to explore using a simple C++ implementation and a couple of benchmarks.

What are ropes?

Rope represents a string in a tree-like fashion. Leaf nodes store the actual characters, while internal tree nodes store weights (mostly). Wikipedia has a pretty nice article, which I encourage you to familiarise with.

Additionally, this classic white paper details ropes as well.

What are the advantages over character strings?

The main selling point is concatenation which is O(1). Additionally insertion/deletion happens in O(log n). This would make it a perfect candidate as an underlying data structure to implement text editor buffer but aside of that, what else can it be used for?

Applications

One thing that comes to mind are string builders. Golang has strings.Builder, C++ has the std::stringstream. Both of these provide much more functionality than just plain boring string concatenation (like arbitrary type to string conversion etc) but I’m gonna focus on most simplest use-case - let’s explore if ropes perform any better when used just for string concatenation than plain old std::stringstream.

What am I gonna test?

Custom implementation

I always had a soft spot for rope data structure so, I can’t resist but to implement one myself - of course this will be a very limited implementation and will have many caveats but should be good enough as a proof of concept.

You can find my repo, with all the details here.

Boost.Text

Boost.Text is an implementation by Zachary Laine which aims to bring in more robust support for UTF-8 to C++ but aside of that, it brings in an implementation of a set of interesting data structures, one of which, amongst many, are ropes. Boost.Text is meant to undergo proposal as an addition to boost libraries. This is very exciting! There’s a good chance C++ will have a production ready implementation of ropes available to everyone (which would make C++ quite unique in that regard amongst other languages).

Boost.Text deserves a separate post of its own, which I may write, since it packs a lot of interesting features but at the moment, this is outside of this post’s scope. I recommend going through the official documentation available on project’s github pages.

Advantages

The rope implementation within Boost.Text offers a lot more features compared to my simple implementation. To name a few:

  • Copy on write
  • Proper Unicode support
  • std::string referencing
  • thread safety

In fact uencoded_rope implementation is a lot more complex and resembles more of a BTree. More can be found in the official documentation.

Custom implementation

For the sake of this comparison, I’ve implemented a rope, following the documentation on Wikipedia. I didn’t try to focus on any particular production application, nor did I try to optimise the code and make it production proof in whatever shape or form. My implementation is rather straight forward without too much thought to it.

I’ve got a common interface for nodes:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
    class node {
    public:
        virtual ~node() = default;

        virtual std::size_t weight() const = 0;

        virtual std::size_t size() const = 0;

        virtual std::string to_string() const = 0;

        virtual std::string::value_type at(std::size_t) const = 0;

        virtual std::pair<std::unique_ptr<node>, std::unique_ptr<node>> split_at(std::size_t offs) = 0;
    };

There are two types of nodes:

  • leaves - these actually contain the string characters
  • internals - these glue the nodes together in the tree

I won’t go into much details as the code is quite short and easy to comprehend so, if you’re interested, I encourage you to have a look here.

Benchmarking

This is just a very rough, back of the envelope comparison to see if it makes sense to resort to ropes at all. Therefore, the presented implementation and the comparison itself should be treated with a pinch of salt and by no means whatsoever, is meant to resemble any solid scientific analysis.

Right, that being said, here’s my benchmarks:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
static void BM_rope_concatenation(benchmark::State& state) {
    rope::rope r{"I'm the initial string"};
    const std::size_t cnt = state.range(0);

    for (auto _ : state) {
        state.PauseTiming();
        std::string s(cnt, 'a');
        state.ResumeTiming();
        r.append(std::move(s));
    }
}

static void BM_stringstream_concatenation(benchmark::State& state) {
    std::stringstream ss;
    ss << "I'm the initial string";
    const std::size_t cnt = state.range(0);

    for (auto _ : state) {
        state.PauseTiming();
        std::string s(cnt, 'a');
        state.ResumeTiming();
        ss << s;
    }
}

static void BM_string_concatenation(benchmark::State& state) {
    std::string str{"I'm the initial string"};

    const std::size_t cnt = state.range(0);

    for (auto _ : state) {
        state.PauseTiming();
        std::string s(cnt, 'a');
        state.ResumeTiming();
        str.append(std::move(s));
    }
}

static void BM_boost_text_concatenation(benchmark::State& state) {
    boost::text::unencoded_rope r{"I'm the initial string"};

    const std::size_t cnt = state.range(0);

    for (auto _ : state) {
        state.PauseTiming();
        std::string s(cnt, 'a');
        state.ResumeTiming();
        r += s;
    }
}

Since the benchmarks rely on Boost.Text, I’ve kept them on a separate branch within the repo.

Quick word about what’s going on here. The first benchmark parameter:

const std::size_t cnt = state.range(0);

… defines the length of the string that will be appended. Additionally, the string s definition is wrapped with pause/resume timing statements. I want to measure the concatenation time purely and the string allocation is irrelevant, hence I decided to exclude it from the measurements. Here’s the results:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
Run on (8 X 3900 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x4)
  L1 Instruction 32 KiB (x4)
  L2 Unified 256 KiB (x4)
  L3 Unified 8192 KiB (x1)
Load Average: 0.00, 0.00, 0.01
------------------------------------------------------------------------------------------------
Benchmark                                                      Time             CPU   Iterations
------------------------------------------------------------------------------------------------
BM_rope_concatenation/256/iterations:5000                    675 ns          673 ns         5000
BM_rope_concatenation/1024/iterations:5000                   604 ns          612 ns         5000
BM_rope_concatenation/4096/iterations:5000                   594 ns          592 ns         5000
BM_rope_concatenation/16384/iterations:5000                  559 ns          547 ns         5000
BM_rope_concatenation/65536/iterations:5000                  609 ns          568 ns         5000
BM_rope_concatenation/262144/iterations:5000                 756 ns          666 ns         5000
BM_rope_concatenation/1048576/iterations:5000                854 ns          762 ns         5000
BM_stringstream_concatenation/256/iterations:5000            460 ns          463 ns         5000
BM_stringstream_concatenation/1024/iterations:5000           606 ns          614 ns         5000
BM_stringstream_concatenation/4096/iterations:5000          2101 ns         2104 ns         5000
BM_stringstream_concatenation/16384/iterations:5000         7704 ns         7710 ns         5000
BM_stringstream_concatenation/65536/iterations:5000        29568 ns        29468 ns         5000
BM_stringstream_concatenation/262144/iterations:5000      116913 ns       116582 ns         5000
BM_stringstream_concatenation/1048576/iterations:5000     515868 ns       514569 ns         5000
BM_string_concatenation/256/iterations:5000                  450 ns          456 ns         5000
BM_string_concatenation/1024/iterations:5000                 612 ns          609 ns         5000
BM_string_concatenation/4096/iterations:5000                1711 ns         1716 ns         5000
BM_string_concatenation/16384/iterations:5000               7115 ns         7118 ns         5000
BM_string_concatenation/65536/iterations:5000              28635 ns        28538 ns         5000
BM_string_concatenation/262144/iterations:5000            114379 ns       114045 ns         5000
BM_string_concatenation/1048576/iterations:5000           508107 ns       506853 ns         5000
BM_boost_text_concatenation/256/iterations:5000             1673 ns         1684 ns         5000
BM_boost_text_concatenation/1024/iterations:5000            2056 ns         2063 ns         5000
BM_boost_text_concatenation/4096/iterations:5000            1874 ns         1889 ns         5000
BM_boost_text_concatenation/16384/iterations:5000           4993 ns         5020 ns         5000
BM_boost_text_concatenation/65536/iterations:5000          17931 ns        17936 ns         5000
BM_boost_text_concatenation/262144/iterations:5000         80412 ns        80369 ns         5000
BM_boost_text_concatenation/1048576/iterations:5000       312508 ns       312409 ns         5000

It’s visible from the above, that std::stringstream has just a small overhead over bare std::string::append. The append time grows proportionally with the chunk size both in case of using std::string::append and std::stringstream. This is expected, as copying data can’t be avoided.

It’s a completely different story for my simple rope implementation. The measured time remains relatively constant, independent of the chunk size.

I can’t really comment regarding boost::text::unencoded_rope - the underlying data structure is a lot more complex. Potentially, there’s tree rebalancing happening at some point, which causes the time complexity to be dependant on the chunk size. Maybe the nodes are restricted to a certain number of characters and that inflicts additional performance penalty. Still though, almost twice as fast as conventional std::stringstream or std::string.

Here’s how it looks plotted on a graph:

/images/rope_plot.png

Conclusion

Standard streams and std::string::append on their own, are perfectly good solution as a string builders when dealing with small strings. However, it’s perfectly visible from the above comparison, that using ropes, even in their simplest form may yield significant advantages when the chunk size grows. The advantages are quite obvious for chunks bigger than 1K when, in case of a simple rope implementation like the one discussed in this post, the only cost when appending strings is related to memory allocation of a constant size internal rope node. To conclude, use cases for ropes are therefore quite limited, still though ropes are a relevant data structure that could be a very attractive alternative for a specialised set of applications.