Experimenting with Robin Hood hashing

2025-11-16 2696 words 13 minutes

Contents

I’ve recently discovered a paper¹ about Robin Hood hashing and decided to perform some simple experiments and check how a trivial, custom implementation of a hash table employing this algorithm stacks against unordered_map.

Collisions

In general, collisions in hash tables are handled using either linear probing (meaning that the colliding element will be inserted in next available slot in the table if the one that it’s hashed to is already occupied) or element chaining, in which case, the elements with the same hash are just chained together into a linked list.

Robin Hood hashing is a technique applied to implementations using linear probing that aims to minimise the probing sequence length to optimise lookups. So, based on that, it’s safe to assume it’s optimising read-heavy applications.

It’s all about PSL

In hash tables employing linear probing, psl describe the distance between elements hashed index and its actual index. These two can be far away if the hash table is heavily populated and there are a lot of elements hashed to the same index.

Spoiler alert, Robin Hood hashing aims to minimise PSL like so:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16


void insert(v) {
    p = (hash of v) % b.length;
    vpsl= 0;
    while (b[p] != null) {
        if (vpsl > b[p].psl) {
            swap v and b[p].v;
            swap vpsl and b[p].psl;
        }

        p= (p+1) % b.length;
        vpsl= vpsl+1;
    }

    b[p].v= v;
    b[p].psl= vpsl;
}

I’ve copied this pseudo-code from the original paper verbatim. vpsl is a bit a strange name for a variable but in this case, it just means psl of v, the latter describing the variable name for the element being inserted. In short, the while loop, runs until we find a free slot in the table. If there’s no collisions, then it doesn’t execute at all. The psl of v is zero and the element is inserted straight away into its designated hashed index. Therefore, psl == 0 indicates that a given element is inserted with no collisions.

In case the hashed index of v is already occupied, the loop will linearly look for next free slot in the table increasing the vpsl with each iteration.

The magic happens in:

1
2
3
4
5
6


...
if (vpsl > b[p].psl) {
    swap v and b[p].v;
    swap vpsl and b[p].psl;
}
...

Which swaps the current value of v with the one at current index in order to equalise the overall elements psl in the table.

Implementation & benchmarks

I’ve come up with this quick implementation sketch just to get some benchmark numbers:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80


template <typename ValueT, std::size_t LenV> struct RHHT {
  struct EntryT {
    ValueT v;
    int psl{0};
    bool isUsed{false};
  };

  EntryT data[LenV] = {0};
  size_t occupied = 0;

  static size_t hashValue(ValueT v) { return std::hash<ValueT>{}(v); }

  void dump() {
    for (size_t i = 0; i < LenV; ++i) {
      const auto &d = data[i];
      if (!d.isUsed) {
        continue;
      }
      std::cout << "#" << i << " (v:" << d.v << ")[psl:" << d.psl << "]"
                << std::endl;
    }
  }

  bool hasSpace() const { return occupied < LenV; }

  bool insert(ValueT v) {
    auto p = hashValue(v) % LenV;
    auto vpsl = 0;
    if (!hasSpace()) {
      return false;
    }

    while (data[p].isUsed) {
      if (vpsl > data[p].psl) {
        std::swap(data[p].v, v);
        std::swap(data[p].psl, vpsl);
      }
      p = (p + 1) % LenV;
      vpsl += 1;
    }

    data[p].v = v;
    data[p].psl = vpsl;
    data[p].isUsed = true;
    occupied++;
    return true;
  }

  bool has(ValueT v) const {
    auto p = hashValue(v) % LenV;
    auto vpsl = 0;
    while (data[p].isUsed) {
      if (data[p].v == v) {
        return true;
      }

      if (vpsl > data[p].psl) {
        return false;
      }

      p = (p + 1) & LenV;
      vpsl++;
    }
    return false;
  }

  double avgPsl() const {
    size_t total = 0, occupied = 0;
    for (const auto &d : data) {
      if (!d.isUsed) {
        continue;
      }
      total += d.psl;
      occupied++;
    }
    return static_cast<double>(total) / occupied;
  }

  double avgLoad() const { return static_cast<double>(occupied) / LenV; }
};

The insert algorithm is re-implemented just like in the paper. There are no special quirks or features added on top of it.

With the help of LLMs, I’ve generated the following benchmarks which compare the performance of my implementation against unordered_map.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292


std::vector<int> generateTestData(size_t count, int seed = 42) {
  std::mt19937 gen(seed);
  std::uniform_int_distribution<int> dis(0, 1000000);

  std::vector<int> data;
  data.reserve(count);
  for (size_t i = 0; i < count; ++i) {
    data.push_back(dis(gen));
  }
  return data;
}

template <size_t Size>
static void BM_RobinHood_Insert(benchmark::State &state) {
  auto data = generateTestData(Size);

  for (auto _ : state) {
    RHHT<int, Size> ht;
    for (const auto &val : data) {
      benchmark::DoNotOptimize(ht.insert(val));
    }
    benchmark::ClobberMemory();
  }

  state.SetItemsProcessed(state.iterations() * Size);
}

static void BM_UnorderedMap_Insert(benchmark::State &state) {
  size_t size = state.range(0);
  auto data = generateTestData(size);

  for (auto _ : state) {
    std::unordered_map<int, int> map;
    for (const auto &val : data) {
      benchmark::DoNotOptimize(map.insert({val, val}));
    }
    benchmark::ClobberMemory();
  }

  state.SetItemsProcessed(state.iterations() * size);
}

template <size_t Size, int LoadFactorPercent>
static void BM_RobinHood_Lookup_Sequential(benchmark::State &state) {
  size_t numElements = (Size * LoadFactorPercent) / 100;
  auto data = generateTestData(numElements);

  // Pre-populate
  RHHT<int, Size> ht;
  for (const auto &val : data) {
    ht.insert(val);
  }

  size_t idx = 0;
  for (auto _ : state) {
    bool found = ht.has(data[idx % data.size()]);
    benchmark::DoNotOptimize(found);
    idx++;
  }

  state.SetItemsProcessed(state.iterations());
}

template <size_t Size, int LoadFactorPercent>
static void BM_UnorderedMap_Lookup_Sequential(benchmark::State &state) {
  size_t numElements = (Size * LoadFactorPercent) / 100;
  auto data = generateTestData(numElements);

  // Pre-populate
  std::unordered_map<int, int> map;
  for (const auto &val : data) {
    map[val] = val;
  }

  size_t idx = 0;
  for (auto _ : state) {
    bool found = map.find(data[idx % data.size()]) != map.end();
    benchmark::DoNotOptimize(found);
    idx++;
  }

  state.SetItemsProcessed(state.iterations());
}

// Random lookup pattern (more realistic)
template <size_t Size, int LoadFactorPercent>
static void BM_RobinHood_Lookup_Random(benchmark::State &state) {
  size_t numElements = (Size * LoadFactorPercent) / 100;
  auto data = generateTestData(numElements);
  auto lookupKeys =
      generateTestData(numElements, 123); // Different seed for lookups

  // Pre-populate
  RHHT<int, Size> ht;
  for (const auto &val : data) {
    ht.insert(val);
  }

  size_t idx = 0;
  for (auto _ : state) {
    bool found = ht.has(lookupKeys[idx % lookupKeys.size()]);
    benchmark::DoNotOptimize(found);
    idx++;
  }

  state.SetItemsProcessed(state.iterations());
}

template <size_t Size, int LoadFactorPercent>
static void BM_UnorderedMap_Lookup_Random(benchmark::State &state) {
  size_t numElements = (Size * LoadFactorPercent) / 100;
  auto data = generateTestData(numElements);
  auto lookupKeys = generateTestData(numElements, 123);

  // Pre-populate
  std::unordered_map<int, int> map;
  for (const auto &val : data) {
    map[val] = val;
  }

  size_t idx = 0;
  for (auto _ : state) {
    bool found = map.find(lookupKeys[idx % lookupKeys.size()]) != map.end();
    benchmark::DoNotOptimize(found);
    idx++;
  }

  state.SetItemsProcessed(state.iterations());
}

// ============================================================================
// MIXED WORKLOAD (90% reads, 10% writes)
// ============================================================================

template <size_t Size>
static void BM_RobinHood_Mixed_90Read(benchmark::State &state) {
  size_t numElements = (Size * 80) / 100;
  auto data = generateTestData(numElements);
  auto lookupKeys = generateTestData(numElements, 123);

  for (auto _ : state) {
    state.PauseTiming();
    RHHT<int, Size> ht;
    for (const auto &val : data) {
      ht.insert(val);
    }
    state.ResumeTiming();

    for (size_t i = 0; i < 1000; ++i) {
      if (i % 10 == 0) {
        // 10% writes
        benchmark::DoNotOptimize(ht.insert(lookupKeys[i % lookupKeys.size()]));
      } else {
        // 90% reads
        benchmark::DoNotOptimize(ht.has(lookupKeys[i % lookupKeys.size()]));
      }
    }
  }

  state.SetItemsProcessed(state.iterations() * 1000);
}

static void BM_UnorderedMap_Mixed_90Read(benchmark::State &state) {
  size_t size = state.range(0);
  size_t numElements = (size * 80) / 100;
  auto data = generateTestData(numElements);
  auto lookupKeys = generateTestData(numElements, 123);

  for (auto _ : state) {
    state.PauseTiming();
    std::unordered_map<int, int> map;
    for (const auto &val : data) {
      map[val] = val;
    }
    state.ResumeTiming();

    for (size_t i = 0; i < 1000; ++i) {
      if (i % 10 == 0) {
        benchmark::DoNotOptimize(map[lookupKeys[i % lookupKeys.size()]] = i);
      } else {
        benchmark::DoNotOptimize(map.find(lookupKeys[i % lookupKeys.size()]) !=
                                 map.end());
      }
    }
  }

  state.SetItemsProcessed(state.iterations() * 1000);
}

// ============================================================================
// CACHE BEHAVIOR TEST (Many lookups on same small set)
// ============================================================================

template <size_t Size>
static void BM_RobinHood_HotCache(benchmark::State &state) {
  size_t numElements = (Size * 80) / 100;
  auto data = generateTestData(numElements);

  RHHT<int, Size> ht;
  for (const auto &val : data) {
    ht.insert(val);
  }

  // Hot set: only 100 keys accessed repeatedly
  std::vector<int> hotKeys(data.begin(),
                           data.begin() + std::min(size_t(100), data.size()));

  size_t idx = 0;
  for (auto _ : state) {
    bool found = ht.has(hotKeys[idx % hotKeys.size()]);
    benchmark::DoNotOptimize(found);
    idx++;
  }

  state.SetItemsProcessed(state.iterations());
}

template <size_t Size>
static void BM_UnorderedMap_HotCache(benchmark::State &state) {
  size_t numElements = (Size * 80) / 100;
  auto data = generateTestData(numElements);

  std::unordered_map<int, int> map;
  for (const auto &val : data) {
    map[val] = val;
  }

  std::vector<int> hotKeys(data.begin(),
                           data.begin() + std::min(size_t(100), data.size()));

  size_t idx = 0;
  for (auto _ : state) {
    bool found = map.find(hotKeys[idx % hotKeys.size()]) != map.end();
    benchmark::DoNotOptimize(found);
    idx++;
  }

  state.SetItemsProcessed(state.iterations());
}

// ============================================================================
// REGISTER BENCHMARKS
// ============================================================================

// Small size (1K elements)
BENCHMARK_TEMPLATE(BM_RobinHood_Lookup_Sequential, 1024, 75);
BENCHMARK_TEMPLATE(BM_UnorderedMap_Lookup_Sequential, 1024, 75);
BENCHMARK_TEMPLATE(BM_RobinHood_Lookup_Sequential, 1024, 90);
BENCHMARK_TEMPLATE(BM_UnorderedMap_Lookup_Sequential, 1024, 90);

BENCHMARK_TEMPLATE(BM_RobinHood_Lookup_Random, 1024, 75);
BENCHMARK_TEMPLATE(BM_UnorderedMap_Lookup_Random, 1024, 75);
BENCHMARK_TEMPLATE(BM_RobinHood_Lookup_Random, 1024, 90);
BENCHMARK_TEMPLATE(BM_UnorderedMap_Lookup_Random, 1024, 90);

// Medium size (10K elements)
BENCHMARK_TEMPLATE(BM_RobinHood_Lookup_Sequential, 10240, 75);
BENCHMARK_TEMPLATE(BM_UnorderedMap_Lookup_Sequential, 10240, 75);
BENCHMARK_TEMPLATE(BM_RobinHood_Lookup_Sequential, 10240, 90);
BENCHMARK_TEMPLATE(BM_UnorderedMap_Lookup_Sequential, 10240, 90);

BENCHMARK_TEMPLATE(BM_RobinHood_Lookup_Random, 10240, 75);
BENCHMARK_TEMPLATE(BM_UnorderedMap_Lookup_Random, 10240, 75);
BENCHMARK_TEMPLATE(BM_RobinHood_Lookup_Random, 10240, 90);
BENCHMARK_TEMPLATE(BM_UnorderedMap_Lookup_Random, 10240, 90);

// Large size (100K elements)
BENCHMARK_TEMPLATE(BM_RobinHood_Lookup_Sequential, 102400, 75);
BENCHMARK_TEMPLATE(BM_UnorderedMap_Lookup_Sequential, 102400, 75);
BENCHMARK_TEMPLATE(BM_RobinHood_Lookup_Sequential, 102400, 90);
BENCHMARK_TEMPLATE(BM_UnorderedMap_Lookup_Sequential, 102400, 90);

BENCHMARK_TEMPLATE(BM_RobinHood_Lookup_Random, 102400, 75);
BENCHMARK_TEMPLATE(BM_UnorderedMap_Lookup_Random, 102400, 75);
BENCHMARK_TEMPLATE(BM_RobinHood_Lookup_Random, 102400, 90);
BENCHMARK_TEMPLATE(BM_UnorderedMap_Lookup_Random, 102400, 90);

// Insertion benchmarks
BENCHMARK_TEMPLATE(BM_RobinHood_Insert, 1024);
BENCHMARK(BM_UnorderedMap_Insert)->Arg(1024);
BENCHMARK_TEMPLATE(BM_RobinHood_Insert, 10240);
BENCHMARK(BM_UnorderedMap_Insert)->Arg(10240);
BENCHMARK_TEMPLATE(BM_RobinHood_Insert, 102400);
BENCHMARK(BM_UnorderedMap_Insert)->Arg(102400);

// Mixed workload
BENCHMARK_TEMPLATE(BM_RobinHood_Mixed_90Read, 10240);
BENCHMARK(BM_UnorderedMap_Mixed_90Read)->Arg(10240);

// Hot cache
BENCHMARK_TEMPLATE(BM_RobinHood_HotCache, 10240);
BENCHMARK_TEMPLATE(BM_UnorderedMap_HotCache, 10240);

Debug build (-O0 -g) results are very promising:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46


2025-11-16T18:06:17+00:00
Running ./bld/rh
Run on (16 X 1624.14 MHz CPU s)
CPU Caches:
  L1 Data 48 KiB (x8)
  L1 Instruction 32 KiB (x8)
  L2 Unified 1280 KiB (x8)
  L3 Unified 18432 KiB (x1)
Load Average: 0.65, 0.85, 0.68
--------------------------------------------------------------------------------------------------------
Benchmark                                              Time             CPU   Iterations UserCounters...
--------------------------------------------------------------------------------------------------------
BM_RobinHood_Lookup_Sequential<1024, 75>            23.2 ns         23.2 ns     29431364 items_per_second=43.0989M/s
BM_UnorderedMap_Lookup_Sequential<1024, 75>         47.7 ns         47.7 ns     14715163 items_per_second=20.9762M/s
BM_RobinHood_Lookup_Sequential<1024, 90>            30.1 ns         30.0 ns     22890742 items_per_second=33.301M/s
BM_UnorderedMap_Lookup_Sequential<1024, 90>         49.0 ns         48.9 ns     14332389 items_per_second=20.4436M/s
BM_RobinHood_Lookup_Random<1024, 75>                23.3 ns         23.2 ns     30824388 items_per_second=43.0181M/s
BM_UnorderedMap_Lookup_Random<1024, 75>             50.0 ns         49.9 ns     14074009 items_per_second=20.0314M/s
BM_RobinHood_Lookup_Random<1024, 90>                30.8 ns         30.8 ns     22487564 items_per_second=32.5128M/s
BM_UnorderedMap_Lookup_Random<1024, 90>             53.3 ns         53.2 ns     13227582 items_per_second=18.7909M/s
BM_RobinHood_Lookup_Sequential<10240, 75>           15.1 ns         15.1 ns     46823053 items_per_second=66.3617M/s
BM_UnorderedMap_Lookup_Sequential<10240, 75>        51.5 ns         51.5 ns     13923980 items_per_second=19.4305M/s
BM_RobinHood_Lookup_Sequential<10240, 90>           19.4 ns         19.4 ns     37017310 items_per_second=51.5001M/s
BM_UnorderedMap_Lookup_Sequential<10240, 90>        54.2 ns         54.2 ns     12995774 items_per_second=18.4631M/s
BM_RobinHood_Lookup_Random<10240, 75>               14.1 ns         14.1 ns     49210793 items_per_second=71.0811M/s
BM_UnorderedMap_Lookup_Random<10240, 75>            56.7 ns         56.7 ns     12466727 items_per_second=17.6515M/s
BM_RobinHood_Lookup_Random<10240, 90>               19.0 ns         18.9 ns     36736116 items_per_second=52.8209M/s
BM_UnorderedMap_Lookup_Random<10240, 90>            59.8 ns         59.8 ns     11552573 items_per_second=16.734M/s
BM_RobinHood_Lookup_Sequential<102400, 75>          24.5 ns         24.5 ns     28520988 items_per_second=40.7975M/s
BM_UnorderedMap_Lookup_Sequential<102400, 75>       61.6 ns         61.5 ns     11358184 items_per_second=16.2709M/s
BM_RobinHood_Lookup_Sequential<102400, 90>          33.4 ns         33.3 ns     21381782 items_per_second=30.0186M/s
BM_UnorderedMap_Lookup_Sequential<102400, 90>       56.5 ns         56.4 ns     12453230 items_per_second=17.7316M/s
BM_RobinHood_Lookup_Random<102400, 75>              23.1 ns         23.0 ns     28817244 items_per_second=43.4076M/s
BM_UnorderedMap_Lookup_Random<102400, 75>           67.5 ns         67.3 ns     10439771 items_per_second=14.8492M/s
BM_RobinHood_Lookup_Random<102400, 90>              33.2 ns         33.2 ns     21677768 items_per_second=30.1504M/s
BM_UnorderedMap_Lookup_Random<102400, 90>           57.8 ns         57.7 ns     12323824 items_per_second=17.3335M/s
BM_RobinHood_Insert<1024>                          95319 ns        95212 ns         7270 items_per_second=10.755M/s
BM_UnorderedMap_Insert/1024                       112268 ns       112161 ns         6246 items_per_second=9.12976M/s
BM_RobinHood_Insert<10240>                       4249993 ns      4245093 ns          164 items_per_second=2.4122M/s
BM_UnorderedMap_Insert/10240                     1162218 ns      1161018 ns          604 items_per_second=8.81984M/s
BM_RobinHood_Insert<102400>                    826079369 ns    824980437 ns            1 items_per_second=124.124k/s
BM_UnorderedMap_Insert/102400                   15235647 ns     15212770 ns           46 items_per_second=6.73119M/s
BM_RobinHood_Mixed_90Read<10240>                   20910 ns        20882 ns        33439 items_per_second=47.889M/s
BM_UnorderedMap_Mixed_90Read/10240                239302 ns       239000 ns         2929 items_per_second=4.1841M/s
BM_RobinHood_HotCache<10240>                        9.40 ns         9.39 ns     73922140 items_per_second=106.523M/s
BM_UnorderedMap_HotCache<10240>                     49.3 ns         49.3 ns     14228030 items_per_second=20.2833M/s

Clearly, there’s at least a 2x advantage using RobinHood hashing vs default unordered_map. This is perhaps better visible on the graphs:

The access times advantages quickly diminishes when switching to actual release build (-O3):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46


2025-11-16T18:04:25+00:00
Running ./bld/rh
Run on (16 X 400 MHz CPU s)
CPU Caches:
  L1 Data 48 KiB (x8)
  L1 Instruction 32 KiB (x8)
  L2 Unified 1280 KiB (x8)
  L3 Unified 18432 KiB (x1)
Load Average: 2.59, 1.05, 0.70
--------------------------------------------------------------------------------------------------------
Benchmark                                              Time             CPU   Iterations UserCounters...
--------------------------------------------------------------------------------------------------------
BM_RobinHood_Lookup_Sequential<1024, 75>            5.40 ns         5.39 ns    133449597 items_per_second=185.489M/s
BM_UnorderedMap_Lookup_Sequential<1024, 75>         5.27 ns         5.27 ns    132876156 items_per_second=189.839M/s
BM_RobinHood_Lookup_Sequential<1024, 90>            8.55 ns         8.54 ns     83301857 items_per_second=117.098M/s
BM_UnorderedMap_Lookup_Sequential<1024, 90>         5.39 ns         5.39 ns    129537377 items_per_second=185.468M/s
BM_RobinHood_Lookup_Random<1024, 75>                5.89 ns         5.88 ns    119774512 items_per_second=170.068M/s
BM_UnorderedMap_Lookup_Random<1024, 75>             5.90 ns         5.90 ns    118744624 items_per_second=169.57M/s
BM_RobinHood_Lookup_Random<1024, 90>                8.85 ns         8.85 ns     80458434 items_per_second=113.045M/s
BM_UnorderedMap_Lookup_Random<1024, 90>             6.27 ns         6.26 ns    111711716 items_per_second=159.653M/s
BM_RobinHood_Lookup_Sequential<10240, 75>           4.83 ns         4.82 ns    144810114 items_per_second=207.391M/s
BM_UnorderedMap_Lookup_Sequential<10240, 75>        5.48 ns         5.48 ns    126778145 items_per_second=182.631M/s
BM_RobinHood_Lookup_Sequential<10240, 90>           7.54 ns         7.53 ns     93302180 items_per_second=132.745M/s
BM_UnorderedMap_Lookup_Sequential<10240, 90>        7.45 ns         7.44 ns     93659417 items_per_second=134.361M/s
BM_RobinHood_Lookup_Random<10240, 75>               4.39 ns         4.39 ns    159898091 items_per_second=227.92M/s
BM_UnorderedMap_Lookup_Random<10240, 75>            14.4 ns         14.4 ns     49152677 items_per_second=69.4271M/s
BM_RobinHood_Lookup_Random<10240, 90>               6.32 ns         6.31 ns    111408154 items_per_second=158.355M/s
BM_UnorderedMap_Lookup_Random<10240, 90>            15.9 ns         15.9 ns     43591477 items_per_second=62.957M/s
BM_RobinHood_Lookup_Sequential<102400, 75>          15.5 ns         15.5 ns     45106080 items_per_second=64.451M/s
BM_UnorderedMap_Lookup_Sequential<102400, 75>       14.5 ns         14.5 ns     48171299 items_per_second=69.1551M/s
BM_RobinHood_Lookup_Sequential<102400, 90>          18.5 ns         18.5 ns     37614565 items_per_second=54.039M/s
BM_UnorderedMap_Lookup_Sequential<102400, 90>       10.9 ns         10.9 ns     63764135 items_per_second=91.5571M/s
BM_RobinHood_Lookup_Random<102400, 75>              14.8 ns         14.8 ns     47009683 items_per_second=67.4697M/s
BM_UnorderedMap_Lookup_Random<102400, 75>           24.1 ns         24.0 ns     29192357 items_per_second=41.6372M/s
BM_RobinHood_Lookup_Random<102400, 90>              18.3 ns         18.3 ns     38073755 items_per_second=54.6051M/s
BM_UnorderedMap_Lookup_Random<102400, 90>           18.6 ns         18.6 ns     37694527 items_per_second=53.7428M/s
BM_RobinHood_Insert<1024>                            236 ns          236 ns      2962331 items_per_second=4.33331G/s
BM_UnorderedMap_Insert/1024                        30669 ns        30636 ns        22923 items_per_second=33.4252M/s
BM_RobinHood_Insert<10240>                          2292 ns         2291 ns       305683 items_per_second=4.47005G/s
BM_UnorderedMap_Insert/10240                      458507 ns       457909 ns         1533 items_per_second=22.3625M/s
BM_RobinHood_Insert<102400>                        22836 ns        22828 ns        30668 items_per_second=4.48567G/s
BM_UnorderedMap_Insert/102400                    7463094 ns      7448520 ns           95 items_per_second=13.7477M/s
BM_RobinHood_Mixed_90Read<10240>                    9574 ns         9551 ns        73202 items_per_second=104.697M/s
BM_UnorderedMap_Mixed_90Read/10240                123672 ns       123521 ns         5690 items_per_second=8.09582M/s
BM_RobinHood_HotCache<10240>                        2.35 ns         2.34 ns    298672432 items_per_second=426.647M/s
BM_UnorderedMap_HotCache<10240>                     5.42 ns         5.41 ns    129175740 items_per_second=184.684M/s

Still though, there’s a very noticeable performance advantage when performing random lookups which for sure is influenced by the CPU cache and data locality in case of my implementation. Which makes this approach viable when used as some sort of lookup cache.

Robin Hood hashing ↩︎

Contents

Experimenting with Robin Hood hashing

Collisions

It’s all about PSL

Implementation & benchmarks

Debug build, access times

Debug build, insert operations times

Release build, access times

Release build, insert operations times