| 200 | In this section we compare lookup scalability of CHT with ordinary spinlock guarded hash tables. |

| 201 | The test consists of searching for keys in tables with load factor 4 (ie 4 items/bucket on average). |

| 202 | Half of all lookups searched for keys not present in the table. Moreover, all items were inserted |

| 203 | into the tables before the benchmark started. |

| 204 | |

| 205 | In the figure above: |

| 206 | - //ht/spinlock// - represents the original non-resizible kernel hash table with 127 buckets |

| 207 | and guarded by a single global spinlock. |

| 208 | - //ht/bktlock// - same as ht/spinlock but protects each bucket with a separate spinlock. |

| 209 | - //cht/a-rcu// - CHT protected by A-RCU, uses 128 buckets. |

| 210 | |

| 211 | Note that CHT was set up with the resize triggering load factor high enough for CHT to never |

| 212 | resize. However, CHT was still tracking the number of items in the table and still checking |

| 213 | if it should resize -- quite unlike the spinlock protected tables that are not resizible |

| 214 | and therefore do not need to track the number of elements nor check if a resize is in order. |

| 215 | |

| 216 | What is more, CHT and HTs use different hash functions. While CHT mixes user supplied |

| 217 | hashes to produce a good hash, HTs divide user supplied hashes by a prime number and |

| 218 | use the remainder as the final hash. This influences both the distribution of items |

| 219 | in buckets as well as the time to actually compute a hash. Keys were selected to favor |

| 220 | traditional HTs which achieved an optimal item distribution (exactly 4 per each bucket). |

| 221 | On the other hand, in CHT buckets contained from 0 to 9 items. Furthermore, the hash |

| 222 | mixing function used by CHT in rev1589 appears to be slightly slower (~10% slower) than |

| 223 | dividing by a prime. |

| 224 | |

| 225 | As expected, a HT protected by a single spinlock scales negatively. On the other hand, |

| 226 | a HT with one spinlock/bucket scales fairly well but is significantly slower than CHT. |

| 227 | CHT performs slightly better than both HTs in the base case of running on a single cpu. |

| 228 | |