| 200 | In this section we compare lookup scalability of CHT with ordinary spinlock guarded hash tables. |
| 201 | The test consists of searching for keys in tables with load factor 4 (ie 4 items/bucket on average). |
| 202 | Half of all lookups searched for keys not present in the table. Moreover, all items were inserted |
| 203 | into the tables before the benchmark started. |
| 204 | |
| 205 | In the figure above: |
| 206 | - //ht/spinlock// - represents the original non-resizible kernel hash table with 127 buckets |
| 207 | and guarded by a single global spinlock. |
| 208 | - //ht/bktlock// - same as ht/spinlock but protects each bucket with a separate spinlock. |
| 209 | - //cht/a-rcu// - CHT protected by A-RCU, uses 128 buckets. |
| 210 | |
| 211 | Note that CHT was set up with the resize triggering load factor high enough for CHT to never |
| 212 | resize. However, CHT was still tracking the number of items in the table and still checking |
| 213 | if it should resize -- quite unlike the spinlock protected tables that are not resizible |
| 214 | and therefore do not need to track the number of elements nor check if a resize is in order. |
| 215 | |
| 216 | What is more, CHT and HTs use different hash functions. While CHT mixes user supplied |
| 217 | hashes to produce a good hash, HTs divide user supplied hashes by a prime number and |
| 218 | use the remainder as the final hash. This influences both the distribution of items |
| 219 | in buckets as well as the time to actually compute a hash. Keys were selected to favor |
| 220 | traditional HTs which achieved an optimal item distribution (exactly 4 per each bucket). |
| 221 | On the other hand, in CHT buckets contained from 0 to 9 items. Furthermore, the hash |
| 222 | mixing function used by CHT in rev1589 appears to be slightly slower (~10% slower) than |
| 223 | dividing by a prime. |
| 224 | |
| 225 | As expected, a HT protected by a single spinlock scales negatively. On the other hand, |
| 226 | a HT with one spinlock/bucket scales fairly well but is significantly slower than CHT. |
| 227 | CHT performs slightly better than both HTs in the base case of running on a single cpu. |
| 228 | |