| | 1 | = Scalable Concurrent Hash Table = |
| | 2 | * GSoC 2012, ticket #398 |
| | 3 | * mentor: Jakub Jermar |
| | 4 | * student: Adam Hraska |
| | 5 | * repository: lp:~adam-hraska+lp/helenos/rcu/ |
| | 6 | * benchmarking repository: lp:~adam-hraska+lp/helenos/cht-bench/ |
| | 7 | |
| | 8 | The goal of the project is to create a scalable concurrent resizable hash table (CHT): |
| | 9 | * scalable = performance improves with added cpus |
| | 10 | * concurrent = the table allows parallel access from multiple threads/cpus preferably without blocking |
| | 11 | * resizable = the table grows and shrinks automatically as item are added or removed from it |
| | 12 | Moreover, the table must be suitable to replace the current global page hash table in the kernel. |
| | 13 | Optionally, it may be ported to user space as well. |
| | 14 | |
| | 15 | == Current Status == |
| | 16 | {{{#!comment |
| | 17 | |
| | 18 | supported platforms, what needs to be implemented in others (atomics, ipis vs config_smp) |
| | 19 | NMI safe or not |
| | 20 | todos |
| | 21 | }}} |
| | 22 | |
| | 23 | Todos: |
| | 24 | * |
| | 25 | |
| | 26 | == Implementation == |
| | 27 | === CHT Algorithm === |
| | 28 | === RCU Algorithms === |
| | 29 | |
| | 30 | == Testing == |
| | 31 | The CHT replies on working atomic operations, SMP calls (ie IPIs), RCU and the system work queue. |
| | 32 | In order to test these out, pull and build the source code (eg on x86-64 machines): |
| | 33 | {{{ |
| | 34 | bzr checkout lp:~adam-hraska+lp/helenos/rcu/ |
| | 35 | make PROFILE=amd64 |
| | 36 | }}} |
| | 37 | Boot HelenOS with eg 4 CPUs in QEMU: |
| | 38 | {{{ |
| | 39 | qemu-system-x86_64 -cdrom image.iso -smp4 |
| | 40 | }}} |
| | 41 | Once fully operational, switch to the kernel console by pressing F12 |
| | 42 | (or typing kcon in the shell). You can then finally run the following |
| | 43 | relevant tests: |
| | 44 | {{{ |
| | 45 | test atomic1 |
| | 46 | test smpcall1 |
| | 47 | test workqueue |
| | 48 | test rcu1 |
| | 49 | test cht |
| | 50 | }}} |
| | 51 | Neither of the tests should panic, crash, leak memory, loop infinitely, or report errors. |
| | 52 | The only test //workqueue// reports an error which is however expected and marked appropriately |
| | 53 | with line //Stress testing a custom queue. Stops prematurely. Errors are expected.// |
| | 54 | |
| | 55 | You may be interested in runtime statistics gathered by RCU or the system work queue. |
| | 56 | Run these commands in the kernel console: |
| | 57 | {{{ |
| | 58 | rcu |
| | 59 | workq |
| | 60 | }}} |
| | 61 | |
| | 62 | == Benchmarks == |
| | 63 | In order to benchmark the CHT or RCU implementation, pull sources of the benchmarking |
| | 64 | branch and build the code in a release build (ie clear the debug build option): |
| | 65 | {{{ |
| | 66 | bzr checkout lp:~adam-hraska+lp/helenos/cht-bench/ |
| | 67 | make PROFILE=amd64 |
| | 68 | make config # this is where you have to clear the debug build option |
| | 69 | make |
| | 70 | }}} |
| | 71 | Boot HelenOS in QEMU with eq 4 CPUS and list the available benchmarks in the kernel console with the |
| | 72 | //chtbench// command: |
| | 73 | {{{ |
| | 74 | qemu-system-x86_64 -cdrom image.iso -smp 4 |
| | 75 | |
| | 76 | / # kcon |
| | 77 | kconsole> chtbench 0 0 0 |
| | 78 | }}} |
| | 79 | |
| | 80 | === Results === |
| | 81 | The following results were obtained by benchmarking [http://bazaar.launchpad.net/~adam-hraska+lp/helenos/cht-bench/revision/1589 r1589] |
| | 82 | on Core(TM) i7 CPU 920 @ 2.67GHz (4 cores, each with 2 !HyperThreads), running |
| | 83 | a release build (with networking): |
| | 84 | {{{contrib/conf/net-qe.sh e1k -smp 4 }}} |
| | 85 | Each benchmark run consists of repeating a certain operation in a loop for 10 seconds |
| | 86 | on the specified number of cpus (ie individual threads wired to these cpus). |
| | 87 | Benchmarks were repeated as many times as needed (between 10 and 20 consecutive |
| | 88 | runs) in order to obtain a confidence interval of the number of completed operations |
| | 89 | within about 1% of the average number of completed operations of the runs. The following |
| | 90 | figures include error bars depicting a single standard deviation of the completed number |
| | 91 | of operations in a single run (ie it is //not// the SD of the mean, aka standard error). |
| | 92 | However the estimated SD is typically below 1-2% of the mean (ie tiny) and therefore not |
| | 93 | visible in the figures without zooming ;-). |