= Using QEMU and GDB to debug kernel and uspace tasks = [[PageOutline(2-3)]] {{{#!box type=note Some of the debugging techniques and procedures described in this article are illustrated on an example involving the [wiki:Arch/Ia32 ia32] architecture. Other architectures should behave in a similar or analogous way, but dealing with potential differences is left to the reader as an exercise. }}} The combination of QEMU and GDB allows HelenOS to be comfortably debugged either on the assembly or the source code level. For detailed information on low-level debugging, see for example this [http://d3s.mff.cuni.cz/teaching/crash_dump_analysis/ course] on crash dump analysis. == Preparing the build == The default HelenOS build should produce unstripped binaries. If necessary, this can be enforced by making sure the `Strip binaries` build configuration option is not checked. Unstripped binaries come with symbols, but do not contain any fancier debugging information. In order to get maximum out of the debug build, make sure to configure HelenOS with the `Line debugging information` option. Another thing which may impede debugging is optimization, so consider changing optimization levels to 0 using the `OPTIMIZATION` variable in the respective Makefiles ([browser:mainline/kernel/Makefile] or [browser:mainline/uspace/Makefile.common]). When everything is set, make sure to rebuild with the new settings. == Starting QEMU == QEMU provides two command line options for debugging with GDB: `-s` and `-S`. The former instructs QEMU to listen for GDB connections on localhost:1234 (but does not wait for it) and the latter stops the guest CPU at startup so that debugging is possible from the very beginning. When starting emulation using the [browser:mainline/tools/ew.py] script, one can add these options like this: {{{ $ `tools/ew.py -d` -s -S }}} == Connecting GDB to QEMU == Once QEMU is started with the `-s` (and optionally also the `-S`) option, it is possible to connect GDB to it. For our purposes, we will assume the respective cross-GDB built by the [browser:mainline/tools/toolchain.sh] is ready to be used and named as `/usr/local/cross/ia32/bin/i686-pc-linux-gnu-gdb`. Needless to say, the cross-GDB should always match the architecture of the HelenOS guest. {{{ $ /usr/local/cross/ia32/bin/i686-pc-linux-gnu-gdb GNU gdb (GDB) 7.11 Copyright (C) 2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "--host=x86_64-pc-linux-gnu --target=i686-pc-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: . Find the GDB manual and other documentation resources online at: . For help, type "help". Type "apropos word" to search for commands related to "word". (gdb) }}} Connect to QEMU using the following command at the `(gdb)` prompt: {{{ (gdb) target remote :1234 Remote debugging using :1234 0x0000fff0 in ?? () }}} Note that if QEMU was not started with the `-S` option, you will have to manually break into the debugger by pressing `Ctrl-C`. == Loading symbols == Depending on what is the subject of our debugging session, we will need some symbols. The easiest case is kernel debugging. In that case, we simply load the kernel symbols from `kernel.raw`: {{{ (gdb) symbol-file kernel/kernel.raw Reading symbols from kernel/kernel.raw...done. }}} Debugging userspace tasks is a little more complicated because unlike the kernel, there will be multiple user processes running at the same time, so setting a mere breakpoint on a user address will probably not do. We will need to get some assistance from the kernel. == Setting and hitting breakpoints == To set a breakpoint, for example on the kernel function that handles invalid memory accesses from userspace, we type the following command: {{{ (gdb) break fault_from_uspace_core Breakpoint 1 at 0x801237ff: file generic/src/interrupt/interrupt.c, line 169. }}} {{{#!box type=warning Sadly the set breakpoint is not always hit. At this point, the cause of this behavior is unknown. }}} To continue the emulation, tell GDB to continue: {{{ (gbd) c Continuing. }}} When our breakpoint is later hit, for example as a result of executing inside HelenOS: {{{ # tester fault1 }}} we will break into the debugger prompt again: {{{ Breakpoint 1, fault_from_uspace_core (istate=0x86d47fb4, fmt=fmt@entry=0x8016f3ce "Page fault: %p.", args=args@entry=0x86d47ec4 "\004") at generic/src/interrupt/interrupt.c:169 169 { (gdb) }}} Note that at any time while the guest is running, you can break into the debugger also by pressing the `Ctrl-C` combo. Now that a breakpoint in the kernel was hit, we can inspect the state of the kernel a little bit. Typing `bt` will show us the stack trace of the current kernel thread: {{{ (gdb) bt #0 fault_from_uspace_core (istate=0x86d47fb4, fmt=fmt@entry=0x8016f3ce "Page fault: %p.", args=args@entry=0x86d47ec4 "\004") at generic/src/interrupt/interrupt.c:169 #1 0x80123e2a in fault_if_from_uspace (istate=istate@entry=0x86d47fb4, fmt=fmt@entry=0x8016f3ce "Page fault: %p.") at generic/src/interrupt/interrupt.c:206 #2 0x80130bc9 in as_page_fault (address=4, access=PF_ACCESS_WRITE, istate=istate@entry=0x86d47fb4) at generic/src/mm/as.c:1497 #3 0x8010f978 in page_fault (n=14, istate=0x86d47fb4) at arch/ia32/src/mm/page.c:99 #4 0x80123bda in exc_dispatch (n=14, istate=0x86d47fb4) at generic/src/interrupt/interrupt.c:131 #5 0x8010ab62 in int_14 () at arch/ia32/src/asm.S:437 #6 0x0000000e in ?? () }}} To see the information about the interrupted context, in this case the user context as it existed when the page fault exception occurred, we can print the `istate` structure: {{{ (gdb) set radix 16 Input and output radices now set to decimal 16, hex 10, octal 20. (gdb) p *istate $2 = {edx = 0x80, ecx = 0x7, ebx = 0x7013cee0, esi = 0x70037f98, edi = 0x7001b784, ebp = 0x7013ce78, eax = 0x4, ebp_frame = 0x0, eip_frame = 0x3da7, gs = 0x30, fs = 0x23, es = 0x23, ds = 0x23, error_word = 0x6, eip = 0x3da7, cs = 0x1b, eflags = 0x10202, esp = 0x7013ce78, ss = 0x23} }}} The printed `eip` member is the program counter of the instruction which caused the exception. We will remember this value along with the value of `esp` and `ebp` for later. == Useful macros == When debugging the kernel, it is sometimes useful to find out some information about the current process. For that, we will need to mimic the computation of the address of the `THE` structure. We will start by defining a macro: {{{ (gdb) macro define the ((the_t *) ((uintptr_t )$esp & ~0x1fff)) }}} From this time on, we can do things like: {{{ (gdb) p the->task->name $3 = "tester\000it\000\000\000\000\000\000\000\000\000\000" }}} Note that this will work only when `$esp` corresponds to the kernel stack. == Switching to the user context == Let us assume that we have somehow found the values of the userspace registers EIP, ESP and EBP (for example by inspecting the `istate` structure as shown above) and know the name of the process (for example `tester` from the example above). Before we can add symbol information for this process, we need to find out the load address of its `.text` section (for some reason, GDB does not use the information provided in the ELF file): {{{ $ objdump -h uspace/app/tester/tester uspace/app/tester/tester: file format elf32-i386 Sections: Idx Name Size VMA LMA File off Algn 0 .init 0000002e 000010b4 000010b4 000000b4 2**0 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .text 0002f5a0 000010e8 000010e8 000000e8 2**3 CONTENTS, ALLOC, LOAD, READONLY, CODE 2 .data 00001b08 000316a0 000316a0 0002f6a0 2**5 CONTENTS, ALLOC, LOAD, DATA 3 .tbss 00000048 000331a8 000331a8 000311a8 2**2 ALLOC, THREAD_LOCAL 4 .bss 00000184 000331c0 000331c0 000311a8 2**5 ALLOC 5 .comment 00000011 00000000 00000000 000311a8 2**0 CONTENTS, READONLY 6 .debug_abbrev 0000878f 00000000 00000000 000311b9 2**0 CONTENTS, READONLY, DEBUGGING 7 .debug_aranges 00000db8 00000000 00000000 00039948 2**3 CONTENTS, READONLY, DEBUGGING 8 .debug_info 000293a5 00000000 00000000 0003a700 2**0 CONTENTS, READONLY, DEBUGGING 9 .debug_line 0000cfc7 00000000 00000000 00063aa5 2**0 CONTENTS, READONLY, DEBUGGING 10 .debug_ranges 00000358 00000000 00000000 00070a6c 2**0 CONTENTS, READONLY, DEBUGGING 11 .debug_str 000078fc 00000000 00000000 00070dc4 2**0 CONTENTS, READONLY, DEBUGGING }}} So the `.text` section gets loaded at 0x10e8 in this case. We will use this address to load our userspace symbols: {{{ (gdb) add-symbol-file uspace/app/tester/tester 0x000010e8 add symbol table from file "uspace/app/tester/tester" at .text_addr = 0x10e8 (y or n) y Reading symbols from uspace/app/tester/tester...done. }}} The last step before we can print out our userspace stack trace is restoring registers to their userspace contents (as for example captured in the `istate` structure): {{{ (gdb) set $eip=0x3da7 (gdb) set $esp=0x7013ce78 (gdb) set $ebp=0x7013ce78 }}} We are now ready to do some userspace debugging: {{{ (gdb) bt #0 0x00003da7 in test_fault1 () at fault/fault1.c:34 #1 0x000010f6 in run_test (test=0x31770 ) at tester.c:84 #2 0x00001374 in main (argc=0x43060, argv=0x0) at tester.c:161 #3 0x000134e3 in __main (pcb_ptr=0x7001b784) at generic/libc.c:121 #4 0x000010e2 in __entry () at arch/ia32/src/entry.S:69 #5 0x7001b784 in ?? () }}} {{{ (gdb) disassemble Dump of assembler code for function test_fault1: 0x00003d9f <+0>: push %ebp 0x00003da0 <+1>: mov %esp,%ebp 0x00003da2 <+3>: mov $0x4,%eax => 0x00003da7 <+8>: movl $0x0,(%eax) 0x00003dad <+14>: mov $0x2d973,%eax 0x00003db2 <+19>: pop %ebp 0x00003db3 <+20>: ret End of assembler dump. }}} These two snippets above confirm that the page fault exception was indeed deliberately injected by the `tester` application when running test `fault1`.