Fork us on GitHub Follow us on Facebook Follow us on Twitter

Version 5 (modified by Martin Decky, 5 months ago) (diff)

refresh the text a bit

Using QEMU and GDB to debug kernel and uspace tasks

Some of the debugging techniques and procedures described in this article are illustrated on an example involving the ia32 architecture. Other architectures should behave in a similar or analogous way, but dealing with potential differences is left to the reader as an exercise.

The combination of QEMU and GDB allows HelenOS to be comfortably debugged either on the assembly or the source code level.

Preparing the build

The default HelenOS build should produce unstripped binaries. If necessary, this can be enforced by making sure the Strip binaries build configuration option is not checked. Unstripped binaries come with symbols, but do not contain any fancier debugging information. In order to get maximum out of the debug build, make sure to configure HelenOS with the Line debugging information option. Another thing which may impede debugging is optimization, so consider changing optimization levels to 0 using the Optimization level configuration option. When everything is set, make sure to rebuild with the new settings.

Starting QEMU

QEMU provides two command line options for debugging with GDB: -s and -S. The former instructs QEMU to listen for GDB connections on localhost:1234 (but does not wait for it) and the latter stops the guest CPU at startup so that debugging is possible from the very beginning. When starting emulation using the tools/ew.py script, one can add these options like this:

$ `tools/ew.py -d` -s -S

Connecting GDB to QEMU

Once QEMU is started with the -s (and optionally also the -S) option, it is possible to connect GDB to it. For our purposes, we will assume the respective cross-GDB built by the tools/toolchain.sh is ready to be used and named as /usr/local/cross/bin/i686-helenos-gdb. Needless to say, the cross-GDB should always match the architecture of the HelenOS guest.

$ /usr/local/cross/bin/i686-helenos-gdb
GNU gdb (GDB) 7.11
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "--host=x86_64-pc-linux-gnu --target=i686-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
(gdb) 

Connect to QEMU using the following command at the (gdb) prompt:

(gdb) target remote :1234
Remote debugging using :1234
0x0000fff0 in ?? ()

Note that if QEMU was not started with the -S option, you will have to manually break into the debugger by pressing Ctrl+C.

Loading symbols

Depending on what is the subject of our debugging session, we will need some symbols. The easiest case is kernel debugging. In that case, we simply load the kernel symbols from kernel.elf:

(gdb) symbol-file dist/boot/kernel.elf 
Reading symbols from dist/boot/kernel.elf...done.

Debugging user space tasks is a little more complicated because unlike the kernel, there will be multiple user processes running at the same time, so setting a mere breakpoint on a user address will probably not do. We will need to get some assistance from the kernel.

Setting and hitting breakpoints

To set a breakpoint, for example on the kernel function that handles invalid memory accesses from user space, we type the following command:

(gdb) break fault_from_uspace_core
Breakpoint 1 at 0x801237ff: file generic/src/interrupt/interrupt.c, line 169.

Note that the default breakpoint mechanism modifies the code running in QEMU by inserting a breakpoint instruction (INT3 on ia32). Thus if you setup a breakpoint using the break command before the kernel boots or an user space task loads, the breakpoint instruction might get overwritten and the breakpoint rendered ineffective. To avoid this situation, you can use hardware-assisted breakpoints using the hbreak command. Unfortunately, the number of hardware-assisted breakpoints is limited.

To continue the emulation, tell GDB to continue:

(gbd) c
Continuing.

When our breakpoint is later hit, for example as a result of executing inside HelenOS:

# tester fault1

we will break into the debugger prompt again:

Breakpoint 1, fault_from_uspace_core (istate=0x86d47fb4, fmt=fmt@entry=0x8016f3ce "Page fault: %p.", args=args@entry=0x86d47ec4 "\004") at generic/src/interrupt/interrupt.c:169
169	{
(gdb)

Note that at any time while the guest is running, you can break into the debugger also by pressing the Ctrl+C combo.

Now that a breakpoint in the kernel was hit, we can inspect the state of the kernel a little bit. Typing bt will show us the stack trace of the current kernel thread:

(gdb) bt
#0  fault_from_uspace_core (istate=0x86d47fb4, fmt=fmt@entry=0x8016f3ce "Page fault: %p.", args=args@entry=0x86d47ec4 "\004") at generic/src/interrupt/interrupt.c:169
#1  0x80123e2a in fault_if_from_uspace (istate=istate@entry=0x86d47fb4, fmt=fmt@entry=0x8016f3ce "Page fault: %p.") at generic/src/interrupt/interrupt.c:206
#2  0x80130bc9 in as_page_fault (address=4, access=PF_ACCESS_WRITE, istate=istate@entry=0x86d47fb4) at generic/src/mm/as.c:1497
#3  0x8010f978 in page_fault (n=14, istate=0x86d47fb4) at arch/ia32/src/mm/page.c:99
#4  0x80123bda in exc_dispatch (n=14, istate=0x86d47fb4) at generic/src/interrupt/interrupt.c:131
#5  0x8010ab62 in int_14 () at arch/ia32/src/asm.S:437
#6  0x0000000e in ?? ()

To see the information about the interrupted context, in this case the user context as it existed when the page fault exception occurred, we can print the istate structure:

(gdb) set radix 16
Input and output radices now set to decimal 16, hex 10, octal 20.
(gdb) p *istate
$2 = {edx = 0x80, ecx = 0x7, ebx = 0x7013cee0, esi = 0x70037f98, edi = 0x7001b784, ebp = 0x7013ce78, eax = 0x4, ebp_frame = 0x0, eip_frame = 0x3da7, gs = 0x30, fs = 0x23, es = 0x23, ds = 0x23, 
  error_word = 0x6, eip = 0x3da7, cs = 0x1b, eflags = 0x10202, esp = 0x7013ce78, ss = 0x23}

The printed eip member is the program counter of the instruction which caused the exception. We will remember this value along with the value of esp and ebp for later.

Useful macros

When debugging the kernel, it is sometimes useful to find out some information about the current task. For that, we will need to mimic the computation of the address of the CURRENT structure. We will start by defining a macro:

(gdb) macro define CURRENT ((current_t *) ((uintptr_t) $esp & ~0x1fff))

From this time on, we can do things like:

(gdb) x/s CURRENT->task->name
0x879748c8:	"/app/tester"

Note that this will work only when $esp corresponds to the kernel stack.

Switching to the user context

Let us assume that we know the values of the user space registers EIP, ESP and EBP (for example by inspecting the istate structure as shown above). We can load the user space symbols:

(gdb) add-symbol-file dist/app/tester
add symbol table from file "dist/app/tester" 
(y or n) y
Reading symbols from dist/app/tester...done.

The last step before we can print out our userspace stack trace is restoring registers to their user space contents (as for example captured in the istate structure):

(gdb) set $eip=0x3da7
(gdb) set $esp=0x7013ce78
(gdb) set $ebp=0x7013ce78

We are now ready to do some user space debugging:

(gdb) bt
#0  0x00003da7 in test_fault1 () at fault/fault1.c:34
#1  0x000010f6 in run_test (test=0x31770 <tests+208>) at tester.c:84
#2  0x00001374 in main (argc=0x43060, argv=0x0) at tester.c:161
#3  0x000134e3 in __main (pcb_ptr=0x7001b784) at generic/libc.c:121
#4  0x000010e2 in __entry () at arch/ia32/src/entry.S:69
#5  0x7001b784 in ?? ()
(gdb) disassemble
Dump of assembler code for function test_fault1:
   0x00003d9f <+0>:	push   %ebp
   0x00003da0 <+1>:	mov    %esp,%ebp
   0x00003da2 <+3>:	mov    $0x4,%eax
=> 0x00003da7 <+8>:	movl   $0x0,(%eax)
   0x00003dad <+14>:	mov    $0x2d973,%eax
   0x00003db2 <+19>:	pop    %ebp
   0x00003db3 <+20>:	ret    
End of assembler dump.

These two snippets above confirm that the page fault exception was indeed deliberately injected by the tester application when running test fault1.

Interactive user space debugging

While using GDB outside QEMU to interactively debug an user space task in QEMU is tricky, since the GDB does not understand the notion of tasks and threads within the virtual machine, we can achieve it at least to a certain degree. First we need to make sure that we are running in the proper user space task context, for example by setting a conditional kernel breakpoint like this:

(gdb) hbreak as_switch:1690 if $_streq(CURRENT->task->name, "/app/tester")
Hardware assisted breakpoint 1 at 0x801199b3: file ../../HelenOS/kernel/generic/src/mm/as.c, line 1623.

Note that the breakpoint position should be the last statement of the as_switch() function. When the breakpoint hits, we are safe to assume that we running in the address space of /app/tester and we can setup additional breakpoints in user space. However, during the single-stepping of the user space task we need to be prepared for the preemption to kernel space (and eventually to a different task) at any time due to exception and interrupt handling.