Opened 12 years ago

Closed 12 years ago

#465 closed defect (notadefect)

HelenOS does not boot in Ski

Reported by: Jakub Jermář Owned by: Jakub Jermář
Priority: major Milestone: 0.5.0
Component: helenos/boot/ia64 Version: mainline
Keywords: speculation, gcc, compiler_bug Cc:
Blocker for: Depends on:
See also:

Description

As of mainline,1539, HelenOS does not boot in the Ski simulator. Simulation stops at this point:

HelenOS bootloader, release 0.4.3 (Sashimi), revision 1539M (vojtechhorky@users.sourceforge.net-20120710130310-jg176a9lre52wzlb)
Built on 2012-07-14 21:34:34 for ia64
Copyright (c) 2001-2012 HelenOS project
 0

Change History (8)

comment:1 by Jakub Jermář, 12 years ago

Just verified that the problem does not occur with the same sources built using the gcc 4.6.3 toolchain.

comment:2 by Jakub Jermář, 12 years ago

I located the sequence of instructions that are responsible for this. In printf_core(), there is the following piece of code:

 4404550:       11 00 08 1d 80 11       [MIB]       st1 [r14]=r66
 4404556:       00 00 00 02 00 00                   nop.i 0x0
 440455c:       b8 11 00 50                         br.call.sptk.many b0=4405700 <ascii_check>;;
 4404560:       10 00 00 00 01 00       [MIB]       nop.m 0x0
 4404566:       80 00 20 20 00 00                   zxt1 r8=r8
 440456c:       00 00 00 20                         nop.b 0x0
 4404570:       09 70 00 42 38 10       [MMI]       ld8.s r14=[r33]
 4404576:       a0 04 90 70 20 20                   ld8.s r74=[r36]
 440457c:       19 00 00 90                         mov r73=1;;
 4404580:       11 38 00 10 06 39       [MIB]       cmp.eq p7,p6=0,r8
 4404586:       60 70 04 80 83 03                   mov b6=r14

At address 4404570, we read the contents of memory pointed by r33 into r14. r33 in this case is 0x440ff40. For some reason, this load fails and since it is a speculative read, it defers the exception by setting the r14's NaT bit. On address 0x4404586, r14 with the NaT bit set propagates to b6 and because branch registers do not have a NaT bit, the processor is forced to report the NaT Consumption vector exception. This is the immediate reason why Ski halts the emulation here.

We need to find out why reading from 0x440ff40 generates the deferred exception. According to the link map, that address lies in .bss of image.boot.

comment:3 by Jakub Jermář, 12 years ago

Interestingly enough, it appears like in the continuation of the above excerpt, the speculative read of r14 would be checked and the, if needed, would be reattempted in the speculation recovery path:

440458c:       a0 fd ff 4b                   (p07) br.cond.dpnt.few 4404320 <printf_core+0x19e0>;;
 4404590:       c2 40 8a 19 00 21       [MII] (p06) adds r72=98,r12
 4404596:       00 00 00 02 00 40                   nop.i 0x0;;
 440459c:       a3 04 00 01                         chk.s.i r74,4404730 <printf_core+0x1df0>
 44045a0:       08 b8 38 00 40 04       [MMI]       chk.s.m r14,4404710 <printf_core+0x1dd0>
 44045a6:       00 00 00 02 00 00                   nop.m 0x0
 44045ac:       00 00 04 00                         nop.i 0x0
 44045b0:       11 00 00 00 01 00       [MIB]       nop.m 0x0
 44045b6:       00 00 00 02 00 00                   nop.i 0x0
 44045bc:       68 00 80 10                         br.call.sptk.many b0=b6;;

The speculation recovery is here:

 4404710:       09 00 00 00 01 00       [MMI]       nop.m 0x0
 4404716:       e0 00 84 30 20 00                   ld8 r14=[r33]
 440471c:       00 00 04 00                         nop.i 0x0;;
 4404720:       10 00 00 00 01 00       [MIB]       nop.m 0x0
 4404726:       60 70 04 80 03 00                   mov b6=r14
 440472c:       90 fe ff 48                         br.few 44045b0 <printf_core+0x1c70>

Could it be that the assignment:

 4404586:       60 70 04 80 83 03                   mov b6=r14

before the speculation check at 0x44045a0 is a compiler bug?

Version 0, edited 12 years ago by Jakub Jermář (next)

comment:4 by Jakub Jermář, 12 years ago

I have filed a GCC bug 53975, let's see what they think of it.

comment:5 by Jakub Jermář, 12 years ago

Keywords: speculation gcc compiler_bug added
Milestone: 0.5.00.5.1

I am rescheduling this for the next release as this is probably a compiler bug. For 0.5.0, HelenOS/ia64 will have to be built using the older version of the toolchain.

comment:6 by Martin Decky, 12 years ago

Milestone: 0.5.10.5.0
Resolution: fixed
Status: newclosed

A temporary workaround was implemented in changeset 1556.

comment:7 by Jakub Jermář, 12 years ago

Resolution: fixed
Status: closedreopened

comment:8 by Jakub Jermář, 12 years ago

Resolution: notadefect
Status: reopenedclosed
Note: See TracTickets for help on using tickets.