Opened 14 years ago

Closed 14 years ago

#211 closed defect (worksforme)

Time sometimes appears to stop on mips32

Reported by: Jakub Jermář Owned by:
Priority: critical Milestone: 0.4.3
Component: helenos/kernel/mips32 Version: mainline
Keywords: Cc:
Blocker for: Depends on:
See also:

Description (last modified by Jakub Jermář)

Time sometimes appears to stop on mips32. This has several symptoms such as:

  • in tetris, the blocks stop falling, even though one can move them sidewise
  • cursor stops blinking
  • it is not possible to switch to another vc when running tester loop 1

After some time, the system wakes up and continues to operate normally, but the problem may return.

This can be observed with the testmips machine in GXemul (0.4.7.2).

Change History (4)

comment:1 by Jakub Jermář, 14 years ago

Description: modified (diff)

comment:2 by Jakub Jermář, 14 years ago

This is because for some reason the count register outgrows the compare register as evidenced by the following output GXenmul commands during the non-preemptive period:

GXemul> reg ,c
cpu0:    index =         0x0000001f   random =         0x00000015
cpu0: entrylo0 = 0x0000000000028a2e entrylo1 = 0x0000000000000000
cpu0:  context = 0x0000000000000090 pagemask = 0x0000000000006000
cpu0:    wired =         0x00000001  reserv7 = 0x0000000000000000
cpu0: badvaddr = 0x000000000004bf74    count =         0x488340d7
cpu0:  entryhi = 0x0000000000048007  compare =         0x24887a5a
cpu0:   status = 0x0000000030008411    cause = 0x0000000000000028
cpu0:      epc = 0x0000000000011d88     prid = 0x0000000000000400
cpu0:   config = 0x0000000000804240   lladdr = 0x0000000000002812
cpu0:  watchlo = 0x0000000000000000  watchhi = 0x0000000000000000
cpu0: xcontext = 0x0000000000000090 reserv21 = 0x0000000000000000
cpu0: reserv22 = 0x0000000000000000    debug = 0x0000000000000000
cpu0:     depc = 0x0000000000000000  perfcnt = 0x0000000000000000
cpu0:   errctl = 0x0000000000000000 cacheerr = 0x0000000000000000
cpu0: tagdatlo = 0x0000000000000000 tagdathi = 0x0000000000000000
cpu0: errorepc = 0x0000000000000000   desave = 0x0000000000000000

and a while later, while the system was still not preemptive:

GXemul> reg ,c
cpu0:    index =         0x00000023   random =         0x0000000a
cpu0: entrylo0 = 0x0000000000000000 entrylo1 = 0x0000000000028d2e
cpu0:  context = 0x00000000000000a0 pagemask = 0x0000000000006000
cpu0:    wired =         0x00000001  reserv7 = 0x0000000000000000
cpu0: badvaddr = 0x0000000000057ee0    count =         0x4a113d97
cpu0:  entryhi = 0x0000000000050007  compare =         0x24887a5a
cpu0:   status = 0x0000000030008411    cause = 0x0000000000000028
cpu0:      epc = 0x0000000000011d88     prid = 0x0000000000000400
cpu0:   config = 0x0000000000804240   lladdr = 0x0000000000002812
cpu0:  watchlo = 0x0000000000000000  watchhi = 0x0000000000000000
cpu0: xcontext = 0x00000000000000a0 reserv21 = 0x0000000000000000
cpu0: reserv22 = 0x0000000000000000    debug = 0x0000000000000000
cpu0:     depc = 0x0000000000000000  perfcnt = 0x0000000000000000
cpu0:   errctl = 0x0000000000000000 cacheerr = 0x0000000000000000
cpu0: tagdatlo = 0x0000000000000000 tagdathi = 0x0000000000000000
cpu0: errorepc = 0x0000000000000000   desave = 0x0000000000000000

Since an interrupt is generated when the two registers match, there was no interrupt until count wrapped around 0xffffffff and reached compare from below.

comment:3 by Jakub Jermář, 14 years ago

An instrumented version of GXemul reveals that the last value written into the compare register was greater (by about 97000) than the current value of the count register, which is pretty normal. This disproves the theory that the overflow happens as a result of writing the compare register after the count register has already missed it.

comment:4 by Jakub Jermář, 14 years ago

Resolution: worksforme
Status: newclosed

There appears to be a problem in GXemul which is responsible for this behavior. Anders Gavare says:

that the code which asserted the interrupt pin did so when it had updated the count register to
go past the compare register, but since there was another hack in cpu_mips_coproc.c which _also_
updated the count register, then the first mechanism sometimes failed.

As for the likely GXemul fix, Anders said:

the solution was to keep track of the number of updates done in cpu_mips_coproc.c (i.e. for each
read of the count register), and do a "rollback" just before doing the
count-goes-past-the-compare-register check.

I have tested the fixed version provided by Anders and the issue was no longer reproducible.
Closing this ticket as not a defect.

Note: See TracTickets for help on using tickets.