Follow us on Google+ Follow us on Facebook Follow us on Twitter

Opened 9 years ago

Last modified 4 weeks ago

#4 reopened defect

HelenOS/sparc64 unstable with CONFIG_TSB

Reported by: Jakub Jermář Owned by: Jakub Jermář
Priority: major Milestone:
Component: helenos/kernel/sparc64 Version: mainline
Keywords: Cc:
Blocker for: Depends on:
See also:

Description

I found out that when I double the size of the buffer allocated for the TSB, the problem disappears. However, the size used for TSB allocation seems right. Therefore, it seems like something is damaging the content of the TSB memory.

I still haven't seen this show elsewhere than on one of the Ultra 60's.

Disabling TSB during compile time is a workaround for this bug.

By further investigating the issue, I have come to the conslusion that the bug was introduced in revision 2161. It is more likely that an already existing bug was exposed by fixing another bug in 2161. 2161 fixes a bug which prevented the TSB from functioning at all. So it looks like a TSB issue.

I have never seen this with r2128.
The earliest revision I saw this bug on is r 2174.
I have not investigated the revisions in between yet.
The problem seems to be independent from whether the kernel was translated with gcc 4.1.1 or gcc 4.1.2.

I saw this only on one Ultra 60 when trying to boot revisions around 2233 from a CD-ROM.
What happened was one of the three scenarios:

  1. the kernel booted just fine, but the ns task got the data_access_error exception (as reported in klog) and died; several tasks died afterwards, most likely due to the fact that they could not connect to ns; the kconsole was responsive in this case and I could investigate the content of the klog
  1. the kernel booted just fine, but the ns task exitted and no exception was reported in klog; some other tasks died after ns exitted; the kconsole was responsive in this case and I could investigate the content of the klog
  1. the kernel booted but then it looked as hung - no console task UI and the kconsole was not responsive

Change History (13)

comment:1 Changed 9 years ago by Jakub Jermář

Component: kernel/sparc64

comment:2 Changed 8 years ago by Jakub Jermář

Summary: Sudden death of userspace tasksHelenOS/sparc64 unstable with CONFIG_TSB

The issue still exists with revision 4684, but I think it has slightly different symptoms considering the huge evolution step HelenOS made from 2233 to 4684.

comment:3 Changed 8 years ago by Jakub Jermář

Milestone: 0.5.0

comment:4 Changed 8 years ago by Jakub Jermář

The respective Ultra 60 system ran fine (without any of the above symptoms) with the current version of HelenOS over the night, having the following load:

  • played tetris to around 4500 points
  • ran kernel and userspace tests
  • ran tester loop1 test
  • ran the factorial sysel example in an infinite loop

This morning, the system did not boot, either hanging, or killing the userspace tasks due to an data_access_error, or both. The data_access_error trap is a sign of a hardware problem (i.e. a machine check exception).

comment:5 Changed 7 years ago by Jakub Jermář

Status: newaccepted

comment:6 Changed 7 years ago by Jakub Jermář

Status: acceptedassigned

comment:7 Changed 7 years ago by Jakub Jermář

Milestone: 0.5.00.5.1

comment:8 Changed 7 years ago by Jakub Jermář

Resolution: worksforme
Status: assignedclosed

Closing as not reproducible. This ticket has been reproducible only on one Ultra 60 which will shortly become unavailable to me. If the issue reproduces on some other machine, please file a new ticket with up to date data.

comment:9 Changed 7 years ago by Jakub Jermář

Resolution: worksforme
Status: closedreopened

Reopening as my new Ultra 60, 2x CPU, 2GiB RAM exhibits the same problem (mainline,1018).

comment:10 Changed 7 years ago by Jakub Jermář

The two cpus identify as:

cpu0: manuf=UltraSPARC, impl=UltraSPARC II, mask=160 (450 MHz)
cpu1: manuf=UltraSPARC, impl=UltraSPARC II, mask=160 (450 MHz)

comment:11 Changed 6 years ago by Jakub Jermář

Milestone: 0.5.00.5.1

comment:12 Changed 3 years ago by Jakub Jermář

Milestone: 0.6.00.7.1

comment:13 Changed 4 weeks ago by Jakub Jermář

Milestone: 0.7.1
Note: See TracTickets for help on using tickets.