Opened 14 years ago
Closed 14 years ago
#290 closed defect (fixed)
Infrequent hang of loaders while reading from fat during startup
Reported by: | Jakub Jermář | Owned by: | Jakub Jermář |
---|---|---|---|
Priority: | major | Milestone: | 0.4.3 |
Component: | helenos/fs/fat | Version: | |
Keywords: | Cc: | ||
Blocker for: | Depends on: | ||
See also: |
Description
In about 1/10 boots of default mainline,844 on ia32, I experience a hang when loading of the bdsh tasks on all vc's gets stuck in the loader phase. The loader tasks seem to be waiting for VFS / FAT to answer a read() request.
Looking at the FAT task, I can see that there is one dispatched pair of VFS_OUT_READ and IPC_M_DATA_READ. The other reads are still in a non-dispatched state.
For some reason, FAT seems to be unable to make enough forward progress to answer the read request.
Change History (10)
comment:1 by , 14 years ago
Status: | new → accepted |
---|
comment:2 by , 14 years ago
comment:3 by , 14 years ago
It looks like we are dealing with a corrupted collision chain of the cache block hash table. By adding some printfs into hash_table_remove(), hash_table_remove() operates on a hash table which looks like this:
0x38608 is the head link_t
Then there are two entries that form a closed cycle via the next pointer:
0x1e42c next = 0x38628 0x38628 next = 0x1e42c
Note that the second entry is actually only 32 bytes away from the head link. But this is normally impossible, because the list is formed of block_t structures that are 88 bytes long and the hash_link itself is located at offset 76.
comment:4 by , 14 years ago
From another instance of the bug:
head=0x385c8 next=0x1ed1c prev=0x1ed1c 0x1ed1c next=0x38618 prev=0x38618 0x38618 next=0x1ed1c prev=0x1ed1c
So the list's prev and next pointers are consistent, only the last record should not be there. This time, it is 80 bytes offset from the head of the list, meaning that there is not enough space between the head and the record for the record to be from a valid allocated memory at the same time as the head of the list.
Has the memory holding the head of the list been freed?
comment:5 by , 14 years ago
I managed to reproduce this even when I added some extra debugging facilities:
- free() sets the freed area with a special pattern so that use-after-free is detected
- list operations verify that the link is a link and head is a head
These techniques helped me to discover bugs in other code (already fixed in mainline) but unfortunately did not help with debugging of this issue.
follow-up: 7 comment:6 by , 14 years ago
I think I have just had an epiphany: the linked list looks like the middle record was first inserted to the list with head in the first record and later it was inserted into another list with head in the third record without first being removed from the first list. This would suggest an logic error in maintaining the block cache.
comment:7 by , 14 years ago
Replying to jermar:
I think I have just had an epiphany: the linked list looks like the middle record was first inserted to the list with head in the first record and later it was inserted into another list with head in the third record without first being removed from the first list. This would suggest an logic error in maintaining the block cache.
And given that the linked lists in question form a hash table suggests that the middle record was inserted twice, each time with a key hashed to a different bucket.
comment:8 by , 14 years ago
I managed to catch the FAT server when it was corrupting the hash table. The stack shows:
fat_read() → fat_block_get() → _fat_block_get() → block_get(),
where we detect an attempt to initialize a hash table link, which is still hashed in.
We have this piece of assembly:
7ee7: 8b 53 58 mov 0x58(%ebx),%edx 7eea: 8b 43 54 mov 0x54(%ebx),%eax 7eed: c7 43 5c e0 af dc ba movl $0xbadcafe0,0x5c(%ebx) 7ef4: c7 43 60 01 00 00 00 movl $0x1,0x60(%ebx) 7efb: 89 02 mov %eax,(%edx) 7efd: 8b 43 54 mov 0x54(%ebx),%eax 7f00: c7 43 54 00 00 00 00 movl $0x0,0x54(%ebx) 7f07: 89 50 04 mov %edx,0x4(%eax) 7f0a: 8b 43 40 mov 0x40(%ebx),%eax 7f0d: 89 45 e0 mov %eax,-0x20(%ebp) 7f10: 8d 45 e0 lea -0x20(%ebp),%eax 7f13: c7 43 58 00 00 00 00 movl $0x0,0x58(%ebx) 7f1a: 52 push %edx 7f1b: 6a 01 push $0x1 7f1d: 50 push %eax 7f1e: 8b 45 c8 mov -0x38(%ebp),%eax 7f21: 50 push %eax 7f22: e8 69 e9 00 00 call 16890 <hash_table_remove> 7f27: 83 c4 10 add $0x10,%esp 7f2a: 83 ec 0c sub $0xc,%esp 7f2d: 53 push %ebx 7f2e: e8 2d 58 00 00 call d760 <fibril_mutex_initialize> 7f33: c7 43 18 01 00 00 00 movl $0x1,0x18(%ebx) 7f3a: c6 43 1c 00 movb $0x0,0x1c(%ebx) 7f3e: c6 43 1d 00 movb $0x0,0x1d(%ebx) 7f42: 8d 43 20 lea 0x20(%ebx),%eax 7f45: 89 04 24 mov %eax,(%esp) 7f48: e8 93 5b 00 00 call dae0 <fibril_rwlock_initialize> 7f4d: 83 c4 10 add $0x10,%esp 7f50: 81 7b 5c e0 af dc ba cmpl $0xbadcafe0,0x5c(%ebx) 7f57: 0f 84 1d 02 00 00 je 817a <block_get+0x56a> 7f5d: c7 43 54 00 00 00 00 movl $0x0,0x54(%ebx) 7f64: c7 43 58 00 00 00 00 movl $0x0,0x58(%ebx) 7f6b: c7 43 5c e0 af dc ba movl $0xbadcafe0,0x5c(%ebx) 7f72: c7 43 60 01 00 00 00 movl $0x1,0x60(%ebx) 7f79: 81 7b 6c e0 af dc ba cmpl $0xbadcafe0,0x6c(%ebx) 7f80: 0f 84 db 01 00 00 je 8161 <block_get+0x551>
8161: 83 7b 70 02 cmpl $0x2,0x70(%ebx) 8165: 0f 85 1b fe ff ff jne 7f86 <block_get+0x376> 816b: c7 05 00 00 00 00 01 movl $0x1,0x0 <==== CRASHED HERE 8172: 00 00 00 8175: e9 0c fe ff ff jmp 7f86 <block_get+0x376>
The above roughly corresponds to this piece of block_get():
fibril_mutex_unlock(&b->lock); /* * Unlink the block from the free list and the hash * table. */ list_remove(&b->free_link); temp_key = b->lba; hash_table_remove(&cache->block_hash, &temp_key, 1); } block_initialize(b); <=== CRASHED HERE b->devmap_handle = devmap_handle; b->size = cache->lblock_size; b->lba = ba; b->pba = ba_ltop(devcon, b->lba); hash_table_insert(&cache->block_hash, &key, &b->hash_link);
In block_initialize(), we do:
fibril_mutex_initialize(&b->lock); b->refcnt = 1; b->dirty = false; b->toxic = false; fibril_rwlock_initialize(&b->contents_lock); link_initialize(&b->free_link); link_initialize(&b->hash_link); <=== CRASHED HERE
comment:9 by , 14 years ago
I think what happens is this:
- block_get(ba1)
- block_get(ba2), ba2 == ba1
Thanks to a race condition, block_get() does not detect that it has just instantiated a block_t for ba1 and instantiates it once again for ba2. So that we end up with two block_t's in the hash table having the same key.
- block_put(ba2)
- block_get(ba3), ba3 != ba1
In block_get() we now decide to recycle ba2 for instantiating ba3. So we remove ba2 from the free list and want to hash it out from the hash table. So we call hash_table_remove() and pass ba2 as the key. The problem is that this will find and unhash the block_t for ba1 because it was the first block_t with this key added to the hash bucket list. ba2 will remain in the hash table. Then we hash ba2's block_t for the second time, corrupting thus the hash table.
comment:10 by , 14 years ago
Resolution: | → fixed |
---|---|
Status: | accepted → closed |
Fixed in changeset:mainline,880.
After yesterday's luck, this issue was much harder to reproduce today (about 1/100). I noticed that the fat thread is in the ready state when observed from kconsole, which suggests some kind of infinite loop. Using the btrace kconsole command, I can see that the stack trace remains one of the following: