Fork us on GitHub Follow us on Facebook Follow us on Twitter

Opened 8 years ago

Closed 8 years ago

#373 closed defect (fixed)

Hang while mounting a file_bd device backed by an image located on root FS

Reported by: Jakub Jermář Owned by: Jakub Jermář
Priority: major Milestone: 0.5.0
Component: helenos/srv/vfs Version: mainline
Keywords: Cc:
Blocker for: Depends on:
See also:

Description

On 09/06/2011 11:13 AM, Maurizio Lombardi wrote:

2) This is a bug that I'm unable to detect and I'm not sure is in the
minixfs driver:

#mkfile -s 300k disk
#file_bd disk loop
#mkminix loop ←—- make a minix v3 filesystem with 4K block size
#mount mfs mnt loop

Here the mfs driver hung, it successfully reads the superblock via
block_read_direct(), than it inits the block cache and it tries to
read the root inode, it calls block_get() but this function never
returns (the arguments are valid, so I don't understand what is going
wrong)

Note that it works if I use block sizes of 2K or 1K.


On 09/06/2011 09:15 PM, Maurizio Lombardi wrote:

I forgot to mention that this bug is reproducible only with "file_bd",
not with "ata_bd".


On 09/06/2011 11:26 PM, Martin Sucha wrote:

There seems to be some deadlock involving fat server - using ipc
<task_id> from kconsole, I can see there is some dispatched call (I have
not investigated further which message is it).

I also tried using tmpfs as ramdisk format and the problem indeed is not
reproducible if using tmpfs instead of fat.

Change History (4)

comment:1 Changed 8 years ago by Jakub Jermář

This seems to be only reproducible when the image file is located on the root file system. Placing it under /tmp will make this bug non-reproducible.

comment:2 Changed 8 years ago by Jakub Jermář

Summary: Hang in block_get()Hang while mounting a file_bd device backed by an image located on root FS

comment:3 Changed 8 years ago by Jakub Jermář

Component: helenos/bd/otherhelenos/srv/vfs
Owner: set to Jakub Jermář

When inspecting the IPC state of VFS and FAT tasks using the kconsole ipc command, we can see, the VFS sends out the VFS_OUT_MOUNT (for mfs) and VFS_OUT_READ (for file_bd) over the same phone. This leads to a deadlock, because the answer for the VFS_OUT_MOUNT will not come without first completing the VFS_OUT_READ request, which won't happen, because the fibril which could perform VFS_OUT_READ is blocked by processing of VFS_OUT_MOUNT.

The problem seems to be in this code:

=== modified file 'uspace/srv/vfs/vfs_ops.c'
--- uspace/srv/vfs/vfs_ops.c	2011-08-19 08:58:50 +0000
+++ uspace/srv/vfs/vfs_ops.c	2011-09-09 15:22:42 +0000
@@ -223,8 +223,8 @@
 		return;
 	}
 	
+	async_wait_for(msg, &rc);
 	vfs_exchange_release(exch);
-	async_wait_for(msg, &rc);
 	
 	if (rc == EOK) {
 		rindex = (fs_index_t) IPC_GET_ARG1(answer);

In other words, it looks like the phone over which we sent VFS_OUT_MOUNT is recycled for the second request. Waiting for the answer first should force VFS_OUT_READ to use another phone.

comment:4 Changed 8 years ago by Jakub Jermář

Resolution: fixed
Status: newclosed

Fixed in mainline,1220.

Note: See TracTickets for help on using tickets.