Opened 11 years ago

Closed 10 years ago

#558 closed enhancement (fixed)

More flexible locking scheme for fibril timers

Reported by: Jiri Svoboda Owned by: Jiri Svoboda
Priority: major Milestone: 0.6.0
Component: helenos/lib/c Version: mainline
Keywords: Cc:
Blocker for: Depends on:
See also: #556, #557

Description

Once #556 is addressed, we get a timer that provides reasonable mutual-exclusion guarantees, but slightly inflexible with respect to locking.

As an example, imagine we implement a state machine that reacts to events and timeouts. It has a state A. Upon reception of event E the machine transitions to state B. If the event B does not come for a specified amount of time, the state machine transitions to state C.

We implement the state machine using state variable S, state lock SL and a timeout timer T.

Event E received:

  • lock SL
  • if S == A then clear T; S := B
  • unlock SL

Timeout T expires:

  • lock SL
  • S := C
  • unlock SL

Obviously S must be accessed with SL held. Now the T.lock is held when T handler is invoked. That means lock order T.lock → SL. However we must clear T while holding SL. That means lock order SL → T.lock and we get a deadlock.

In other words it is not possible to hold a lock L while setting or clearing the timer if L is taken inside the timer handler.

With the current fibril timer interface, there seems to be no way of breaking this deadlock and keeping the correctness/mutual exclusion of state transitions.

Change History (2)

comment:1 by Jiri Svoboda, 11 years ago

One way to get around this problem would be to unify the two locks, i.e. to allow the caller to specify the lock used for accessing the timer. IMHO a slightly more elegant way is to change the interface so that locking the timer lock is done explicitly by the caller using functions such as:

  • fibril_timer_lock()
  • fibril_timer_unlock()

this gives the caller a chance to take the timer lock before any locks they intend to take inside the timer handler.

comment:2 by Jiri Svoboda, 10 years ago

Resolution: fixed
Status: newclosed

Fixed in mainline,2136. In the end I went with the option of allowing the caller to specify a mutex to use for locking the timer (or NULL to use the internal one). The reason is that explicit locking of the timer's internal lock is not good enough. Imagine code like:

fibril_timer_lock(T);
fibril_mutex_lock(M);

while (...)
    fibril_condvar_wait(CV, M);

fibril_timer_clear_locked(T);
fibril_mutex_unlock(M);
fibril_timer_unlock(T);

You can see that trying to wait on a conditional variable with mutex M leads to a deadlock, because fibril_condvar_wait(CV, M) will drop M (but not T.lock) and then it will try to re-acquire M with T.lock held. If we simply allow the use of M for locking T, there is no problem.

Note: See TracTickets for help on using tickets.