Opened 11 years ago
Closed 10 years ago
#558 closed enhancement (fixed)
More flexible locking scheme for fibril timers
Reported by: | Jiri Svoboda | Owned by: | Jiri Svoboda |
---|---|---|---|
Priority: | major | Milestone: | 0.6.0 |
Component: | helenos/lib/c | Version: | mainline |
Keywords: | Cc: | ||
Blocker for: | Depends on: | ||
See also: | #556, #557 |
Description
Once #556 is addressed, we get a timer that provides reasonable mutual-exclusion guarantees, but slightly inflexible with respect to locking.
As an example, imagine we implement a state machine that reacts to events and timeouts. It has a state A. Upon reception of event E the machine transitions to state B. If the event B does not come for a specified amount of time, the state machine transitions to state C.
We implement the state machine using state variable S, state lock SL and a timeout timer T.
Event E received:
- lock SL
- if S == A then clear T; S := B
- unlock SL
Timeout T expires:
- lock SL
- S := C
- unlock SL
Obviously S must be accessed with SL held. Now the T.lock is held when T handler is invoked. That means lock order T.lock → SL. However we must clear T while holding SL. That means lock order SL → T.lock and we get a deadlock.
In other words it is not possible to hold a lock L while setting or clearing the timer if L is taken inside the timer handler.
With the current fibril timer interface, there seems to be no way of breaking this deadlock and keeping the correctness/mutual exclusion of state transitions.
Change History (2)
comment:1 by , 11 years ago
comment:2 by , 10 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Fixed in mainline,2136. In the end I went with the option of allowing the caller to specify a mutex to use for locking the timer (or NULL to use the internal one). The reason is that explicit locking of the timer's internal lock is not good enough. Imagine code like:
fibril_timer_lock(T); fibril_mutex_lock(M); while (...) fibril_condvar_wait(CV, M); fibril_timer_clear_locked(T); fibril_mutex_unlock(M); fibril_timer_unlock(T);
You can see that trying to wait on a conditional variable with mutex M leads to a deadlock, because fibril_condvar_wait(CV, M) will drop M (but not T.lock) and then it will try to re-acquire M with T.lock held. If we simply allow the use of M for locking T, there is no problem.
One way to get around this problem would be to unify the two locks, i.e. to allow the caller to specify the lock used for accessing the timer. IMHO a slightly more elegant way is to change the interface so that locking the timer lock is done explicitly by the caller using functions such as:
this gives the caller a chance to take the timer lock before any locks they intend to take inside the timer handler.