Thread Coordination
-
In many scenarios involving shared state, threads must coordinate regarding changes to said state
-
A common pattern is for thread A to signal thread B when state has changed in a manner that blocks thread B while waiting for the state change
-
Rust provides support for traditional, MESA-style condition variables. Unfortunately, it misses an opportunity to significantly improve on their use
-
Condition variables are part of the monitor pattern, which has 3 parts:
- shared state and a mutex that protects shared state
- a boolean condition that is checked before and after waiting
- one or more condition variables tied to the monitor
-
Rust gets part 1 safe, but parts 2 and 3 are as error prone as in C
Condition Variables
-
Condition variables represent queues with operations on them
-
wait
- add caller to condition variable’s queue, block caller. Also drop monitor lock while blocked. -
signal
- (aka notify) - unblock (at least) one thread from condition variable’s queue (if any). -
broadcast
- (aka notifyAll) - unblock all threads waiting on the condition variable
-
-
These primitives are have a relaxed and (somewhat) unreliable semantics:
-
wait
may return even though there was no signal. (spurious
wakeups.) -
signal
may wake up more than one waiter. -
signal
aren’t stored: if no one is waiting when signal is called, nothing happens - necessitates the use of a boolean condition regarding the state to avoid losing wakeups. -
These relaxations reflect a compromise between implementors and users and result from the idea to provide a low-level, efficient, and flexible primitive.
-
- But they’re prone to usage errors. The monitor pattern is the only way to use condition variables correctly.
Correct Use of Condition Variables
A condition variable must always be associated with the same mutex. They must always be used in the following pattern.
// Waiter side
acquire_mutex(&M);
...
while (! condition that says event has not occurred)
cond_wait(&C, &M)
...
act on shared state, knowing that event has occurred
release_mutex(&M);
// Signaler side
acquire_mutex(&M);
// act on shared state; produce the state change waiter is interested in
cond_signal(&C)
...
release_mutex(&M);
Rust’s Condition Variables do not prevent naked waits
This code will deadlock if notify_one is called before wait.
let coin_flipped = &Condvar::new();
let coin = &Mutex::new(0);
std::thread::scope(|s| {
s.spawn(move || {
let mut rng = thread_rng();
if let Ok(mut v) = coin.lock() {
*v = rng.next_u32() % 2;
coin_flipped.notify_one();
}
});
s.spawn(move || {
if let Ok(v) = coin.lock() {
// naked wait without checking the boolean condition
if let Ok(v2) = coin_flipped.wait(v) {
println!("Coin flipped to {}", v2);
}
}
});
});
Correct use
.wait()
should not be part of the Rust API because
(in my opinion) it’s impossible to use correctly.
Instead, checking of the condition before and after is necessary. For conceptual clarity, create a struct that combines state and condition variable:
struct CoinFlip<'c> {
value: u32, // value of coin flip (0 or 1)
settled: bool, // whether value is valid
coin_flipped: &'c Condvar,
}
let coin_flipped = Condvar::new();
let coin = Mutex::new(CoinFlip {
value: 0,
settled: false,
coin_flipped: &coin_flipped,
});
Correct use, continued
std::thread::scope(|s| {
let coin = &coin; // share via immutable reference
s.spawn(move || {
let mut rng = thread_rng();
if let Ok(mut v) = coin.lock() {
v.value = rng.next_u32() % 2; // flip coin
v.settled = true;
v.coin_flipped.notify_one(); // notify waiter
}
});
s.spawn(move || {
if let Ok(v) = coin.lock() {
if let Ok(v) = v.coin_flipped.wait_while(v, |v| !v.settled) {
println!("Coin flipped to {}", v.value);
}
}
});
});
Lock Poisoning
-
Notice that
Mutex.lock
returns aLockResult<MutexGuard<'_, T>>
rather than aMutexGuard
directly.
enum LockResult<Guard> {
Ok(Guard),
Err(PoisonError<Guard>),
}
-
Thus, it can fail, both at
.lock
, but also at the implicit reacquision of the lock in condition.wait
or.wait_while
. -
A mutex acquisition fails if a thread holding the mutex panic’d. It is assumed that the state guarded by this mutex is then inconsistent, and further attempts to lock and access it shall fail as well.