Thread Coordination

In many scenarios involving shared state, threads must coordinate regarding changes to said state
A common pattern is for thread A to signal thread B when state has changed in a manner that blocks thread B while waiting for the state change
Rust provides support for traditional, MESA-style condition variables. Unfortunately, it misses an opportunity to significantly improve on their use
Condition variables are part of the monitor pattern, which has 3 parts:
1. shared state and a mutex that protects shared state
2. a boolean condition that is checked before and after waiting
3. one or more condition variables tied to the monitor
Rust gets part 1 safe, but parts 2 and 3 are as error prone as in C

Condition Variables

Condition variables represent queues with operations on them
- wait - add caller to condition variable’s queue, block caller. Also drop monitor lock while blocked.
- signal - (aka notify) - unblock (at least) one thread from condition variable’s queue (if any).
- broadcast - (aka notifyAll) - unblock all threads waiting on the condition variable
These primitives are have a relaxed and (somewhat) unreliable semantics:
- wait may return even though there was no signal. (spurious wakeups.)
- signal may wake up more than one waiter.
- signal aren’t stored: if no one is waiting when signal is called, nothing happens - necessitates the use of a boolean condition regarding the state to avoid losing wakeups.
- These relaxations reflect a compromise between implementors and users and result from the idea to provide a low-level, efficient, and flexible primitive.
But they’re prone to usage errors. The monitor pattern is the only way to use condition variables correctly.

Correct Use of Condition Variables

A condition variable must always be associated with the same mutex. They must always be used in the following pattern.

// Waiter side
acquire_mutex(&M);
...
while (! condition that says event has not occurred)
   cond_wait(&C, &M)
...
act on shared state, knowing that event has occurred
release_mutex(&M);

godbolt

// Signaler side
acquire_mutex(&M);
// act on shared state; produce the state change waiter is interested in  
cond_signal(&C)
...
release_mutex(&M);

godbolt

Rust’s Condition Variables do not prevent naked waits

This code will deadlock if notify_one is called before wait.

let coin_flipped = &Condvar::new();
let coin = &Mutex::new(0);

std::thread::scope(|s| {
    s.spawn(move || {
        let mut rng = thread_rng();
        if let Ok(mut v) = coin.lock() {
            *v = rng.next_u32() % 2;
            coin_flipped.notify_one();
        }
    });

    s.spawn(move || {
        if let Ok(v) = coin.lock() {
            // naked wait without checking the boolean condition
            if let Ok(v2) = coin_flipped.wait(v) {
                println!("Coin flipped to {}", v2);
            }
        }
    });
});

godbolt
playground

Correct use

.wait() should not be part of the Rust API because (in my opinion) it’s impossible to use correctly.

Instead, checking of the condition before and after is necessary. For conceptual clarity, create a struct that combines state and condition variable:

struct CoinFlip<'c> {
    value: u32,         // value of coin flip (0 or 1)
    settled: bool,      // whether value is valid
    coin_flipped: &'c Condvar,
}

godbolt
playground

    let coin_flipped = Condvar::new();
    let coin = Mutex::new(CoinFlip {
        value: 0,
        settled: false,
        coin_flipped: &coin_flipped,
    });

godbolt
playground

Correct use, continued

std::thread::scope(|s| {
    let coin = &coin;   // share via immutable reference

    s.spawn(move || {
        let mut rng = thread_rng();
        if let Ok(mut v) = coin.lock() {
            v.value = rng.next_u32() % 2;   // flip coin
            v.settled = true;
            v.coin_flipped.notify_one();    // notify waiter
        }
    });
    s.spawn(move || {
        if let Ok(v) = coin.lock() {
            if let Ok(v) = v.coin_flipped.wait_while(v, |v| !v.settled) {
                println!("Coin flipped to {}", v.value);
            }
        }
    });
});

godbolt
playground

Lock Poisoning

Notice that Mutex.lock returns a LockResult<MutexGuard<'_, T>> rather than a MutexGuard directly.

enum LockResult<Guard> {
    Ok(Guard),
    Err(PoisonError<Guard>),
}

godbolt
playground

Thus, it can fail, both at .lock, but also at the implicit reacquision of the lock in condition .wait or .wait_while.
A mutex acquisition fails if a thread holding the mutex panic’d. It is assumed that the state guarded by this mutex is then inconsistent, and further attempts to lock and access it shall fail as well.