Last time: * Thread intro * Create - asynchronous function call - user vs. system * Within thread - order respected * Relative rates unknown (problematic) * Exit/join - synchronization & communication - Join returns after exit completes, passes exit value * Volatile - prevent register caching, part of the type system Today: * "synch: mutual exclusion" * pthreads: mutexes (locks) * Rules for using mutexes safely (some non482 stuff) - granularity - deadlock - composability * pthread_once * trylocks and why they're almost always a bad idea Readings: * Link: LLNL thread tutorial (see Phorum) * Scheduler activations - interesting, not required * Dr. Dobbs on volatile Mutual exclusion is a way to guarantee that at most one thread is in any of a set of sequences of operations at the same time. Recall from last time: carve() while(done < count) sleep(1); write(C); subMult() ... done++ This isn't going to work because "done++" is likely not atomic (load, inc register, store on a load/store arch, nonatomic inc instruction on x86). We want to guarantee that the two load/inc/store sequences from different threads cannot be interleaved. Solution: Mutexes (type pthread_mutex_t) Can be initialized statically or dynamically. pthread_mutex_lock(m) - Waits until m is free - acquires it - returns At most one thread can get the lock at once. pthread_mutex_unlock(m) - Changes's m's state from held to free. pthread_mutex_trylock(m) - Acquires the lock if it's not already held by someone else. - Otherwise does nothing. (so it's nonblocking) - Returns the old state of the lock (i.e.) There are two ways to write multithreaded programs: * Think really carefully before you start. * Write sequential version, then add threads and locks until it works. The rule: Do not hack before think. (Think first, hack later.) Corollary: do not code when you can't blink. (i.e., you're tired) Step 1: For each piece of shared mutable state, associate a lock with it. (note it doesn't say "a new lock" or "only one lock") Step 2: For each piece of shared mutable state, identify the invariants on that state. Then identify the legal transformations under those invariants. These two steps are the hard parts (step 2 in particular). Everything after is mechanical. 1) Before accessing shared state, you MUST acquire the associated mutex. 2) You must hold the mutex for the duration of the transformation. 3) You must release the lock after the transformation completes and the invariant is restored (must hold for all paths out of the transformation). Goal: lock as late as possible, release as soon as possible (maximize concurrency). When in doubt, worry about correctness over performance. Sequences protected by locks are called critical sections. lock(m); while(done < count) { unlock(m); sleep(1); lock(m); } unlock(m); Don't forget the unlock/lock inside the loop, otherwise no one will ever increment done. How do we get the bank account checking<->savings transfer case correct? (try to transfer $100 both ways at the same time) We could have a "global bank lock", but that would be outrageous in terms of concurrency. There are two other alternatives: lock for each account, and lock for each customer. Note that the way we choose to assign locks leads to certain ways of thinking about transformations. If we're concerned with customers, a transfer is one action. If we're thinking about accounts, a transfer is 2 operations (a debit and a credit). In the accounts case, we have to think about how to compose these two operations: lock(check) debit $100, checking unlock(check) lock(savings) credit $100, savings unlock(savings) is totally useless. I could check my balance between the two locks and $100 would vanish, generating customer service calls. So, we could write transfer as one function: transfer(fromA, toA) lock(fromA.m) lock(toA.m) debit/credit unlock(toA.m) unlock(fromA.m) Sadly, this doesn't work. Consider: transfer(sav,check) & transfer(check,sav). There's a deadlock if each thread succeeds in its first lock attempt and blocks on the other when trying to get the second lock. From the point of view of locks, deadlock occurs when: * You have to acquire more than one lock and * Two threads can try to acquire the same set of locks in different orders. The most common way to solve the deadlock problem with mutexes is to enforce a global partial order. This order says that if any two locks can be held concurrently, there can be only one order in which they are acquired. For the bank account problem, just take them in account number order. This can be complicated because it might not be clear that two locks are taken at the same time, and sometimes it's "just hard". To deal with this, we assign locks hierarchically, and then we can acquire top-down and left-to-right. (If you can structure data in a tree, there's a natural order. This is harder on general graphs. Best example on there not being a clear order: mv foo ... To do name lookup, you have to lock from the root of the filesystem down, but the obvious implementation of this function locks the file and then locks the destination. It can also happen that multiple programs are themselves consistent, but they don't agree in concert. This sort of thing is a trylock() temptation. Aside about transactions: * A transaction in the formal sense has 4 important properties: atomicity (provided by locks), isolation (no interleaving, provided by locks), abortability (give up and restart), and something else that apparently wasn't important enough to mention. Point being that you could use trylock() and abort if it fails. The other problem with locks is that they're not composable. If I wrote an account-centric view with thread-safe per account debit() and credit(), there is no way to write an atomic transfer. You can't pull the lock outside of debit() and credit(). It doesn't quite work even with counting locks. (counting locks let you lock more than once and require the corresponding number of unlocks.) xfer(A,B) (suppose A