Missed summary again. Today: idioms for thread usage. We're talking about task parallelism still. Grain size: Too few threads -> idle CPU resources. (Note to self: review Amdahl's law) Too many threads (or threads that are too short) -> waste CPU resources in management overhead. Encouragement: play around with our multicore machines & determine perf. characteristics. Drawback of threads is that your program can't be platform-independent from a performance point of view. Partitioning the program for 2 cores is way different than partitioning for 32 cores. Data independence: no interthread dependencies, ideally. There are three main idioms people use, and bnoble's going to add a fourth that's not usually found in textbooks. All of the idioms have a bunch of work that has to be partitioned and farmed out to threads. 1. Boss/worker model. The boss thread takes a set of work as input and then forks off worker threads per unit of work. The worker thread does stuff and then reports its results (externally or to the boss) and then dies. In this idiom, threads are continuously created and destroyed, so we want to make sure each thread has substantial work to do. This model is used by, among other systems, the Apache web server, except Apache forks processes instead of threads. The final 482 project is also an example of this model. Insufficient work will mean you've got idle resources, and an overload leads to a performance collapse because you spend all your time on the threads. The problem: the grain size is totally dependent on workload. One technique for dealing with this is to establish a "high water mark" -- limit the number of threads. Another technique is load shedding: when there's too much work, don't do some of it. This isn't very common because creating and destroying threads is expensive. 2. Worker pool. Requests come in through the operating system (filesystem, network, serial port, etc.), where the mechanism is usually not thread-safe. Then you have a given amount of threads and the threads in the pool acquire work from the OS mechanism through a "boss thread" via a safe queue. Tuning the number of threads in the pool is an art. Furthermore, you often partition the pool by role. For example, Coda allows replication across machines. The server had 12 thread types. The bulk of them were vprocs (client requests). There was also an admin thread, a pool of backup threads, some data collection threads, and a whole bunch of other stuff. A lot wasn't really necessary, but no fewer than 6 Ph.D.s were earned out of Coda. Sometimes, we have to aggregate the results -- Parallel reduce. 3. Pipelining -- generalization of worker pool. Everyday example: g++. It uses at least 3 programs: preprocessor, compilation step (two passes: front-end, back-end), assembler. WARNING: reading coming up out of the patterns book. bnoble hasn't figured out the right way to present this because the way you talk about the decomposition depends on the application. THE MYSTERY FOURTH IDIOM: 4. Deferred work There might be situations in which you can respond to a request before you've done all the work you need to do to make the data structures happy. Example: splay trees. (zig, zig-zag, zag-zig) It's a tree where you fiddle with the tree after servicing a request, but you don't have to fiddle until servicing the request.