Fixed up version of the example from last time: template class RootStream { T root; bool done; public: RootStream(T & r); bool pop_if_presennt(T & n) { if(!done) { done = true; n = root; return true; } return false; } }; class TBody { elt (*fn)(elt); parallel_while & w; public: TBody(elt (*f)(elt), parallel_while & wh) : fn(f), w(wh) {} typedef tree * argument_type; void operator()(argument_type t) const { if(t) { t->elt = fn(t->elt); if(t->left) w.add(t->left); if(t->right) w.add(t->right); } } }; void apply(tree * t, elt (*fn)(elt)) { parallel_while w; TBody b(fn, w); RootStream s(t); w.run(s,b); } For doing this with a DAG, atomically mark nodes. //garbled, this was way low on the board. foreach outedge if(n(e).done = false){ n(e).done = true; w.add(); } Book update: We've mostly done chapters 1-4, and we're now looking at atomic types from Chapter 7. We also talked about Chapter 9 on the task scheduler. So far, nothing in Chapter 5 is compelling to talk about in this class vs. 280/281. Atomic type synopsis: atomic b; Members: * fetch_and_store(bool y) - atomically does {bool old = b; b = y; return old;} So we can easily mark nodes atomically and fast. There are some situations where you think you want atomic and you don't, and the book's example is really good. However, consider the following: //x is global and atomic do { curX = X; //atomic if X is if(curX) { newX = ... } }while(X.compare_and_swap(newX, curX) != curX); This is optimistic concurrency control: you speculate that the value of X won't change and make sure it didn't. It's really cool, but you can get into really big trouble with it. The A-B-A problem: if someone might try to change X to B and back A without you noticing and that's a problem for you, then you can't use optimistic concurrency control. For example, you cache a pointer in a linked list. Someone deletes the node you were using, and then someone else allocates a new node and inserts it. It's pretty likely that the new node is a reuse of the old chunk of memory, so you're probably hosed. The solution: don't speculate here. You could also tag nodes with serial numbers and speculate on those, though they need to not wrap for safety. That technique is generalized into "version vectors". Pipelines ========= pluggable filters...it's a pipeline like from any other concept of pipelines. The filters can be marked as serial or parallel. Some reading required for this pipeline stuff; the example would take forever. Pipeline class -- just use. Filter -- abstract base class. Implement subclass for each filter stage type. Instantiate 1 per stage. Instantiate a pipeline. Add filters to the pipeline in order. (only linear pipelines supported) Then run the pipeline. Must remove filters from pipeline - explicitly destroy pl - or pl.clear() namespace tbb { class filter { protected: filter(bool is_serial); public: bool is_serial() const; virtual void * operator()(void * inp) = 0; virtual ~filter() {} }; } class MyFilter : public filter { public: //serial guarantees that one instance runs at once and that //elements are seen in order they are yielded. MyFilter() : filter(true) {} void * operator()(void * p) { Foo * fp = static_cast(p); //1st stage: creating elts //last stage: return value ignored return fp; } }; class pipeline { public: //ctor, dtor void add_filter(filter & f); void clear(); void run(size_t num_tokens); }; throughput = throughput of slowest filter latency is about # stages - work/stage/elt parallelism depends on: structure of pipeline number of tokens in flight