Introduction
- a program using fork/join can be seen as a DAG
- Vertixes: units of work
- Edges: dependencies (source must finish before destination starts)
fork
- current node ends, 2 outgoing edges and vertices created. If possible, then in parallel
- The newly forked task
- The continuation of original task
join
- one node with 2 incoming edges
- 1 edge from the forked task that finished
- 1 edge from continuation of original task that was waiting for the forked task
- program cannot proceed until both incoming branches complete
Example, Fibonacci
Performance Model
Introduction
: execution time on P worker threads. This does not necessarily mean physical hardware cores.
To determine how much a program can potentially speed up by running in parallel, the model relies on two critical metrics derived from the DAG.
Speedup:
Maximum Parallelism
Work:
- execution time if sequentially run
- on DAG, total number of nodes in the entire graph.
Span, Critical Path:
- longest chain of dependent operations in the DAG (tasks that cannot be run simultaneously)
Goals and Motivation
DAG Bounds
Lower bound
Work law: linear speedup
Span law the longest sequential chain determines the time, no matter how many worker threads we have
Combining Work and Span law:
Upper bound
The total time should not exceed the time it takes to do the perfectly divided work () plus the time spent waiting on the critical path’s dependencies ().
Execution times with ForkJoin