Introduction

  • a program using fork/join can be seen as a DAG
  • Vertixes: units of work
  • Edges: dependencies (source must finish before destination starts)

fork

  • current node ends, 2 outgoing edges and vertices created. If possible, then in parallel
    1. The newly forked task
    2. The continuation of original task

join

  • one node with 2 incoming edges
  • 1 edge from the forked task that finished
  • 1 edge from continuation of original task that was waiting for the forked task
  • program cannot proceed until both incoming branches complete

Example, Fibonacci

Performance Model

Introduction

: execution time on P worker threads. This does not necessarily mean physical hardware cores.

To determine how much a program can potentially speed up by running in parallel, the model relies on two critical metrics derived from the DAG.

Speedup:

Maximum Parallelism

Work:

  • execution time if sequentially run
  • on DAG, total number of nodes in the entire graph.

Span, Critical Path:

  • longest chain of dependent operations in the DAG (tasks that cannot be run simultaneously)

Goals and Motivation

DAG Bounds

Lower bound

Work law: linear speedup

Span law the longest sequential chain determines the time, no matter how many worker threads we have

Combining Work and Span law:

Upper bound

The total time should not exceed the time it takes to do the perfectly divided work () plus the time spent waiting on the critical path’s dependencies ().

Execution times with ForkJoin