A number of problems in parallel computing require reasoning about the
dependency structure in parallel programs. For example, dynamic race detection
relies on efficient "on-the-fly" determination of dependencies between
sequential and parallel tasks (e.g. to quickly determine whether or not two
memory accesses occur in parallel). Several solutions to this "parallel order
maintenance" problem has been proposed, but they all have several drawbacks,
including lack of provable bounds, high asymptotic or practical overheads, and
poor support for parallel execution.
In this paper, we present a solution to the parallel order maintenance problem
that is provably efficient, fully parallel, and practical. Our algorithm --
called DePa -- represents a computation as a graph and encodes vertices in the
graph with two components: a dag-depth and a fork-path. In this encoding, each
query requires O(f/ω) work, where f is the minimum dynamic nesting depth of the
two vertices compared, and ω is the word-size. In the common case (where f is
small, e.g., less than 100), each query requires only a single memory lookup and
a small constant number of bitwise instructions. Furthermore, graph maintenance
at forks and joins requires only constant work, resulting in no asymptotic
impact on overall work and span. DePa is therefore work-efficient and fully
parallel.