Visualizing PVM Executions

Thomas Kunz and David J. Taylor

Department of Computer Science, University of Waterloo

Waterloo, Ontario, Canada, N2L 3G1.

To facilitate reasoning about the execution of massively parallel and distributed applications, we are developing a visualization tool that displays application execution using process-time diagrams. These displays are based on the notion that each process consists of a totally ordered sequence of primitive events, each representing some activity performed by a process and considered to take place at an instant in time. Typically, the lowest level of observed behaviour consists of events representing process interactions, such as sending and receiving messages and process creation and termination.

Our tool, POET, displays processes and events using two-dimensional process-time diagrams. The placement of events along the time axis is based on either their occurrence in real time or their relationship to other events in the partial order introduced by Lamport. Each display mode has value: for example, partial-order displays are useful for debugging, while real-time displays are useful for performance analysis.

Massively parallel and distributed executions typically contain many processes and primitive events. To assist in understanding such executions, abstract visualizations are provided in which processes are grouped into process clusters and primitive events are grouped into abstract events. Such abstractions are either derived automatically or created manually by a user. POET allows a user to navigate the resulting hierarchy of abstract views, to collect increasingly detailed information for smaller parts of the execution, for example.

Currently, POET runs in a variety of target environments, such as DCE, Hermes, ABC++, uC++, and SR. This paper describes how we adapted POET to PVM 3.3, using its tracing facility. The basic event model in POET allows only one-to-one communication. To support PVM features such as multicast and group communication, event abstraction was used. Such activities are modeled as multiple primitive events involving one-to-one communication and then grouped into abstract events and displayed as single entities.

A paper version of the presentation is not yet available. However, a list of other publications describing Poet and our approach to process clustering and event abstraction can be found on the Web Server of our research group.
Sat May 6 10:48:36 EST 1995