Interfacing Parallel Scientific Applications with Multiple Visualization Systems: The CUMULVS Approach

James Arthur Kohl and Philip M. Papadopoulos
Computer Science and Mathematics Division
Oak Ridge National Laboratory

As high-performance computer simulation increasingly replaces more expensive and time-consuming conventional alternatives, namely physical prototyping / experimentation, it becomes important to provide mechanisms for interacting with parallel or distributed simulation programs. Such interactive exchanges can close the loop on simulation experiments by providing visualization of intermediate results and then allowing scientists to respond by manipulating the simulation, while it is running. This type of visualization and computation steering feedback system can dramatically shorten the experimental cycle by pruning off experiments that are quickly seen to be undesirable. Further, such interaction can provide enhanced capabilities for "what if" analyses, even beyond what would be possible in a physical environment. This approach also introduces new forms of collaboration, by allowing multiple remote collaborators to cooperate and interact via the running simulation programs, each person exploring their own view of the data and then sharing their understanding with the group.

The CUMULVS (Collaborative User Migration User Library for Visualization and Steering) system [1,2] is a middleware library infrastructure that allows multiple, potentially remote, scientists to monitor and coordinate control over a parallel simulation program. Using CUMULVS, various front-end viewers can dynamically attach to a simulation to view snapshots of the ongoing computation, or manipulate (steer) parameters of the simulation, and then detach. The snapshots of intermediate data array values can be rendered using a variety of visualization systems, including AVS, Tcl/Tk, and VTK, or as simple text dumps. CUMULVS provides the interfacing between the simulation tasks and the visualization systems, and transparently handles the viewer connection protocols and the data collection required for decomposed or distributed data. CUMULVS also provides mechanisms for developing fault-tolerant applications in heterogeneous distributed computing environments.

To use CUMULVS with a simulation program, the program must be instrumented to annotate the individual data fields and parameters of interest. Any distributed decomposition of data arrays must be described so that CUMULVS knows where each data element resides. This "knowledge" allows viewer programs to reference the application data using global coordinates, independent of any underlying data decomposition. CUMULVS maps requests from the global view, via the decomposition information, to locate where the distributed data resides (see Figure 1).

While this manual preparation of the application requires some specification time on the part of the programmer, it leads to a variety of powerful capabilities. The additional semantic information about program variables allows CUMULVS to transparently collect and organize subregions of data for transport to the viewer programs. Existing applications need not be torn apart to insert visualization directives, but rather a single initialization segment defines the available data which CUMULVS can periodically access to manage external connections. Efficient user-directed checkpointing, reconfiguration and task migration are also possible, even across heterogeneous architecture boundaries, using the application's data roadmap to manipulate and re-organize program data.

It is this flexibility that allows CUMULVS to interface with a variety of visualization systems: the simulation programs need not be concerned with any external data formats; and the visualization tools are not required to understand the application's decomposed data layout. CUMULVS serves as an intermediate translator of the data formats, supplying the data in simple, contiguous regions that the front-end viewers can understand. CUMULVS even transforms data from column major to row major format and vice versa, applying truncation and promotion of data across various data types as requested to match the viewer's needs.

Figure 1: Multiple Global Views of Decomposed Data

This model can be generalized to encompass a wide spectrum of different data mappings, to support simultaneous connections to different visualization tools from the same running simulation program. Further, this technique suggests a means for connecting arbitrary applications together, for other kinds of interactive cooperation. Any set of applications that communicate by sharing or transferring data could utilize this approach to translate their information back and forth, e.g. execution monitoring tools collecting trace data.

The ongoing CUMULVS work includes extending the existing communication protocols to support more complex types of interactions. Currently, CUMULVS supports the attachment of serial viewer programs, but this could prove inadequate for more elaborate visualization systems. The attachment protocols will be generalized to support parallel-to-parallel connections in addition to the serial-to-parallel special case. This work complements other research at ORNL in next-generation heterogeneous computing, including the HARNESS project [3]. HARNESS utilizes a pluggable architecture model in which user applications can customize their execution environment by "plugging in" elements at several levels, from low-level communication protocols, to programming models, to entire virtual machines. One of the specific goals of HARNESS is to allow the "plugging together" of arbitrary application programs using general interface definition semantics and built-in negotiation protocols. The CUMULVS interaction protocols for attaching viewer programs to simulations will form the basis for this more general collaborative mechanism.

Other more fundamental visualization research will be explored to provide more sophisticated sampling techniques in CUMULVS. Viewers can already request subregions of data at various levels of granularity by sampling every k-th point on a coordinate axis. This simple sampling approach reduces network load and is sufficient to zoom the level of visualization detail in and out, but might obscure high-frequency trends in the data. To this end, new techniques will be developed to apply statistical functions to the sampling, such as computing sums, averages, or other reductions on the data before returning them to the viewer for presentation.

Currently, CUMULVS works with parallel or serial simulation programs written in C or Fortran that use either PVM [4] or MPI [5] as a message-passing substrate. To improve the applicablility of CUMULVS, it will be ported to work with other languages, programming models and communication systems. For example, HPF [6] supplies sufficient accessor functions for its data array decompositions to allow automation of some of the instrumentation required for using CUMULVS. Other development environments, such as InDEPS (formerly POET) [7], could be merged with CUMULVS to expedite the instrumentation of simulations programs.

[1] G.A. Geist, J.A. Kohl, P.M. Papadopoulos, "CUMULVS: Providing Fault-Tolerance, Visualization and Steering of Parallel Applications," International Journal of Supercomputing Applications.

[2] J.A. Kohl, P.M. Papadopoulos, "A Library for Visualization and Steering of Distributed Simulations using PVM and AVS," Proceedings of the High Performance Computing Symposium, Montreal, Canada, 1995, pp. 243--254.

[3] J.A Kohl, G.A. Geist, P.M. Papadopoulos, S. Scott, "Beyond PVM 3.4: What We've Learned, What's Next, and Why," Proceedings of EuroPVM-MPI 97, November 1997.

[4] G.A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek, V. Sunderam, PVM: Parallel Virtual Machine, A User's Guide and Tutorial for Networked Parallel Computing, MIT Press, Cambridge, MA, 1994.

[5] M. Snir, S. Otto, S. Huss-Lederman, D. Walker, J. Dongarra, MPI: The Complete Reference, MIT Press, Cambridge, MA, 1996.

[6] C. Koebel, D. Loveman, R. Schreiber, G. Steele Jr., M. Zosel, The High Performance Fortran Handbook, MIT Press, Cambridge, MA, 1994.

[7] R. Armstrong, P. Wyckoff, C. Yam, M. Bui-Pham, N. Brown, "Frame-Based Components for Generalized Particle Methods", High Performance Distributed Computing (HPDC '97), Portland, OR, August 1997.

[8] J.A. Kohl, P.M. Papadopoulos, "Fault-Tolerance and Reconfigurability Using CUMULVS," Cluster Computing Conference, Emory University, Atlanta, GA, March 9-11, 1997.