Resource Management for Grid-based Visualization

Ian Bowman    ijbowman@ucdavis.edu
John Shalf
Kwan-Liu Ma

home/contact info     current project     news    

Summary

Large-scale scientific computing will soon become Grid-based over high-capacity and high-speed networks. Appropriate Grid-based visualization tools must also be developed to support remote, collaborative data analysis making use of geographically distributed high-performance computing and storage facilities. The Grid offers the baseline infrastructure for launching this distributed pipeline, but offers no services that support even marginally optimal resource selection or partitioning for a visualization pipeline. Our work is creating a methodology that enables optimal resource allocation and pipeline partitioning for Grid-based visualization.

The actual process of developing such a methodology is a detailed one, however. We first need to define the visualization pipeline that we will work with. Once this is done we need to actually develop it. Finally, we will use this pipeline to develop and test our resource management theories. The specifics of implementing these objectives are listed and discussed below in detail.

Tasks

Task Completion status Est. Comp. Date. More Info
define visualization pipeline 100% NA click here
develop serializable datasets 100% NA click here
develop basic pipeline components 100% NA click here
develop advanced pipeline components 100% Jan 2003 click here
create performance models for basic components 100% Jan 2003 click here
create performance models for advanced components 100% Feb 2003 click here
use performance models for performance prediction of basic visualization pipeline 100% Jan 2003 click here
use performance models for performance prediction of advanced visualization pipeline 100% Feb 2003 click here
create resource allocation strategy based on shortest path algorithm 70% May 2003 click here
use resource allocation strategy to optimally partition visualization pipeline among available resources 0% May 2003 click here

top

Define Visualization Pipeline

Status: Complete


The above figures represent the two basic pipeline designs. In the top pipeline represented, there are three components: Reader, Isosurface Extraction, and Display. There are two datasets: 3D Grid, and Triangle List. The datasets can be either locally handed off, or serialized and sent to a different machine. If all components are on different machines, data flow is as follows. First the Reader opens a dataset file and uses it to initialize a 3D grid dataset. The 3D grid is serialized and its packets are sent to the Isosurface Extractor. The Isosurface Extractor deserializes the 3D grid, and uses it (along with an isovalue) to create the isosurface triangles. These triangles are used to initialize a Triangle list which is serialized as before, and sent to the Display. The display renders and displays the triangles.

The second pipeline depicted is similar to the first. The only difference is that the triangle list is sent to an off-screen renderer, which renders the triangles and sends image data to the Display. In both pipelines, the Isosurface Extraction can be done in parallel on a cluster.


The advanced pipeline shown above features a Parallel Isosurface Extraction and Image Compositing component.

top   objectives

Develop Serializable Datasets

Status: Complete

3D Grid, Triangle List, and Image datasets were developed. They all can be serialized and deserialized into TCP packets.

top   objectives

Develop Basic Pipeline Components

Status: Complete

A file reader, isosurface extractor (serial and parallel), off-screen renderer and display were developed. All of these components use local i/o, or desialize and serialize datasets as required.

top   objectives

Develop Advanced Pipeline Components

Status: 100% complete, ETC Jan.2003

By advanced components, I mean advanced component. By advanced component, I mean the Parallel Isosurface Extraction and Image Compositing component. The obstacle lied not only in developing it, but also in finding a cluster that could do the off-screen rendering.

top   objectives

Create Performance Models for Basic Components

Status: 100% complete, ETC Jan.2003

The performance models for all components are complete, including the parallel isosurface extractor, and the on screen display. I assume that the display is fixed and do not model it. The performance models are described below. The following notation is used. Time is represented by t, size represented by s, number(as in amount) is represented by n, and a machine-specific constant is represented by C.

The performance model for the reader is

treader(s3Dgrid) = s3Dgrid * Creader

The performance model for the isosurface extractor is

tiso(ntris, s3Dgrid) = base(s3Dgrid) + ntris * Ciso,

where base(s3Dgrid) = s3Dgrid * Cbase

The performance model for the off-screen renderer is

trender(ntris) = ntris * Crender

top   objectives

Create Performance Models for Advanced Components

Status: 100% complete, ETC Feb.2003

Work modeling the Parallel Isosurface Extraction and Image Compositing Component has been completed.

top   objectives

Use performance models For Performance Prediction of Basic Visualization pipeline

Status: 100% complete, ETC Jan.2003

Tests were done to predict the performance of the various basic components including the parallel isosurfacer using the performance model. We predicted the performance of the components individually, as well as the overall pipeline. Result graphs shown below. All times in seconds.


The above graph shows the predicted versus real times for the reader component on a CIPIC machine. Three datasets were used.


The above graph shows the predicted versus real times for the isosurfacer component on a CIPIC machine. Three datasets were used.


The above graph shows the predicted versus real times for the renderer component on a CIPIC machine. Three datasets were used.


Network timing information was collected and combined with the component performance models to predict overall pipeline performance. The above graph shows the predicted versus real times for the overall pipeline performance. Three different permutations of four machines were used to partition the pipeline.

top   objectives

Use Performance Models for Performance Prediction of Advanced Visualization Pipeline

Status: 100% complete, ETC Feb.2003

Work predicting the pipelines with the Isosurface Extraction and Image Compositing component involved has been completed.

top   objectives

Create Resource Allocation Strategy Based on Shortest Path Algorithm

Status: 70% complete, ETC Jan.2003
The key to this method lies in the graph construction. The conceptual model is more or less complete. What remains is to implement the shortest path algorithm and graph construction so that the optimal pipeline configuration can be automatically computed. I plan on having some of the graph information read from a file (the network B/W info) and the rest calculated (performance prediction info). All of the arcs weights will be in terms of time: serialization time for file of XX bytes, ISO time for dataset DD with isolevel II, serialization time for YY triangles, Render time, etc. The objective, of course is to minimize the total time across the pipeline. In other words, the objective is to find the shortest path from reader to display. I'll post an illustration as soon as I make one.

top   objectives

Use Resource Allocation Strategy to Optimally Partition Visualization Pipeline Among Available Resources

Status: 0% complete, ETC Feb.2003

top   objectives