MBender: Leveraging QuickTime VR as a Delivery Vehicle for Remote and Distributed Visualization

Lawrence Berkeley National Laboratory Visualization Group, November 2004

Table of Contents


The scientific community faces a well-recognized challenge - the need to perform data analysis and visualization on data that is located "somewhere else." This situation occurs when large simulation results are placed on large storage caches at computing centers, yet the scientist or analyst is located elsewhere. In the general subject area of remote and distributed visualization, there are several different approaches aimed at solving the fundamental problem of facilitating interactive exploration of large, complex, multidimensional datasets from a remote location.

The image below shows three possible partitionings of the visualization pipeline. The portions of the image in red are the remote location, while the portions in blue are local to the user. In the top row, data is stored at, read from, visualized and rendered on the remote resource. The resulting images are sent to the viewer. In the middle row, data is read from the remote location, and visualized and rendered locally. The bottom row shows a configuration where data is stored at, read from and visualized on the remote resource, and visualization results (predominantly geometry) are sent to the remote location where they are rendered and displayed. The boundary between the red and blue regions indicates where data of some type is transferred over the network.

Each approach has its own strengths and weaknesses. If we assume for the moment that the network link between the remote and local sites is the primary performance bottleneck, we can make the following generalizations about performance. In the top row, the primary factor influencing performance is the cost of sending image data. That means that the performance is more or less independent of the amount of data being visualized, and is more sensitive to the resolution of images being rendered. The exceptions to this statement include the relative difference in image size depending upon which compression algorithm is used. The relative compressed image size (and cost of compression) is sensitive to scene contents - "busy" scenes will produce images that do not compress as well as images where there is less variation in content. While the "send images" approach has substantial bandwidth requirements (about 90MB/s to send 1024x1024 RGB uncompressed images at 30fps, and perhaps about 5-10MB/s to send 1024x1024 RGB compressed images at 30fps with an effective compression ratio of about 10-20:1), this figure remains more or less constant regardless of the amount of data being processed.

If we send all of the simulation or experiment data to the workstation for analysis, visualization and rendering, then the pipeline performance will be dominated by the cost of sending the data across the network link. While this approach provides the best performance in terms of local desktop interactivity, it also presumes (1) that the data will all fit on the local workstation, and (2) that you have time to wait for the data to be moved. The fact of life we face is that multi-terabyte datasets are routinely generated by scientific simulations, and neither of the above presumptions apply.

Another alternative is shown in the bottom row, where data is visualized then sent to the client for rendering. This approach has the advantage of providing excellent desktop interactivity once the visualization results have moved across the network link. Additionally, this approach permits use of scalable visualization resources on the remote site to accelerate processing, ammortize the cost of I/O across many processors, and so forth. It also presumes that all the visualization results will fit onto the desktop. In many instances, that assumption is not true.

Generally speaking, as data sizes grow larger, the approach of sending all data to the local workstation or site for processing becomes increasingly intractable. Growth of network performance (in particular) and local storage capacity does not keep pace with the Moore's Law increase in computational capacity at central computing facilities.

Motivation and Approach

In the MBender project, we're interested in alternatives to the traditional pipeline decompositions to better support interactive, remote, 3D visualization. Specifically, we want to be able to support fully interactive, 3D visualization of time varying data using "standard desktop tools" that are readily available to anyone. We observe that in many instances, users require only a subset of all possible visualization functionality: to understand 3D shape and depth relationships, and to browse time-varying data presented in a 3D format.

The approach we take is to explore ways to deliver visualization and rendering results - images - in a way that provides the experience of 3D, interactive exploration. We would like to have a delivery mechanism that strikes a balance between cost and functionality. Sending images rather than data is a better approach in terms of scaling up to ever larger data. However, sending images is expensive in its own right, and interactivity suffers when there is even minor latency in the WAN network link.

The simplest way to think of our approach is as an "interactive, 3D movie." To do so, we generate a number of images of a 3D scene by varying the viewpoint as well as the time step, then provide the user the ability to easily select a viewpoint or timestep. In this way, the user has the experience of 3D interaction under their control, as well as the ability to perform 3D interaction with time varying data. Later in this article, we explain several alternative approaches that will meet these objectives: JavaScript-based media, QuickTime VR Object movies, and MBrowser, a multiresolution image browser.

This approach sacrifices the complete generality of other solutions, but offers numerous advantages. These advantages include:

The JavaScript Encoder

One of our first experiments was to create explore a "low-tech" solution using JavaScript to perform client-side dynamic image selection based upon mouse position. We can use any one of a number of visualization applications to generate images for a set of viewpoints, and save the images into raster files. Next, we run an "encoder" that creates a web page containing links to the visualization images stored on the webserver along with the JavaScript code that performs client-side image selection from the array of images depending upon the position of the mouse. If the viewpoints are chosen in the right way, then the JavaScript interface "feels like" (behaves like) a virtual trackball.

Click here to launch one of the JavaScript demos. After the new browser window appears, you interact with the model by pressing and dragging with the left mouse button. This example shows single-axis rotation about the y-axis. Click here to launch the other the JavaScript demo. After the new browser window appears, you interact with the model by pressing and dragging with the left mouse button. This example shows two-axis rotation about both the x- and y-axes. (Note: we occasionally have problems with this particular demo running. Let us know if you have problems at SC04 and we'll try to help).

The primary advantage of the JavaScript encoder approach is that a standard web browser can be used to do the 3D interaction. There are a couple of noteworthy disadvantages that prohibit its use for "large and complex" 3D interactions: JavaScript will first preload all the images for the scene into your browser's cache before you can interact with the scene. When the scene contains a small number of images, this limitation is rarely a problem. When the "interactive movie" contains contains a large number of images - perhaps representing many finely spaced viewpoints to provide good visual fidelity - then it is possible your browser will crash or "lock up" when local memory/cache capacity has been exceeded. We believe that the browser stores images in uncompressed format in memory, resulting in an explosion of memory consumption. We found the JavaScript approach to be interesting, but of relatively limited use for interactive 3D visualization. We did not attempt to implement the ability to time-browse data due to the memory constraints encountered when doing a single time step's worth of visualization results.

What is Quicktime VR?

QuickTime VR is a media format created and maintained by Apple Computer. It is a cross-platform, multimedia technology for manipulating, enhancing, and storing video, sound, animation, graphics, text, music, and even 360-degree "virtual reality." It also allows you to stream digital video where the data stream can be either live or stored.

The following few paragraphs were copied verbatim from the International QuickTime VR Association Website, and include a few editorial modifications.

There are two main types of [QuickTime] VR, ["panorama movies" and "object movies."] Both forms s are user-navigated movies that can be freely explored on either Windows or Mac. [Movie] content is created either photographically or with 3D software, and usually the environments are photo-realistic interpretations of reality. Often the VR environments come alive with sound and animations in the movies, adding to the richness of the experience.

Object movies simulate viewing an object from many sides, such as holding a small object in your hand with the ability to turn it around. Likewise, it could be a very large object, such as flying around the exterior of the Great Pyramids. VR panoramas simulate environments that we can view around in 360 degrees. These panoramas could be of literally anywhere, a viewer could be on a gorgeous beach on Maui island, or standing on stage at the Royal Albert Hall. By combining objects, panoramas and audio, you can create a virtual tour to tell a story in a unique and dynamic way.

There are various players for viewing VR movies, and they differ greatly in quality and performance. There is Apple Computer's QuickTimeVR (QTVR), PTViewer, and a few other Java-based players. Java will play VR in a limited way on computers that do not have QuickTime. Java-based PTViewer has many limitations compared to QuickTime VR, but offers a basic VR tour and has the best quality of the Java players. QuickTime VR is the most widely recognized player with superior quality and excellent performance. QuickTimeVR is the industry standard for photo-realistic virtual tours. Because QTVR is part of QuickTime, there are many unique features and useful benefits to using it, and it offers an open-ended ability to customize the VR being created. With QuickTimeVR a highly interactive and high resolution VR experience is possible, and this is the wave of the creative future.

This QuickTime VR Panorama movie (borrowed from Apple's website) provides a view inside Grand Central station. This QuickTime VR Panorama movie (borrowed from HRL's website) shows an example of georectified information presentation system.

There are lots of QuickTime VR movies on the web. Google search for "QuickTime VR" and you'll get more hits "than you can shake a stick at."

QuickTime VR Object Movies

As indicated in the previous section, there are two main kinds of QTVR: panorama and object movies. With a panorama movie, you can rotate your head 360 degrees from a fixed point to look at the world around you. Object movies, on the other hand, are more akin to looking inside a fishbowl from any direction. In other words, you can position your viewpoint at any location on the surface of a sphere while having your viewpoint "look in" at the center of the sphere.

There's another subtle but important distinction: when viewing panorama movies, the QuickTime player will smoothly interpolate images as you change your viewpoint. The QTVR "image source" is a panoramic image, and what you see onscreen is a small subset of the larger panorama. Object movies, on the other hand, consist of an array of discrete images: the QTVR object movie player does no interpolation. That means when you have a large number of images taken from closely spaced views, the motion from one frame to the next in the object movie will seem to be smoother than if you have fewer frames for a given range of view motion.

The QTVR object movie is an excellent way to present 3D visualization results for it permits interactive manipulation to select from different views. The QTVR player also supports a "zoom-in, zoom-out" capability, as well as the ability to "flipbook" through images representing time varying visualization results.

This example shows how QuickTime VR Object movies can be used to deliver interactive, "3d-like" scientific visualization to the desktop. This visualization is a volume rendering of a physics simulation. To create the movie, we rendered the scene from a total of 36*19 different viewpoints for a total of 684 images. The size of this QuickTime VR object movie is about 14MB. Note that we could have created the object movie with fewer images; in fact, the number of images used to create the QTVR Object Movie is completely arbitrary. This example shows a QuickTime VR Object movie where there is only a single axis of interaction (rotate the scene about the Y-axis), but we've added the dimension of time. Unfortunately, we've not yet seen a QuickTime player that will permit you to pause playback over time. This movie was created using a total of 36 different views over 11 timesteps, for a total of 396 frames. The size of this QTVR object movie is about 7MB.

LBNL's contribution in this area is the creation of an encoder that will generate either QuickTime VR Object movies, or JavaScript/HTML that is a coarse approximation to a QuickTime VR object movie in terms of being able to perform virtual, interactive 3D transformations. As the encoder source code stabilizes, we will release it using an Open Source license. We had no luck in locating any freely available tools for generating QuickTime VR object movies.

QTVR Object movies hold much promise for use in remote visualization of large scientific datasets. First, the QTVR object movie creation process can be conducted completely out of core. Note, however, that the source images must be generated by a visualization application, and not all visualization applications are capable of out-of-core execution.

Discussion and Future Work

If you play around with the "zoom-in, zoom-out" feature in QTVR, you'll quickly see there is a problem with fixed image resolution. As you zoom in, the resolution doesn't improve, so you end up seeing zoomed-up pixels from the image source. What would be better is to have the ability to truly zoom in to higher image resolutions when needed. If the higher resolution images could be loaded on demand to maximize effective use of local memory/cache requirements.

The images below illustrate exactly this point. The "main image" is the coarser resolution representation. If you zoom in with the normal QTVR player, you see the fuzzy image on the left. With true multiresolution representation, you see the crisp, highly detailed image on the right.

The image on the left shows the degraded visual quality you receive when zooming in using "regular" QTVR. On the right, zooming in triggers a change in data source so that a higher resolution image source is used for close-up views. This approach - which uses a progressive, multiresolution image model - results in better visual fidelity during data exploration. However, the result is no longer "true QTVR," and a special display client must be run on the workstation.

The multiresolution work was inspired by a combination of need to maintain high visual fidelity during the interactive, 3D visualization experience as well as nascent similar technologies in industry. Viewpoint's ZoomView product implements something like this, but its application is somewhat different than what we have in mind, and it is also proprietary technology.

Contact Information