Preface Comment from Ilmi Yoon:

Just one curiosity, component is in much larger granularity compared

to object in terms of reuasability or usage itself.   Component is kind of

package of objects that has interfaces to communicate with other components.

So, components are much more portable and easily re-usably without knowing

the programming environment of diffenrent components - they can be different

programming langauges, etc as long as they know the interfaces to each

other... Just I feel some discussions are related to object-oriented not,

component-oriented. Maybeit is from my ignorance and/or lacking of certain

backgrounds from the last meeting.

 

 

=============The Survey=========================

 

1) Data Structures/Representations/Management==================

The center of every successful modular visualization architecture has been a flexible core set of data structures for representing data that is important to the targeted application domain.  Before we can begin working on algorithms, we must come to some agreement on common methods (either data structures or accessors/method  calls) for exchanging data between components of our vis framework.

 

There are two potentially disparate motivations for defining the data representation requirements.  In the coarse-grained case, we need to define standards for exchanging data between components in this framework (interoperability).  In the fined-grained case, we want to define some canonical data structures that can be used within a component -- one developed specifically for this framework.  These two use-cases may drive different set of requirements and implementation issues.

        * Do you feel both of these use cases are equally important or should we focus exclusively on one or the other?

 

Randy: I think that interoperability (both in terms of data and perhaps more

critically operation/interaction) is more critical than fine-grained

data sharing.  My motivation: there is no way that DiVA will be able to

meet all needs initially and in many cases, it may be fine for data to

go "opaque" to the framework once inside a "limb" in the framework (e.g.

VTK could be a limb).  This allows the framework to be easily populated

with a lot of solid code bases and shifts the initial focus on important

interactions (perhaps domain centric).  Over time, I see the fine-grain

stuff coming up, but perhaps proposed by the "limbs" rather than the

framework.  I do feel that the coarse level must take into account

distributed processing however...

 

I want to facilitate interfaces

between packages, opting for (possibly specific) data models that map

to the application at hand.  I could use some generic mechanisms

provided by DiVA to reduce the amount of code I need or bootstrap

more rapid prototyping, but it is not key that the data model be

burned fully into the Framework.  I certainly feel that the Framework

should be able to support more than one data model (since we have

repeatedly illustrated that all "realizable" models have design

boundaries that we will eventually hit.

 

Pat: I think both cases are important, but agreeing upon the fine-grained access

will be harder.

 

John C: Too soon to tell. Focus on both until the issues become more clear.

 

Jim: I think for now we need to exclusively focus on exchanging data between

components, rather than any fine-grained generalized data objects...

 

The first order entry into any component development is to "wrap up

what ya got".  The "rip things apart" phase comes after you can glue

all the coarse-grained piece together reliably...

 

Ilmi: I think we need to decide the coarse-grain something like SOAP that wraps

the internal data with XML format. But I think we don't need to decide the

fined-grain since each component can have choose their own way/format and

then post format to public, so the party who want to use the component needs

to follow the interface. But if we like to decide initial sets of format

that must/may be supported by diva components, then we can list most popular

format and choose some/all of them.

 

JohnS: While I am very interested in design patterns, data structures, and services that could make the design of the interior of parallel/distributed components easier, it is clear that the interfaces between components are the central focus of this project.  So the definition of inter-component data exchanges is preeminent.

 

Wes: Both are important. The strongest case, IMO, for the intra-component DS/DM

is that I have a stable set of data modeling/mgt tools that I can use for

families of components. Having a solid DS/DM base will free me to focus

on vis and rendering algorithms, which is how I want to spend my time.

 

The strongest case for the inter-component DS/DM is the "strong typing"

property that makes AVS and apps of its ilk work so well.

 

The "elephant in the living room" is that there is no silver bullet.

I favor an approach that is, by design, incremental. What I mean is that

we can deal with structure grids, unstructured grids, geom and other

renderable data, etc. in a more or less piecemeal fashion with an eye

towards component level interoperability in the long term. In the beginning,

there won't be 100% interoperability as if, for example, all data models

and types were stuffed into a vector bundles interface. OTOH, a more

conciliatory approach will permit forward progress among multiple

independent groups who are all eyeing "interoperability". This is the

real goal, not a "single true data model."

 

á       Do you feel the requirements for each of these use-cases are aligned or will they involve two separate development tracks?  For instance, using "accessors" (method calls that provide abstract access to essentially opaque data structures) will likely work fine for the coarse-grained data exchanges between components, but will lead to inefficiencies if used to implement algorithms within a particular component.

á        As you answer the "implementation and requirements" questions below, please try to identify where coarse-grained and fine-grained use cases will affect the implementation requirements.

 

Randy: I think you hit the nail on the head.  Where necessary, I see sub-portions

of the framework working out the necessary fine-grained, efficient,

"aware" interactions and datastuctures as needed.  I strongly doubt we

would get that part right initially and think it would lead to some of

the same constraints that are forcing us to re-invent frameworks right

now.  IMHO: the fine-grain stuff must be flexible and dynamic over

time as development and research progress.

 

Pat: I think the focus should be on interfaces rather than data structures.  I

would advocate this approach not just because it's the standard

"object-oriented" way, but because it's the one we followed with FEL,

and now FM, and it has been a big win for us.  It's a significant benefit

not having to maintain different versions of the same visualization

technique, each dedicated to a different method for producing the

data (i.e., different data structures).  So, for example, we use the same

visualization code in both in-core and out-of-core cases.  Assuming up

front that  an interface-based approach would be too slow is, in my

humble opinion, classic premature optimization.

 

Jim: Two separate development tracks.  Definitely.  There are different driving

design forces and they can be developed (somewhat) independently (I hope).

 

Lori: The TSTT center is not interested in defining a data representation

per se - that is dictating what the data structure will look like.  Rather,

we are interested in defining how data can be accessed in a uniform

way from a wide variety of different data structures (for both structured

and unstructured meshes).  This came about because we recognize

that

  1.  there are a lot of different meshing/data frameworks out there,

       that have many man years of effort behind their development,

       that are not going to change their data structures very easily

       (if at all).  Moreover, these infrastructures have made their

       choices for a reason - if there was a one-size-fits-all answer,

       someone probably would have found it by now :-)

  2.  Because of the difference in data structures - it has been very

       difficult for application scientists (and tool builders) to experiment

       with and/or support different data infrastructures which has

       severely limited their ability to play with different meshing strategies,

       discretization schemes, etc.

 

We are trying to address this latter point - by developing common

interfaces for a variety of infrastructures applications can easily

experiment with different techniques and supporting tool developers

(such as mesh quality improvement and front tracking codes) and

write their tools to a single API and automatically support multiple

infrastructures.

 

We are also experimenting with the language interoperability tools

provided by the Babel team at LLNL and have ongoing work to

evaluate it's performance (and the performance of our interface in

general) for fine and course grained access to mesh (data) entities -

something that I suspect will be of interest to this group as well.

 

JohnC: I think it's premature to say. We need to have agreement on the

questions below first.

 

Ilmi: There will be some overhead and inefficiency using accessors for data

exchange, but I like the apporach of accessors and believe the CCA achieves

the reusability in expense of performance as OOP does anyway. Just we try to

make the expense as little as possible.

 

JohnS: Given the focus on inter-component data exchange, I think accessors provide the most straightforward paradigm for data exchange.  The arguments to the data access methods can involve elemental data types rather than composite data structures (eg. we use scalars and arrays of basic machine data types rather than hierarchical structures).  Therefore we should look closely at FM's API organization as well as the accessors employed by SCIRun V1 (before they employed dynamic compilation).

 

The accessor method works well for abstracting component location, but requires potentially redundant copying of data for components in the same memory space.  It may be necessary to use reference counting in order to reduce the need to recopy data arrays between co-located components, but I'd really like to avoid making ref counting a mandatory requirement if we can avoid it.  (does anyone know how to avoid redundant data copying between opaque components without employing reference counting?)

 

Wes: They are aligned to a large degree - data structures/models are produced and

consumed by component code, but may also be manipulated (serialized,

marshalled, etc) by the framework.

 

What are requirements for the data representations that must be supported by a common infrastructure.  We will start by answering Pat's questions of about representation requirements and follow up with personal experiences involving particular domain scientist's requirements.

        Must: support for structured data

 

Randy: Must-at the coarse level, I think this could form the basis of all

other representations.

 

Pat: Structured data support is a must.

 

JohnC: Must

 

Jim: Must

 

JohnS: Must

 

Wes: Agree.

 

        Must/Want: support for multi-block data?

 

Randy: Must-at the coarse level, I think this is key for scalability,

domain decomposition and streaming/multpart data transfer.

 

Pat: We have unstructured data, mostly based on tetrahedral or prismatic meshes.

We need support for at least those types.  I do not think we could simply

graft unstructured data support on top of our structured data structures.

 

JohnC: Must

 

Jim: Must

 

JohnS: Must

 

Wes: Must. We must set targets that meet our needs, and not sacrifice

requirements for speed of implementation.

 

        Must/Want: support for various unstructured data representations? (which ones?)

 

Randy: Nice-but I would be willing to live with an implementation on top

of structured, multi-block (e.g. Exdous).  I feel accessors are

fine for this at the "framework" level (not at the leaves).

 

Pat: We have unstructured data, mostly based on tetrahedral or prismatic meshes.

We need support for at least those types.  I do not think we could simply

graft unstructured data support on top of our structured data structures.

 

JohnC: Not sure.  Not a priority.

 

Jim: Want (low priority)

 

JohnS: Cell based initially unstructured representations first.  Need support for arbitrary connectivity eventually, but not mandatory.  I liked Iris ExplorerÕs hierarchical model as it seems more general than the model offered by other vis systems.

 

Wes: Must. Unstructured data reps are widely used and they should not be

excluded from the base set of DS/DM technologies.

 

        Must/Want: support for adaptive grid standards?  Please be specific about which adaptive grid methods you are referring to.  Restricted block-structured AMR (aligned grids), general block-structured AMR (rotated grids), hierarchical unstructured AMR, or non-hierarchical adaptive structured/unstructured meshes.

 

Randy: Similar to my comments on unstructured data reps.  In the long

run, something like boxlib with support for both P and H adaptivity

will be needed (IMHO, VTK might provide this).

 

Pat: Adaptive grid support is a "want" for us currently, probably eventually

a "must".  The local favorite is CART3D, which consists of hierarchical

regular grids.  The messy part is that CART3D also supports having

more-or-less arbitrary shapes in the domain, e.g., an aircraft fuselage.

Handling the shape description and all the "cut cell" intersections

I expect will be a pain.

 

JohnC: Adaptive grid usage is in its infancy at NCAR. But I suspect it is the

way of the future. Too soon to be specific about which adaptive grid

methods are prefered.

 

Jim: Want (low priority) the AMR folks havfe been trying to get together and define

a standard API, and have been as yet unsuccessful.  Who are we to attempt

this where they have failed...?

 

JohnS: If we can define the data models rigorously for the individual grid types (ie. structured and unstructured data), then adaptive grid standards really revolve around an infrastructure for indexing data items.  We normally think of indexing datasets by time and by data species.  However, we need to have more general indexing methods that can be used to support concepts of spatial and temporal relationships.  Support for pervasive indexing structures is also important for supporting other visualization features like K-d trees, octrees, and other such methods that are used to accelerate graphics algorithms.  We really should consider how to pass such representations down the data analysis pipeline in a uniform manner because they are used so commonly.

 

Wes: Want, badly. We could start with Berger-Colella AMR since it is widely

used. I'm not crazy about Boxlib, though, and hope we can do something

that is easier to use.

 

        Must/Want: "vertex-centered" data, "cell-centered" data? other-centered?

 

Randy: Must.

 

Pat: Most of the data we see is still vertex-centered.  FM supports other

associations, but we haven't used them much so far.

 

Jim: Want (low priority)

All of these should be "Wants", to the extent that they require more

sophisticated handling, or are less well-known in terms of generalizing

the interfaces.

 

For example, the AMR folks havfe been trying to get together and define

a standard API, and have been as yet unsuccessful.  Who are we to attempt

this where they have failed...?

 

So to clarify, if we *really* understand (or think we do) a particular

data representation/organization, or even a specific subset of a general

representation type, then by all means lets whittle an API into our stuff.

Otherwise, leave it alone for someone else to do, or do as strictly needed.

 

JohnS: The accessors must  understand (or not preclude) all centering.  This is particularly for structured grids where vis systems are typically lax in storing/representing this information.

 

Wes: Don't care - will let someone else answer this.

 

Note: It sounds like at least time-varying data handling is well understood by the people who want it.

 

        Must: support time-varying data, sequenced, streamed data?

 

Randy: Must, but way too much to say here to do it justice.  I will say

that the core must deal with time-varying/sequenced data.  Streaming

might be able to be placed on top of that, if it is designed

properly.  I will add that we have a need for progressive data as

well.

 

Pat: Support for time-varying data is a must.

 

JohnC: Must. Time varying data is what makes so many of our problems currently

intractible. Too many of the available tools (e.g. VTK) assume static

data and completely fall apart when the data is otherwise.

 

Defin