Preface Comment from Ilmi Yoon:
Just one curiosity, component is in much larger granularity
compared
to object in terms of reuasability or usage itself. Component is kind of
package of objects that has interfaces to communicate with other
components.
So, components are much more portable and easily re-usably
without knowing
the programming environment of diffenrent components - they can
be different
programming langauges, etc as long as they know the interfaces
to each
other... Just I feel some discussions are related to
object-oriented not,
component-oriented. Maybeit is from my ignorance and/or lacking
of certain
backgrounds from the last meeting.
=============The
Survey=========================
1) Data
Structures/Representations/Management==================
The center of every successful modular
visualization architecture has been a flexible core set of data structures for
representing data that is important to the targeted application domain. Before we can begin working on
algorithms, we must come to some agreement on common methods (either data
structures or accessors/method
calls) for exchanging data between components of our vis framework.
There are two potentially disparate
motivations for defining the data representation requirements. In the coarse-grained case, we need to
define standards for exchanging data between components in this framework
(interoperability). In the
fined-grained case, we want to define some canonical data structures that can
be used within a component -- one developed specifically for this
framework. These two use-cases may
drive different set of requirements and implementation issues.
* Do you feel both of
these use cases are equally important or should we focus exclusively on one or
the other?
Randy: I think that interoperability (both
in terms of data and perhaps more
critically operation/interaction) is more
critical than fine-grained
data sharing. My motivation: there is no way that DiVA will be able to
meet all needs initially and in many cases,
it may be fine for data to
go "opaque" to the framework once
inside a "limb" in the framework (e.g.
VTK could be a limb). This allows the framework to be easily
populated
with a lot of solid code bases and shifts
the initial focus on important
interactions (perhaps domain centric). Over time, I see the fine-grain
stuff coming up, but perhaps proposed by the
"limbs" rather than the
framework. I do feel that the coarse level must take into account
distributed processing however...
I want to facilitate interfaces
between packages, opting for (possibly
specific) data models that map
to the application at hand. I could use some generic mechanisms
provided by DiVA to reduce the amount of
code I need or bootstrap
more rapid prototyping, but it is not key
that the data model be
burned fully into the Framework. I certainly feel that the Framework
should be able to support more than one data
model (since we have
repeatedly illustrated that all
"realizable" models have design
boundaries that we will eventually hit.
Pat: I think both cases are important, but
agreeing upon the fine-grained access
will be harder.
John C: Too soon to tell. Focus on both
until the issues become more clear.
Jim: I think for now we need to exclusively
focus on exchanging data between
components, rather than any fine-grained
generalized data objects...
The first order entry into any component
development is to "wrap up
what ya got". The "rip things apart" phase
comes after you can glue
all the coarse-grained piece together
reliably...
Ilmi: I think we need to decide the
coarse-grain something like SOAP that wraps
the internal data with XML format. But I
think we don't need to decide the
fined-grain since each component can have
choose their own way/format and
then post format to public, so the party
who want to use the component needs
to follow the interface. But if we like to
decide initial sets of format
that must/may be supported by diva
components, then we can list most popular
format and choose some/all of them.
JohnS: While I am very interested
in design patterns, data structures, and services that could make the design of
the interior of parallel/distributed components easier, it is clear that the
interfaces between components are the central focus of this project. So the definition of inter-component
data exchanges is preeminent.
Wes: Both are important. The strongest case, IMO, for the
intra-component DS/DM
is that I have a stable set of data
modeling/mgt tools that I can use for
families of components. Having a solid
DS/DM base will free me to focus
on vis and rendering algorithms, which is
how I want to spend my time.
The strongest case for the inter-component
DS/DM is the "strong typing"
property that makes AVS and apps of its
ilk work so well.
The "elephant in the living
room" is that there is no silver bullet.
I favor an approach that is, by design,
incremental. What I mean is that
we can deal with structure grids,
unstructured grids, geom and other
renderable data, etc. in a more or less
piecemeal fashion with an eye
towards component level interoperability
in the long term. In the beginning,
there won't be 100% interoperability as
if, for example, all data models
and types were stuffed into a vector
bundles interface. OTOH, a more
conciliatory approach will permit forward
progress among multiple
independent groups who are all eyeing
"interoperability". This is the
real goal, not a "single true data model."
á Do you feel the requirements for each of these use-cases are aligned or will they involve two separate development tracks? For instance, using "accessors" (method calls that provide abstract access to essentially opaque data structures) will likely work fine for the coarse-grained data exchanges between components, but will lead to inefficiencies if used to implement algorithms within a particular component.
á As you answer the "implementation and requirements" questions below, please try to identify where coarse-grained and fine-grained use cases will affect the implementation requirements.
Randy: I think you hit the nail on the
head. Where necessary, I see
sub-portions
of the framework working out the necessary
fine-grained, efficient,
"aware" interactions and
datastuctures as needed. I
strongly doubt we
would get that part right initially and
think it would lead to some of
the same constraints that are forcing us to
re-invent frameworks right
now.
IMHO: the fine-grain stuff must be flexible and dynamic over
time as development and research progress.
Pat: I think the focus should be on
interfaces rather than data structures.
I
would advocate this approach not just
because it's the standard
"object-oriented" way, but
because it's the one we followed with FEL,
and now FM, and it has been a big win for
us. It's a significant benefit
not having to maintain different versions of
the same visualization
technique, each dedicated to a different
method for producing the
data (i.e., different data
structures). So, for example, we
use the same
visualization code in both in-core and
out-of-core cases. Assuming up
front that an interface-based approach would be too slow is, in my
humble opinion, classic premature
optimization.
Jim: Two separate development tracks. Definitely. There are different driving
design forces and they can be developed
(somewhat) independently (I hope).
Lori: The TSTT center is not interested in
defining a data representation
per se - that is dictating what the data
structure will look like. Rather,
we are interested in defining how data can
be accessed in a uniform
way from a wide variety of different data
structures (for both structured
and unstructured meshes). This came about because we recognize
that
1. there are a lot of
different meshing/data frameworks out there,
that have many man years of
effort behind their development,
that are not going to change
their data structures very easily
(if at all). Moreover, these infrastructures have
made their
choices for a reason - if
there was a one-size-fits-all answer,
someone probably would have
found it by now :-)
2. Because of the
difference in data structures - it has been very
difficult for application
scientists (and tool builders) to experiment
with and/or support different
data infrastructures which has
severely limited their ability
to play with different meshing strategies,
discretization schemes, etc.
We are trying to address this latter point -
by developing common
interfaces for a variety of infrastructures
applications can easily
experiment with different techniques and
supporting tool developers
(such as mesh quality improvement and front
tracking codes) and
write their tools to a single API and
automatically support multiple
infrastructures.
We are also experimenting with the language
interoperability tools
provided by the Babel team at LLNL and have
ongoing work to
evaluate it's performance (and the
performance of our interface in
general) for fine and course grained access
to mesh (data) entities -
something that I suspect will be of interest
to this group as well.
JohnC: I think it's premature to say. We
need to have agreement on the
questions below first.
Ilmi: There will be some overhead and
inefficiency using accessors for data
exchange, but I like the apporach of
accessors and believe the CCA achieves
the reusability in expense of performance
as OOP does anyway. Just we try to
make the expense as little as possible.
JohnS: Given the focus on inter-component
data exchange, I think accessors provide the most straightforward paradigm for
data exchange. The arguments to the
data access methods can involve elemental data types rather than composite data
structures (eg. we use scalars and arrays of basic machine data types rather
than hierarchical structures).
Therefore we should look closely at FM's API organization as well as the
accessors employed by SCIRun V1 (before they employed dynamic compilation).
The accessor method works well for
abstracting component location, but requires potentially redundant copying of
data for components in the same memory space. It may be necessary to use reference counting in order to
reduce the need to recopy data arrays between co-located components, but I'd
really like to avoid making ref counting a mandatory requirement if we can
avoid it. (does anyone know how to
avoid redundant data copying between opaque components without employing
reference counting?)
Wes: They are aligned to a large degree -
data structures/models are produced and
consumed by component code, but may also
be manipulated (serialized,
marshalled, etc) by the framework.
What are requirements for the data representations that must be supported by a common infrastructure. We will start by answering Pat's questions of about representation requirements and follow up with personal experiences involving particular domain scientist's requirements.
Must: support for
structured data
Randy: Must-at the coarse level, I think
this could form the basis of all
other representations.
Pat: Structured data support is a must.
JohnC: Must
Jim: Must
JohnS: Must
Wes: Agree.
Must/Want: support for
multi-block data?
Randy: Must-at the coarse level, I think
this is key for scalability,
domain decomposition and streaming/multpart
data transfer.
Pat: We have unstructured data, mostly
based on tetrahedral or prismatic meshes.
We need support for at least those
types. I do not think we could
simply
graft unstructured data support on top of
our structured data structures.
JohnC: Must
Jim: Must
JohnS: Must
Wes: Must. We must set targets that meet
our needs, and not sacrifice
requirements for speed of implementation.
Must/Want: support for
various unstructured data representations? (which ones?)
Randy: Nice-but I would be willing to live
with an implementation on top
of structured, multi-block (e.g.
Exdous). I feel accessors are
fine for this at the "framework"
level (not at the leaves).
Pat: We have unstructured data, mostly
based on tetrahedral or prismatic meshes.
We need support for at least those
types. I do not think we could
simply
graft unstructured data support on top of
our structured data structures.
JohnC: Not sure. Not a priority.
Jim: Want (low priority)
JohnS: Cell based initially unstructured
representations first. Need
support for arbitrary connectivity eventually, but not mandatory. I liked Iris ExplorerÕs hierarchical
model as it seems more general than the model offered by other vis systems.
Wes: Must. Unstructured data reps are
widely used and they should not be
excluded from the base set of DS/DM
technologies.
Must/Want: support for
adaptive grid standards? Please be
specific about which adaptive grid methods you are referring to. Restricted block-structured AMR
(aligned grids), general block-structured AMR (rotated grids), hierarchical
unstructured AMR, or non-hierarchical adaptive structured/unstructured meshes.
Randy: Similar to my comments on
unstructured data reps. In the
long
run, something like boxlib with support for
both P and H adaptivity
will be needed (IMHO, VTK might provide
this).
Pat: Adaptive grid support is a
"want" for us currently, probably eventually
a "must". The local favorite is CART3D, which
consists of hierarchical
regular grids. The messy part is that CART3D also supports having
more-or-less arbitrary shapes in the
domain, e.g., an aircraft fuselage.
Handling the shape description and all the
"cut cell" intersections
I expect will be a pain.
JohnC: Adaptive grid usage is in its infancy
at NCAR. But I suspect it is the
way of the future. Too soon to be specific
about which adaptive grid
methods are prefered.
Jim: Want (low priority) the AMR folks
havfe been trying to get together and define
a standard API, and have been as yet
unsuccessful. Who are we to
attempt
this where they have failed...?
JohnS: If we can define the data models
rigorously for the individual grid types (ie. structured and unstructured
data), then adaptive grid standards really revolve around an infrastructure for
indexing data items. We normally
think of indexing datasets by time and by data species. However, we need to have more general
indexing methods that can be used to support concepts of spatial and temporal
relationships. Support for
pervasive indexing structures is also important for supporting other
visualization features like K-d trees, octrees, and other such methods that are
used to accelerate graphics algorithms.
We really should consider how to pass such representations down the data
analysis pipeline in a uniform manner because they are used so commonly.
Wes: Want, badly. We could start with
Berger-Colella AMR since it is widely
used. I'm not crazy about Boxlib, though,
and hope we can do something
that is easier to use.
Must/Want:
"vertex-centered" data, "cell-centered" data?
other-centered?
Randy: Must.
Pat: Most of the data we see is still vertex-centered. FM supports other
associations, but we haven't used them
much so far.
Jim: Want (low priority)
All of these should be "Wants",
to the extent that they require more
sophisticated handling, or are less
well-known in terms of generalizing
the interfaces.
For example, the AMR folks havfe been
trying to get together and define
a standard API, and have been as yet
unsuccessful. Who are we to
attempt
this where they have failed...?
So to clarify, if we *really* understand
(or think we do) a particular
data representation/organization, or even a
specific subset of a general
representation type, then by all means lets
whittle an API into our stuff.
Otherwise, leave it alone for someone else
to do, or do as strictly needed.
JohnS: The accessors must understand (or not preclude) all
centering. This is particularly
for structured grids where vis systems are typically lax in
storing/representing this information.
Wes: Don't care - will let someone else
answer this.
Note: It sounds like at least time-varying data handling is well understood by the people who want it.
Must: support
time-varying data, sequenced, streamed data?
Randy: Must, but way too much to say here to
do it justice. I will say
that the core must deal with time-varying/sequenced
data. Streaming
might be able to be placed on top of that,
if it is designed
properly. I will add that we have a need for progressive data as
well.
Pat: Support for time-varying data is a
must.
JohnC: Must. Time varying data is what makes
so many of our problems currently
intractible. Too many of the available tools
(e.g. VTK) assume static
data and completely fall apart when the data
is otherwise.
Defin