Data sets containing millions of records, each with dozens or hundreds of variables, are becoming increasingly common in a multitude of domains, including earth and space science, medicine, telecommunications, and commerce. A key problem accompanying these very large data sets is finding effective tools for gaining insights regarding structures, trends, and anomalies in the data. In this talk I will present some of our recent research efforts at designing and implementing visualization techniques and interactive tools for exploration of multivariate data sets that have been hierarchically clustered or partitioned. The visualizations convey summarizations of each cluster, and the navigation tools allow selective drill-down and roll-up for different subsets of the hierarchy. Filtering and distortion techniques are integrated to facilitate interactive reduction of clutter and focusing on regions of interest. I will demonstrate these concepts and techniques using XmdvTool, a public-domain visualization package developed at WPI to support the data exploration process. To conclude, I will briefly describe some of our related work in database techniques to support interactive data exploration as well as our plans for future research.
Snacks will be provided.
See Conundrum Talks for more information about this series.