Data Analytics and Visualization Vignettes

Data analytics and visualization play an integral role in the scientific process - allowing a way to see the unseen by creating images of experimental data or theoretical simulation results. The projects listed on this page represent recent or current collaborative efforts between the CRD Data Analytics and Visualization Group and others performing scientific research in both simulation/computational and experimental sciences.

Jump to year 2017, 2015, 2014, 2013, 2012. For previous research highlights, please visit the group’s archival visualization vignettes page.


Nested Tracking Graphs Show Relationship of Features in Simulations Over Time

New technique makes it easier to see how features in simulations are related and how they evolve. (More information)

Web-based Visual Exploration Improves Radiological Source Detection

We developed a web-based system that integrates data management and visual analysis to facilitate access to large amounts of observational data, supporting algorithm development for finding radiological material. (More information)

NWB:N – Beta of Neurodata Without Borders 2.0 Format and Software Released

Neurodata Without Borders: Neurophysiology (NWB:N) is a project to develop a unified data format for cellular-based neurophysiology data. The NWB:N team consists of neuroscientists and software developers who recognize that adoption of a unified data format is an important step toward breaking down the barriers to data sharing in neuroscience. Neuroscientists can now explore a beta version of the new Neurodata Without Borders: Neurophysiology (NWB:N 2.0) software and format.

The 2.0 software version was developed by Lawrence Berkeley National Laboratory’s Oliver Ruebel and Andrew Tritt, in close collaboration with Kristofer Bouchard (Berkeley Lab), Loren Frank (UCSF), Eddie Chang (UCSF), and the broader Neurodata Without Borders (NWB) community. The beta update was released in November 2017 in conjunction with the 2017 Society for Neuroscience meeting in Washington D.C. last month.

For more information download the PDF of the poster by clicking on the image on the left and see the following LBNL news article. The NWB:N format schema and PyNWB Python API have been released to the public online on GitHub.

BASTet: Berkeley Analysis and Storage Toolkit

BASTet is a novel framework for shareable and reproducible data analysis that supports standardized data and analysis interfaces, integrated data storage, data provenance, workflow management, and a broad set of integrated tools. BASTet has been motivated by the critical need to enable MSI researchers to share, reuse, reproduce, validate, interpret, and apply common and new analysis methods.

For more information see the BASTet homepage as well as the IEEE TVCG article on BASTet. BASTet has been released to the public; Sources (GitHub) and the Online Documentation.


OpenMSI Receives R&D 100 Award

The OpenMSI software/service developed by Oliver Ruebel and Benjamin P. Bowen (LBNL) was selected as “One of the 100 Most Technologically Significant New Products of the Year in Software/Services.”

For more information see the following LBNL news article. For a full list of award winners see here. OpenMSI has been released to the public and is available online at

BrainFormat: A Data Standardization Framework for Neuroscience Data

Thanks to standardized image file formats—like JPEG, PNG or TIFF—people can easily share selfies and other pictures with anybody connected to a computer, mobile phone or the Internet, without having to download any special software to see the pictures. But in many science fields—like neuroscience—sharing data is not that simple because no standard data format exists. The BrainFormat library developed at LBNL specifies a general data format standardization framework based on the widely used HDF5 format and implements a novel file format for management and storage of neuroscience data, in particular Electrocorticography (ECoG) data.

For more information see the following LBNL news article as well as the Frontiers in Neuroinformatics article. The BrainFormat library has been released to the public and is available online at


Visible-Wavelength Polarized Light Emission with Small-Diameter InN Nanowires

Using NERSC’s Cray XC30 supercomputer “Edison”, University of Michigan researchers found that the semiconductor indium nitride (InN), which typically emits infrared light, will emit green light if reduced to 1 nanometer-wide wires. Moreover, just by varying their sizes, these nanostructures could be tailored to emit different colors of light, which could lead to more natural-looking white lighting while avoiding some of the efficiency loss today’s LEDs experience at high power. Visualization and analysis of the simulation data were done using custom tools based on VTK, ParaView, and finally rendering the result with POVRay. The visualizations appeared on the cover of ACS Nano Letter, on NERSC web site, and were made into a stereoscopic 3D presentation for SC14. (More information)

Petascale Study of Ion Foreshock Dynamics

This research leverages petascale simulations of the Earth’s magnetosphere to study the physics and dynamics of space weather phenomenon. A radial interplanetary magnetic field(IMF) configuration was studied. Such configurations are common during high intensity solar storms when the solar wind becomes highly turbulent. The study is based upon one of the largest global simulations of the Earth’s magnetosphere ever made. The simulation ran on 150,000 cores for 368 hours, for a total of approximately 55.2 Million core hours. The fields and derived quantities are kept on a 1024x2048x1024 grid, and 1 Trillion Particles were simulated. The simulation produced more that 128 TB of analysis data. The NERSC DAS group applied parallel visualization and analysis techniques using ParaView, VTK, and POVRay. Including interactive and batch volume rendering on Edison using ParaView on 256 cores, and batch mode ray tracing using POV-Ray on 36,000 cores. (More information)


OpenMSI: A Science Gateway to Sort Through Bio-Imaging’s Big Datasets

OpenMSI is a collaborative research effort with the goal to make the most high-performance, advanced data management, model building, analysis and visualization resources for mass spectrometry imaging accessible to scientists via the web. The development and application of cutting-edge analytical methods is a core driver for new scientific discoveries, medical diagnostics, and commercial-innovation. Mass spectrometry imaging (MSI) holds the promise for being a transformative technology for advanced studies of metabolic processes with broad applications in life sciences, bioenergy, and health. MSI enables fast assays of microbial metabolism essential in bioengineering problems common to the development of biofuels, drugs, and diagnostics for cancer and other diseases. While the data can be routinely collected, the broad application of MSI is currently limited by the lack of easily accessible analysis methods that can process data of the size, volume, diversity and complexity generated by MSI experiments. The OpenMSI project will overcome these challenges, allowing broad use of MSI to researchers by providing a web-based gateway for management and storage of MSI data, the visualization of the hyper-dimensional contents of the data, and the statistical analysis.

More information is available in the following LBNL and news articles as well on the OpenMSI project website.


High Performance Visualization: Enabling Extreme Scale Scientific Insight

Visualization and analysis tools, techniques, and algorithms have undergone a rapid evolution in recent decades to accommodate explosive growth in data size and complexity and to exploit emerging multi- and many-core computational platforms. High Performance Visualization: Enabling Extreme-Scale Scientific Insight focuses on the subset of scientific visualization concerned with algorithm design, implementation, and optimization for use on today’s largest computational platforms. (More information)

Streamline Integration using MPI-Hybrid Parallelism on a Large, Multi-Core Architecture

Studies show the hybrid-parallel implementation of streamline integration, a key visualization algorithm, performs better and moves less data than a traditional MPI-only implementation. (More information)

Efficient Parallel Extraction of Crack-free Isosurfaces from Adaptive Mesh Refinement Data

Efficient parallel extraction of crack-free isosurfaces from AMR data in a distributed memory setting. (More information)