Visualization of large remote data sets has traditionally involved copying the data to your local disk before viewing it. As data set sizes grow, and network throughput becomes as fast or faster than disk throughput, this is no longer a necessary or desirable paradigm. However, to obtain good wide-area I/O performance is not trivial, and requires a good deal of tuning.
To address this situation, we developed The Distributed-Parallel Storage System (DPSS), a scalable, high-performance, distributed-parallel data storage system built from low-cost commodity hardware components. This talk will describe the architecture of the DPSS the the DPSS client library, and describe how the DPSS is "network-aware", automatically tuning TCP parameters to current network conditions.
Snacks will be provided.
See Conundrum Talks for more information about this series.