Abstract
We present a collection of techniques for exploiting latent I/O asynchrony which can substantially improve performance in data-intensive parallel applications. Latent asynchrony refers to an application’s tolerance for decoupling ancillary operations from its core computation, and is a property of HPC codes not fully explored by current HPC I/O systems. Decoupling operations such as buffering and staging, reorganization, and format conversion in space and in time from core codes can shorten I/O phases, preserving valuable MPP compute cycles. We describe in this paper DataTaps, IOgraphs, and Metabots, three tools which allow HPC developers to implement decoupled I/O operations. Using these tools, asynchrony can be exploited by data generators which overlap computation with communication, and by data consumers that perform data conversion and reorganization out-of-band and on-demand. In the context of a data-intensive fusion simulation, we show that exploiting latent asynchrony through decoupling of operations can provide significant performance benefits.
Get full access to this article
View all access options for this article.
