Resource Management for Processing Wide Area Data Streams on Supercomputers

No Thumbnail Available
Date
2020-08-01
Authors
Chung, Joaquin
Adhikari, Mainak
Srirama, Satish Narayana
Jung, Eun Sung
Kettimuthu, Rajkumar
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Modern scientific instruments generate enormous amount of data. Typically, the data collected from the instruments are stored in one or more files that are then moved to a distant supercomputer for processing. The final results are sent back to the user. In order to make effective use of the time on expensive instruments, experimenters want to process the data as they are generated. They want to stream the data from instruments' memory directly to a supercomputer's memory for analysis. Since the compute nodes in a supercomputer are not connected directly to the wide area network, the data streams need to be passed through intermediate gateway nodes. As opposed to the best effort file transfers, data streaming applications require resources at a specific time for a specific period. In this paper, we present a system model for enabling data streaming through gateway nodes and an algorithm to efficiently allocate gateway node resources along with compute nodes. We evaluate the algorithm using real-world traces on the Chameleon Cloud. The results show that our system can schedule compute and gateway resources efficiently for streaming analysis.
Description
Keywords
data streaming, online resource provisioning, wide area networks
Citation
Proceedings - International Conference on Computer Communications and Networks, ICCCN. v.2020-August