Lightweight Provenance Service for High-Performance Computing

Dong Dai; Yong Chen; Philip Carns; John Jenkins; Robert Ross

doi:10.1109/PACT.2017.14

Lightweight Provenance Service for High-Performance Computing

Dong Dai, Yong Chen, Philip Carns, John Jenkins, Robert Ross

Computer Science

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

9 Scopus citations

Abstract

Provenance describes detailed information about the history of a piece of data, containing the relationships among elements such as users, processes, jobs, and workflows that contribute to the existence of data. Provenance is key to supporting many data management functionalities that are increasingly important in operations such as identifying data sources, parameters, or assumptions behind a given result; auditing data usage; or understanding details about how inputs are transformed into outputs. Despite its importance, however, provenance support is largely underdeveloped in highly parallel architectures and systems. One major challenge is the demanding requirements of providing provenance service in situ. The need to remain lightweight and to be always on often conflicts with the need to be transparent and offer an accurate catalog of details regarding the applications and systems. To tackle this challenge, we introduce a lightweight provenance service, called LPS, for high-performance computing (HPC) systems. LPS leverages a kernel instrument mechanism to achieve transparency and introduces representative execution and flexible granularity to capture comprehensive provenance with controllable overhead. Extensive evaluations and use cases have confirmed its efficiency and usability. We believe that LPS can be integrated into current and future HPC systems to support a variety of data management needs.

Original language	English
Title of host publication	Proceedings - 26th International Conference on Parallel Architectures and Compilation Techniques, PACT 2017
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	117-129
Number of pages	13
ISBN (Electronic)	9781467395243
DOIs	https://doi.org/10.1109/PACT.2017.14
State	Published - Oct 31 2017
Event	26th International Conference on Parallel Architectures and Compilation Techniques, PACT 2017 - Portland, United States Duration: Sep 9 2017 → Sep 13 2017

Publication series

Name	Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT
Volume	2017-September
ISSN (Print)	1089-795X

Conference

Conference	26th International Conference on Parallel Architectures and Compilation Techniques, PACT 2017
Country/Territory	United States
City	Portland
Period	09/9/17 → 09/13/17

Keywords

Data management
High-performance computing
Lightweight
Provenance
Rich metadata

Access to Document

10.1109/PACT.2017.14

Cite this

Dai, D., Chen, Y., Carns, P., Jenkins, J., & Ross, R. (2017). Lightweight Provenance Service for High-Performance Computing. In Proceedings - 26th International Conference on Parallel Architectures and Compilation Techniques, PACT 2017 (pp. 117-129). (Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT; Vol. 2017-September). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/PACT.2017.14

Dai, Dong ; Chen, Yong ; Carns, Philip et al. / Lightweight Provenance Service for High-Performance Computing. Proceedings - 26th International Conference on Parallel Architectures and Compilation Techniques, PACT 2017. Institute of Electrical and Electronics Engineers Inc., 2017. pp. 117-129 (Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT).

@inproceedings{5fd8d594f9ac443582ab7d467791d6e2,

title = "Lightweight Provenance Service for High-Performance Computing",

abstract = "Provenance describes detailed information about the history of a piece of data, containing the relationships among elements such as users, processes, jobs, and workflows that contribute to the existence of data. Provenance is key to supporting many data management functionalities that are increasingly important in operations such as identifying data sources, parameters, or assumptions behind a given result; auditing data usage; or understanding details about how inputs are transformed into outputs. Despite its importance, however, provenance support is largely underdeveloped in highly parallel architectures and systems. One major challenge is the demanding requirements of providing provenance service in situ. The need to remain lightweight and to be always on often conflicts with the need to be transparent and offer an accurate catalog of details regarding the applications and systems. To tackle this challenge, we introduce a lightweight provenance service, called LPS, for high-performance computing (HPC) systems. LPS leverages a kernel instrument mechanism to achieve transparency and introduces representative execution and flexible granularity to capture comprehensive provenance with controllable overhead. Extensive evaluations and use cases have confirmed its efficiency and usability. We believe that LPS can be integrated into current and future HPC systems to support a variety of data management needs.",

keywords = "Data management, High-performance computing, Lightweight, Provenance, Rich metadata",

author = "Dong Dai and Yong Chen and Philip Carns and John Jenkins and Robert Ross",

note = "Publisher Copyright: {\textcopyright} 2017 IEEE.; 26th International Conference on Parallel Architectures and Compilation Techniques, PACT 2017 ; Conference date: 09-09-2017 Through 13-09-2017",

year = "2017",

month = oct,

day = "31",

doi = "10.1109/PACT.2017.14",

language = "English",

series = "Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "117--129",

booktitle = "Proceedings - 26th International Conference on Parallel Architectures and Compilation Techniques, PACT 2017",

}

Dai, D, Chen, Y, Carns, P, Jenkins, J & Ross, R 2017, Lightweight Provenance Service for High-Performance Computing. in Proceedings - 26th International Conference on Parallel Architectures and Compilation Techniques, PACT 2017. Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT, vol. 2017-September, Institute of Electrical and Electronics Engineers Inc., pp. 117-129, 26th International Conference on Parallel Architectures and Compilation Techniques, PACT 2017, Portland, United States, 09/9/17. https://doi.org/10.1109/PACT.2017.14

Lightweight Provenance Service for High-Performance Computing. / Dai, Dong; Chen, Yong; Carns, Philip et al.
Proceedings - 26th International Conference on Parallel Architectures and Compilation Techniques, PACT 2017. Institute of Electrical and Electronics Engineers Inc., 2017. p. 117-129 (Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT; Vol. 2017-September).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Lightweight Provenance Service for High-Performance Computing

AU - Dai, Dong

AU - Chen, Yong

AU - Carns, Philip

AU - Jenkins, John

AU - Ross, Robert

PY - 2017/10/31

Y1 - 2017/10/31

N2 - Provenance describes detailed information about the history of a piece of data, containing the relationships among elements such as users, processes, jobs, and workflows that contribute to the existence of data. Provenance is key to supporting many data management functionalities that are increasingly important in operations such as identifying data sources, parameters, or assumptions behind a given result; auditing data usage; or understanding details about how inputs are transformed into outputs. Despite its importance, however, provenance support is largely underdeveloped in highly parallel architectures and systems. One major challenge is the demanding requirements of providing provenance service in situ. The need to remain lightweight and to be always on often conflicts with the need to be transparent and offer an accurate catalog of details regarding the applications and systems. To tackle this challenge, we introduce a lightweight provenance service, called LPS, for high-performance computing (HPC) systems. LPS leverages a kernel instrument mechanism to achieve transparency and introduces representative execution and flexible granularity to capture comprehensive provenance with controllable overhead. Extensive evaluations and use cases have confirmed its efficiency and usability. We believe that LPS can be integrated into current and future HPC systems to support a variety of data management needs.

AB - Provenance describes detailed information about the history of a piece of data, containing the relationships among elements such as users, processes, jobs, and workflows that contribute to the existence of data. Provenance is key to supporting many data management functionalities that are increasingly important in operations such as identifying data sources, parameters, or assumptions behind a given result; auditing data usage; or understanding details about how inputs are transformed into outputs. Despite its importance, however, provenance support is largely underdeveloped in highly parallel architectures and systems. One major challenge is the demanding requirements of providing provenance service in situ. The need to remain lightweight and to be always on often conflicts with the need to be transparent and offer an accurate catalog of details regarding the applications and systems. To tackle this challenge, we introduce a lightweight provenance service, called LPS, for high-performance computing (HPC) systems. LPS leverages a kernel instrument mechanism to achieve transparency and introduces representative execution and flexible granularity to capture comprehensive provenance with controllable overhead. Extensive evaluations and use cases have confirmed its efficiency and usability. We believe that LPS can be integrated into current and future HPC systems to support a variety of data management needs.

KW - Data management

KW - High-performance computing

KW - Lightweight

KW - Provenance

KW - Rich metadata

UR - http://www.scopus.com/inward/record.url?scp=85043572310&partnerID=8YFLogxK

U2 - 10.1109/PACT.2017.14

DO - 10.1109/PACT.2017.14

M3 - Conference contribution

AN - SCOPUS:85043572310

T3 - Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT

SP - 117

EP - 129

BT - Proceedings - 26th International Conference on Parallel Architectures and Compilation Techniques, PACT 2017

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 26th International Conference on Parallel Architectures and Compilation Techniques, PACT 2017

Y2 - 9 September 2017 through 13 September 2017

ER -

Dai D, Chen Y, Carns P, Jenkins J, Ross R. Lightweight Provenance Service for High-Performance Computing. In Proceedings - 26th International Conference on Parallel Architectures and Compilation Techniques, PACT 2017. Institute of Electrical and Electronics Engineers Inc. 2017. p. 117-129. (Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT). doi: 10.1109/PACT.2017.14

Lightweight Provenance Service for High-Performance Computing

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this