Nowadays, data-centers are largely under-utilized because resource allocation is based on reservation mechanisms which ignore actual resource utilization. Indeed, it is common to reserve resources for peak demand, which may occur only for a small portion of the application life time. As a consequence, cluster resources often go under-utilized. In this work, we propose a mechanism that improves compute cluster utilization and their responsiveness, while preventing application failures due to contention in accessing finite resources such as RAM. Our method monitors resource utilization and employs a data-driven approach to resource demand forecasting, featuring quantification of uncertainty in the predictions. Using demand forecast and its confidence, our mechanism modulates cluster resources assigned to running applications, and reduces the turnaround time by more than one order of magnitude while keeping application failures under control. Thus, tenants enjoy a responsive system and providers benefit from an efficient cluster utilization.

Dynamic Resource Shaping for Compute Clusters

Carra, Damiano
;
2019-01-01

Abstract

Nowadays, data-centers are largely under-utilized because resource allocation is based on reservation mechanisms which ignore actual resource utilization. Indeed, it is common to reserve resources for peak demand, which may occur only for a small portion of the application life time. As a consequence, cluster resources often go under-utilized. In this work, we propose a mechanism that improves compute cluster utilization and their responsiveness, while preventing application failures due to contention in accessing finite resources such as RAM. Our method monitors resource utilization and employs a data-driven approach to resource demand forecasting, featuring quantification of uncertainty in the predictions. Using demand forecast and its confidence, our mechanism modulates cluster resources assigned to running applications, and reduces the turnaround time by more than one order of magnitude while keeping application failures under control. Thus, tenants enjoy a responsive system and providers benefit from an efficient cluster utilization.
2019
978-1-7281-2772-9
cluster management
machine learning
File in questo prodotto:
File Dimensione Formato  
main.pdf

solo utenti autorizzati

Tipologia: Documento in Pre-print
Licenza: Accesso ristretto
Dimensione 547.38 kB
Formato Adobe PDF
547.38 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1000823
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
social impact