Exploring characteristics of inter-cluster machines and cloud applications on Google clusters
Abstract
Modern cluster management systems have been evolving to cope with running and managing diverse cloud applications on heterogeneous computing clusters. Consequently, the system behaviours become complex and non-trivial to explain. In this paper we take the recently published Google trace data set version 3 (V3) as a case study to explore various aspects of inter- cluster differences. We analyse the distribution of underlying physical machines resource, e.g. number and types of machine, and metrics of computational job requests, e.g. job duration, utilisation and Cycles Per Instruction (CPI). We also apply an unsupervised learning algorithm on the metrics to characterise jobs. Our analysis suggests that the composition of the underlying machine resources in different cells can be substantially different, and the cells with similar machine resource structures can utilise resources differently depending on the characteristics of job requests.
Citation
Lin , Y , Barker , A D & Ceesay , S 2020 , Exploring characteristics of inter-cluster machines and cloud applications on Google clusters . in The 4th Workshop on Benchmarking, Performance Tuning and Optimization for Big Data Applications (BPOD) . IEEE Computer Society , IEEE International Conference on Big Data - IEEE BigData 2020 , 10/12/20 . conference
Publication
The 4th Workshop on Benchmarking, Performance Tuning and Optimization for Big Data Applications (BPOD)
Type
Conference item
Description
Funding: ABC project (Adaptive Brokerage for the Cloud) funded by UK EPSRC EP/R010528/1.Collections
Items in the St Andrews Research Repository are protected by copyright, with all rights reserved, unless otherwise indicated.