Exploring characteristics of inter-cluster machines and cloud applications on Google clusters

Modern cluster management systems have been evolving to cope with running and managing diverse cloud applications on heterogeneous computing clusters. Consequently, the system behaviours become complex and non-trivial to explain. In this paper we take the recently published Google trace data set version 3 (V3) as a case study to explore various aspects of inter- cluster differences. We analyse the distribution of underlying physical machines resource, e.g. number and types of machine, and metrics of computational job requests, e.g. job duration, utilisation and Cycles Per Instruction (CPI). We also apply an unsupervised learning algorithm on the metrics to characterise jobs. Our analysis suggests that the composition of the underlying machine resources in different cells can be substantially different, and the cells with similar machine resource structures can utilise resources differently depending on the characteristics of job requests.

Citation

Lin , Y , Barker , A D & Ceesay , S 2020 , Exploring characteristics of inter-cluster machines and cloud applications on Google clusters . in The 4th Workshop on Benchmarking, Performance Tuning and Optimization for Big Data Applications (BPOD) . IEEE Computer Society , IEEE International Conference on Big Data - IEEE BigData 2020 , 10/12/20 .

conference

Publication

The 4th Workshop on Benchmarking, Performance Tuning and Optimization for Big Data Applications (BPOD)

Type

Conference item

Description

Funding: ABC project (Adaptive Brokerage for the Cloud) funded by UK EPSRC EP/R010528/1.

Collections

University of St Andrews Research

URI

https://hdl.handle.net/10023/21190