Using unsupervised machine learning for fault identification in virtual machines

Schneider, Christopher

View/Open

ChrisSchneiderPhDThesis.pdf (1.355Mb)

Date

2015

Abstract

Self-healing systems promise operating cost reductions in large-scale computing environments through the automated detection of, and recovery from, faults. However, at present there appears to be little known empirical evidence comparing the different approaches, or demonstrations that such implementations reduce costs. This thesis compares previous and current self-healing approaches before demonstrating a new, unsupervised approach that combines artificial neural networks with performance tests to perform fault identification in an automated fashion, i.e. the correct and accurate determination of which computer features are associated with a given performance test failure. Several key contributions are made in the course of this research including an analysis of the different types of self-healing approaches based on their contextual use, a baseline for future comparisons between self-healing frameworks that use artificial neural networks, and a successful, automated fault identification in cloud infrastructure, and more specifically virtual machines. This approach uses three established machine learning techniques: Naïve Bayes, Baum-Welch, and Contrastive Divergence Learning. The latter demonstrates minimisation of human-interaction beyond previous implementations by producing a list in decreasing order of likelihood of potential root causes (i.e. fault hypotheses) which brings the state of the art one step closer toward fully self-healing systems. This thesis also examines the impact of that different types of faults have on their respective identification. This helps to understand the validity of the data being presented, and how the field is progressing, whilst examining the differences in impact to identification between emulated thread crashes and errant user changes – a contribution believed to be unique to this research. Lastly, future research avenues and conclusions in automated fault identification are described along with lessons learned throughout this endeavor. This includes the progression of artificial neural networks, how learning algorithms are being developed and understood, and possibilities for automatically generating feature locality data.

Type

Thesis, PhD Doctor of Philosophy

Rights

Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International

http://creativecommons.org/licenses/by-nc-nd/4.0/

Collections

Computer Science Theses

URI

https://hdl.handle.net/10023/7327

Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International

Except where otherwise noted within the work, this item's licence for re-use is described as Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International