Distributed machine learning on Edge computing systems

Wu, Di

Show simple item record

Files in this item

Name:: Thesis-Di-Wu-complete-version.pdf
Size:: 17.06Mb
Format:: PDF
Description:: Complete version

View/Open

Name:: Thesis-Di-Wu-complete-version-LaTeX-files.zip
Size:: 18.27Mb
Format:: application/zip
Description:: Complete version (Preservation copy)

View/Open

Item metadata

dc.contributor.advisor	Varghese, Blesson
dc.contributor.author	Wu, Di
dc.coverage.spatial	213	en_US
dc.date.accessioned	2024-11-11T20:53:41Z
dc.date.available	2024-11-11T20:53:41Z
dc.date.issued	2024-12-03
dc.identifier.uri	https://hdl.handle.net/10023/30919
dc.description.abstract	The demand for distributed machine learning (DML) systems, which distribute training workloads across multiple nodes, has surged over the past decade due to the rapid growth of datasets and computational requirements. Additionally, executing ML training at the edge has gained importance for data privacy and reducing communication costs associated with sending raw data to the cloud. These trends have motivated a new ML training paradigm: DML at the edge. However, implementing DML systems at the edge presents four key challenges: (i) Hardware limited and heterogeneous resources at the edge result in impractical training times; (ii) The communication costs of DML systems at the edge are substantial; (iii) On-device training cannot be carried out on low-end devices; (iv) There is a lack of a comprehensive framework that can tackle the aforementioned challenges and support efficient DML systems at the edge. This thesis presents four techniques to address the above. First, it proposes an adaptive deep neural network (DNN) partitioning and offloading technique to address limited device resources with cloud assistance. This DNN partitioning-based federated learning (DPFL) system is further optimized by a reinforcement learning agent to adapt to heterogeneous devices. The thesis then introduces the techniques of pre-training initialization and replay buffer to reduce gradient and activation communication, identified as bottlenecks in a DPFL system. Additionally, a dual-phase layer freezing technique is proposed to minimize the on- device computations. Finally, a holistic framework is developed to integrate these techniques, maximizing their application and impact. The proposed framework supports the building of a new DPFL system that is more efficient than classic DML at the edge. Experimental evaluation on two real-world testbeds across various datasets and model architectures demonstrates the improvements of the proposed DML system on a range of quality and performance metrics, such as final accuracy, training latency, and communication cost.	en_US
dc.language.iso	en	en_US
dc.rights	Creative Commons Attribution-NonCommercial 4.0 International	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc/4.0/	*
dc.title	Distributed machine learning on Edge computing systems	en_US
dc.type	Thesis	en_US
dc.contributor.sponsor	Rakuten Mobile, Inc.	en_US
dc.type.qualificationlevel	Doctoral	en_US
dc.type.qualificationname	PhD Doctor of Philosophy	en_US
dc.publisher.institution	The University of St Andrews	en_US
dc.identifier.doi	https://doi.org/10.17630/sta/1164

The following licence files are associated with this item:

This item appears in the following Collection(s)

Computer Science Theses

Show simple item record

Creative Commons Attribution-NonCommercial 4.0 International

Except where otherwise noted within the work, this item's licence for re-use is described as Creative Commons Attribution-NonCommercial 4.0 International