Show simple item record

Files in this item

Thumbnail

Item metadata

dc.contributor.advisorVarghese, Blesson
dc.contributor.authorWu, Di
dc.coverage.spatial213en_US
dc.date.accessioned2024-11-11T20:53:41Z
dc.date.available2024-11-11T20:53:41Z
dc.date.issued2024-12-03
dc.identifier.urihttps://hdl.handle.net/10023/30919
dc.description.abstractThe demand for distributed machine learning (DML) systems, which distribute training workloads across multiple nodes, has surged over the past decade due to the rapid growth of datasets and computational requirements. Additionally, executing ML training at the edge has gained importance for data privacy and reducing communication costs associated with sending raw data to the cloud. These trends have motivated a new ML training paradigm: DML at the edge. However, implementing DML systems at the edge presents four key challenges: (i) Hardware limited and heterogeneous resources at the edge result in impractical training times; (ii) The communication costs of DML systems at the edge are substantial; (iii) On-device training cannot be carried out on low-end devices; (iv) There is a lack of a comprehensive framework that can tackle the aforementioned challenges and support efficient DML systems at the edge. This thesis presents four techniques to address the above. First, it proposes an adaptive deep neural network (DNN) partitioning and offloading technique to address limited device resources with cloud assistance. This DNN partitioning-based federated learning (DPFL) system is further optimized by a reinforcement learning agent to adapt to heterogeneous devices. The thesis then introduces the techniques of pre-training initialization and replay buffer to reduce gradient and activation communication, identified as bottlenecks in a DPFL system. Additionally, a dual-phase layer freezing technique is proposed to minimize the on- device computations. Finally, a holistic framework is developed to integrate these techniques, maximizing their application and impact. The proposed framework supports the building of a new DPFL system that is more efficient than classic DML at the edge. Experimental evaluation on two real-world testbeds across various datasets and model architectures demonstrates the improvements of the proposed DML system on a range of quality and performance metrics, such as final accuracy, training latency, and communication cost.en_US
dc.language.isoenen_US
dc.rightsCreative Commons Attribution-NonCommercial 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by-nc/4.0/*
dc.titleDistributed machine learning on Edge computing systemsen_US
dc.typeThesisen_US
dc.contributor.sponsorRakuten Mobile, Inc.en_US
dc.type.qualificationlevelDoctoralen_US
dc.type.qualificationnamePhD Doctor of Philosophyen_US
dc.publisher.institutionThe University of St Andrewsen_US
dc.identifier.doihttps://doi.org/10.17630/sta/1164


The following licence files are associated with this item:

    This item appears in the following Collection(s)

    Show simple item record

    Creative Commons Attribution-NonCommercial 4.0 International
    Except where otherwise noted within the work, this item's licence for re-use is described as Creative Commons Attribution-NonCommercial 4.0 International