Show simple item record

Files in this item

Thumbnail

Item metadata

dc.contributor.authorZhang, Zihan
dc.contributor.authorRodgers, Philip
dc.contributor.authorKilpatrick, Peter
dc.contributor.authorSpence, Ivor
dc.contributor.authorVarghese, Blesson
dc.date.accessioned2024-07-08T15:30:16Z
dc.date.available2024-07-08T15:30:16Z
dc.date.issued2024-11
dc.identifier303931145
dc.identifieraff69919-034d-4dc2-8ee1-46ee39e948d4
dc.identifier.citationZhang , Z , Rodgers , P , Kilpatrick , P , Spence , I & Varghese , B 2024 , ' PiPar: Pipeline parallelism for collaborative machine Learning ' , Journal of Parallel and Distributed Computing , vol. 193 , 104947 . https://doi.org/10.1016/j.jpdc.2024.104947en
dc.identifier.issn0743-7315
dc.identifier.urihttps://hdl.handle.net/10023/30110
dc.descriptionFunding: This work was sponsored by Rakuten Mobile, Inc., Japan.en
dc.description.abstractCollaborative machine learning (CML) techniques, such as federated learning, have been proposed to train deep learning models across multiple mobile devices and a server. CML techniques are privacy-preserving as a local model that is trained on each device instead of the raw data from the device is shared with the server. However, CML training is inefficient due to low resource utilization. We identify idling resources on the server and devices due to sequential computation and communication as the principal cause of low resource utilization. A novel framework PiPar that leverages pipeline parallelism for CML techniques is developed to substantially improve resource utilization. A new training pipeline is designed to parallelize the computations on different hardware resources and communication on different bandwidth resources, thereby accelerating the training process in CML. A low overhead automated parameter selection method is proposed to optimize the pipeline, maximizing the utilization of available resources. The experimental results confirm the validity of the underlying approach of PiPar and highlight that when compared to federated learning: (i) the idle time of the server can be reduced by up to 64.1×, and (ii) the overall training time can be accelerated by up to 34.6× under varying network conditions for a collection of six small and large popular deep neural networks and four datasets without sacrificing accuracy. It is also experimentally demonstrated that PiPar achieves performance benefits when incorporating differential privacy methods and operating in environments with heterogeneous devices and changing bandwidths.
dc.format.extent17
dc.format.extent6185727
dc.language.isoeng
dc.relation.ispartofJournal of Parallel and Distributed Computingen
dc.subjectCollaborative machine learningen
dc.subjectResource utilizationen
dc.subjectPipeline parallelismen
dc.subjectEdge computingen
dc.subjectQA75 Electronic computers. Computer scienceen
dc.subjectNDASen
dc.subject.lccQA75en
dc.titlePiPar: Pipeline parallelism for collaborative machine Learningen
dc.typeJournal articleen
dc.contributor.institutionUniversity of St Andrews. School of Computer Scienceen
dc.identifier.doi10.1016/j.jpdc.2024.104947
dc.description.statusPeer revieweden


This item appears in the following Collection(s)

Show simple item record