Show simple item record

Files in this item

Thumbnail

Item metadata

dc.contributor.authorMajeed, Ayesha Abdul
dc.contributor.authorKilpatrick, Peter
dc.contributor.authorSpence, Ivor
dc.contributor.authorVarghese, Blesson
dc.contributor.editorArdagna, Claudio Agostino
dc.contributor.editorBian, Hongyi
dc.contributor.editorChang, Carl K.
dc.contributor.editorChang, Rong N.
dc.contributor.editorDamiani, Ernesto
dc.contributor.editorElia, Gabriele
dc.contributor.editorHe, Qiang
dc.contributor.editorPuig, Vicenç
dc.contributor.editorWard, Robert
dc.contributor.editorXhafa, Fatos
dc.contributor.editorZhang, Jia
dc.date.accessioned2022-05-23T10:30:09Z
dc.date.available2022-05-23T10:30:09Z
dc.date.issued2022-08-24
dc.identifier.citationMajeed , A A , Kilpatrick , P , Spence , I & Varghese , B 2022 , CONTINUER : maintaining distributed DNN services during edge failures . in C A Ardagna , H Bian , C K Chang , R N Chang , E Damiani , G Elia , Q He , V Puig , R Ward , F Xhafa & J Zhang (eds) , 2022 IEEE International Conference on Edge Computing and Communications (EDGE 2022) . , 9860277 , IEEE International Conference on Edge Computing and Communications , IEEE Computer Society , IEEE International Conference on Edge Computing and Communications , Barcelona , Spain , 11/07/22 . https://doi.org/10.1109/EDGE55608.2022.00029en
dc.identifier.citationconferenceen
dc.identifier.isbn9781665481410
dc.identifier.isbn9781665481403
dc.identifier.issn2767-990X
dc.identifier.otherPURE: 279710703
dc.identifier.otherPURE UUID: a04788c4-ec47-4d37-b42a-fab98805cc5b
dc.identifier.urihttp://hdl.handle.net/10023/25431
dc.descriptionFunding: A. A. Majeed is supported by a Schlumberger Scholarship and B. Varghese by a Royal Society Short Industry Fellowship.en
dc.description.abstractPartitioning and deploying Deep Neural Networks (DNNs) across edge nodes may be used to meet performance objectives of applications. However, the failure of a single node may result in cascading failures that will adversely impact the delivery of the service and will result in failure to meet specific objectives. The impact of these failures needs to be minimised at runtime. Three techniques are explored in this paper, namely repartitioning, early-exit and skip-connection. When an edge node fails, the repartitioning technique will repartition and redeploy the DNN thus avoiding the failed nodes. The early exit technique makes provision for a request to exit (early)before the failed node. The skip connection technique dynamically routes the request by skipping the failed nodes. This paper will leverage trade-offs in accuracy, end-to-end latency and downtime for selecting the best technique given user-defined objectives(accuracy, latency and downtime thresholds) when an edge node fails. To this end, CONTINUER is developed. Two key activities of the framework are estimating the accuracy and latency when using the techniques for distributed DNNs and selecting the best technique. It is demonstrated on a lab-based experimental testbed that CONTINUER estimates accuracy and latency when using the techniques with no more than an average error of 0.28% and13.06%, respectively and selects the suitable technique with a low overhead of no more than 16.82 milliseconds and an accuracy of up to 99.86%.
dc.format.extent10
dc.language.isoeng
dc.publisherIEEE Computer Society
dc.relation.ispartof2022 IEEE International Conference on Edge Computing and Communications (EDGE 2022)en
dc.relation.ispartofseriesIEEE International Conference on Edge Computing and Communicationsen
dc.rightsCopyright © 2022 IEEE. This work has been made available online in accordance with publisher policies or with permission. Permission for further reuse of this content should be sought from the publisher or the rights holder. This is the author created accepted manuscript following peer review and may differ slightly from the final published version. The final published version of this work is available at https://doi.org/10.1109/EDGE55608.2022.00029.en
dc.subjectDistributed DNNsen
dc.subjectEdge computingen
dc.subjectFailuresen
dc.subjectQA75 Electronic computers. Computer scienceen
dc.subjectNSen
dc.subject.lccQA75en
dc.titleCONTINUER : maintaining distributed DNN services during edge failuresen
dc.typeConference itemen
dc.description.versionPostprinten
dc.contributor.institutionUniversity of St Andrews. School of Computer Scienceen
dc.identifier.doihttps://doi.org/10.1109/EDGE55608.2022.00029


This item appears in the following Collection(s)

Show simple item record