CONTINUER : maintaining distributed DNN services during edge failures

Majeed, Ayesha Abdul; Kilpatrick, Peter; Spence, Ivor; Varghese, Blesson

Show simple item record

Files in this item

Name:: CONTINUER_acceptedversion.pdf
Size:: 1.134Mb
Format:: PDF

View/Open

Item metadata

dc.contributor.author	Majeed, Ayesha Abdul
dc.contributor.author	Kilpatrick, Peter
dc.contributor.author	Spence, Ivor
dc.contributor.author	Varghese, Blesson
dc.contributor.editor	Ardagna, Claudio Agostino
dc.contributor.editor	Bian, Hongyi
dc.contributor.editor	Chang, Carl K.
dc.contributor.editor	Chang, Rong N.
dc.contributor.editor	Damiani, Ernesto
dc.contributor.editor	Elia, Gabriele
dc.contributor.editor	He, Qiang
dc.contributor.editor	Puig, Vicenç
dc.contributor.editor	Ward, Robert
dc.contributor.editor	Xhafa, Fatos
dc.contributor.editor	Zhang, Jia
dc.date.accessioned	2022-05-23T10:30:09Z
dc.date.available	2022-05-23T10:30:09Z
dc.date.issued	2022-08-24
dc.identifier	279710703
dc.identifier	a04788c4-ec47-4d37-b42a-fab98805cc5b
dc.identifier	000861398600016
dc.identifier	85146201388
dc.identifier.citation	Majeed , A A , Kilpatrick , P , Spence , I & Varghese , B 2022 , CONTINUER : maintaining distributed DNN services during edge failures . in C A Ardagna , H Bian , C K Chang , R N Chang , E Damiani , G Elia , Q He , V Puig , R Ward , F Xhafa & J Zhang (eds) , 2022 IEEE International conference on edge computing and communications (EDGE 2022) . , 9860277 , IEEE International conference on edge computing and communications , IEEE Computer Society , Piscataway, NJ , pp. 143-152 , IEEE International Conference on Edge Computing and Communications , Barcelona , Spain , 11/07/22 . https://doi.org/10.1109/EDGE55608.2022.00029	en
dc.identifier.citation	conference	en
dc.identifier.isbn	9781665481410
dc.identifier.isbn	9781665481403
dc.identifier.issn	2767-990X
dc.identifier.uri	https://hdl.handle.net/10023/25431
dc.description	Funding: A. A. Majeed is supported by a Schlumberger Scholarship and B. Varghese by a Royal Society Short Industry Fellowship.	en
dc.description.abstract	Partitioning and deploying Deep Neural Networks (DNNs) across edge nodes may be used to meet performance objectives of applications. However, the failure of a single node may result in cascading failures that will adversely impact the delivery of the service and will result in failure to meet specific objectives. The impact of these failures needs to be minimised at runtime. Three techniques are explored in this paper, namely repartitioning, early-exit and skip-connection. When an edge node fails, the repartitioning technique will repartition and redeploy the DNN thus avoiding the failed nodes. The early exit technique makes provision for a request to exit (early)before the failed node. The skip connection technique dynamically routes the request by skipping the failed nodes. This paper will leverage trade-offs in accuracy, end-to-end latency and downtime for selecting the best technique given user-defined objectives(accuracy, latency and downtime thresholds) when an edge node fails. To this end, CONTINUER is developed. Two key activities of the framework are estimating the accuracy and latency when using the techniques for distributed DNNs and selecting the best technique. It is demonstrated on a lab-based experimental testbed that CONTINUER estimates accuracy and latency when using the techniques with no more than an average error of 0.28% and13.06%, respectively and selects the suitable technique with a low overhead of no more than 16.82 milliseconds and an accuracy of up to 99.86%.
dc.format.extent	10
dc.format.extent	1189954
dc.language.iso	eng
dc.publisher	IEEE Computer Society
dc.relation.ispartof	2022 IEEE International conference on edge computing and communications (EDGE 2022)	en
dc.relation.ispartofseries	IEEE International conference on edge computing and communications	en
dc.subject	Distributed DNNs	en
dc.subject	Edge computing	en
dc.subject	Failures	en
dc.subject	QA75 Electronic computers. Computer science	en
dc.subject	NS	en
dc.subject	MCC	en
dc.subject.lcc	QA75	en
dc.title	CONTINUER : maintaining distributed DNN services during edge failures	en
dc.type	Conference item	en
dc.contributor.institution	University of St Andrews. School of Computer Science	en
dc.identifier.doi	10.1109/EDGE55608.2022.00029
dc.identifier.url	https://ieeexplore.ieee.org/xpl/conhome/9860271/proceeding	en

This item appears in the following Collection(s)

University of St Andrews Research

Show simple item record