DNNShifter : an efficient DNN pruning system for edge computing

Eccles, Bailey J.; Rodgers, Philip; Kilpatrick, Peter; Spence, Ivor; Varghese, Blesson

Show simple item record

Files in this item

Name:: Eccles_2023_DNNShifter_FGCS_152_43_CCBY.pdf
Size:: 2.536Mb
Format:: PDF

View/Open

Item metadata

dc.contributor.author	Eccles, Bailey J.
dc.contributor.author	Rodgers, Philip
dc.contributor.author	Kilpatrick, Peter
dc.contributor.author	Spence, Ivor
dc.contributor.author	Varghese, Blesson
dc.date.accessioned	2023-10-30T17:30:03Z
dc.date.available	2023-10-30T17:30:03Z
dc.date.issued	2024-03
dc.identifier	293968823
dc.identifier	918adda9-15d0-438a-92ce-8022d06a4498
dc.identifier.citation	Eccles , B J , Rodgers , P , Kilpatrick , P , Spence , I & Varghese , B 2024 , ' DNNShifter : an efficient DNN pruning system for edge computing ' , Future Generation Computer Systems , vol. 152 , pp. 43-54 . https://doi.org/10.48550/arXiv.2309.06973 , https://doi.org/10.1016/j.future.2023.09.025	en
dc.identifier.issn	0167-739X
dc.identifier.other	ORCID: /0000-0002-5533-7503/work/146006739
dc.identifier.uri	https://hdl.handle.net/10023/28595
dc.description	Funding: This research is funded by Rakuten Mobile, Japan .	en
dc.description.abstract	Deep neural networks (DNNs) underpin many machine learning applications. Production quality DNN models achieve high inference accuracy by training millions of DNN parameters which has a significant resource footprint. This presents a challenge for resources operating at the extreme edge of the network, such as mobile and embedded devices that have limited computational and memory resources. To address this, models are pruned to create lightweight, more suitable variants for these devices. Existing pruning methods are unable to provide similar quality models compared to their unpruned counterparts without significant time costs and overheads or are limited to offline use cases. Our work rapidly derives suitable model variants while maintaining the accuracy of the original model. The model variants can be swapped quickly when system and network conditions change to match workload demand. This paper presents DNNShifter , an end-to-end DNN training, spatial pruning, and model switching system that addresses the challenges mentioned above. At the heart of DNNShifter is a novel methodology that prunes sparse models using structured pruning - combining the accuracy-preserving benefits of unstructured pruning with runtime performance improvements of structured pruning. The pruned model variants generated by DNNShifter are smaller in size and thus faster than dense and sparse model predecessors, making them suitable for inference at the edge while retaining near similar accuracy as of the original dense model. DNNShifter generates a portfolio of model variants that can be swiftly interchanged depending on operational conditions. DNNShifter produces pruned model variants up to 93x faster than conventional training methods. Compared to sparse models, the pruned model variants are up to 5.14x smaller and have a 1.67x inference latency speedup, with no compromise to sparse model accuracy. In addition, DNNShifter has up to 11.9x lower overhead for switching models and up to 3.8x lower memory utilisation than existing approaches. DNNShifter is available for public use from https://github.com/blessonvar/DNNShifter.
dc.format.extent	2659299
dc.language.iso	eng
dc.relation.ispartof	Future Generation Computer Systems	en
dc.subject	Deep neural networks	en
dc.subject	Machine learning	en
dc.subject	Internet of things	en
dc.subject	Edge computing	en
dc.subject	Model compression	en
dc.subject	Model pruning	en
dc.subject	QA75 Electronic computers. Computer science	en
dc.subject	NDAS	en
dc.subject.lcc	QA75	en
dc.title	DNNShifter : an efficient DNN pruning system for edge computing	en
dc.type	Journal article	en
dc.contributor.institution	University of St Andrews. School of Computer Science	en
dc.identifier.doi	10.48550/arXiv.2309.06973
dc.description.status	Peer reviewed	en

This item appears in the following Collection(s)

University of St Andrews Research

Show simple item record