Classification of hyper-scale multimodal imaging datasets

Macfadyen, Craig; Duraiswamy, Ajay; Harris-Birtill, David

Show simple item record

Files in this item

Name:: Macfadyen_2023_PLOSDH_Hyper-scalemultimodal_CCBY.pdf
Size:: 1.435Mb
Format:: PDF

View/Open

Item metadata

dc.contributor.author	Macfadyen, Craig
dc.contributor.author	Duraiswamy, Ajay
dc.contributor.author	Harris-Birtill, David
dc.date.accessioned	2023-12-18T09:30:09Z
dc.date.available	2023-12-18T09:30:09Z
dc.date.issued	2023-12-13
dc.identifier	297485875
dc.identifier	e6148d10-129f-4489-931c-563cd512eee9
dc.identifier.citation	Macfadyen , C , Duraiswamy , A & Harris-Birtill , D 2023 , ' Classification of hyper-scale multimodal imaging datasets ' , PLOS Digital Health , vol. 2 , no. 12 , e0000191 . https://doi.org/10.1371/journal.pdig.0000191	en
dc.identifier.issn	2767-3170
dc.identifier.other	Jisc: 1584021
dc.identifier.other	publisher-id: pdig-d-23-00002
dc.identifier.other	ORCID: /0000-0002-0740-3668/work/149333109
dc.identifier.uri	https://hdl.handle.net/10023/28884
dc.description.abstract	Algorithms that classify hyper-scale multi-modal datasets, comprising of millions of images, into constituent modality types can help researchers quickly retrieve and classify diagnostic imaging data, accelerating clinical outcomes. This research aims to demonstrate that a deep neural network that is trained on a hyper-scale dataset (4.5 million images) composed of heterogeneous multi-modal data can be used to obtain significant modality classification accuracy (96%). By combining 102 medical imaging datasets, a dataset of 4.5 million images was created. A ResNet-50, ResNet-18, and VGG16 were trained to classify these images by the imaging modality used to capture them (Computed Tomography (CT), Magnetic Resonance Imaging (MRI), Positron Emission Tomography (PET), and X-ray) across many body locations. The classification accuracy of the models was then tested on unseen data. The best performing model achieved classification accuracy of 96% on unseen data, which is on-par, or exceeds the accuracy of more complex implementations using EfficientNets or Vision Transformers (ViTs). The model achieved a balanced accuracy of 86%. This research shows it is possible to train Deep Learning (DL) Convolutional Neural Networks (CNNs) with hyper-scale multimodal datasets, composed of millions of images. Such models can find use in real-world applications with volumes of image data in the hyper-scale range, such as medical imaging repositories, or national healthcare institutions. Further research can expand this classification capability to include 3D-scans.
dc.format.extent	15
dc.format.extent	1504949
dc.language.iso	eng
dc.relation.ispartof	PLOS Digital Health	en
dc.subject	QA75 Electronic computers. Computer science	en
dc.subject	DAS	en
dc.subject.lcc	QA75	en
dc.title	Classification of hyper-scale multimodal imaging datasets	en
dc.type	Journal article	en
dc.contributor.institution	University of St Andrews. School of Computer Science	en
dc.contributor.institution	University of St Andrews. Centre for Research into Ecological & Environmental Modelling	en
dc.identifier.doi	https://doi.org/10.1371/journal.pdig.0000191
dc.description.status	Peer reviewed	en

This item appears in the following Collection(s)

University of St Andrews Research

Show simple item record