Files in this item
Classification of hyper-scale multimodal imaging datasets
Item metadata
dc.contributor.author | Macfadyen, Craig | |
dc.contributor.author | Duraiswamy, Ajay | |
dc.contributor.author | Harris-Birtill, David | |
dc.date.accessioned | 2023-12-18T09:30:09Z | |
dc.date.available | 2023-12-18T09:30:09Z | |
dc.date.issued | 2023-12-13 | |
dc.identifier | 297485875 | |
dc.identifier | e6148d10-129f-4489-931c-563cd512eee9 | |
dc.identifier.citation | Macfadyen , C , Duraiswamy , A & Harris-Birtill , D 2023 , ' Classification of hyper-scale multimodal imaging datasets ' , PLOS Digital Health , vol. 2 , no. 12 , e0000191 . https://doi.org/10.1371/journal.pdig.0000191 | en |
dc.identifier.issn | 2767-3170 | |
dc.identifier.other | Jisc: 1584021 | |
dc.identifier.other | publisher-id: pdig-d-23-00002 | |
dc.identifier.other | ORCID: /0000-0002-0740-3668/work/149333109 | |
dc.identifier.uri | https://hdl.handle.net/10023/28884 | |
dc.description.abstract | Algorithms that classify hyper-scale multi-modal datasets, comprising of millions of images, into constituent modality types can help researchers quickly retrieve and classify diagnostic imaging data, accelerating clinical outcomes. This research aims to demonstrate that a deep neural network that is trained on a hyper-scale dataset (4.5 million images) composed of heterogeneous multi-modal data can be used to obtain significant modality classification accuracy (96%). By combining 102 medical imaging datasets, a dataset of 4.5 million images was created. A ResNet-50, ResNet-18, and VGG16 were trained to classify these images by the imaging modality used to capture them (Computed Tomography (CT), Magnetic Resonance Imaging (MRI), Positron Emission Tomography (PET), and X-ray) across many body locations. The classification accuracy of the models was then tested on unseen data. The best performing model achieved classification accuracy of 96% on unseen data, which is on-par, or exceeds the accuracy of more complex implementations using EfficientNets or Vision Transformers (ViTs). The model achieved a balanced accuracy of 86%. This research shows it is possible to train Deep Learning (DL) Convolutional Neural Networks (CNNs) with hyper-scale multimodal datasets, composed of millions of images. Such models can find use in real-world applications with volumes of image data in the hyper-scale range, such as medical imaging repositories, or national healthcare institutions. Further research can expand this classification capability to include 3D-scans. | |
dc.format.extent | 15 | |
dc.format.extent | 1504949 | |
dc.language.iso | eng | |
dc.relation.ispartof | PLOS Digital Health | en |
dc.subject | QA75 Electronic computers. Computer science | en |
dc.subject | DAS | en |
dc.subject.lcc | QA75 | en |
dc.title | Classification of hyper-scale multimodal imaging datasets | en |
dc.type | Journal article | en |
dc.contributor.institution | University of St Andrews. School of Computer Science | en |
dc.contributor.institution | University of St Andrews. Centre for Research into Ecological & Environmental Modelling | en |
dc.identifier.doi | https://doi.org/10.1371/journal.pdig.0000191 | |
dc.description.status | Peer reviewed | en |
This item appears in the following Collection(s)
Items in the St Andrews Research Repository are protected by copyright, with all rights reserved, unless otherwise indicated.