Files in this item
NLP-supervised stroke detection in medical images
Item metadata
dc.contributor.advisor | Harris-Birtill, David Cameron Christopher | |
dc.contributor.advisor | O'Neil, Alison Q. | |
dc.contributor.author | Schrempf, Patrick Maurice | |
dc.coverage.spatial | 247 | en_US |
dc.date.accessioned | 2023-02-27T11:28:42Z | |
dc.date.available | 2023-02-27T11:28:42Z | |
dc.date.issued | 2023-06-14 | |
dc.identifier.uri | https://hdl.handle.net/10023/27064 | |
dc.description.abstract | In the UK, around 100,000 people have a stroke each year, equating to one stroke every five minutes. Prompt treatment is required to give a good patient outcome. To help decide what the treatment options are, most suspected stroke patients in the UK have a brain scan performed on admission to hospital. Each scan is examined by a radiologist, and a report describing the images is written. An automatic artificial intelligence (AI) solution for brain scan analysis could support the radiologist to report scans quickly and accurately, in turn helping with the speed and accuracy of treatment. However, training AI models for medical image analysis requires large amounts of expertly annotated data which are time-consuming and expensive to obtain. Large unlabelled datasets such as those for stroke patients are currently difficult to use directly for training. An emerging solution is to leverage radiology reports as a source of expert information in order to construct imaging training annotations. In this thesis, a system for extracting suitable labels from radiology reports of suspected stroke patients is proposed and implemented in collaboration with NHS Greater Glasgow and Clyde. A state-of-the-art deep learning natural language processing (NLP) model is first trained to extract the labels from radiology reports automatically by combining per-label attention with a novel data augmentation approach that uses templates and knowledge bases. This NLP model is used to automatically create training annotations from an uncurated dataset of over 27,000 reports, larger than is realistically feasible using only manual annotations. Finally, an image analysis model is trained to detect haemorrhagic regions in the brain. Using automatically extracted annotations improves performance over manual annotations by 20%. Being able to use these data without the need for time-consuming and expensive annotation efforts to be carried out by already stretched healthcare professionals provides a promising avenue for transformative medical image analysis research. | en_US |
dc.language.iso | en | en_US |
dc.rights | Creative Commons Attribution-NonCommercial 4.0 International | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc/4.0/ | * |
dc.subject | Artificial intelligence | en_US |
dc.subject | Machine learning | en_US |
dc.subject | Natural language processing | en_US |
dc.subject | Medical AI | en_US |
dc.subject | Medical imaging | en_US |
dc.subject | Image processing | en_US |
dc.title | NLP-supervised stroke detection in medical images | en_US |
dc.type | Thesis | en_US |
dc.contributor.sponsor | Canon Medical Research Europe | en_US |
dc.contributor.sponsor | Data Lab | en_US |
dc.contributor.sponsor | UK Research and Innovation (Agency) | en_US |
dc.type.qualificationlevel | Doctoral | en_US |
dc.type.qualificationname | DEng Doctor of Engineering | en_US |
dc.publisher.institution | The University of St Andrews | en_US |
dc.rights.embargodate | 2025-01-26 | |
dc.rights.embargoreason | Thesis restricted in accordance with University regulations. Restricted until 26th January 2025 | en |
dc.identifier.doi | https://doi.org/10.17630/sta/305 | |
dc.identifier.grantnumber | 104690 | en_US |
The following licence files are associated with this item:
This item appears in the following Collection(s)
Except where otherwise noted within the work, this item's licence for re-use is described as Creative Commons Attribution-NonCommercial 4.0 International
Items in the St Andrews Research Repository are protected by copyright, with all rights reserved, unless otherwise indicated.