Show simple item record

Files in this item


Item metadata

dc.contributor.authorShi, Wen
dc.contributor.authorKelsey, Tom
dc.contributor.authorSullivan, Francis
dc.identifier.citationShi , W , Kelsey , T & Sullivan , F 2020 , ' Efficient identification of patients eligible for clinical studies using case-based reasoning on Scottish Health Research register (SHARE) ' , BMC Medical Informatics and Decision Making , vol. 20 , 70 .
dc.identifier.otherPURE: 267500831
dc.identifier.otherPURE UUID: 1c0efae3-1a81-4da5-9f79-da6938a6fc28
dc.identifier.otherORCID: /0000-0002-8091-1458/work/72842491
dc.identifier.otherScopus: 85083755641
dc.identifier.otherWOS: 000529005800001
dc.identifier.otherORCID: /0000-0002-6623-4964/work/72842744
dc.descriptionW.S. is a PhD student funded by the University of St Andrews where T.K. and F.S. are faculty members.en
dc.description.abstractBackground Trials often struggle to achieve their target sample size with only half doing so. Some researchers have turned to Electronic Health Records (EHRs), seeking a more efficient way of recruitment. The Scottish Health Research Register (SHARE) obtained patients’ consent for their EHRs to be used as a searching base from which researchers can find potential participants. However, due to the fact that EHR data is not complete, sufficient or accurate, a database search strategy may not generate the best case-finding result. The current study aims to evaluate the performance of a case-based reasoning method in identifying participants for population-based clinical studies recruiting through SHARE, and assess the difference between its resultant cohort and the original one deriving from searching EHRs. Methods A case-based reasoning framework was applied to 119 participants in nine projects using two-fold cross-validation, with records from a further 86,292 individuals used for testing. A prediction score for study participation was derived from the diagnosis, procedure, pharmaceutical prescription, and laboratory test results attributes of each participant. Evaluation was conducted by calculating Area Under the ROC Curve and information retrieval metrics for the ranking list of the test set by prediction score. We compared the most likely participants as identified by searching a database to those ranked highest by our model. Results The average ROCAUC for nine projects was 81% indicating strong predictive ability for these data. However, the derived ranking lists showed lower predictive performance, with only 21% of the persons ranked within top 50 positions being the same as identified by searching databases. Conclusions Case-based reasoning is may be more effective than a database search strategy for participant identification for clinical studies using population EHRs. The lower performance of ranking lists derived from case-based reasoning means that patients identified as highly suitable for study participation may still not be recruited. This suggests that further study is needed into improvements in the collection and curation of population EHRs, such as use of free text data to aid reliable identification of people more likely to be recruited to clinical trials.
dc.relation.ispartofBMC Medical Informatics and Decision Makingen
dc.rightsCopyright © The Author(s). 2020 Open Access. This article is licensed under a Creative Commons Attribution 4.0 International License,which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit
dc.subjectClinical studiesen
dc.subjectElectronic health recorden
dc.subjectMachine learningen
dc.subjectArtificial intelligenceen
dc.subjectQA76 Computer softwareen
dc.titleEfficient identification of patients eligible for clinical studies using case-based reasoning on Scottish Health Research register (SHARE)en
dc.typeJournal articleen
dc.description.versionPublisher PDFen
dc.contributor.institutionUniversity of St Andrews. Population and Behavioural Science Divisionen
dc.contributor.institutionUniversity of St Andrews. School of Medicineen
dc.contributor.institutionUniversity of St Andrews. School of Computer Scienceen
dc.contributor.institutionUniversity of St Andrews. Centre for Interdisciplinary Research in Computational Algebraen
dc.contributor.institutionUniversity of St Andrews. Sir James Mackenzie Institute for Early Diagnosisen
dc.description.statusPeer revieweden

This item appears in the following Collection(s)

Show simple item record