Show simple item record

Files in this item

Thumbnail

Item metadata

dc.contributor.authorCarson, Jamie Kirk
dc.contributor.authorKirby, Graham Njal Cameron
dc.contributor.authorDearle, Alan
dc.contributor.authorWilliamson, Lee
dc.contributor.authorGarrett, Eilidh
dc.contributor.authorReid, Alice
dc.contributor.authorDibben, Christopher John Lloyd
dc.date.accessioned2014-09-12T09:01:05Z
dc.date.available2014-09-12T09:01:05Z
dc.date.issued2013-03-05
dc.identifier.citationCarson , J K , Kirby , G N C , Dearle , A , Williamson , L , Garrett , E , Reid , A & Dibben , C J L 2013 , Exploiting historical registers: Automatic methods for coding c19th and c20th cause of death descriptions to standard classifications . in New Techniques and Technologies for Statistics . Eurostat , http://www.cros-portal.eu/content/ntts-2013-proceedings , pp. 598-607 , New Techniques and Technologies for Statistics (NTTS 2013) , Brussels , Belgium , 5/03/13 . https://doi.org/10.2901/Eurostat.C2013.001en
dc.identifier.citationconferenceen
dc.identifier.otherPURE: 44100399
dc.identifier.otherPURE UUID: 1314709c-8a20-4431-92c5-576dca1a9b56
dc.identifier.otherORCID: /0000-0002-4422-0190/work/28429094
dc.identifier.urihttp://hdl.handle.net/10023/5409
dc.description.abstractThe increasing availability of digitised registration records presents a significant opportunity for research. Returning to the original records allows researchers to classify descriptions, such as cause of death, to modern medical understandings of illness and disease, rather than relying on contemporary registrars’ classifications. Linkage of an individual’s records together also allows the production of sparse life-course micro-datasets. The further linkage of these into family units then presents the possibility of reconstructing family structures and producing multi-generational studies. We describe work to develop a method for automatically coding to standard classifications the causes of death from 8.3 million Scottish death certificates. We have evaluated a range of approaches using text processing and supervised machine learning, obtaining accuracy from 72%-96% on several test sets. We present results and speculate on further development that may be needed for classification of the full data set.
dc.format.extent10
dc.language.isoeng
dc.publisherEurostat
dc.relation.ispartofNew Techniques and Technologies for Statisticsen
dc.rights© European Communities. Reproduction is authorised, provided the source is acknowledged, save where otherwise stated. http://www.cros-portal.eu/page/legal-notice.en
dc.subjectQA75 Electronic computers. Computer scienceen
dc.subjectHA Statisticsen
dc.subject.lccQA75en
dc.subject.lccHAen
dc.titleExploiting historical registers: Automatic methods for coding c19th and c20th cause of death descriptions to standard classificationsen
dc.typeConference itemen
dc.description.versionPostprinten
dc.contributor.institutionUniversity of St Andrews.School of Computer Scienceen
dc.contributor.institutionUniversity of St Andrews.Office of the Principalen
dc.contributor.institutionUniversity of St Andrews.Geography & Sustainable Developmenten
dc.contributor.institutionUniversity of St Andrews.St Andrews Sustainability Instituteen
dc.identifier.doihttps://doi.org/10.2901/Eurostat.C2013.001
dc.identifier.urlhttp://www.cros-portal.eu/content/ntts-2013en


This item appears in the following Collection(s)

Show simple item record