Show simple item record

Files in this item

Thumbnail

Item metadata

dc.contributor.authorFilgueira, Rosa
dc.date.accessioned2023-02-17T15:30:01Z
dc.date.available2023-02-17T15:30:01Z
dc.date.issued2022-12-14
dc.identifier280547080
dc.identifiera83434d6-9660-488a-b7b8-d77dc60dcc47
dc.identifier85145436249
dc.identifier.citationFilgueira , R 2022 , frances : a deep learning NLP and text mining web tool to unlock historical digital collections : a case study on the Encyclopaedia Britannica . in 2022 IEEE 18th International Conference on e-Science (e-Science) . , 9973695 , IEEE international conference on e-science and grid computing , IEEE , pp. 246-255 , 18th IEEE International eScience Conference (eScience 2022) , Salt Lake City , Utah , United States , 10/10/22 . https://doi.org/10.1109/eScience55777.2022.00038en
dc.identifier.citationconferenceen
dc.identifier.isbn9781665461252
dc.identifier.isbn9781665461245
dc.identifier.urihttps://hdl.handle.net/10023/27006
dc.descriptionFunding: This work was supported by the NLS Digital Fellowship and by the Google Cloud Platform research credit program.en
dc.description.abstractThis work presents frances, an integrated text mining tool that combines information extraction, knowledge graphs, NLP, deep learning, parallel processing and Semantic Web techniques to unlock the full value of historical digital textual collections, offering new capabilities for researchers to use powerful analysis methods without being distracted by the technology and middleware details. To demonstrate these capabilities, we use the first eight editions of the Encyclopaedia Britannica offered by the National Library of Scotland (NLS) as an example digital collection to mine and analyse. We have developed novel parallel heuristics to extract terms from the original collection (alongside metadata), which provides a mix of unstructured and semi-structured input data, and populated a new knowledge graph with this information. Our Natural Language Processing models enable frances to perform advanced analyses that go significantly beyond simple search using the information stored in the knowledge graph. Furthermore, frances also allows for creating and running complex text mining analyses at scale. Our results show that the novel computational techniques developed within frances provide a vehicle for researchers to formalize and connect findings and insights derived from the analysis of large-scale digital corpora such as the Encyclopaedia Britannica.
dc.format.extent10
dc.format.extent15990429
dc.language.isoeng
dc.publisherIEEE
dc.relation.ispartof2022 IEEE 18th International Conference on e-Science (e-Science)en
dc.relation.ispartofseriesIEEE international conference on e-science and grid computingen
dc.subjectInformation extractionen
dc.subjectKnowlege graphen
dc.subjectTransfer learningen
dc.subjectNatural language processingen
dc.subjectText miningen
dc.subjectWeb toolsen
dc.subjectSemantic weben
dc.subjectParallel computingen
dc.subjectDigital toolsen
dc.subjectDigital textual collectionsen
dc.subjectDeep learningen
dc.subjectMetadataen
dc.subjectKnowledge engineeringen
dc.subjectInformation retrievalen
dc.subjectQA75 Electronic computers. Computer scienceen
dc.subjectZ665 Library Science. Information Scienceen
dc.subjectArtificial Intelligenceen
dc.subjectComputer Science Applicationsen
dc.subjectInformation Systemsen
dc.subjectT-NDASen
dc.subjectMCCen
dc.subjectNISen
dc.subject.lccQA75en
dc.subject.lccZ665en
dc.titlefrances: a deep learning NLP and text mining web tool to unlock historical digital collections : a case study on the Encyclopaedia Britannicaen
dc.typeConference itemen
dc.contributor.institutionUniversity of St Andrews. School of Computer Scienceen
dc.identifier.doi10.1109/eScience55777.2022.00038
dc.date.embargoedUntil2022-10-11
dc.identifier.urlhttps://ieeexplore.ieee.org/xpl/conhome/9973400/proceedingen
dc.identifier.urlhttps://www.escience-conference.org/2022/en


This item appears in the following Collection(s)

Show simple item record