Show simple item record

Files in this item

Thumbnail

Item metadata

dc.contributor.advisorNederhof, Mark-Jan
dc.contributor.authorMcCaffery, Martin
dc.coverage.spatialxvi, 185 p.en_US
dc.date.accessioned2017-11-14T10:12:04Z
dc.date.available2017-11-14T10:12:04Z
dc.date.issued2017-09-28
dc.identifier.urihttps://hdl.handle.net/10023/12080
dc.description.abstractWe present a multifaceted investigation into the relevance of word order in machine translation. We introduce two tools, DTED and DERP, each using dependency structure to detect differences between the structures of machine-produced translations and human-produced references. DTED applies the principle of Tree Edit Distance to calculate edit operations required to convert one structure into another. Four variants of DTED have been produced, differing in the importance they place on words which match between the two sentences. DERP represents a more detailed procedure, making use of the dependency relations between words when evaluating the disparities between paths connecting matching nodes. In order to empirically evaluate DTED and DERP, and as a standalone contribution, we have produced WOJ-DB, a database of human judgments. Containing scores relating to translation adequacy and more specifically to word order quality, this is intended to support investigations into a wide range of translation phenomena. We report an internal evaluation of the information in WOJ-DB, then use it to evaluate variants of DTED and DERP, both to determine their relative merit and their strength relative to third-party baselines. We present our conclusions about the importance of structure to the tools and their relevance to word order specifically, then propose further related avenues of research suggested or enabled by our work.en_US
dc.language.isoenen_US
dc.publisherUniversity of St Andrews
dc.rightsAttribution-ShareAlike 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by-sa/4.0/*
dc.subjectMachine translationen_US
dc.subjectNatural language processingen_US
dc.subjectTranslation evaluationen_US
dc.subjectDependency structureen_US
dc.subjectEvaluation dataseten_US
dc.subjectTree edit distanceen_US
dc.subject.lccP308.M3
dc.subject.lcshMachine translating.en
dc.subject.lcshNatural language processing (Computer science)en
dc.subject.lcshTranslators (Computer programs)en
dc.titleThe mat sat on the cat : investigating structure in the evaluation of order in machine translationen_US
dc.typeThesisen_US
dc.contributor.sponsorEngineering and Physical Sciences Research Council (EPSRC)en_US
dc.type.qualificationlevelDoctoralen_US
dc.type.qualificationnamePhD Doctor of Philosophyen_US
dc.publisher.institutionThe University of St Andrewsen_US


The following licence files are associated with this item:

    This item appears in the following Collection(s)

    Show simple item record

    Attribution-ShareAlike 4.0 International
    Except where otherwise noted within the work, this item's licence for re-use is described as Attribution-ShareAlike 4.0 International