Automatic classification of human translation and machine translation : a study from the perspective of lexical diversity

Fu, Yingxue; Nederhof, Mark Jan

Show simple item record

Files in this item

Name:: Fu_2021_Automatic_classification_of_human_MOTRA_91_CCBY.pdf
Size:: 135.7Kb
Format:: PDF

View/Open

Item metadata

dc.contributor.author	Fu, Yingxue
dc.contributor.author	Nederhof, Mark Jan
dc.contributor.editor	Bizzoni, Yuri
dc.contributor.editor	Teich, Elke
dc.contributor.editor	España-Bonet, Cristina
dc.contributor.editor	van Genabith, Josef
dc.date.accessioned	2021-06-03T10:30:14Z
dc.date.available	2021-06-03T10:30:14Z
dc.date.issued	2021-05-31
dc.identifier	273993683
dc.identifier	777591c5-314d-4ecb-be42-2d2ae1d834bc
dc.identifier.citation	Fu , Y & Nederhof , M J 2021 , Automatic classification of human translation and machine translation : a study from the perspective of lexical diversity . in Y Bizzoni , E Teich , C España-Bonet & J van Genabith (eds) , Proceedings for the First Workshop on Modelling Translation : Translatology in the Digital Age . NEALT Proceedings Series , Linkoping University Electronic Press , pp. 91–99 , Workshop on Modelling Translation , Online City , Iceland , 31/05/21 . < https://aclanthology.org/previews/ingest-nodalida/2021.motra-1.10/ >	en
dc.identifier.citation	workshop	en
dc.identifier.issn	1650-3686
dc.identifier.other	ORCID: /0000-0002-1845-6829/work/95041670
dc.identifier.uri	https://hdl.handle.net/10023/23304
dc.description.abstract	By using a trigram model and fine-tuning a pretrained BERT model for sequence classification, we show that machine translation and human translation can be classified with an accuracy above chance level, which suggests that machine translation and human translation are different in a systematic way. The classification accuracy of machine translation is much higher than of human translation. We show that this may be explained by the difference in lexical diversity between machine translation and human translation. If machine translation has independent patterns from human translation, automatic metrics which measure the deviation of machine translation from human translation may conflate difference with quality. Our experiment with two different types of automatic metrics shows correlation with the result of the classification task. Therefore, we suggest the difference in lexical diversity between machine translation and human translation be given more attention in machine translation evaluation.
dc.format.extent	139015
dc.language.iso	eng
dc.publisher	Linkoping University Electronic Press
dc.relation.ispartof	Proceedings for the First Workshop on Modelling Translation	en
dc.relation.ispartofseries	NEALT Proceedings Series	en
dc.subject	Q Science (General)	en
dc.subject	Artificial Intelligence	en
dc.subject	3rd-DAS	en
dc.subject.lcc	Q1	en
dc.title	Automatic classification of human translation and machine translation : a study from the perspective of lexical diversity	en
dc.type	Conference item	en
dc.contributor.institution	University of St Andrews. School of Computer Science	en
dc.identifier.url	https://aclanthology.org/previews/ingest-nodalida/2021.motra-1.10/	en

This item appears in the following Collection(s)

University of St Andrews Research

Show simple item record