A Siamese transformer network for zero-shot ancient coin classification

Guo, Zhongliang; Arandelovic, Oggie; Reid, David; Lei, Yaxiong

Show simple item record

Files in this item

Name:: Guo_2023_A_Siamese_transformer_network_JImaging_09_00107_CCBY.pdf
Size:: 5.236Mb
Format:: PDF

View/Open

Item metadata

dc.contributor.author	Guo, Zhongliang
dc.contributor.author	Arandelovic, Oggie
dc.contributor.author	Reid, David
dc.contributor.author	Lei, Yaxiong
dc.date.accessioned	2023-05-25T12:30:07Z
dc.date.available	2023-05-25T12:30:07Z
dc.date.issued	2023-05-25
dc.identifier	286256413
dc.identifier	367138bb-8a9d-4fe8-8ff7-937e736be9ef
dc.identifier	85163753595
dc.identifier.citation	Guo , Z , Arandelovic , O , Reid , D & Lei , Y 2023 , ' A Siamese transformer network for zero-shot ancient coin classification ' , Journal of Imaging , vol. 9 , no. 6 , 107 . https://doi.org/10.3390/jimaging9060107	en
dc.identifier.issn	2313-433X
dc.identifier.other	ORCID: /0000-0002-6025-3021/work/135851474
dc.identifier.other	ORCID: /0000-0002-0697-7942/work/137914978
dc.identifier.uri	https://hdl.handle.net/10023/27673
dc.description.abstract	Ancient numismatics, the study of ancient coins, has in recent years become an attractive domain for the application of computer vision and machine learning. Though rich in research problems, the predominant focus in this area to date has been on the task of attributing a coin from an image, that is of identifying its issue. This may be considered the cardinal problem in the field and it continues to challenge automatic methods. In the present paper, we address a number of limitations of previous work. Firstly, the existing methods approach the problem as a classification task. As such, they are unable to deal with classes with no or few exemplars (which would be most, given over 50,000 issues of Roman Imperial coins alone), and require retraining when exemplars of a new class become available. Hence, rather than seeking to learn a representation that distinguishes a particular class from all the others, herein we seek a representation that is overall best at distinguishing classes from one another, thus relinquishing the demand for exemplars of any specific class. This leads to our adoption of the paradigm of pairwise coin matching by issue, rather than the usual classification paradigm, and the specific solution we propose in the form of a Siamese neural network. Furthermore, while adopting deep learning, motivated by its successes in the field and its unchallenged superiority over classical computer vision approaches, we also seek to leverage the advantages that transformers have over the previously employed convolutional neural networks, and in particular their non-local attention mechanisms, which ought to be particularly useful in ancient coin analysis by associating semantically but not visually related distal elements of a coin’s design. Evaluated on a large data corpus of 14,820 images and 7605 issues, using transfer learning and only a small training set of 542 images of 24 issues, our Double Siamese ViT model is shown to surpass the state of the art by a large margin, achieving an overall accuracy of 81%. Moreover, our further investigation of the results shows that the majority of the method’s errors are unrelated to the intrinsic aspects of the algorithm itself, but are rather a consequence of unclean data, which is a problem that can be easily addressed in practice by simple pre-processing and quality checking.
dc.format.extent	34
dc.format.extent	5490832
dc.language.iso	eng
dc.relation.ispartof	Journal of Imaging	en
dc.subject	Siamese neural network	en
dc.subject	Matching	en
dc.subject	Deep learning	en
dc.subject	Computer vision	en
dc.subject	Machine learning	en
dc.subject	Low-shot learning	en
dc.subject	CJ Numismatics	en
dc.subject	QA75 Electronic computers. Computer science	en
dc.subject	NDAS	en
dc.subject	MCC	en
dc.subject.lcc	CJ	en
dc.subject.lcc	QA75	en
dc.title	A Siamese transformer network for zero-shot ancient coin classification	en
dc.type	Journal article	en
dc.contributor.institution	University of St Andrews. School of Computer Science	en
dc.identifier.doi	10.3390/jimaging9060107
dc.description.status	Peer reviewed	en
dc.identifier.url	https://www.mdpi.com/journal/jimaging/special_issues/873SA697YH	en

This item appears in the following Collection(s)

University of St Andrews Research

Show simple item record