Descriptor transition tables for object retrieval using unconstrained cluttered video acquired using a consumer level handheld mobile device

Rieutort-Louis, Warren; Arandelovic, Ognjen

Show simple item record

Files in this item

Name:: 2016_IJCNN_paper1.pdf
Size:: 1.483Mb
Format:: PDF

View/Open

Item metadata

dc.contributor.author	Rieutort-Louis, Warren
dc.contributor.author	Arandelovic, Ognjen
dc.date.accessioned	2016-07-23T23:31:40Z
dc.date.available	2016-07-23T23:31:40Z
dc.date.issued	2016-11-03
dc.identifier	242457497
dc.identifier	21d614de-847a-443e-9e3f-e3afbf119e55
dc.identifier	85007247670
dc.identifier	000399925503031
dc.identifier.citation	Rieutort-Louis , W & Arandelovic , O 2016 , Descriptor transition tables for object retrieval using unconstrained cluttered video acquired using a consumer level handheld mobile device . in 2016 International Joint Conference on Neural Networks (IJCNN) . , 7727584 , IEEE , pp. 3030-3037 , IEEE World Congress on Computational Intelligence , Vancouver , Canada , 24/07/16 . https://doi.org/10.1109/IJCNN.2016.7727584	en
dc.identifier.citation	conference	en
dc.identifier.other	ORCID: /0000-0002-9314-194X/work/164895874
dc.identifier.uri	https://hdl.handle.net/10023/9201
dc.description.abstract	Visual recognition and vision based retrieval of objects from large databases are tasks with a wide spectrum of potential applications. In this paper we propose a novel recognition method from video sequences suitable for retrieval from databases acquired in highly unconstrained conditions e.g. using a mobile consumer-level device such as a phone. On the lowest level, we represent each sequence as a 3D mesh of densely packed local appearance descriptors. While image plane geometry is captured implicitly by a large overlap of neighbouring regions from which the descriptors are extracted, 3D information is extracted by means of a descriptor transition table, learnt from a single sequence for each known gallery object. These allow us to connect local descriptors along the 3rd dimension (which corresponds to viewpoint changes), thus resulting in a set of variable length Markov chains for each video. The matching of two sets of such chains is formulated as a statistical hypothesis test, whereby a subset of each is chosen to maximize the likelihood that the corresponding video sequences show the same object. The effectiveness of the proposed algorithm is empirically evaluated on the Amsterdam Library of Object Images and a new highly challenging video data set acquired using a mobile phone. On both data sets our method is shown to be successful in recognition in the presence of background clutter and large viewpoint changes.
dc.format.extent	1555105
dc.language.iso	eng
dc.publisher	IEEE
dc.relation.ispartof	2016 International Joint Conference on Neural Networks (IJCNN)	en
dc.rights	© 2016, IEEE. This work is made available online in accordance with the publisher’s policies. This is the author created, accepted version manuscript following peer review and may differ slightly from the final published version. The final published version of this work is available at ieeexplore.ieee.org / https://dx.doi.org/10.1109/IJCNN.2016.7727584	en
dc.subject	QA75 Electronic computers. Computer science	en
dc.subject	3rd-DAS	en
dc.subject.lcc	QA75	en
dc.title	Descriptor transition tables for object retrieval using unconstrained cluttered video acquired using a consumer level handheld mobile device	en
dc.type	Conference item	en
dc.contributor.institution	University of St Andrews.School of Computer Science	en
dc.identifier.doi	10.1109/IJCNN.2016.7727584
dc.date.embargoedUntil	2016-07-24

This item appears in the following Collection(s)

Show simple item record