Show simple item record

Files in this item

Thumbnail

Item metadata

dc.contributor.authorConnor, Richard
dc.contributor.authorDearle, Al
dc.contributor.authorVadicamo, Lucia
dc.contributor.editorMecella, Massimo
dc.contributor.editorAmato, Guiseppe
dc.contributor.editorGennaro, Claudio
dc.date.accessioned2019-07-11T12:30:02Z
dc.date.available2019-07-11T12:30:02Z
dc.date.issued2019-07-09
dc.identifier259580206
dc.identifier831b338f-f0d8-4677-a975-2764e8100db0
dc.identifier85069491938
dc.identifier.citationConnor , R , Dearle , A & Vadicamo , L 2019 , Modelling string structure in vector spaces . in M Mecella , G Amato & C Gennaro (eds) , Proceedings of the 27th Italian Symposium on Advanced Database Systems : Castiglione della Pescaia (Grosseto), Italy, June 16th to 19th, 2019 . , 45 , CEUR Workshop Proceedings , vol. 2400 , Sun SITE Central Europe , SEBD 2019 27th Italian Symposium on Advanced Database Systems , Castiglione della Pescaia , Italy , 17/06/19 . < http://ceur-ws.org/Vol-2400/paper-45.pdf >en
dc.identifier.citationworkshopen
dc.identifier.issn1613-0073
dc.identifier.urihttps://hdl.handle.net/10023/18082
dc.description.abstractSearching for similar strings is an important and frequent database task both in terms of human interactions and in absolute world-wide CPU utilisation. A wealth of metric functions for string comparison exist. However, with respect to the wide range of classification and other techniques known within vector spaces, such metrics allow only a very restricted range of techniques. To counter this restriction, various strategies have been used for mapping string spaces into vector spaces, approximating the string distances within the mapped space and therefore allowing vector space techniques to be used. In previous work we have developed a novel technique for mapping metric spaces into vector spaces, which can therefore be applied for this purpose. In this paper we evaluate this technique in the context of string spaces, and compare it to other published techniques for mapping strings to vectors. We use a publicly available English lexicon as our experimental data set, and test two different string metrics over it for each vector mapping. We find that our novel technique considerably outperforms previously used technique in preserving the actual distance.
dc.format.extent12
dc.format.extent3047972
dc.language.isoeng
dc.publisherSun SITE Central Europe
dc.relation.ispartofProceedings of the 27th Italian Symposium on Advanced Database Systemsen
dc.relation.ispartofseriesCEUR Workshop Proceedingsen
dc.subjectMetric mappingen
dc.subjectn-Simplex projectionen
dc.subjectPivoted embeddingen
dc.subjectStringen
dc.subjectJensen-Shannon distanceen
dc.subjectLevenshtein distanceen
dc.subjectQA75 Electronic computers. Computer scienceen
dc.subjectDASen
dc.subject.lccQA75en
dc.titleModelling string structure in vector spacesen
dc.typeConference itemen
dc.contributor.institutionUniversity of St Andrews. School of Computer Scienceen
dc.date.embargoedUntil2019-07-09
dc.identifier.urlhttp://ceur-ws.org/Vol-2400/paper-45.pdfen


This item appears in the following Collection(s)

Show simple item record