Querying metric spaces with bit operations
Abstract
Metric search techniques can be usefully characterised by the time at which distance calculations are performed during a query. Most exact search mechanisms use a “just-in-time” approach where distances are calculated as part of a navigational strategy. An alternative is to use a “one-time” approach, where distances to a fixed set of reference objects are calculated at the start of each query. These distances are typically used to re-cast data and queries into a different space where querying is more efficient, allowing an approximate solution to be obtained. In this paper we use a “one-time” approach for an exact search mechanism. A fixed set of reference objects is used to define a large set of regions within the original space, and each query is assessed with respect to the definition of these regions. Data is then accessed if, and only if, it is useful for the calculation of the query solution. As dimensionality increases, the number of defined regions must increase, but the memory required for the exclusion calculation does not. We show that the technique gives excellent performance over the SISAP benchmark data sets, and most interestingly we show how increases in dimensionality may be countered by relatively modest increases in the number of reference objects used.
Citation
Connor , R & Dearle , A 2018 , Querying metric spaces with bit operations . in S Marchand-Maillet , Y N Silva & E Chávez (eds) , Similarity Search and Applications : 11th International Conference, SISAP 2018, Lima, Peru, October 7-9, 2018, Proceedings . Lecture Notes in Computer Science , vol. 11223 , Springer , Cham , pp. 33-46 , 11th International Conference on Similarity Search and Applications (SISAP 2018) , Lima , Peru , 7/10/18 . https://doi.org/10.1007/978-3-030-02224-2_3 conference
Publication
Similarity Search and Applications
ISSN
0302-9743Type
Conference item
Rights
© 2018, Springer Nature Switzerland AG. This work has been made available online in accordance with the publisher’s policies. This is the author created accepted version manuscript following peer review and as such may differ slightly from the final published version. The final published version of this work is available at https://doi.org/10.1007/978-3-030-02224-2_3
Description
Funding: This work was supported by ESRC grant ES/L007487/1 “Administrative Data Research Centre—Scotland".Items in the St Andrews Research Repository are protected by copyright, with all rights reserved, unless otherwise indicated.