Get access

Dual-tree fast exact max-kernel search



The problem of max-kernel search arises everywhere: given a query point equation image, a set of reference objects equation image and some kernel equation image, find equation image. Max-kernel search is ubiquitous and appears in countless domains of science, thanks to the wide applicability of kernels. A few domains include image matching, information retrieval, bio-informatics, similarity search, and collaborative filtering (to name just a few). However, there is no generalized technique for efficiently solving max-kernel search. This paper presents a single-tree algorithm called single-tree FastMKS which returns the max-kernel solution for a single query point in provably equation image time (where equation image is the number of reference objects), and also a dual-tree algorithm (dual-tree FastMKS) which is useful for max-kernel search with many query points. If the set of query points is of size equation image, this algorithm returns a solution in provably equation image time, which is significantly better than the equation image linear scan solution; these bounds are dependent on the expansion constant of the data. These algorithms work for abstract objects, as they do not require explicit representation of the points in kernel space. Empirical results for a variety of datasets show up to five orders of magnitude speedup in some cases. In addition, we present approximate extensions of the FastMKS algorithms that can achieve further speedups.

Get access to the full text of this article