This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA
template
c h a i n
c h a i n
) {
return
c1
−>
struct
chain ;
bool c h a i n _ s o r t ( const ∗ c1 , const ∗ c 2
f i r s t _ l e v e l < c2
−>
first_level
;
}
template
struct c h a i n { int f i r s t _ l e v e l ; std : : multiset < chain ∗ > children_chains ; Point
p;
};
Figure 1: Candidate structure for chains in a cover tree
Figure 2: This gure shows the recursive insertion algorithm. Note the existence of two implicit loops, colored in red and green.
Thin lines represent explicit
paths in the code, and gross lines denote implicit control ow across the recursive invocation.
3
Algorithm 2 portraits the use of an optimization feature: the use of a 'break' in the second loop. Its use is possible because once a parent is found for a chain, there is nothing else to do. In the other side, there is no explicit way of breaking the green loop of gure 2, except perhaps throwing an exception. And there are little chances that even a sophisticated functional compiler can optimize with tail invocation superous returns in the green loop.
Algorithm 2: Non recursive insert algoritm Input: p (point to insert) stack= empty stack (stack) i = higher level of root chain with direct Qi = {ri } where ri is node of root chain
while true do
children (integer) at level
i
(set of nodes)
// Meaning of i: the level currently considered. // Meaning of Qi : a set of nodes such that d(p, q) 6 2i+1 . Q = {Children(q) : q ∈ Qi } // Q is at level i-1 if d(p, Q) > 2i then
else
break
stack.push(Qi ) // Update i and Qi for the level below; // note that i is updated after Qi . Qi = q ∈ Q : d(p, q) 6 2i i=i-1
end end while true do i += 1
// Go up one level
stack.pop() if d(p, Qi ) ≤ 2i then i Pick q ∈ Qi such that d(p, q) 6 2 Make p a child of q
Qi
=
break end end
It is possible to further exploit the just-one-parent issue and store a suitable candidate in each run of the rst loop, eliminating the stack and the second loop. Such idea is depicted in algorithm 3.
1 over 1.
If implemented, algorithm 3 might yield a considerable speedup
Note that these results are ultimately dependent on the implemention, the target dataset and the the costs of computing distances between points.
1 In
my implementation, using a cheap
l1
metric in a one-dimensional dataset, algorithm 3
was approximately two times faster than 1
4
Algorithm 3: Optimized, non recursive insert algoritm Input: p (point to insert) Input: r (root chain) Input: imax (level high enough such that d(p, r) < 2imax i = imax Q∗ = {r} candidateparent
while true do
=
)(integer)
(r, imax + 1)
// i: the level currently considered. // Q∗ : every node q at level i such that d(p, q) 6 2i+1 , Q = {Children(q) : q ∈ Q∗ } // Q is at level i-1 // Q∗ is Qi till here Q∗ = Ø // Q∗ is Qi−1 from here foreach q ∈ Q do dq = d(q, p) if dq 6 2i then Add q to Q∗ if rst level of q ≥ i then // Previous condition implies that q ∈ Qi candidateparent = (q, i)
end end end if Q∗ = Ø then break else i=i−1
end end
(q, i) = candidateparent Insert p as child of q , level
of
q
is
i
and level of
p
is
i−1
References [1] A. Beygelzimer, S. Kakade, and J. Langford. Cover trees for nearest neighbor. 2006.
5