A
A
Arkady2012-04-04 13:02:36
Python
Arkady, 2012-04-04 13:02:36

Is there anything faster than BK-Tree?

I'm going to be storing a lot of data to search through it with "hamming distance".
Volume ~1,000,000. I decided to use a BK tree, but with the same million, the search turns out to be long, about 10-20 seconds.
The tree itself: dumpz.org/192335/ (Cython)
Maybe there are other, faster algorithms?

Answer the question

In order to leave comments, you need to log in

3 answer(s)
M
Mrrl, 2012-04-04
@p0is0n

What order of distance are assumed in the query?
I would take an ordinary binary tree (branching by the value of some bit; each node indicates which bit to branch by). Then search(tree,d)=search(subtree with correct bit value,d)+search(subtree with incorrect bit value,d-1).

A
AxisPod, 2012-04-04
@AxisPod

And for example, use specialized tools, such as sphinx?

A
akzhan, 2012-04-04
@akzhan

v-tree. stackoverflow.com/questions/6389841/efficiently-find-binary-strings-with-low-hamming-distance-in-large-set

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question