K
K
Konstantin2014-10-03 19:56:49
Database
Konstantin, 2014-10-03 19:56:49

What is the best database for bitwise XOR sampling?

Greetings to all knowledgeable and interested!
I would be extremely grateful for useful recommendations on solving the problem or indicating the vector "where to dig."
Initial: there is a large array of strings [all 32 bytes] - approximately over 100 million.
Task: I will describe with a MySQL query

SELECT str FROM tbl WHERE BIT_COUNT(str ^ :search) / 256 <= :pc
Example:
a = X'fffffffffff727d9181b191bf95ffc1f981f981f98399839ffffffffffffffff' // искомая строка
b = X'ffffffffffe407e9181b191bf91ffe1f981f981f98399839ffffffffffffffff' // одна из строк в БД
r = X'0000000000132030000000000040020000000000000000000000000000000000' // результат побитового XOR
c = 8 // число установленных бит в результате
if :pc = 0.05, then the string "b" from the base will be selected, because 8 / 256 = 0.03125 I
'm currently testing this case using MySQL. But, something tells me that the result will not satisfy me.
Please tell me a tool and / or algorithm that will speed up the search process ... but not at the cost of investing large time resources for study and implementation - the issue of time is quite acute.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
X
xmoonlight, 2014-10-03
@xmoonlight

One of the solutions (it's not the best!):
Here I answered about hashing strings.
To find the maximum match of strings...
Of course, this is not very beautiful, but it is quite workable....

A
Alex Chistyakov, 2014-10-04
@alexclear

None.
Your query is always a full scan, even a functional index cannot be built for it (there are none in MySQL, but there are in MariaDB and PostgreSQL).
You need to change the algorithm so that it does not search by iterating over the entire table.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question