Z
Z
Zubchick2011-01-19 23:07:28
Algorithms
Zubchick, 2011-01-19 23:07:28

Fuzzy search?

There are two lines, the 1st is short 1-3 words, the second is long 10-20 words, you need to determine whether the first line is in the second or how many percent it is there. Recommend algorithms :)

Answer the question

In order to leave comments, you need to log in

7 answer(s)
N
Nicolette, 2011-01-20
@Nicolette

I once wrote a thesis on this topic; it turned out that it is best to compare Russian words by the length of the maximum common prefix (as a percentage of the length of the smallest of the words, must be above the threshold). To compare sentences - really compare the words of the strings in pairs and display the similarity function through the distances between similar words.

B
bear11, 2011-01-19
@bear11

try to calculate the Levenshtein distance: D0%B8%D0%B5_%D0%9B%D0%B5%D0%B2%D0%B5%D0%BD%D1%88%D1%82%D0%B5%D0%B9%D0%BD%D0% B0
example
bytes.com/topic/python/answers/580959-fuzzy-string-comparison

N
Nikita, 2011-01-19
@Nigrimmist

if(longStr.Contains(shortStr))

T
tampere, 2011-01-20
@tampere

There are many different distances between words, I would split the phrases into words and take the average of the maximum obtained measures of pairwise matching of words.

O
Oleg Matrozov, 2011-01-20
@Mear

You can compare using the trigram method. Gives a certain result, even if words with different endings, etc.

L
lightcaster, 2011-01-20
@lightcaster

Traditionally, the Levenshtein distance.
But I would recommend using the common longest subsequence. At the same time, you can introduce a certain penalty for gaps between words.

X
xmoonlight, 2016-02-06
@xmoonlight

How to determine the similarity of two strings?

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question