A
A
AlexandrMa2021-02-01 16:34:08
Notepad++
AlexandrMa, 2021-02-01 16:34:08

How to compare strings by incomplete similarity?

The site has product names.
In a third-party price list, the same products, but with differences in the name (commas, dashes, spaces) and sometimes the words are swapped.
Is there any way to automatically match the strings?
PS. I don't know what tags to choose for the question. Correct please.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
R
Roman Mirilaczvili, 2021-02-01
@2ord

Levenshtein distance or other metrics for comparison.

S
Sergey Ilyin, 2021-02-02
@sunsexsurf

You have already been written about the Levenshtein distance (as a basic metric to start with). The problem begins, as for me, at the moment when the understanding comes that "words can be rearranged." And here you have to remember combinatorics and all these factorials. Because even for three words, the number of permutations will be 3! = 6. And you, again, most likely, will take and split your string by spaces and compare each word with each. Do you feel the increasing complexity?

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question