Answer the question
In order to leave comments, you need to log in
How to compare strings by incomplete similarity?
The site has product names.
In a third-party price list, the same products, but with differences in the name (commas, dashes, spaces) and sometimes the words are swapped.
Is there any way to automatically match the strings?
PS. I don't know what tags to choose for the question. Correct please.
Answer the question
In order to leave comments, you need to log in
You have already been written about the Levenshtein distance (as a basic metric to start with). The problem begins, as for me, at the moment when the understanding comes that "words can be rearranged." And here you have to remember combinatorics and all these factorials. Because even for three words, the number of permutations will be 3! = 6. And you, again, most likely, will take and split your string by spaces and compare each word with each. Do you feel the increasing complexity?
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question