Answer the question
In order to leave comments, you need to log in
How to remove all duplicate variations?
Let's say there is a list of phrases:
I want a red ball
red ball I want
where to get a red ball
what color a red ball
red ball I want to find a
ball I want red
You need to clear this list and leave only unique phrases whose words are not repeated in different variations.
Those. from this list you need to get this:
I want a red ball
where to get a red ball
what color a red
ball I want to find a red ball
And remove the following phrases: I want a red
ball - analogue: I want a red ball
I want a red ball - analogue: I want a red ball
Answer the question
In order to leave comments, you need to log in
you can sort words alphabetically, then remove duplicate lines.
Sorting words is too easy. You can split the string into words (remembering to convert them to the same case). Then hash each word. Then add the resulting hashes (as numbers) and use the sum as the hash of the phrase. Well, then at the output of the object. We run through the array of source lines in a loop, adding each line to the output object, using a hash as a key. If such a key has already been used, the previous value will be overwritten. Everything comes out elementary and beautiful, a single-line body of the uniqueization loop. Along the way, do not forget to come up with a mechanism for dealing with collisions.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question