J
J
jslby2015-08-22 03:15:25
Programming
jslby, 2015-08-22 03:15:25

How to remove all duplicate variations?

Let's say there is a list of phrases:
I want a red ball
red ball I want
where to get a red ball
what color a red ball
red ball I want to find a
ball I want red
You need to clear this list and leave only unique phrases whose words are not repeated in different variations.
Those. from this list you need to get this:
I want a red ball
where to get a red ball
what color a red
ball I want to find a red ball
And remove the following phrases: I want a red
ball - analogue: I want a red ball
I want a red ball - analogue: I want a red ball

Answer the question

In order to leave comments, you need to log in

3 answer(s)
S
Saboteur, 2015-08-22
@saboteur_kiev

you can sort words alphabetically, then remove duplicate lines.

V
Vladlen Grachev, 2015-08-22
@gwer

Sorting words is too easy. You can split the string into words (remembering to convert them to the same case). Then hash each word. Then add the resulting hashes (as numbers) and use the sum as the hash of the phrase. Well, then at the output of the object. We run through the array of source lines in a loop, adding each line to the output object, using a hash as a key. If such a key has already been used, the previous value will be overwritten. Everything comes out elementary and beautiful, a single-line body of the uniqueization loop. Along the way, do not forget to come up with a mechanism for dealing with collisions.

Why not?
d0e7323f980a.jpg

D
dtestyk, 2015-10-26
@dtestyk

Sorting words is too easy.
Assign each word a prime number.
Then the phrase will be their product.
Delete phrases with the same meaning.
For example, using them as a key in a hashmap.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question