S
S
SunGoesDown2015-10-27 13:28:32
Algorithms
SunGoesDown, 2015-10-27 13:28:32

Which algorithm to choose to remove duplicates?

There is a list like this:

site.com/index.php?id=1
site.com/index.php?id=2
site.com/index.php?id=3
site.com/index.php?id=1¶m=a
site.com/ index.php?id=1¶m=b
site2.com/index.php?id=3
site2.com/index.php?id=4
site2.com/index.php?id=5
site2.com/index.php? id=1¶m=x
site2.com/index.php?id=1¶m=y
site2.com/index.php?id=2¶m=z

It is required to bring it to the form:
site.com/index.php?id=1
site.com/index.php?id=1¶m=a
site2.com/index.php?id=3
site2.com/index.php?id=1¶m=x

That is, remove links where only the parameter values ​​differ, but the parameters themselves are the same. Advise on which algorithm is better to go and how would it be easier to implement all this?

Answer the question

In order to leave comments, you need to log in

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question