A
A
Anton2021-06-23 15:13:59
MongoDB
Anton, 2021-06-23 15:13:59

How to find duplicate rows in MongoDB?

Hey!
There is a document in mongodb like this:

"_id" : ObjectId("5ea3138daee55c0001eac29f"),
"linkRole" : [
        {
            "role" : "admin",
            "Organization" : "a32cc286-256b-40e5-fc5d-5ecbdc341ab1"
        },
        {
            "role" : "superadmin",
            "Organization" : "a32cc286-256b-40e5-fc5d-5ecbdc341ab1"
        },
        {
            "role" : "user",
            "Organization" : "a32cc286-256b-40e5-fc5d-5ecbdc341ab1"
        },
        {
            "role" : "user",
            "Organization" : "a32cc286-256b-40e5-fc5d-5ecbdc341ab1"
        },
        {
            "role" : "admin",
            "Organization" : "dd79f23d-2382-4eb7-a2f3-634890eba0bb"
        },
        {
            "role" : "superadmin",
            "Organization" : "dd79f23d-2382-4eb7-a2f3-634890eba0bb"
        }]


Or he is in a different form:

linkRole[0].role:admin
linkRole[0].Organization:a32cc286-256b-40e5-fc5d-5ecbdc341ab1
linkRole[1].role:superadmin
linkRole[1].Organization:a32cc286-256b-40e5-fc5d-5ecbdc341ab1
linkRole[2].role:user
linkRole[2].Organization:a32cc286-256b-40e5-fc5d-5ecbdc341ab1
linkRole[3].role:user
linkRole[3].Organization:a32cc286-256b-40e5-fc5d-5ecbdc341ab1
linkRole[4].role:admin
linkRole[4].Organization:dd79f23d-2382-4eb7-a2f3-634890eba0bb
linkRole[5].role:superadmin
linkRole[5].Organization:dd79f23d-2382-4eb7-a2f3-634890eba0bb


Question: how to count all documents that have the same "paired" arrays (in this example, these are [2] and [3]) and then remove all duplicates?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
A
Alexander Romanov, 2021-06-24
@hunk3r

Only aggregation with $unwind comes to mind, but $unwind is a burning operation

pipeline = [
  {$unwind: "$linkRole"},
  {$group: {
    _id: {
      d_id: "$_id",
      role: "$linkRole.role",
      Organization: "$linkRole.Organization"
    },
    count: {$sum: 1}
  }},
  {$match:{count: {$gt: 1}}}
]
db.users.aggregate(pipeline).forEach(a => {
  user = db.users.findOne({_id: a._id.d_id});
  for (let i = 1; i < a.count; i++) {
    user.linkRole.splice(
      user.linkRole.findIndex(lr => lr.role === a._id.role && lr.Organization === a._id.Organization),
      1
    )
  }
  db.users.save(user)
})

In response, get documents in the docs field which will have _id with the same linkRole and delete them already in the client code

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question