1

I have an aggregation pipeline in mongo composed of a match and a project operation and I have an array of objects as a result.

An example of result:

[{"pdb_id":"1avy"},{"pdb_id":"1avy"},{"pdb_id":"1lwu"}]

I would like to remove duplicated objects, so the result should be:

[{"pdb_id":"1avy"},{"pdb_id":"1lwu"}]

An example of working solution is:

 const uniqueArray = result.filter((object,index) => index === result.findIndex(obj => JSON.stringify(obj) === JSON.stringify(object)));

But this is extremely slow when I have more data involved. Do you know a faster solution?

Please consider that the object in the result may also have more than one property. For example:

[{"pdb_id":"1avy", "pdb_chain":"A"},{"pdb_id":"1avy", "pdb_chain":"A"},{"pdb_id":"1lwu", "pdb_chain":"A"}]

Needs to be filtered to:

 [{"pdb_id":"1avy", "pdb_chain":"A"},{"pdb_id":"1lwu", "pdb_chain":"A"}]

1 Answer 1

1

If you want it to do it with Mongo's help you don't have much choice other than using $group:

db.collection.aggregate([
  {
    $project: {
      _id: 0
    }
  },
  {
    $group: {
      _id: "$$ROOT"
    }
  },
  {
    "$replaceRoot": {
      "newRoot": "$_id"
    }
  }
])

Mongo Playground

Obviously Mongo has some limits, assuming you don't reach the 100 mb limit this should always be the faster option.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.