So I mad the mistake and saved a lot of doduments twice because I messed up my document id. Because I did a Insert, i multiplied my documents everytime I saved them. So I want to delete all duplicates except the first one, that i wrote. Luckilly the documents have an implicit unique key (match._id) and I should be able to tell what the first one was, because I am using the object id.
The documents look like this:
{
_id: "5e8e2d28ca6e660006f263e6"
match : {
_id: 2345
...
}
...
}
So, right now I have a aggregation that tells me what elements are duplicated and stores them in a collection. There is for sure a more elegant way, but I am still learning.
[{$sort: {"$_id": 1},
{$group: {
_id: "$match._id",
duplicateIds: {$push: "$_id"},
count: {$sum: 1}
}},
{$match: {
count: { $gt: 1 }
}}, {$addFields: {
deletableIds: { $slice: ["$duplicateIds", 1, 1000 ] }
}},
{$out: 'DeleteableIds'}]
Now I do not know how to proceed further, as it does not seem to have a "delete" operation in aggregations and I do not want to write those temp data to a db just so I can write a delete command with that, as I want to delete them in one go. Is there any other way to do this? I am still learning with mongodb and feel a little bit overwhelmed :/