I am using the following aggregate pipeline in MongoDB to traverse a two-way graph:
{ $match: { from: "some_root" } },
{
$graphLookup: {
from: "connections",
startWith: "$to",
connectFromField: "to",
connectToField: "from",
as: "children",
depthField: "depth",
maxDepth: 100
},
}
With my data structure, this results in zero or more output documents which all have the same array of connection documents in their children fields. I would like to do further processing to those connection documents but avoid working with the duplicates.
How can I select only the first document's children array and promote them to output documents for processing in the pipeline's next steps?
Adding the following stages results in what I want but with duplicate documents for each output document of the $graphLookup stage:
{ $unwind: { path: "$children" } },
{ $replaceRoot: { newRoot: "$children" } }
To me, this seems like the semantically correct approach and is missing before it only a stage to "$select" the children field of the first document for processing.
I am aware that a $group stage might be able to filter out duplicates but that feels like a convoluted solution to a simple problem (i.e., why filter out duplicates if you can avoid having those duplicates in the first place).
Pseudo-example for clarification
The $graphLookup stage results in:
{
_id: 1,
children: [
{ _id: 10 },
{ _id: 11 }
]
},
{
_id: 2,
children: [
{ _id: 10 },
{ _id: 11 }
]
}
From which I wish to extract one set of children and promote them to documents for later steps in the pipeline, i.e., to have as input documents:
{ _id: 10 },
{ _id: 11 }
{ $limit: 1 }) before$unwind? But is it possible that an object appears in thechildrenarray for other documents, not in the first document and do you want those objects?childrenarrays will always be identical. (That probably is indicative that I should optimize the pipeline further.)