1

I am using the following aggregate pipeline in MongoDB to traverse a two-way graph:

{ $match: { from: "some_root" } },
{
  $graphLookup: {
    from: "connections",
    startWith: "$to",
    connectFromField: "to",
    connectToField: "from",
    as: "children",
    depthField: "depth",
    maxDepth: 100
  },
}

With my data structure, this results in zero or more output documents which all have the same array of connection documents in their children fields. I would like to do further processing to those connection documents but avoid working with the duplicates.

How can I select only the first document's children array and promote them to output documents for processing in the pipeline's next steps?

Adding the following stages results in what I want but with duplicate documents for each output document of the $graphLookup stage:

{ $unwind: { path: "$children" } },
{ $replaceRoot: { newRoot: "$children" } }

To me, this seems like the semantically correct approach and is missing before it only a stage to "$select" the children field of the first document for processing.

I am aware that a $group stage might be able to filter out duplicates but that feels like a convoluted solution to a simple problem (i.e., why filter out duplicates if you can avoid having those duplicates in the first place).

Pseudo-example for clarification

The $graphLookup stage results in:

{
  _id: 1,
  children: [
    { _id: 10 },
    { _id: 11 }
  ]
},
{
  _id: 2,
  children: [
    { _id: 10 },
    { _id: 11 }
  ]
}

From which I wish to extract one set of children and promote them to documents for later steps in the pipeline, i.e., to have as input documents:

{ _id: 10 },
{ _id: 11 }
2
  • 1
    Limit 1 ({ $limit: 1 }) before $unwind? But is it possible that an object appears in the children array for other documents, not in the first document and do you want those objects? Commented Jan 6 at 9:42
  • 1
    Wow, so simple. How could I not think of this :D Thank you very much! If you want to write that up as an answer, I'll accept. And to answer your question: No, in my case the children arrays will always be identical. (That probably is indicative that I should optimize the pipeline further.) Commented Jan 6 at 16:13

1 Answer 1

1

Written answer from the comment as requested.

Apply the $limit stage to limit 1 document before the $unwind stage as the Post Owner needs the objects in the children array for the first document only returning from the previous stage.

[
  ...,  // Previous stages
  { $limit: 1 },
  { $unwind: { path: "$children" } },
  { $replaceRoot: { newRoot: "$children" } }
]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.