2

I have the following post documents:

{
   "_id" : ObjectId("56960cd909b0d8801d145543"),
   "title" : "Post title",
   "body" : "Post body"
}

{
   "_id" : ObjectId("56960cd909b0d8801d145544"),
   "_post": ObjectId("56960cd909b0d8801d145543"),
   "body" : "Comment one"
}

{
   "_id" : ObjectId("56960cd909b0d8801d145544"),
   "_post": ObjectId("56960cd909b0d8801d145543"),
   "body" : "Comment Two"
}

As you can see from the my documents above this is flat list of my post and comment implementation (like SO). If post has _post field then it's comment but if no it's post itself.

When I query for question 56960cd909b0d8801d145543 I need to get response from mongoDB in the following view:

// query
Post.aggregate({_id: ObjectId("56960cd909b0d8801d145543")});

// result 
{
   "_id" : ObjectId("56960cd909b0d8801d145543"),
   "title" : "Post title",
   "body" : "Post body",
   "comments" [{
      "_id" : ObjectId("56960cd909b0d8801d145544"),
      "_post": ObjectId("56960cd909b0d8801d145543"),
      "body" : "Comment one"
   },
   {
      "_id" : ObjectId("56960cd909b0d8801d145544"),
      "_post": ObjectId("56960cd909b0d8801d145543"),
      "body" : "Comment Two"
   }]
}

How shoul I construct aggregation pipilenes to get result above?

1 Answer 1

2

The following pipeline should work for you:

var pipeline = [
    {
        "$project": {
            "title": 1, "body": 1, 
            "post_id": { "$ifNull": [ "$_post", "$_id" ] }
        }
    },  
    {
        "$group": {
            "_id": "$post_id",
            "title": { "$first": "$title" },
            "body": { "$first": "$body" },
            "comments": {
                "$push": {
                    "_id": "$_id",
                    "_post": "$post_id",
                    "body": "$body"
                }
            }
        }
    },
    {
        "$project": {
            "title": 1, "body": 1,
            "comments": {
                "$setDifference": [
                    {
                        "$map": {
                            "input": "$comments",
                            "as": "el",
                            "in": {
                                "$cond": [
                                    { "$ne": [ "$$el._id", "$$el._post" ] },
                                    "$$el",
                                    false
                                ]
                            }
                        }
                    },
                    [false]
                ]
            }
        }
    }
];

Post.aggregate(pipeline, function (err, result) {
    if (err) { /* handle error */ };
    console.log(result);
});

The pipeline is structured in such a way that your first step, the $project operator stage, is to project the field post_id to be used as the group by key in the next pipeline stage. Since your schema is hierarchical, you'd need this field for parent/root documents. The $ifNull operator will act as the coalesce operator and return the replacement value if the field does not exist in the documents.

The next pipeline step, the $group pipeline stage tries to group the data to process them. The $group pipeline operator is similar to the SQL's GROUP BY clause. In SQL, we can't use GROUP BY unless we use any of the aggregation functions. The same way, we have to use an aggregation function in MongoDB as well. In this case you need the $push operator to create the comments array. The other fields are then accumulated using the $first operator.

The final step involves fitering the comments array so that you remove the document with the post details, which is definitely not of a comment type. This is made possible through the $setDifference and $map operators. The $map operator in essence creates a new array field that holds values as a result of the evaluated logic in a subexpression to each element of an array. The $setDifference operator then returns a set with elements that appear in the first set but not in the second set; i.e. performs a relative complement of the second set relative to the first. In this case it will return the final comments array that has elements not related to the parent documents via the _id property.

Sign up to request clarification or add additional context in comments.

2 Comments

Best answer I've seen ever :)) Thanks!
@Erik No worries, happy to help :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.