2

Given these documents:

{
    values: [
        { attribute: 1 },
        { attribute: 2 },
        { attribute: 3 },
        { attribute: 4 },
    ]
},
{
    values: [
        { attribute: 2 },
        { attribute: 3 },
        { attribute: 4 },
    ]
},
{
    values: [
        { attribute: 2 },
        { attribute: 3 },
    ]
}

I'm trying to get the common "attribute" values:

[ 2, 3 ]

I'm looking at the aggregator framework, but I've found nothing that could really answer my need for now.

I'm using Mongo 2.4.6.

Thanks in advance for your answers!

EDIT

In fact, my documents can have duplicated attributes (but I want to count them only once per document).

Given this data

{
    values: [
        { attribute: 1 },
        { attribute: 2 },
        { attribute: 3 },
        { attribute: 3 },
        { attribute: 4 },
    ]
},
{
    values: [
        { attribute: 2 },
        { attribute: 2 },
        { attribute: 3 },
        { attribute: 4 },
    ]
},
{
    values: [
        { attribute: 2 },
        { attribute: 3 },
    ]
}

Then the query should return:

{
        "result" : [
                {
                        "values" : 2
                },
                {
                        "values" : 3
                }
        ],
        "ok" : 1
}

Anand, the query you posted will count attribute "2" 4 times, instead of 3 times. I tried to modify it, but this is still quite cryptic to me...

Thanks in advance.

3
  • what exactly do you mean my "common" attribute values? Do you want find attributes that are present in every document in the collection? for example, 4 is present in the 1st and 2nd document. is that considered common? Commented Mar 27, 2014 at 15:51
  • Hello Anand. No I mean attribute must be present in all the documents. So, in this example only the attributes 2 and 3 must be selected. Commented Mar 27, 2014 at 19:28
  • I've posted a solution. Check if that's what you are looking for. Commented Mar 27, 2014 at 19:56

2 Answers 2

2

I'm not certain if I understand your question completely, but I'm going to give a shot at it.

If you want to find only the attributes that are present in every document in the collection, one approach is to get the document count in a separate query and then use an aggregation query like below.

db.collection.aggregate([
    // Unwind the values array
    { "$unwind" : "$values"}, 
    // Group by "values.attribute" and get the count for each
    { "$group" : {_id:"$values.attribute", count:{$sum:1}}}, 
    // Filter only those documents where count equals number of docs in the collection (i.e., 3)
    { "$match" : {count:3}}, // Replace 3 with document count
    // Project phase to make the result prettier and in the format you want
    { "$project" :{_id:0, values:"$_id"}}
])

This is the output you'll get when you run the above query:

{
        "result" : [
                {
                        "values" : 3
                },
                {
                        "values" : 2
                }
        ],
        "ok" : 1
}

I don't think this can be achieved in a single query though (i.e., without running a separate query for the document count). May be someone will post here if there's a better approach.

EDIT: For the edge case you have described, you can take advantage of the _id field that's present in every document and is unique across the collection by adding an additional $group phase including the _id:

db.collection.aggregate([
    // Unwind the values array
    { "$unwind" : "$values"}, 
    // Group by "_id" and "values.attribute" to pick just one element from the array per document
    { "$group" : {_id:{_id:"$_id", attrValue: "$values.attribute"}}},
    // Group by "values.attribute" and get the count for each
    { "$group" : {_id:"$_id.attrValue", count:{$sum:1}}}, 
    // Filter only those documents where count equals number of docs in the collection (i.e., 3)
    { "$match" : {count:3}}, // Replace 3 with document count
    // Project phase to make the result prettier and in the format you want
    { "$project" :{_id:0, values:"$_id"}}
])
Sign up to request clarification or add additional context in comments.

2 Comments

Hello Anand. Thank you very much for your answer! It answers my need, but I've forgotten about a edge case that'll change slitghly the request. I've updated the initial question to show it to you. Regards.
I have added another query for the edge base. Hope it helps!
0

We've came up with this solution:

db.collection.aggregate(
    { $project: { "values.attribute": 1} },
    { $unwind: "$values" },
    { $group: {
        _id : "$_id",
        attribute: {$addToSet:"$values.attribute"}                                   
      }
    },
    { $unwind: "$attribute" },
    { $group: { _id: "$attribute", count: { $sum: 1 } } },
    { "$match" : {count:3}},
)

and the addToSet seems faster than the group on a composite key.

Thank you very Anand, your help was very appreciated!

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.