7

I have transaction table which is populated by holidays taken by the employees. I would need help on following sql scenario in mongodb.

select employee,month,year,count(distinct (holiday_type) from 
transactions group by employee,month,year

I need to use aggregation in mongodb and was created mongo query like this and this gives me wrong solution

db.transactions.aggregate([
    { "$group": { 
        "_id": { 
            "Month": { "$month" : "$date" }, 
            "Year": { "$year" : "$date" },
            "employee" : "$employee",
            "holiday_type" : "$holiday_type"
        },
        "Count_of_Transactions" : { "$sum" : 1 }
     }}
 ]);

I am confused in using count distinct logic in mongodb. Any suggestion would be helpful

1 Answer 1

11

Part of the way there but you need to get the "distinct" values for "holiday_type" first, then you $group again:

db.transactions.aggregate([
    { "$group": { 
        "_id": { 
            "employee" : "$employee",
            "Month": { "$month" : "$date" }, 
            "Year": { "$year" : "$date" },
            "holiday_type" : "$holiday_type"
        },
     }},
     { "$group": {
         "_id": {
            "employee" : "$_id.employee",
            "Month": "$_id.Month",
            "Year": "$_id.Year"
         },
         "count": { "$sum": 1 }
     }}
 ], { "allowDiskUse": true }
 );

That is the general process as "distinct" in SQL is kind of a grouping operation in itself. So it is a double $group operation in order to get your correct result.

Sign up to request clarification or add additional context in comments.

5 Comments

I am getting the error as follows on executing this query assert: command failed: { "errmsg" : "exception: Exceeded memory limit for $group, but didn't allo w external sort. Pass allowDiskUse:true to opt in.", "code" : 16945, "ok" : 0 } : aggregate failed
@Karthi The memory usage limit for $group operations was reduced in MongoDB 2.6 but you also may have a particularly large collection. You can add "allowDiskUse" to counter that in the "options" section of the method arguments as I have included in the edit. Also see the aggregate command manual page
You are right. I didnt include allowDiskuse in the query. This is very much helpful.
If I want to do a sum function for example say hours, which group segment should have the sum function say (total : $sum : $hours") this needs to be group across month,year,employee.
@Karthi That sounds like another question and is probably better expressed as a question than a short comment. Please feel free to ask it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.