3

I have transaction table which is populated by holidays taken by the employees. I would need help on following sql scenario in mongodb.

select employee,month,year,count(distinct (holiday_type),sum(hours) from transactions group by employee,month,year

I have started mongodb couple of weeks back. I have got the partial answer through is stack overflow post Mongodb count distinct with multiple group fields and now am looking to add sum function.

Any guidance will be really helpful, here is a sample of the data shown in table form:

Employee    date      holiday_type  hours
1           1/1/2014  1             8 
1           1/5/2014  2             7 
1           2/15/2014 1             8 
1           3/15/2014 3             16 
11          1/1/2014  1             8 
11          1/5/2014  1             6 
11          2/15/2014 3             8 
11          3/15/2014 3             8
1
  • What exactly is hours? Where does that come from? Can you perhaps show some sample data so we have an idea? Commented May 14, 2014 at 11:08

1 Answer 1

4

So with "hours" actually being a field (property) within your document to begin with. So from the previous answer you just abstract the double grouping as follows:

db.transactions.aggregate([
    { "$group": { 
        "_id": { 
            "employee" : "$employee",
            "Month": { "$month" : "$date" }, 
            "Year": { "$year" : "$date" },
            "holiday_type" : "$holiday_type"
        },
        "hours": { "$sum": "$hours" }
     }},
     { "$group": {
         "_id": {
            "employee" : "$_id.employee",
            "Month": "$_id.Month",
            "Year": "$_id.Year"
         },
         "count": { "$sum": 1 },
         "hours": { "$sum": "$hours" }
     }}
 ], { "allowDiskUse": true }
 );

So you are simply using $sum in both stages.

Additionally, It should be worthwhile for you to take a look at the SQL to Aggregation mapping chart provided in the official documentation. It has many examples of common SQL operations and how to implement them in a MongoDB way.


From your own data, but inserted by myself in this way:

db.transactions.insert([
    { "employee": 1,  "date": new Date("2014-01-01"), "holiday_type":  1, "hours": 8   },
    { "employee": 1,  "date": new Date("2014-01-05"), "holiday_type":  2, "hours": 7   },
    { "employee": 1,  "date": new Date("2014-02-15"), "holiday_type":  1, "hours": 8   },
    { "employee": 1,  "date": new Date("2014-03-15"), "holiday_type":  3, "hours": 16  },
    { "employee": 11, "date": new Date("2014-01-01"), "holiday_type":  1, "hours": 8   },
    { "employee": 11, "date": new Date("2014-01-05"), "holiday_type":  1, "hours": 6   },
    { "employee": 11, "date": new Date("2014-02-15"), "holiday_type":  1, "hours": 8   },
    { "employee": 11, "date": new Date("2014-03-15"), "holiday_type":  3, "hours": 8   }
])

And not the best example since all the months are actually different but this would get "distinct" values on the "holiday_type" if it needed to group that way. The result is achieved:

{
    "_id" : {
            "employee" : 1,
            "Month" : 2,
            "Year" : 2014
    },
    "count" : 1,
    "hours" : 8
}
{
    "_id" : {
            "employee" : 11,
            "Month" : 2,
            "Year" : 2014
    },
    "count" : 1,
    "hours" : 8
}
{
    "_id" : {
            "employee" : 1,
            "Month" : 1,
            "Year" : 2014
    },
    "count" : 2,
    "hours" : 15
}
{
    "_id" : {
            "employee" : 11,
            "Month" : 1,
            "Year" : 2014
    },
    "count" : 1,
    "hours" : 14
}
{
    "_id" : {
            "employee" : 1,
            "Month" : 3,
            "Year" : 2014
    },
    "count" : 1,
    "hours" : 16
}
{
    "_id" : {
            "employee" : 11,
            "Month" : 3,
            "Year" : 2014
    },
    "count" : 1,
    "hours" : 8
}
Sign up to request clarification or add additional context in comments.

5 Comments

Thanks @Neil. I tried this, I am getting zeros on the hours column. The sample data set is "employee" : "Karthick" <br /> "holiday_type" : 1 <br /> "hourrs" : 8, <br /> "date" : 2009-01-01" <br /> <br /> "employee" : "Karthick11" <br /> "holiday_type" : 1 <br /> "hourrs" : 8, <br /> "date" : 2009-01-01" <br /> <br /> "employee" : "Karthick12" <br /> "holiday_type" : 1 <br /> "hourrs" : 8, <br /> "date" : 2009-01-01" <br /> <br />
@Karthi your hourrs field is spelled differently ( and incorrectly ), so you cannot just "cut and paste" and instead you need to look at the differences. Also this really should have been an edit to your question as I have already done ( or attempted from your previous comments ). So please edit your question in the future rather than try to post additional details ( especially data ) in comments. But I do think the general question is worthwhile to demonstrate the concept, so you got an up vote for that.
Thanks for the suggestion. In my real time data it is mentioned as "hours" only. I just populated with sample data and mispelled. What else would have gone wrong.
@Karthi you clearly are doing something wrong in your setup. See the additional information I have provided in the answer. The query as shown works as expected.
This is working. Mistakes in my query. Thanks a lot Neil. I am understanding this logic clearly now.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.