I have a set of (~35k) documents which look like this:
{
"_id" : ObjectId("583dabfc7572394f93ac6ef2"),
"updatedAt" : ISODate("2016-11-29T16:25:32.130Z"),
"createdAt" : ISODate("2016-11-29T16:25:32.130Z"),
"sourceType" : "report",
"sourceRef" : ObjectId("583da865686e3dfbd977f059"),
"type" : "video",
"caption" : "lorem ipsum",
"timestamps" : {
"postedAt" : ISODate("2016-08-26T15:09:35.000Z"),
"monthOfYear" : 7, // 0-based
"dayOfWeek" : 5, // 0-based
"hourOfDay" : 16 // 0-based
},
"stats" : {
"comments" : 0,
"likes" : 8
},
"user" : {
"id" : "123456",
"username" : "johndoe",
"fullname" : "John",
"picture" : ""
},
"images" : {
"thumbnail" : "",
"low" : "",
"standard" : ""
},
"mentions" : [
"janedoe"
],
"tags" : [
"holiday",
"party"
],
"__v" : 0
}
I want to produce an aggregate report, which will be used to graph frequency of documents by hour of day/day of week/month of year, along with counts for mentions/tags.
{
// Each frequency is independant from the others,
// e.g. the total count for each frequency should
// be ~35k.
dayFrequency: [
{ day: 0, count: 1400 }, // Monday
{ day: 1, count: 1700 }, // Tuesday
{ day: 2, count: 1800 }, // Wednesday
{ /* etc */ },
{ day: 6, count: 1200 } // Sunday
],
monthFrequency: [
{ month: 0, count: 200 }, // January
{ month: 1, count: 250 }, // February
{ month: 2, count: 300 }, // March
{ /* etc */ },
{ month: 11, count: 150 } // December
],
hourFrequency: [
{ hour: 0, count: 150 }, // 0am
{ hour: 1, count: 200 }, // 1am
{ hour: 2, count: 275 }, // 2am
{ /* etc */ },
{ hour: 23, count: 150 }, // 11pm
],
mentions: {
janedoe: 12,
johnsmith: 11,
peter: 54,
/* and so on */
},
tags: {
holiday: 872,
party: 1029,
/* and so on */
}
}
Is this possible, and if so, how would I write it? From what I understand, as I am performing an aggregate of all matching documents, it would effectively be one group?
My code so far simply groups all matching records into one group, but I'm not sure on how to move forward.
Model.aggregate([
{ $match: { sourceType: 'report', sourceRef: '583da865686e3dfbd977f059' } },
{ $group: {
_id: '$sourceRef'
}}
], (err, res) => {
console.log(err);
console.log(res);
})
Also acceptable would be counting frequencies as an array of counts (e.g. [ 1400, 1700, 1800, /* etc */ 1200 ]), which lead me to look at $count and a few of the other operators, however again I'm not clear on usage.