Situation: I have collection with huge amount of documents after map reduce(aggregation). Documents in the collection looks like this:
/* 0 */
{
"_id" : {
"appId" : ObjectId("1"),
"timestamp" : ISODate("2014-04-12T00:00:00.000Z"),
"name" : "GameApp",
"user" : "[email protected]",
"type" : "game"
},
"value" : {
"count" : 2
}
}
/* 1 */
{
"_id" : {
"appId" : ObjectId("2"),
"timestamp" : ISODate("2014-04-29T00:00:00.000Z"),
"name" : "ScannerApp",
"user" : "[email protected]",
"type" : "game"
},
"value" : {
"count" : 5
}
}
...
And I searching inside this collection with aggregation framework:
db.myCollection.aggregate([match, project, group, sort, skip, limit]); // aggregation can return result on Daily or Monthly time base depends of user search criteria, with pagination etc...
Possible search criteria:
1. {appId, timestamp, name, user, type}
2. {appId, timestamp}
3. {name, user}
I'm getting correct result, exactly what I need. But from optimisation point of view I have doubts about indexing.
Questions:
- Is it possible to create indexes for such collection?
- How I can create indexes for such object with complex _id field?
- How I can do analog of db.collection.find().explain() to verify which index used?
- And is good idea to index such collection or its my performance paranoia?
Answer summarisation:
- MongoDB creates index by
_idfield automatically but that is useless in a case of complex_idfield like in an example. For field like:_id: {name: "", timestamp: ""}you must use index like that:*.ensureIndex({"_id.name": 1, "_id.timestamp": 1})only after that your collection will be indexed in proper way by_idfield. - For tracking how your indexes works with Mongo Aggregation you can not use
db.myCollection.aggregate().explain()and proper way of doing that is:
db.runCommand({
aggregate: "collection_name",
pipeline: [match, proj, group, sort, skip, limit],
explain: true
})
- My testing on local computer sows that such indexing seems to be good idea. But this is require more testing with big collections.