I'm having performance problems when using the MongoDB Aggregation Framework through C#. An aggregation which works fast through Mongo shell takes forever when executed with C#.
Before trying to call the framework through C#, I executed the following aggregation through Mongo shell to check that everything works:
db.runCommand(
{
aggregate: "actions",
pipeline :
[
{ $match : { CustomerAppId : "f5357224-b1a8-4f1a-8ea2-a06a00ca597a", ActionName : "install"}},
{ $group : { _id : { CustomerAppId:"$CustomerAppId",ActionDate:"$ActionDate" }, count : { $sum : 1 } }}
]
});
The script executed in < 500ms and returns the expected around 200 results (The CustomerAppId is defined as a string in the database. It's not possible to use GUIDs with aggregation framework.).
Then, I ported the same script to C#:
var pipeline = new BsonArray
{
new BsonDocument
{
{
"$match",
new BsonDocument
{
{"CustomerAppId", "f5357224-b1a8-4f1a-8ea2-a06a00ca597a"},
{"ActionName", "install"}
}
},
{ "$group",
new BsonDocument
{
{ "_id", new BsonDocument
{
{
"CustomerAppId","$CustomerAppId"
},
{
"ActionName","$ActionName"
}
}
},
{
"Count", new BsonDocument
{
{
"$sum", 1
}
}
}
}
}
}
};
var command = new CommandDocument
{
{ "aggregate", "actions" },
{ "pipeline", pipeline }
};
(Please let me know if there's an easier way to write the aggregation in C# :) )
Which I'm executing like this:
var result = db.RunCommand(command);
The problem is that it kills the server: The CPU and mem usage go way up. When I check db.currentOp(), I can see the aggregate operation but I eventually have to kill it using db.killOp(1281546):
"opid" : 1281546,
"active" : true,
"secs_running" : 294,
"op" : "query",
"ns" : "database.actions",
"query" : {
"aggregate" : "actions",
"pipeline" : [
{
"$match" : {
"CustomerAppId" : "f5357224-b1a8-4f1a-8ea2-a06a00ca597a",
"ActionName" : "install"
},
"$group" : {
"_id" : {
"CustomerAppId" : "$CustomerAppId",
"ActionName" : "$ActionName"
},
"Count" : {
"$sum" : 1
}
}
}
]
},
To me the operation looks completely fine and similar to the script I run directly from mongo shell. It feels like running the aggregation through C# causes the MongoDB to miss the index and it's doing a table scan for all the ~6 million documents in the collection.
Any ideas?
Update: Logs
Thanks to cirrus' suggestion, I enabled the verbose logging and then used tail to get the queries. And they are different! So I think there is something wrong in my C# port. Any ideas on how to format the query correctly?
The query when executed through shell:
Mon Oct 8 15:00:13 [conn1] run command database.$cmd { aggregate: "actions", pipeline: [ { $match: { CustomerAppId: "f5357224-b1a8-4f1a-8ea2-a06a00ca597a", ActionName: "install" } }, { $group: { _id: { CustomerAppId: "$CustomerAppId", ActionDate: "$ActionDate" }, count: { $sum: 1.0 } } } ] }
Mon Oct 8 15:00:13 [conn1] command database.$cmd command: { aggregate: "actions", pipeline: [ { $match: { CustomerAppId: "f5357224-b1a8-4f1a-8ea2-a06a00ca597a", ActionName: "install" } }, { $group: { _id: { CustomerAppId: "$CustomerAppId", ActionDate: "$ActionDate" }, count: { $sum: 1.0 } } } ] } ntoreturn:1 keyUpdates:0 locks(micros) r:27944 reslen:12705 29ms
And the query when executed through C#:
Mon Oct 8 15:00:16 [conn8] run command database.$cmd { aggregate: "actions", pipeline: [ { $match: { CustomerAppId: "f5357224-b1a8-4f1a-8ea2-a06a00ca597a", ActionName: "install" }, $group: { _id: { CustomerAppId: "$CustomerAppId", ActionDate: "$ActionDate" }, Count: { $sum: 1 } } } ] }
Second line is missing, I suppose because the query doesn't finish.
And here are the logs again for easier comparison. Script is up, C# down:
Mon Oct 8 15:00:13 [conn1] run command database.$cmd { aggregate: "actions", pipeline: [ { $match: { CustomerAppId: "f5357224-b1a8-4f1a-8ea2-a06a00ca597a", ActionName: "install" } }, { $group: { _id: { CustomerAppId: "$CustomerAppId", ActionDate: "$ActionDate" }, count: { $sum: 1.0 } } } ] }
Mon Oct 8 15:00:16 [conn8] run command database.$cmd { aggregate: "actions", pipeline: [ { $match: { CustomerAppId: "f5357224-b1a8-4f1a-8ea2-a06a00ca597a", ActionName: "install" }, $group: { _id: { CustomerAppId: "$CustomerAppId", ActionDate: "$ActionDate" }, Count: { $sum: 1 } } } ] }