We've recently decided to revisit some of our MongoDB indexes and came across a peculiar result when using a compound index which contains a multikey part.
It's important to note that we're using v2.4.5
TLDR: When using a compound index with multikey part, the bounds of a non-multikey field used for range restriction are dropped.
I'll explain the problem with an example:
Create some data
db.demo.insert(
[{ "foo" : 1, "attr" : [ { "name" : "a" }, { "name" : "b" }, { "name" : "c" } ]},
{ "foo" : 2, "attr" : [ { "name" : "b" }, { "name" : "c" }, { "name" : "d" } ]},
{ "foo" : 3, "attr" : [ { "name" : "c" }, { "name" : "d" }, { "name" : "e" } ]},
{ "foo" : 4, "attr" : [ { "name" : "d" }, { "name" : "e" }, { "name" : "f" } ]}])
Index
db.demo.ensureIndex({'attr.name': 1, 'foo': 1})
Query & Explain
Query on 'attr.name' but constrain the range of the non-multikey field 'foo':
db.demo.find({foo: {$lt:3, $gt: 1}, 'attr.name': 'c'}).hint('attr.name_1_foo_1').explain()
{
"cursor" : "BtreeCursor attr.name_1_foo_1",
"isMultiKey" : true,
"n" : 1,
"nscannedObjects" : 2,
"nscanned" : 2,
"nscannedObjectsAllPlans" : 2,
"nscannedAllPlans" : 2,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {
"attr.name" : [
[
"c",
"c"
]
],
"foo" : [
[
-1.7976931348623157e+308,
3
]
]
}
}
As you can see, the range of 'foo' is not as defined in the query, one end is completely ignored which results in nscanned being larger than it should.
Changing the order of the range operands will alter the dropped end:
db.demo.find({foo: {$gt: 1, $lt:3}, 'attr.name': 'c'}).hint('attr.name_1_foo_1').explain()
{
"cursor" : "BtreeCursor attr.name_1_foo_1",
"isMultiKey" : true,
"n" : 1,
"nscannedObjects" : 2,
"nscanned" : 2,
"nscannedObjectsAllPlans" : 2,
"nscannedAllPlans" : 2,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {
"attr.name" : [
[
"c",
"c"
]
],
"foo" : [
[
1,
1.7976931348623157e+308
]
]
}
}
We're either missing out on some multikey index basics, or we're facing a bug.
We've gone through similar topics, including:
- https://groups.google.com/forum/#!searchin/mongodb-user/multikey$20bounds/mongodb-user/RKrsyzRwHrE/_i0SxdJV5qcJ
- Order of $lt and $gt in MongoDB range query
Unfortunately these posts address a different use-case where a range is set on the multikeyed value.
Other things we've tried to do:
Change the compound index ordering, starting with the non-multikey field.
Put the 'foo' value inside each of the subdocuments in the 'attr' array, index by ('attr.name', 'attr.foo') and do an $elemMatch on 'attr' with a range constraint on 'foo'.
Use an $and operator when defining the range:
db.demo.find({'attr.name': 'c', $and: [{num: {$lt: 3}}, {num: {$gt: 1}}]})Use MongoDB v2.5.4
None of the above had any effect (v2.5.4 made things worse by dumping both ends of the range completely).
Any kind of help would be highly appreciated!
Many Thanks,
Roi