0

it seems like using $or with a sort does a full table scan and avoids my indexes on title and keywords how can I get it to use my two indexes when using an $or query?

this query uses both the title and keywords index

db.tasks.find({$or: [{keywords: /^japan/}, {title:/^japan/}]})

this does a full table scan and uses my index total_-1

db.tasks.find({$or: [{keywords: /^japan/}, {title:/^japan/}]}).sort({total:-1})

while queries against keywords or title with a sort do use the indexes on keywords or title respectively.

db.tasks.find({title:/^japan/}).sort({total:-1})
db.tasks.find({keywords:/^japan/}).sort({total:-1})

1 Answer 1

1

Sorting and indexes in Mongo are a complex topic. Mongo also has a special error that prevents you from doing a sort without an index if you have too many items. So it's good that you're asking about indexes, because an un-indexed sort will eventually start failing.

There is a bug in JIRA that seems to cover your issue, however there are some extra details to consider.

The first thing to note are your last queries:

db.tasks.find({title:/^japan/}).sort({total:-1})
db.tasks.find({keywords:/^japan/}).sort({total:-1})

These queries will fail eventually because you are only indexing on title not on title/total. Here's a script that will demonstrate the problem.

> db.foo.ensureIndex({title:1})
> for(var i = 0; i < 100; i++) { db.foo.insert({title: 'japan', total: i}); }
> db.foo.count()
100
> db.foo.find({title: 'japan'}).sort({total:-1}).explain()
... uses BTreeCursor title_1
> // Now try with one million items
> for(var i = 0; i < 1000000; i++) { db.foo.insert({title: 'japan', total: i}); }
> db.foo.find({title: 'japan'}).sort({total:-1}).explain()
Sat Mar 31 05:57:41 uncaught exception: error: {
        "$err" : "too much data for sort() with no index.  add an index or specify a smaller limit",
        "code" : 10128
}

So if you plan to query & sort on title and total, then you need an index on both, in that order:

> db.foo.ensureIndex({title:1,total:1})
> db.foo.find({title: 'japan'}).sort({total:-1}).explain()
{
        "cursor" : "BtreeCursor title_1_total_1 reverse",
...

The JIRA bug I listed above is for something like the following:

> db.foo.find({$or: [title:/^japan/, title:/^korea/]}).sort({total:-1})

Yours is slightly different, but it will encounter the same problem. Even if you have both indexes on title/total and keyword/total MongoDB will not be able to use indexes optimally.

Sign up to request clarification or add additional context in comments.

2 Comments

if you have an index for {total:-1, title:1} and sort by {total:-1, title:1}, does it at least make the index scan more efficient?
@drogon what are you filtering by? Is there a find() clause in there? The JIRA bug is very specific to using $or clause with sorting.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.