6

I have some problems with very slow distinct commands that use a query. From what I have observed the distinct command only makes use of an index if you do not specify a query:

I have created a test database on my MongoDB 3.0.10 server with 1Mio objects. Each object looks as follows:

{
    "_id" : ObjectId("56e7fb5303858265f53c0ea1"),
    "field1" : "field1_6",
    "field2" : "field2_10",
    "field3" : "field3_29",
    "field4" : "field4_64"
}

The numbers at the end of the field values are random 0-99.

On the collections two simple indexes and one compound-index has been created:

{ "field1" : 1 } # simple index on "field1"
{ "field2" : 1 } # simple index on "field2"
{                # compound index on all fields
    "field2" : 1,
    "field1" : 1,
    "field3" : 1,
    "field4" : 1
}

Now I execute distinct queries on that database:

db.runCommand({ distinct: 'dbtest',key:'field1'})

The result contains 100 values, nscanned=100 and it has used index on "field1".

Now the same distinct query is limited by a query:

db.runCommand({ distinct: 'dbtest',key:'field1',query:{field2:"field2_10"}})

It contains again 100 values, however nscanned=9991 and the used index is the third one on all fields.

Now the third index that was used in the last query is dropped. Again the last query is executed:

db.runCommand({ distinct: 'dbtest',key:'field1',query:{field2:"field2_10"}})

It contains again 100 values, nscanned=9991 and the used index is the "field2" one.

Conclusion: If I execute a distinct command without query the result is taken directly from an index. However when I combine a distinct command with a query only the query uses an index, the distinct command itself does not use an index in such a case.

My problem is that I need to perform a distinct command with query on a very large database. The result set is very large but only contains ~100 distinct values. Therefore the complete distinct command takes ages (> 5 minutes) as it has to cycle through all values.

What needs to be done to perform my distinct command presented above that can be answered by the database directly from an index?

2
  • Currently not supported in mongo, duplicate of: stackoverflow.com/questions/36006208/… Commented Mar 15, 2016 at 13:35
  • 1
    Thanks, that explains a lot. It is really great to have a "database" with data in it, well indexed and the "database" ignores the index. May be one should rethink about calling Mongo a database... Commented Mar 15, 2016 at 13:56

1 Answer 1

3

The index is automatically used for distinct queries if your Mongo database version supports it.

The possibility to use an index in a distinct query requires Mongo version 3.4 or higher - it works for both storage engines MMAPv1/WiredTiger.

See also the bug ticket https://jira.mongodb.org/browse/SERVER-19507

Sign up to request clarification or add additional context in comments.

11 Comments

This does not answer the question
I disagree - the question was "how can we use the index in distinct queries " and the answer as presented is "upgrade to Mongo 3.4". What answer do you expect? The question explicitly indicates that the necessary indices already exist.
The question was "how to use AN index ..." not "THE" index. I was personaly looking for a description of, well, how to define an index to work with distinct command (field orders, taking advantage of covered queries), since it's not documented. This may be the title of the question though, and not the answer
Don't forget that the title is not the question itself. I gave an example what does not work and the resulting question was how to make it work with index. You really should take your time and read the complete question - and if it does not match your problem it is not a problem of the answer. Answers should be interpreted in the context of the question not in the context of what you are trying to do.
@Max I added the link to the bug ticket.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.