0

I need to index this query:

db.messages.find({
  $or: [
    { $and: [
      { receiverFbId: 1 },
      { senderFbId: 2 }
    ]},
    { $and: [
      { receiverFbId: 2 },
      { senderFbId: 1 }
    ]}
  ]
}).sort({ timestamp: -1 });

I have created indexes:

db.messages.ensureIndex({ receiverFbId: 1 });
db.messages.ensureIndex({ senderFbId: 1 });
db.messages.ensureIndex({ receiverFbId: 1, senderFbId: 1, timestamp: -1 });

The first two indexes work for query whithout sorting by timestamp. The third index should work for query with sort but it doesn't. The query with explain() function returns BasicCursor.

So what index should I create to index this query with sort by timestamp?

6
  • as far as I know, index on sort is only one direction. oh and you only need 1 index on (receiverFbId:1, senderFbId: 1) Commented Oct 6, 2014 at 21:07
  • what version are you using? @Anzel I don't know what you comment means, but it seems incorrect to me. Commented Oct 6, 2014 at 22:20
  • Based on my understanding it should work. Can you tell whether just { receiverFbId: 1, senderFbId: 1} works? Commented Oct 6, 2014 at 23:18
  • I tried on MongoDB version V2.6.4, any of these three indexes will be used, and the last one has the preference. But when trying on V2.4.8, none of them will be used. It seems that V2.4.8 is not intelligent enough on index selection. Commented Oct 6, 2014 at 23:23
  • @AsyaKamsky, please see my answer below. Your assumption on mongo index is wrong, regardless of mongo version. Commented Oct 7, 2014 at 13:05

2 Answers 2

1

I made a test by db.messages.ensureIndex({ receiverFbId: 1, senderFbId: 1, timestamp: -1 }, {name:"rst"});. The index was used on MongoDB V2.6.4, but not used on V2.4.8.
So, perhaps you are out of luck if your MongoDB version is less than 2.6. :)

Otherwise, I want to say, it's almost impossible to use index to succeed on this query and sort by timestamp completely even on V2.6.4. Here I give an example.

Run below codes on mongo shell, (all are V2.6.4)

// initialize data
var docs = [ 
// group 1
{
    _id : 1,
    receiverFbId : 1,
    senderFbId : 2,
    timestamp : new Date("2014-10-09")
}, {
    _id : 2,
    receiverFbId : 1,
    senderFbId : 2,
    timestamp : new Date("2014-10-08")
}, {
    _id : 3,
    receiverFbId : 1,
    senderFbId : 2,
    timestamp : new Date("2014-10-07")
}, 
// group 2
{
    _id : 4,
    receiverFbId : 2,
    senderFbId : 1,
    timestamp : new Date("2014-10-08")
}, {
    _id : 5,
    receiverFbId : 2,
    senderFbId : 1,
    timestamp : new Date("2014-10-07")
}, {
    _id : 6,
    receiverFbId : 2,
    senderFbId : 1,
    timestamp : new Date("2014-10-09")
}, 
// group 3
{
    _id : 7,
    receiverFbId : 1,
    senderFbId : 8,
    timestamp : new Date("2014-10-09")
}, {
    _id : 8,
    receiverFbId : 2,
    senderFbId : 6,
    timestamp : new Date("2014-10-01")
} ];

var c = db["messages"];
c.drop();
c.insert(docs);
c.ensureIndex({ receiverFbId: 1, senderFbId: 1, timestamp: -1 }, {name: "rst"});

// make an output test
c.find({
      $or: [
        { $and: [
          { receiverFbId: 1 },
          { senderFbId: 2 }
        ]},
        { $and: [
          { receiverFbId: 2 },
          { senderFbId: 1 }
        ]}
      ]
    }).sort({ timestamp: -1 }); 
// result
{ "_id" : 1, "receiverFbId" : 1, "senderFbId" : 2, "timestamp" : ISODate("2014-10-09T00:00:00Z") }
{ "_id" : 6, "receiverFbId" : 2, "senderFbId" : 1, "timestamp" : ISODate("2014-10-09T00:00:00Z") }
{ "_id" : 2, "receiverFbId" : 1, "senderFbId" : 2, "timestamp" : ISODate("2014-10-08T00:00:00Z") }
{ "_id" : 4, "receiverFbId" : 2, "senderFbId" : 1, "timestamp" : ISODate("2014-10-08T00:00:00Z") }
{ "_id" : 3, "receiverFbId" : 1, "senderFbId" : 2, "timestamp" : ISODate("2014-10-07T00:00:00Z") }
{ "_id" : 5, "receiverFbId" : 2, "senderFbId" : 1, "timestamp" : ISODate("2014-10-07T00:00:00Z") }

// make an explain
c.find({
      $or: [
        { $and: [
          { receiverFbId: 1 },
          { senderFbId: 2 }
        ]},
        { $and: [
          { receiverFbId: 2 },
          { senderFbId: 1 }
        ]}
      ]
    }).sort({ timestamp: -1 }).explain();
// result
{
    "clauses" : [ {
        "cursor" : "BtreeCursor rst",
        "isMultiKey" : false,
        "n" : 3,
        "nscannedObjects" : 3,
        "nscanned" : 3,
        "scanAndOrder" : false,             // Attention on this line
        "indexOnly" : false,
        "nChunkSkips" : 0,
        "indexBounds" : {
            "receiverFbId" : [ [ 1, 1 ] ],
            "senderFbId" : [ [ 2, 2 ] ],
            "timestamp" : [ [ {
                "$maxElement" : 1
            }, {
                "$minElement" : 1
            } ] ]
        }
    }, {
        "cursor" : "BtreeCursor rst",
        "isMultiKey" : false,
        "n" : 3,
        "nscannedObjects" : 3,
        "nscanned" : 3,
        "scanAndOrder" : false,             // Attention on this line
        "indexOnly" : false,
        "nChunkSkips" : 0,
        "indexBounds" : {
            "receiverFbId" : [ [ 2, 2 ] ],
            "senderFbId" : [ [ 1, 1 ] ],
            "timestamp" : [ [ {
                "$maxElement" : 1
            }, {
                "$minElement" : 1
            } ] ]
        }
    } ],
    "cursor" : "QueryOptimizerCursor",
    "n" : 6,
    "nscannedObjects" : 6,
    "nscanned" : 6,
    "nscannedObjectsAllPlans" : 6,
    "nscannedAllPlans" : 6,
    "scanAndOrder" : false,                 // Attention on this line
    "nYields" : 0,
    "nChunkSkips" : 0,
    "millis" : 0,
    "server" : "Duke-PC:27017",
    "filterSet" : false
}

According to above output, it almost follows the expectation. But also we can find something else,

  • Both groups (group 1 and group 2) have been selected and sorted by index respectively.
  • But, these two groups have intersection on timestamp. To provide correct result, an extra global sorting in memory is necessary. This sorting should be very fast because each group has been ordered.

I understand the last line of "scanAndOrder" : false from the .explain() as it almost but not completely implements sorting without extra memory sorting.


--------------- Edit ------------------

CLARIFICATION

My previous comprehension about the last line of scanAndOrder : false in .explain() is wrong.
$or can perfectly merge results from those indexes, without extra burdened buffering.

Thanks for the help from Asya Kamsky.

Sign up to request clarification or add additional context in comments.

4 Comments

I don't know what you mean about scanAndOrder - it indicates that an in-memory sort was NOT needed, and the index was fully used. There is no global sort - it's a merge of two sorted lists, which is very efficient. Other than that, your answer is 100% correct, for 2.6 or later, earlier versions couldn't do this.
@Asya Kamsky, I understand ScanAndOrder completely by the description of manual: If scanAndOrder is false, MongoDB can use the order of the documents in an index to return sorted results. My first thought is it returns final sorted results directly. But now I find that it could be only the sorted results from that index, by literal. When $or merges tow sorted lists, the behavior of comparing and inserting from one list into another list happens, and this is what about global sorting in memory - maybe I used improper words.
@Asya Kamsky, if returned result is very large, is it true that $or has to use a lot of memory to buffer the final sorted result? Because these two sorted list are independent.
no, it's not true - merged lists can be merged and returned without loading the entire list into memory. Consider merging the lists 1,2,3,4,5,6,7,8,9,100 and 5,60,70,80,90,91,92 etc. I have to read the first element of second list and then I know I don't have to find the next one until I read the next five from the first list, then looking at the second item of second list (60) I can continue reading the returning the first list without having to load any of the second list ...
0

It seems index on sort is really ONE direction only, meaning it can index sorting on the same direction (either ascending or descending), please refer to this. And you do not need to place multiple indexes for the same prefix. So, for your sort to work best, it should be just like this:

db.messages.ensureIndex({ receiverFbId: 1, senderFbId: 1 });

The sort must specify the same sort direction (i.e.ascending/descending) for all its keys as the index key pattern or specify the reverse sort direction for all its keys as the index key pattern. For example, an index key pattern { a: 1, b: 1 } can support a sort on { a: 1, b: 1 } and { a: -1, b: -1 } but not on { a: -1, b: 1 }.

So db.messages.ensureIndex({receiverFbId:1, senderFbId:1, timestamp:-1}) simply doesn't work.

and when you try to sort, use this: db.messages.find({...}).sort({timestamp: -1});

And to answer your question, no, currently it doesn't support indexing on the sort query you are suggesting. If you really insist to make this work, you have to rework your timestamp as something like (future - timestamp) to make it indexable on ascending so they can all be sorted in same direction.

6 Comments

you are incorrect. the referenced passage only refers to sorting by multiple fields. In the given query the OP only wants to sort by a single field "timestamp" NOT by multiple fields. The leading parts of the index are used for filtering, the last part is used for sorting (in 2.6) before 2.6 because of the $or clause the use of the index was limited.
How can it be the index limited since OP uses $or and $and on both receiverFbId and senderFbId fields? it equals to "Please return all records with both fields existed", and the sequence involved is a SORT, otherwise it would be just `ensureIndex({timestamp: -1}). You are completely wrong
@AsyaKamsky, please re-read OP question, he wants to index this db.messages.ensureIndex({ receiverFbId: 1, senderFbId: 1, timestamp: -1 }); and this is MULTIPLE; also I do have a solution for it, make timestamp indexable altogether by making it sort in same direction
the index has three fields. his query filters on the first two fields via equality and then uses the third for sorting which can happen in ascending or descending direction using the index.
that I agreed. however to build an index on a query with sort still requires indexing on three fields. and his query involves both $or $and on 2 fields clearly showed OP wants a Sort too
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.