1

I am trying to optimize mongodb query for better response

db.myReports.find({
"CheckInDate": {
    "$gte" : ISODate("2015-01-12T00:00:00Z"),
    "$lte" : ISODate("2015-03-31T00:00:00Z")
},
"SubscriberPropertyId":  NumberLong(47984),
"ChannelId": {
    "$in": [701, 8275]
},
"PropertyId": {
    "$in": [47984, 3159, 5148, 61436, 66251, 70108]
},
"LengthOfStay": 1
},     {
    "CheckInDate": 1,
   "SubscriberPropertyId": 1,
    "ChannelId": 1,
    "PropertyId": 1
});

Currently it is taking around 3 minutes just to find data from 3 million records.

One Document from collection

{
  "_id" : ObjectId("54dba46c320caf5a08473074"),
    "OptimisationId" : NumberLong(1),
    "ScheduleLogId" : NumberLong(3),
      "ReportId" : NumberLong(4113235),
   "SubscriberPropertyId" : NumberLong(10038),
   "PropertyId" : NumberLong(18166),
   "ChannelId" : 701,
  "CheckInDate" : ISODate("2014-09-30T18:30:00Z"),
  "LengthOfStay" : 1,
 "OccupancyIndex" : 1.0,
 "CreatedDate" : ISODate("2014-09-11T06:31:08Z"),
  "ModifiedDate" : ISODate("2014-09-11T06:31:08Z"),

 }

INDEX created is:

db.myReports.getIndexes();
[
        {
            "v" : 1,
            "key" : {
                    "_id" : 1
            },
            "name" : "_id_",
            "ns" : "db.myReports"
    },
    {
            "v" : 1,
            "key" : {
                    "CheckInDate" : 1,
                    "SubscriberPropertyId" : 1,
                    "ReportId" : 1,
                    "ChannelId" : 1,
                    "PropertyId" : 1
            },
            "name" :      
 "CheckInDate_1_SubscriberPropertyId_1_ReportId_1_Channe

 lId_1_PropertyId_1",
            "ns" : "db.myReports"
    },
    {
            "v" : 1,
            "key" : {
                    "CheckInDate" : 1
            },
            "name" : "CheckInDate_1",
            "ns" : "db.myReports"
    }
]

I have created index on possible entities

7
  • You wanna share what the documents look like, and what index you got so far? Commented Feb 13, 2015 at 7:39
  • It would be great to know how your indexes are set up Commented Feb 13, 2015 at 7:52
  • 1
    also, what is lakhs? Commented Feb 13, 2015 at 7:52
  • Technically 3 million. as a lakh is apparently 100,000 so 30 times 100,000 is 3 million. you learn something new every day. :) Commented Feb 13, 2015 at 8:07
  • 2
    Use query.explain() to check which index is being used in the query. Commented Feb 13, 2015 at 8:31

1 Answer 1

1

Put equality queries first, then range queries:

db.myReports.find({
  "SubscriberPropertyId":  NumberLong(47984),
  "ChannelId": {
    "$in": [701, 8275]
  },
  "PropertyId": {
    "$in": [47984, 3159, 5148, 61436, 66251, 70108]
  },
  "CheckInDate": {
    "$gte" : ISODate("2015-01-12T00:00:00Z"),
    "$lte" : ISODate("2015-03-31T00:00:00Z")
  },
  "LengthOfStay": 1 // low selectivity, move to the end
}, {
  "CheckInDate": 1,
  "SubscriberPropertyId": 1,
  "ChannelId": 1,
  "PropertyId": 1
});

Make sure the index fits, i.e make the index SubscriberPropertyId, ChannelId, PropertyId, CheckInDate. LengthOfStay probably has too low selectivity to make sense in an index, depends on your data.

That should reduce the nscanned significantly, but getting 300k results will take its time (actually reading them, I mean)

Sign up to request clarification or add additional context in comments.

3 Comments

both (My query and query modified by you) are giving same result. The no of nscanned elements are same, time taken is also same. Same query if I do in sql server return result in few seconds. We consider mongodb much faster than relational dbs for big data?
NoSQL isn't about speed, but scalability, and it's not the tools that make the difference but the paradigms your code is based on (remember facebook runs on mysql) - your approach is quite read-heavy and there's a ton of indexes, that's generally what rdbms's are good at. You'll have to reason about the selectivity of the indexes and understand the access patterns or choose a more write-heavy data structure. Also, 300k rows is hardly 'big data'... all in all, that's maybe 400M of data - I can fit 40x the amount of that in my off-the-shelf laptop's RAM...
Can you suggest any solution to current query?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.