Optimizing mongo query for better response

Question

I am trying to optimize mongodb query for better response

db.myReports.find({
"CheckInDate": {
    "$gte" : ISODate("2015-01-12T00:00:00Z"),
    "$lte" : ISODate("2015-03-31T00:00:00Z")
},
"SubscriberPropertyId":  NumberLong(47984),
"ChannelId": {
    "$in": [701, 8275]
},
"PropertyId": {
    "$in": [47984, 3159, 5148, 61436, 66251, 70108]
},
"LengthOfStay": 1
},     {
    "CheckInDate": 1,
   "SubscriberPropertyId": 1,
    "ChannelId": 1,
    "PropertyId": 1
});

Currently it is taking around 3 minutes just to find data from 3 million records.

One Document from collection

{
  "_id" : ObjectId("54dba46c320caf5a08473074"),
    "OptimisationId" : NumberLong(1),
    "ScheduleLogId" : NumberLong(3),
      "ReportId" : NumberLong(4113235),
   "SubscriberPropertyId" : NumberLong(10038),
   "PropertyId" : NumberLong(18166),
   "ChannelId" : 701,
  "CheckInDate" : ISODate("2014-09-30T18:30:00Z"),
  "LengthOfStay" : 1,
 "OccupancyIndex" : 1.0,
 "CreatedDate" : ISODate("2014-09-11T06:31:08Z"),
  "ModifiedDate" : ISODate("2014-09-11T06:31:08Z"),

 }

INDEX created is:

db.myReports.getIndexes();
[
        {
            "v" : 1,
            "key" : {
                    "_id" : 1
            },
            "name" : "_id_",
            "ns" : "db.myReports"
    },
    {
            "v" : 1,
            "key" : {
                    "CheckInDate" : 1,
                    "SubscriberPropertyId" : 1,
                    "ReportId" : 1,
                    "ChannelId" : 1,
                    "PropertyId" : 1
            },
            "name" :      
 "CheckInDate_1_SubscriberPropertyId_1_ReportId_1_Channe

 lId_1_PropertyId_1",
            "ns" : "db.myReports"
    },
    {
            "v" : 1,
            "key" : {
                    "CheckInDate" : 1
            },
            "name" : "CheckInDate_1",
            "ns" : "db.myReports"
    }
]

I have created index on possible entities

You wanna share what the documents look like, and what index you got so far? — yaoxing
– yaoxing, Commented Feb 13, 2015 at 7:39
Technically 3 million. as a lakh is apparently 100,000 so 30 times 100,000 is 3 million. you learn something new every day. :) — Brian Noah
– Brian Noah, Commented Feb 13, 2015 at 8:07
Use query.explain() to check which index is being used in the query. — ZeMoon
– ZeMoon, Commented Feb 13, 2015 at 8:31

mnemosyn · Accepted Answer · 2015-02-13 15:16:11Z

1

Put equality queries first, then range queries:

db.myReports.find({
  "SubscriberPropertyId":  NumberLong(47984),
  "ChannelId": {
    "$in": [701, 8275]
  },
  "PropertyId": {
    "$in": [47984, 3159, 5148, 61436, 66251, 70108]
  },
  "CheckInDate": {
    "$gte" : ISODate("2015-01-12T00:00:00Z"),
    "$lte" : ISODate("2015-03-31T00:00:00Z")
  },
  "LengthOfStay": 1 // low selectivity, move to the end
}, {
  "CheckInDate": 1,
  "SubscriberPropertyId": 1,
  "ChannelId": 1,
  "PropertyId": 1
});

Make sure the index fits, i.e make the index SubscriberPropertyId, ChannelId, PropertyId, CheckInDate. LengthOfStay probably has too low selectivity to make sense in an index, depends on your data.

That should reduce the nscanned significantly, but getting 300k results will take its time (actually reading them, I mean)

answered Feb 13, 2015 at 15:16

mnemosyn

46.4k6 gold badges80 silver badges84 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

sangita Over a year ago

both (My query and query modified by you) are giving same result. The no of nscanned elements are same, time taken is also same. Same query if I do in sql server return result in few seconds. We consider mongodb much faster than relational dbs for big data?

mnemosyn Over a year ago

NoSQL isn't about speed, but scalability, and it's not the tools that make the difference but the paradigms your code is based on (remember facebook runs on mysql) - your approach is quite read-heavy and there's a ton of indexes, that's generally what rdbms's are good at. You'll have to reason about the selectivity of the indexes and understand the access patterns or choose a more write-heavy data structure. Also, 300k rows is hardly 'big data'... all in all, that's maybe 400M of data - I can fit 40x the amount of that in my off-the-shelf laptop's RAM...

sangita Over a year ago

Can you suggest any solution to current query?

Collectives™ on Stack Overflow

Optimizing mongo query for better response

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related