MongoDB query response is too slow

Question

I am working on a Golang project (db MongoDB). I have executed the below query but it is taking too much time to load the data. In this, I am getting data from 2 collections with multiple stages.

db.getCollection('Collection1').aggregate([
{
    "$lookup": {
        "localField": "uid",
        "from": "collection2",
        "foreignField": "_id",
        "as": "user_info"
    }
},
{
    "$unwind": "$user_info"
},
{
    "$lookup": {
        "localField": "cid",
        "from": "collection3",
        "foreignField": "_id",
        "as": "cust_info"
    }
},
{
    "$lookup": {
        "from": "logs",
        "let":  {"id": "$_id"},
        "pipeline": [
                {"$match": {"$expr": {"$eq": ["$$id", "$log_id"]}}},
                {"$sort": {"log_type": 1}}],
        "as": "logs_data"
    }
},
{
    "$sort": {"logs_data.logged_on":-1}
},
{
    "$skip": 1
},
{
    "$limit": 2
},
])

My requirement is to add 2 time sort within the same query:

Within the logs array "$sort": {"log_type": 1}}
For the end result "$sort": {"logs_data.logged_on":-1}

For this I have tries the following indexes:

{"logged_on" : -1}
{"log_id":1, "log_type":1}

But the query taking still 6-7 sec to execute.

If I remove "$sort": {"logs_data.logged_on":-1} then it works fast but with this sorting it is taking too much time.

how and what can I do to improve the response time.

Will the logs collection contain any documents that don't match an _id from collection1? — Joe
– Joe, Commented Aug 27, 2020 at 9:19
@Joe No It is not possible. Logs does not contain any document that don't match the "_id" from collection1 — Swati
– Swati, Commented Aug 27, 2020 at 9:40

Joe · Accepted Answer · 2020-08-27 09:41:04Z

What that aggregation is doing:

retrieve all documents from collection1
for each document in collection1, find a single document in collection2
for each document in collection1, find a single document in collection3
for each document in collection1, find all realated documents in logs
for each document in collection1, perform an in-memory sort of the documents retrieved from logs
perform an in-memory sort to order the documents
keep 2 of these documents and discard the rest

For each document in collection1 that is 3 document fetches (plus the unknown number of fetches in #4), 2 index scans, and an in-memory sort.

If there is a non-trivial amount of documents in collection1, that is a ton of work, which is wasted for all but 2 of the documents.

If it is safe to assume that every document in logs contains a log_id that maps back to collection1, you could:

create an index on {logged_on:1, log_id:1}
start the aggregation on the logs collection
sort by logged_on: 1
project {logged_on:1, log_id:1, _id:0} (this makes the first part of the aggregation fully covered by the above index)
group by log_id, taking the $first value of logged_on
sort by logged_on: 1 (grouping distrubs the sort)
skip and limit as desired
lookup from collection1 with local log_id foreign _id
replaceRoot with the newRoot being the looked up document
execute the existing pipeline stages you were using - this time they will only be fetching/sorting for the 2 documents you want to return.

Collectives™ on Stack Overflow

MongoDB query response is too slow

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related