1

I need to get docs from collection with condition :

last_updated -gte ISODate("2020-02-26T22:1o:55.364Z")

Input Collection name : intensity_log

Sample Docs :

[
  {
    junction_id:"J1",
    intensities: [
      {
        lane_id: "L1",
        data: [
          {
            intensity: 1,
            last_updated: ISODate("2020-02-26T22:15:55.364Z")
          },
          {
            intensity: 1,
            last_updated: ISODate("2020-02-26T22:10:55.364Z")
          },
          {
            intensity: 0.9,
            last_updated: ISODate("2020-02-26T22:05:55.364Z")
          }
        ]
      },
      {
        lane_id: "L2",
        data: [
          {
            intensity: 1,
            last_updated: ISODate("2020-02-26T22:15:55.364Z")
          },
          {
            intensity: 2.1,
            last_updated: ISODate("2020-02-26T22:10:55.364Z")
          },
          {
            intensity: 1.1,
            last_updated: ISODate("2020-02-26T22:05:55.364Z")
          }
        ]
      }
    ]
  },
  {
    junction_id:"J2",
    intensities: [
      {
        lane_id: "L1",
        data: [
          {
            intensity: 1,
            last_updated: ISODate("2020-02-26T22:15:55.364Z")
          },
          {
            intensity: 1,
            last_updated: ISODate("2020-02-26T22:10:55.364Z")
          },
          {
            intensity: 0.9,
            last_updated: ISODate("2020-02-26T22:05:55.364Z")
          }
        ]
      },
      {
        lane_id: "L2",
        data: [
          {
            intensity: 1,
            last_updated: ISODate("2020-02-26T22:15:55.364Z")
          },
          {
            intensity: 2.1,
            last_updated: ISODate("2020-02-26T22:10:55.364Z")
          },
          {
            intensity: 1.1,
            last_updated: ISODate("2020-02-26T22:05:55.364Z")
          }
        ]
      }
    ]
  }
]

Expected Output :

[
    {
        junction_id: "J1",
        data: [
            {
                lane_id: "L1",
                sum: 2,
                count: 2,
                avg: 1
            },
            {
                lane_id: "L2",
                sum: 2,
                count: 2,
                avg: 1
            }
        ]
    },
    {
        junction_id: "J2",
        data: [
            {
                lane_id: "L1",
                sum: 2,
                count: 2,
                avg: 1
            },
            {
                lane_id: "L2",
                sum: 2,
                count: 2,
                avg: 1
            }
        ]
    }
]
0

1 Answer 1

1

You can try below query :

db.intensity_log.aggregate([
    /** match only docs where there is last_updated > given time, which reduces data size */
    { $match: { 'intensities.data.last_updated': { $gte: ISODate("2020-02-26T22:10:55.364Z") } } },
    /** unwinding array to access objects in it */
    { $unwind: '$intensities' },
    /** filtering objects in data array which matches required criteria */
    { $addFields: { 'intensities.data': { $filter: { input: '$intensities.data', cond: { $gte: ['$$this.last_updated', ISODate("2020-02-26T22:10:55.364Z")] } } } } },
    /** adding required fields into an object named data */
    {
        $addFields: {
            'data.count': { $size: '$intensities.data' },
            'data.sum': {
                $reduce: {
                    input: '$intensities.data',
                    initialValue: 0,
                    in: {
                        $add: ["$$value", "$$this.intensity"]
                    }
                }
            }
        }
    },
    /** adding avg field & extracting lane_id from intensities to data */
    { $addFields: { 'data.avg': { $divide: ["$data.sum", '$data.count'] }, 'data.lane_id': '$intensities.lane_id' } },
    /** Grouping on junction_id & pushing data field created on above stages */
    { $group: { _id: '$junction_id', data: { $push: '$data' } } },
    /** converting _id field name to junction_id & removing _id field from output */
    { $project: { _id: 0, junction_id: '$_id', data: 1 } }
])

Note : You can do the same by double unwinding on array fields but it might explode collections documents & can be an issue over huge datasets, So this would be better as this query will be operating on same no.of docs from collection or even less docs after each stage.

Test : MongoDB-Playground

Sign up to request clarification or add additional context in comments.

2 Comments

This worked correctly. Thanks. Will this query be able to handle very large number of docs, as they will keep on increasing in future?
@NitishGoyal : Yes it should scale up-to certain extinct, as like any-other query we can't say when it will fail, there can be n no.of issues which can make a query less efficient, when ever you feel a query is running slow try to use explain to see what can be done to improve performance..

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.