0

I am trying to write a pipeline query on the following

{
    "_id" : 1,
    "createdDate" : "2018-01-01 00:00:00",
    "visits" : [ 
        {
            "date" : "2018-02-01 00:00:00",
            "type" : "A",
        }, 
        {
            "date" : "2018-03-01 00:00:00",
            "type" : "B",
        }]
    "user" : "Alpha"
},

{
    "_id" : 1,
    "createdDate" : "2018-01-15 00:00:00",
    "visits" : [ 
        {
            "date" : "2018-02-01 00:00:00",
            "type" : "B",
        }, 
        {
            "date" : "2018-04-08 00:00:00",
            "type" : "A",
        }]
    "user" : "Alpha"
}

I want to

  • Group by user
  • Get No of Records
  • Basis a date filter
  • latest type to be selected basis the same date filter

Example:

date between 2018-01-01 and 2018-02-02

Desired Output

{_id:"Alpha",
type: {"A":1, "B":1},
count : 2}

date between 2018-01-01 and 2018-03-02

Desired Output

{_id:"Alpha",
type: {"A":0, "B":2},
count : 2}

Here is where i have got so far

{$match:{ "createdDate":{   "$gte":"2018-01-01 00:00:00",
                            "$lte":"2018-02-02 23:59:59"}     
        }
},
{$unwind:"$visits"},
{$match:{   "visits.date":{ "$gte":"2018-01-01 00:00:00",
                            "$lte":"2018-02-02 23:59:59"}
          }
},
{$project:{_id:1, "visit_type":"$visits.type", "visit_date":"$visits.date"}},
{ $group : {  _id         : "$_id",
              "visits"    :    { "$push": { 
                                    "date": "$visit_date",
                                    "type": "$visit_type"                                        
                                }
                                }
            }
}

1 Answer 1

1

You can try below aggregation.

Added two groups one group each for count by type and other for count by user.

[
  {"$match":{"createdDate":{"$gte":"2018-01-01 00:00:00","$lte":"2018-03-02 23:59:59"}}},
  {"$unwind":"$visits"},
  {"$match":{"visits.date":{"$gte":"2018-01-01 00:00:00","$lte":"2018-03-02 23:59:59"}}},
  {"$sort":{"visits.date":-1}},
  {"$group":{"_id":{"_id":"$_id","user":"$user"},"latestvisit":{"$first":"$visits"}}},
  {"$group":{"_id":{"user":"$_id.user","type":"$latestvisit.type"},"visits":{"$sum":1}}},
  {"$group":{
    "_id":"$_id.user",
    "type":{"$push":{"type":"$_id.type","visits":"$visits"}},
    "count":{"$sum":"$visits"}
  }}
]
Sign up to request clarification or add additional context in comments.

8 Comments

One thing missing here is that we need to consider latest visit from the array as per the date range
Can you update the expected json to show what you mean by latest visit ?
As i have indicated in my question. the result changes with the two different date queries. type: {"A":1, "B":1} vs type: {"A":0, "B":2},
how is {"A":0, "B":2} but not {"A":1, "B":2} for date range 2018-01-01 and 2018-03-02 ?
latest of visits in first document for date range 2018-01-01 and 2018-02-02 would be "type" : "A" as the second in array is out of date range. if Date range is changed to 2018-01-01 and 2018-03-02, then latest would be "type" : B. This is how the count can change
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.