2

I have the following collection structure

{
   "_id": {
     "d_timestamp": NumberLong(1429949699),
     "d_isostamp": ISODate("2015-04-25T08:14:59.0Z")
   },
   "XBT-USD-cpx-okc": [
   {
       "buySpread": -1.80081
   }

I run the following aggregation

$spreadName ='XBT-USD-stp-nex';
$pipe = array(
    array(
        '$match' => array(
            '_id.d_isostamp' => array(
                '$gt' => $start, '$lt' => $end
            )
        )
    ),
    array(
        '$project' => array(
            'sellSpread' =>'$'.$spreadName.'.sellSpread',
        )
    ),
    array(
        '$group' => array(
            '_id' => array(
                'isodate' => array(
                    '$minute' => '$_id.d_isostamp'
                )
            ),
            'rsell_spread' => array(
                '$avg' => '$sellSpread'
            ),
        )
    ),
);

$out = $collection->aggregate($pipe ,$options);

and I get as a result the value 0 for rsell_spread whereas if I run a $max for instance instead of an $avg in the $group , I get an accurate value for rsell_spread , w/ the following structure

{
  "_id": {
     "isodate": ISODate("2015-04-25T08:00:58.0Z")
  },
  "rsell_spread": [
     -4.49996▼
  ]
}

So I have two questions :

1/ How come does the $avg function does not work?

2/ How can I can a result not in an array when I use $max for example (just a regular number)?

0

1 Answer 1

1
  1. The $avg group accumulator operator does work, it's only that in your case it is being applied to an element in an array and thus gives the "incorrect" result.

  2. When you use the $max group accumulator operator, it returns the the highest value that results from applying an expression to each document in a group of documents, thus in your example it returned the maximum array.

To demonstrate this, consider adding a few sample documents to a test collection in mongoshell:

db.test.insert([
{
    "_id" : {
        "d_timestamp" : NumberLong(1429949699),
        "d_isostamp" : ISODate("2015-04-25T08:14:59.000Z")
    },
    "XBT-USD-stp-nex" : [ 
        {
            "sellSpread" : -1.80081
        }
    ]
},
{
    "_id" : {
        "d_timestamp" : NumberLong(1429949710),
        "d_isostamp" : ISODate("2015-04-25T08:15:10.000Z")
    },
    "XBT-USD-stp-nex" : [ 
        {
            "sellSpread" : -1.80079
        }
    ]
},
{
    "_id" : {
        "d_timestamp" : NumberLong(1429949720),
        "d_isostamp" : ISODate("2015-04-25T08:15:20.000Z")
    },
    "XBT-USD-stp-nex" : [ 
        {
            "sellSpread" : -1.80083
        }
    ]
},
{
    "_id" : {
        "d_timestamp" : NumberLong(1429949730),
        "d_isostamp" : ISODate("2015-04-25T08:15:30.000Z")
    },
    "XBT-USD-stp-nex" : [ 
        {
            "sellSpread" : -1.80087
        }
    ]
}
])

Now, replicating the same operation above in mongoshell:

var spreadName = "XBT-USD-stp-nex",
    start = new Date(2015, 3, 25),
    end = new Date(2015, 3, 26);
db.test.aggregate([
    {
        "$match": {
            "_id.d_isostamp": { "$gte": start, "$lte": end }
        }
    },
    {
        "$project": {
            "sellSpread": "$"+spreadName+".sellSpread"
        }
    }/*,<--- deliberately omitted the $unwind stage from the pipeline to replicate the current pipeline
    {
        "$unwind": "$sellSpread"
    }*/,
    {
        "$group": {
            "_id": {
                "isodate": { "$minute": "$_id.d_isostamp"}
            },
            "rsell_spread": {
                "$avg": "$sellSpread"
            }
        }
    }
])

Output:

/* 0 */
{
    "result" : [ 
        {
            "_id" : {
                "isodate" : 15
            },
            "rsell_spread" : 0
        }, 
        {
            "_id" : {
                "isodate" : 14
            },
            "rsell_spread" : 0
        }
    ],
    "ok" : 1
}

The solution is to include an $unwind operator pipeline stage after the $project step, this will deconstruct the XBT-USD-stp-nex array field from the input documents and outputs a document for each element. Each output document replaces the array with an element value. This will then make it possible for the $avg group accumulator operator to work.

Including this will give the aggregation result:

/* 0 */
{
    "result" : [ 
        {
            "_id" : {
                "isodate" : 15
            },
            "rsell_spread" : -1.80083
        }, 
        {
            "_id" : {
                "isodate" : 14
            },
            "rsell_spread" : -1.80081
        }
    ],
    "ok" : 1
}

So your final working aggregation in PHP should be:

$spreadName ='XBT-USD-stp-nex';
$pipe = array(
    array(
        '$match' => array(
            '_id.d_isostamp' => array(
                '$gt' => $start, '$lt' => $end
            )
        )
    ),    
    array(
        '$project' => array(
            'sellSpread' =>'$'.$spreadName.'.sellSpread',
        )
    ),
    array('$unwind' => '$sellSpread'),
    array(
        '$group' => array(
            '_id' => array(
                'isodate' => array(
                    '$minute' => '$_id.d_isostamp'
                )
            ),
            'rsell_spread' => array(
                '$avg' => '$sellSpread'
            ),
        )
    ),
);

$out = $collection->aggregate($pipe ,$options);
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.