0

I have the following MongoDB aggregation query that finds all records within a specified month, $groups the records by day, and then returns an average price for each day. I would also like to return a price average for the entire month. Can I do this by using multiple $groups, if so, how?

        PriceHourly.aggregate([
                { $match: { date: { $gt: start, $lt: end } } },
                { $group: { 
                    _id: "$day", 
                    price: { $avg: '$price' },
                    system_demand: { $avg: '$system_demand'}
                }}
        ], function(err, results){

                results.forEach(function(r) {
                    r.price          = Helpers.round_price(r.price);
                    r.system_demand  = Helpers.round_price(r.system_demand);
                });
                console.log("Results Length: "+results.length, results);
                res.jsonp(results);
        }); // PriceHourly();  

Here is my model:

// Model
var PriceHourlySchema = new Schema({
    created: {
        type: Date,
        default: Date.now
    },
    day: {
        type: String,
        required: true,
        trim: true
    },
    hour: {
        type: String,
        required: true,
        trim: true
    },
    price: {
        type: Number,
        required: true
    },
    date: {
        type: Date,
        required: true
    }
}, 
{ 
    autoIndex: true 
});
1
  • You could not combine results in the way you seem to be describing, nor does it make a lot of sense to have a month average with the daily results unless you mean a "running averge" over each cumulative day. Otherwise two sets of data are best suited to two queries. In either case, your question could benefit from showing the data you seem to expect as a result. Commented Apr 1, 2014 at 1:32

1 Answer 1

1

The short answer is "What is wrong with just expanding your date range to include all the days in a month?", and therefore that is all you need to change in order to get your result.

And could you "nest" grouping stages? Yes you can add additional stages to the pipeline, that is what the pipeline is for. So if you first wanted to "average" per day and then take the average over all the days of the month, you can form like this:

PriceHourly.aggregate([
   { "$match": { 
       "date": { 
           "$gte": new Date("2014-03-01"), "$lt": new Date("2014-04-01")
       }  
   }},
   { "$group": { 
       "_id": "$day", 
       "price": { "$avg": "$price" },
       "system_demand": { "$avg": "$system_demand" }
   }},
   { "$group": { 
       "_id": null, 
       "price": { "$avg": "$price" },
       "system_demand": { "$avg": "$system_demand" }
   }}
])

Even though that is likely to be reasonably redundant as this can arguably be done with one single group statement.

But there is a longer commentary on this schema. You do not actually state much of the purpose of what you are doing other than obtaining an average, or what the schema is meant to contain. So I want to describe something that is maybe a bit different.

Suppose you have a collection that includes the "product", "type" the "current price" and the "timestamp" as a date when that "price" was "changed". Let us call the collection "PriceChange". So every time this event happens a new document is created.

{
    "product": "ABC",
    "type": 2,
    "price": 110,
    "timestamp": ISODate("2014-04-01T00:08:38.360Z")
}

This could change many times in an hour, a day or whatever the case.

So if you were interested in the "average" price per product over the month you could do this:

PriceChange.aggregate([
    { "$match": {
        "timestamp": { 
            "$gte": new Date("2014-03-01"), "$lt": new Date("2014-04-01")
        }
    }},
    { "$group": {
        "_id": "$product",
        "price_avg": { "$avg": "$price" }
    }}
])

Also, without any additional fields you can get the average price per product for each day of the month:

PriceChange.aggregate([
    { "$match": {
        "timestamp": { 
            "$gte": new Date("2014-03-01"), "$lt": new Date("2014-04-01")
        }
    }},
    { "$group": {
        "_id": {
            "day": { "$dayOfMonth": "$timestamp" },
            "product": "$product"
        },
        "price_avg": { "$avg": "$price" }
    }}
])

Or you can even get the last price for each month over a whole year:

PriceChange.aggregate([
    { "$match": {
        "timestamp": { 
            "$gte": new Date("2013-01-01"), "$lt": new Date("2014-01-01")
        }
    }},
    { "$group": {
        "_id": {
            "date": {
                "year": { "$year" : "$timestamp" },
                "month": { "$month": "$timestamp" }
            },
            "product": "$product"
        },
        "price_last": { "$last": "$price" }
    }}
])

So those are some things you can do using the build in Date Aggregation Operators to achieve various results. These can even aid in collection of this information for writing into new "pre-aggregated" collections, to be used for faster analysis.

I suppose there would be one way to combine a "running" average against all prices using mapReduce. So again from my sample:

PriceHourly.mapReduce(
    function () {
        emit( this.timestamp.getDate(), this.price )
    },
    function (key, values) {
        var sum = 0;
        values.forEach(function(value) {
            sum += value;
        });
        return ( sum / values.length );
    },
    { 
        "query": {
            "timestamp": {
                "$gte": new Date("2014-03-01"), "$lt": new Date("2014-04-01")
            }
        },
        "out": { "inline": 1 },
        "scope": { "running": 0, "counter": 0 },
        "finalize": function(key,value) {
            running += value;
            counter++;
            return { "dayAvg": value, "monthAvg": running / counter };
        }
    }
)

And that would return something like this:

{
    "results" : [
        {
            "_id" : 1,
            "value" : {
                "dayAvg" : 105,
                "monthAvg" : 105
            }
        },
        {
            "_id" : 2,
            "value" : {
                "dayAvg" : 110,
                "monthAvg" : 107.5
           }
        }
    ],
}

But if you are otherwise expecting to see discrete values for both the day and the month, then that would not be possible without running separate queries.

Sign up to request clarification or add additional context in comments.

4 Comments

Neil, thanks for including all of this information, it's very helpful. however, I am unable to get either methods to work for returning both a daily average and monthly average. The first approach only returns the monthly average, and the second $group appears to overwrite the daily averages. If I remove the second $group, the daily averages are returned as expected.
The second approach results in an error stating: { [MongoError: exception: invalid operator '$avg'] name: 'MongoError', errmsg: 'exception: invalid operator \'$avg\'', code: 15999, ok: 0 }
@ac360 Sorry, what is it you want to achieve? Do you want both the day and month averages in the same result set? I am not even sure what that should even look like, it certainly is not clear from your question. Also not sure on which listing you are referring to an error on. There are possible typing mistakes ( I spotted one ) as these are simply typed in as examples.
@ac360 Really not sure what you want to achieve here. I added a sample with a running average. Is there still something wrong with the result?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.