1

I have data stored in collections of mongodb documents as below.

{"_id":1536921044022.3953,
"flow":[
    {"_id":1536921044279.358,"y":0.1,"i":375,"t":33.1},
    {"_id":1536921044914.2346,"y":0.2,"i":310,"t":40.9},
    {"_id":1536921045548.5076,"y":0.3,"i":408,"t":32.9}],
"__v":0}
{"_id":1536921044053.3254,
"flow":[
    {"_id":1536921044229.358,"y":0.4,"i":375,"t":33.1},
    {"_id":1536921044954.2346,"y":0.5,"i":310,"t":40.9},
    {"_id":1536921045514.506,"y":0.6,"i":408,"t":32.9}],
    {"_id":1536921045245.5056,"y":0.7,"i":408,"t":32.9}],
    {"_id":1536921045549.3076,"y":0.8,"i":408,"t":32.9}],
"__v":0}

I want to aggregate the data in the flow field such that I get an array representing the average datapoint for the $flow.y value for each corresponding element. Given the data above, the result should be [0.25, 0.35, 0.45, 0.7, 0.8]. Note that each available y field of the flow array has been averaged across all documents. The two last elements of the second document are returned as 0.7, 0.8 as they don't exist in the previous one. Therefore the average of existing entries is just those two values, not 0.35, 0.4 as you might expect. If there were a third document with 0.12, 0.13 then the returned elements would be 0.41 and 0.465.

I have been trying combinations of $arrayElemAt, $elemMatch, $avg as part of an aggregate pipeline but I can't seem to hit the correct syntax.

Here's my progress so far (nodejs):

for (i=0;i<10;i++) {
  ModelName.aggregate([
    { $project: { pulse: { $objectToArray: { $arrayElemAt: ["$flow", i]} } } },
    { $unwind: "$pulse" },
    { $match: { "pulse.k": "y" }},
    { $group: { _id: "$pulse.k", 
              count: { $sum: 1 },
              average: { $avg: "$pulse.v" },
              total: { $sum: "$pulse.v" }}}
  ], function (err, result) {
    console.log(err, result);
    running.push(result[0].average);
  });
};

It will return an average for the y field of each subdocument element of each document. So it's getting there. The key hurdles left are removing the loop and negating arrays without matching elements. I imagine to accomplish the latter I'll have to keep a running count of existing array element and divide by that for each average.

0

1 Answer 1

1

You can use $unwind with includeArrayIndex option which will give you the order in initial array and the you can $group by that value, try:

db.model.aggregate([
    {
        $unwind: {
            path: "$flow",
            includeArrayIndex: "index"
        }
    },
    {
        $group: {
            _id: "$index",
            value: { $avg: "$flow.y" }
        }
    },
    {
        $sort: { _id: 1 }
    },
    {
        $group: {
            _id: null,
            values: { $push: "$value" }
        }
    }
])

Outputs: { "_id" : null, "values" : [ 0.25, 0.35, 0.44999999999999996, 0.7, 0.8 ] }

Sign up to request clarification or add additional context in comments.

1 Comment

That's incredibly useful stuff mickl. Thanks.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.