1

Say I have a lot of the following documents:

{
  _id: “abc”,
  values: {
    0: { 0: 999999, 1: 999999, …, 59: 1000000 },
    1: { 0: 2000000, 1: 2000000, …, 59: 1000000 },
    …,
    58: { 0: 1600000, 1: 1200000, …, 59: 1100000 },
    59: { 0: 1300000, 1: 1400000, …, 59: 1500000 }
  }
}
{
  _id: “def”,
  values: {
    0: { 0: 999999, 1: 999999, …, 59: 1000000 },
    1: { 0: 2000000, 1: 2000000, …, 59: 1000000 },
    …,
    58: { 0: 1600000, 1: 1200000, …, 59: 1100000 },
    59: { 0: 1300000, 1: 1400000, …, 59: 1500000 }
  }
}

which is basically a multidimensional array of 60x60 items. can aggregation (or any other mongodb construct) be used to easily sum the two (or more) matrixes? i.e. values[x][y] of both abc and def are summed together, and the same is done for all other elements? Ideally the output would be a similar multidimensional array.

This answer seems to suggest this is possible with 1 dimensional array but I am not sure for multidimensional.

EDIT:

This is an example with real data in a format which is slightly different:

db.col.find({}, { _id: 0, hit: 1 })
{ "hit" : [ [ 570, 0, 630, 630, 636, 735, 672, 615, 648, 648, 618, 0 ], 
[ 492, 0, 471, 471, 570, 564, 0, 590, 513, 432, 471, 477 ], 
[ 387, 0, 0, 0, 0, 0, 0, 456, 0, 480, 351, 415 ], 
[ 432, 528, 0, 0, 495, 509, 0, 579, 0, 552, 0, 594 ], 
[ 558, 603, 594, 624, 672, 0, 0, 705, 783, 0, 756, 816 ], 
[ 0, 858, 951, 1027, 0, 0, 1058, 1131, 0, 0, 1260, 1260 ], 
[ 1269, 0, 1287, 0, 1326, 0, 1386, 1386, 1470, 0, 0, 0 ], 
[ 1623, 0, 1695, 1764, 1671, 1671, 0, 1824, 1872, 0, 0, 0 ], 
[ 1950, 1894, 2034, 2034, 0, 0, 1941, 0, 2070, 1911, 2049, 2055 ], 
[ 2052, 2052, 0, 0, 0, 2085, 2007, 2073, 0, 0, 0, 1941 ], 
[ 1878, 1896, 0, 1875, 0, 0, 1677, 0, 1722, 0, 1545, 0 ], 
[ 0, 0, 1317, 1469, 1501, 1634, 1494, 0, 0, 1290, 0, 0 ], 
[ 0, 1485, 1375, 1491, 1530, 1407, 0, 0, 0, 1611, 0, 0 ], 
[ 1652, 1800, 1686, 1643, 1923, 0, 0, 0, 1737, 1604, 1797, 0 ], 
[ 1842, 1806, 0, 1830, 1896, 1947, 0, 1710, 1734, 1725, 0, 0 ], 
[ 0, 0, 1932, 0, 1908, 1878, 1941, 1931, 2007, 2013, 1995, 1995 ], 
[ 0, 2025, 2004, 1927, 0, 0, 1939, 1835, 1962, 1863, 0, 1815 ], 
[ 0, 0, 1839, 1755, 1821, 1821, 1751, 1656, 0, 0, 1467, 0 ], 
[ 0, 1632, 1546, 1449, 0, 1551, 1449, 0, 0, 1554, 0, 1491 ], 
[ 1463, 1411, 0, 1491, 0, 0, 1551, 1467, 0, 0, 0, 1464 ], 
[ 0, 0, 1311, 0, 0, 1471, 0, 0, 1581, 0, 1368, 1368 ], 
[ 1296, 0, 0, 0, 1176, 1381, 0, 1170, 1194, 1194, 1193, 1137 ], 
[ 0, 1244, 1221, 1039, 0, 1041, 930, 921, 1033, 813, 0, 0 ], 
[ 0, 0, 0, 1010, 0, 0, 918, 783, 0, 609, 693, 645 ] ] }

And this is the appropriate query (thanks to Veeram in the comments for fixing my code):

db.col.aggregate([ 
{ $project: { _id: 0, hit: 1 } }, 
{ $unwind: { path: "$hit", includeArrayIndex: "x" } }, 
{ $unwind: { path: "$hit", includeArrayIndex: "y" } }, 
{ $group: { _id: { x: "$x", y: "$y" }, hit: { $sum: "$hit" } } }, 
{ $sort: { "_id.x": 1, "_id.y": 1 } }, 
{ $group: { _id: "$_id.x", hit: { $push: "$hit" } } }, 
{ $sort: { "_id": 1 } }, 
{ $group: { _id: null, hit: { $push: "$hit" } } } 
])
2
  • 1
    You are almost there. Try db.col.aggregate([ { $project: { _id: 0, hit: 1 } }, { $unwind: { path: "$hit", includeArrayIndex: "x" } }, { $unwind: { path: "$hit", includeArrayIndex: "y" } }, { $group: { _id: { x: "$x", y: "$y" }, hit: { $sum: "$hit" } } }, { $sort: { "_id.x": 1, "_id.y": 1 } }, { $group: { _id: "$_id.x", hit: { $push: "$hit" } } }, { $sort: { "_id": 1 } }, { $group: { _id: null, hit: { $push: "$hit" } } } ]) Commented Aug 10, 2018 at 12:47
  • Perfect, thanks!!! Commented Aug 10, 2018 at 13:28

1 Answer 1

1

You need two operators to deal with dynamic properties: $objectToArray and $arrayToObject. To sum the values from all documents you can try to represent each x,y pair as single document (using $unwind) and then use several $group stages to get single document as a result. To get the initial order of your rows and columns you can apply $sort twice:

db.col.aggregate([
    {
        $project: {
            values: {
                $map: {
                    input: { $objectToArray: "$values" },
                    as: "obj",
                    in: { k: "$$obj.k", v: { $objectToArray: "$$obj.v" } }
                }
            }
        }
    },
    {
        $unwind: "$values"
    },
    {
        $unwind: "$values.v"
    },
    {
        $project: {
            x: "$values.k",
            y: "$values.v.k",
            value: "$values.v.v"
        }
    },
    {
        $group: {
            _id: { x: "$x", y: "$y" },
            value: { $sum: "$value" }
        }
    },
    {
        $sort: {
            "_id.y": 1
        }
    },
    {
        $group: {
            _id: "$_id.x",
            v: { $push: { k: "$_id.y", v: "$value" } }
        }
    },
    {
        $sort: {
            "_id": 1
        }
    },
    {
        $group: {
            _id: null,
            values: { $push: { k: "$_id", v: "$v" } }
        }
    },
    {
        $project: {
            values: {
                $arrayToObject: {
                    $map: {
                        input: "$values",
                        as: "obj",
                        in: {
                            k: "$$obj.k",
                            v: { $arrayToObject: "$$obj.v" }
                        }
                    }
                }
            }
        }
    }
])

For your sample data it outputs:

{
    "_id" : null,
    "values" : {
            "0" : {
                    "0" : 1999998,
                    "1" : 1999998,
                    "59" : 2000000
            },
            "1" : {
                    "0" : 4000000,
                    "1" : 4000000,
                    "59" : 2000000
            },
            "58" : {
                    "0" : 3200000,
                    "1" : 2400000,
                    "59" : 2200000
            },
            "59" : {
                    "0" : 2600000,
                    "1" : 2800000,
                    "59" : 3000000
            }
    }

}

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks mickl. I've added to the original question another solution for different data set, however I'm sure your solution holds and was instrumental in finding my own answer. Thanks.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.