1

I am trying to finish up a data aggregation pipeline and having issues getting the data into the correct format. I'm not even sure if this is possible to do in one pipeline.

The original data looks like this:

[
  {
    answers: {
      'question1': 'a',
      'question2': 'c',
      'question3': ['a','b'],
      'question4': 1
    },
    createdAt: 2022-03-04T07:30:40.517Z,
  },
  {
    answers: {
      'question1': 'b',
      'question2': 'c',
      'question3': ['a','c']
      'question4': 2
    },
    createdAt: 2022-03-04T07:30:40.518Z,
  }
]

I've got my pipeline so far with this:

{ $project: { 
    "answers": { $objectToArray: "$answers" },
    "date": { $dateToString: { format: "%Y-%m-%d", date: "$createdAt" }}
}},
{ $unwind: "$answers" },
{ $unwind: "$answers.v" },
{
  $group: {
     _id: { answers : "$answers", date: "$date"},
     c: { $sum: 1 }}
 },

and the data now looks like this:

{
    _id: {
      answers: { k: 'q3', v: 'b' },
      date: '2022-03-04'
    },
    count: 1
  },
  {
    _id: {
      answers: { k: 'q3', v: 'a' },
      date: '2022-03-04'
    },
    count: 2
  },
  {
    _id: {
      answers: { k: 'q4', v: 1 },
      date: '2022-03-04'
    },
    count: 1
  },
  {
    _id: {
      answers: { k: 'q1', v: 'b' },
      date: '2022-03-04'
    },
    count: 1
  },
  {
    _id: {
      answers: { k: 'q4', v: 2 },
      date: '2022-03-04'
    },
    count: 1
  },
  {
    _id: {
      answers: { k: 'q2', v: 'c' },
      date: '2022-03-04'
    },
    count: 2
  },
  {
    _id: {
      answers: { k: 'q3', v: 'c' },
      date: '2022-03-04'
    },
    count: 1
  },
  {
    _id: {
      answers: { k: 'q1', v: 'a' },
      date: '2022-03-04'
    },
    count: 1
  }

I would like to get a result that looks something like this:

{
    'dates': [
        {
            'date': '2022-03-04',
            'q1': { 'a': 1, 'b': 1 }
            'q2': { 'c': 2 },
            'q3': { 'a': 2, 'b': 1, 'c': 1 },
            'q4': { '1': 1, '2': 1 }
        }
    ]
    'totals': { // this would be the totals across all the dates
        'q1': { 'a': 1, 'b': 1 }
        'q2': { 'c': 2 },
        'q3': { 'a': 2, 'b': 1, 'c': 1 },
        'q4': { '1': 1, '2': 1 }
    }
}

any help would be greatly appreciated, even if I can't get both the totals and breakdown in 1 query.

here is the mongoplaygroud I've been working on

2
  • Do you run already MongoDB version 5.0? Commented Mar 4, 2022 at 19:26
  • Yes. Using MongoDB 5 Commented Mar 5, 2022 at 23:04

1 Answer 1

3

Not that simple. An important stage you have to use is $facet in order to get totals and dates

Maybe with $setWindowFields the aggregation pipeline could be a little simpler, but that a quick guess.

db.collection.aggregate([
  {
    $project: {
      _id: 0,
      answers: { $objectToArray: "$answers" },
      date: { $dateToString: { format: "%Y-%m-%d", date: "$createdAt" } }
    }
  },
  { $unwind: "$answers" },
  { $unwind: "$answers.v" },
  {
    $group: {
      _id: {
        answer: "$answers.v",
        question: "$answers.k",
        date: "$date"
      },
      count: { $sum: 1 }
    }
  },
  {
    $facet: {
      dates: [
        {
          $group: {
            _id: { question: "$_id.question", date: "$_id.date" },
            count: {
              $push: {
                k: { $toString: "$_id.answer" },
                v: "$count"
              }
            }
          }
        },
        {
          $group: {
            _id: "$_id.date",
            count: {
              $push: {
                k: "$_id.question",
                v: { $arrayToObject: "$count" }
              }
            }
          }
        },
        {
          $replaceWith: {
            $mergeObjects: [
              { date: "$_id" },
              { $arrayToObject: "$count" }
            ]
          }
        }
      ],
      totals: [
        {
          $group: {
            _id: { answer: "$_id.answer", question: "$_id.question" },
            v: { $push: "$count" }
          }
        },
        {
          $group: {
            _id: "$_id.question",
            count: {
              $push: {
                k: { $toString: "$_id.answer" },
                v: { $sum: "$v" }
              }
            }
          }
        },
        {
          $project: {
            _id: 0,
            k: "$_id",
            v: { $arrayToObject: "$count" }
          }
        }
      ]
    }
  },
  { $set: { totals: { $arrayToObject: "$totals" } } }
])

Mongo Playground

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.