2

I have a product collection like this, simplified:

[
  {
    "_id": 1,
    "ref": "product 1",
    "variants": [
      {
        "ref": "variant 1.1",
        "categories": ["category a"]
      },
      {
        "ref": "variant 1.1",
        "categories": ["category a","category b"]
      }
    ]
  },
  {
    "_id": 2,
    "ref": "product 2",
    "variants": [
      {
        "ref": "variant 2.1",
        "categories": ["category c"]
      },
      {
        "ref": "variant 2.1",
        "categories": ["category a","category c"]
      }
    ]
  }
]

I want to query for categories (distinct), with their number of containing products (not variants).

For example some result like this:

[
  "category a": 2,
  "category b": 1,
  "category c": 1
]

I tried some queries with aggregate and unwind, but I cannot figure it out. All help appreciated!

This is what I have so far:

[
  {$match: ... }, // optional filtering
  {$unwind: '$variants'},
  {$unwind: '$variants.categories'},
]

But cannot figure out now, howto group by category, with a aggregate count of all products (not variants) inside that category.

0

1 Answer 1

2
db.products.aggregate([
    {$unwind: "$variants"},
    {$unwind: "$variants.categories"},
    {$group: {_id:"$_id", categories: {$addToSet:"$variants.categories"}}},
    {$unwind: "$categories"},
    {$group: {_id: "$categories", count: {$sum:1}}}
])

Output:

{ "_id" : "category b",  "count" : 1 }
{ "_id" : "category c",  "count" : 1 }
{ "_id" : "category a",  "count" : 2 }

Explanation. First two unwind operators will bring categories out of nested arrays and you will have documents like this

{
    "_id" : 1,
    "ref" : "product 1",
    "variants" : {
        "ref" : "variant 1.1",
        "categories" : "category a"
    }
},
{
    "_id" : 1,
    "ref" : "product 1",
    "variants" : {
        "ref" : "variant 1.1",
        "categories" : "category a"
    }
},
{
    "_id" : 1,
    "ref" : "product 1",
    "variants" : {
        "ref" : "variant 1.1",
        "categories" : "category b"
    }
},
...

Next I do grouping to eliminate duplicated categories from each product variants. Result:

{
    "_id" : 1,
    "categories" : [ 
        "category b", 
        "category a"
    ]
},
...

One more unwind to get rid of categories array.

{
    "_id" : 1,
    "categories" : "category b"
},
{
    "_id" : 1,
    "categories" : "category a"
},
{
    "_id" : 2,
    "categories" : "category a"
},
{
    "_id" : 2,
    "categories" : "category c"
}

And then grouping to calculate count of distinct categories in each product. You will have output as specified above.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.