1

I am trying to do aggregation over the following document

{
  "pid": 900000,
  "mid": 9000,
  "cid": 90,
  "bid": 1000,
  "gmv": 1000000,
  "vol": 200,
  "data": [
    {
      "date": "25-11-2018",
      "gmv": 100000,
      "vol": 20
    },
    {
      "date": "24-11-2018",
      "gmv": 100000,
      "vol": 20
    },
    {
      "date": "23-11-2018",
      "gmv": 100000,
      "vol": 20
    }
  ]
}

The analysis which needs to be done here is:

  1. Filter on mid or/and cid on all documents
  2. Filter range on data.date for last 7 days and sum data.vol over that range for each pid
  3. sort the documents over the sum obtained in previous step in desc order
  4. Group these results by pid.

This means we are trying to get top products by sum of the volume (quantity sold) within a date range for specific cid/mid.

PID here refers product ID, MID refers here merchant ID, CID refers here category ID

3
  • Can you also show your mapping (i.e. is data of nested type)? Commented Nov 28, 2018 at 5:51
  • The data field shuld mapped as type nested. Then you can creat bool query to filter on mid and cid and have a nested query to filter on the data.date field. Last you neef terms aggregation on pid Commented Nov 28, 2018 at 5:51
  • { "mappings": {"_doc": {"properties": {"pid": {"type": "integer"}, "mid": {"type": "integer"}, "cid": {"type": "integer"}, "bid": {"type": "integer"}, "gmv": {"type": "integer"}, "vol": {"type": "integer"}, "data": {"properties": {"date": {"type": "date", "format": "yyyy-MM-dd"}, "gmv": {"type": "integer"}, "vol": {"type": "integer"} } } } } } } Commented Nov 28, 2018 at 7:24

2 Answers 2

2

Firstly you need to change your mapping to run the query on nested fields. change the type for field 'data' as 'nested'.

Then you can use the range query in filter along with the terms filter on mid/cid to filter on the data. Once you get the correct data set, then you can aggregate on the pid following the sub aggregation on sum of vol.

Here is the below query.

{
    "query": {
        "bool": {
            "filter": [
                {
                    "bool": {
                        "must": [
                            {
                                "range": {
                                    "data.date": {
                                        "gte": "28-11-2018",
                                        "lte": "25-11-2018"
                                    }
                                }
                            },
                            {
                                "must": [
                                    {
                                        "terms": {
                                            "mid": [
                                                "9000"
                                            ]
                                        }
                                    }
                                ]
                            }
                        ]
                    }
                }
            ]
        }
    },
    "aggs": {
        "AGG_PID": {
            "terms": {
                "field": "pid",
                "size": 0,
                "order": {
                    "TOTAL_SUM": "desc"
                },
                "min_doc_count": 1
            },
            "aggs": {
                "TOTAL_SUM": {
                    "sum": {
                        "field": "data.vol"
                    }
                }
            }
        }
    }
}

You can modify the query accordingly. Hope this will be helpful.

Sign up to request clarification or add additional context in comments.

Comments

0

Please find nested aggregation query which sorts by "vol" for each bucket of "pid". You can add any number of filters in the query part.

{ "size": 0, "query": { "bool": { "must": [ { "term": { "mid": "2" } } ] } }, "aggs": { "top_products_sorted_by_order_volume": { "terms": { "field": "pid", "order": { "nested_data_object>order_volume_by_range>order_volume_sum": "desc" } }, "aggs": { "nested_data_object": { "nested": { "path": "data" }, "aggs": { "order_volume_by_range": { "filter": { "range": { "data.date": { "gte": "2018-11-26", "lte": "2018-11-27" } } }, "aggs": { "order_volume_sum": { "sum": { "field": "data.ord_vol" } } } } } } } } } }

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.