0

How to apply computation using bucket fields via bucket_script? More so, I would like to understand how to aggregate on distinct, results.

For example, below is a sample query, and the response.

What I am looking for is to aggregate the following into two fields:

  1. sum of all buckets dist.value from e.g. response (1+2=3)
  2. sum of all buckets (dist.value x key) from e.g., response (1x10)+(2x20)=50

Query

{
    "size": 0,
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "field": "value"
                    }
                }
            ]
        }
    },
    "aggs":{
        "sales_summary":{
            "terms":{
                "field":"qty",
                "size":"100"
            },
            "aggs":{
                "dist":{
                    "cardinality":{
                        "field":"somekey.keyword"
                    }
                }
            }
        }
    }
}

Query Result:

{
    "aggregations": {
        "sales_summary": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
                {
                    "key": 10,
                    "doc_count": 100,
                    "dist": {
                        "value": 1
                    }
                },
                {
                    "key": 20,
                    "doc_count": 200,
                    "dist": {
                        "value": 2
                    }
                }
            ]
        }
    }
}
0

1 Answer 1

2

You need to use a sum bucket aggregation, which is a pipeline aggregation to find the sum of response of cardinality aggregation across all the buckets.

Search Query for sum of all buckets dist.value from e.g. response (1+2=3):

POST idxtest1/_search
{
  "size": 0,
  "aggs": {
    "sales_summary": {
      "terms": {
        "field": "qty",
        "size": "100"
      },
      "aggs": {
        "dist": {
          "cardinality": {
            "field": "pageview"
          }
        }
      }
    },
    "sum_buckets": {
      "sum_bucket": {
        "buckets_path": "sales_summary>dist"
      }
    }
  }
}

Search Response :

"aggregations" : {
    "sales_summary" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 10,
          "doc_count" : 3,
          "dist" : {
            "value" : 2
          }
        },
        {
          "key" : 20,
          "doc_count" : 3,
          "dist" : {
            "value" : 3
          }
        }
      ]
    },
    "sum_buckets" : {
      "value" : 5.0
    }
  }

For the second requirement, you need to first modify the response of value in the bucket aggregation response, using bucket script aggregation, and then use the modified value to perform bucket sum aggregation on it.

Search Query for sum of all buckets (dist.value x key) from e.g., response (1x10)+(2x20)=50

POST idxtest1/_search
{
  "size": 0,
  "aggs": {
    "sales_summary": {
      "terms": {
        "field": "qty",
        "size": "100"
      },
      "aggs": {
        "dist": {
          "cardinality": {
            "field": "pageview"
          }
        },
        "format-value-agg": {
          "bucket_script": {
            "buckets_path": {
              "newValue": "dist"
            },
            "script": "params.newValue * 10"
          }
        }
      }
    },
    "sum_buckets": {
      "sum_bucket": {
        "buckets_path": "sales_summary>format-value-agg"
      }
    }
  }
}

Search Response :

"aggregations" : {
    "sales_summary" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 10,
          "doc_count" : 3,
          "dist" : {
            "value" : 2
          },
          "format-value-agg" : {
            "value" : 20.0
          }
        },
        {
          "key" : 20,
          "doc_count" : 3,
          "dist" : {
            "value" : 3
          },
          "format-value-agg" : {
            "value" : 30.0
          }
        }
      ]
    },
    "sum_buckets" : {
      "value" : 50.0
    }
  }
Sign up to request clarification or add additional context in comments.

2 Comments

based on your option 2 response, I realized that I need to clarify that item. I was expecting sum_buckets value to be 80 (10x2)+(20x3)=80 because computation I was looking for is (dist.value x key)
passed down a field with value from _key and used it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.