1

Cardinality aggregation query calculates an approximate count of distinct values. How we can calculate the cardinality distribution of documents?

For example suppose we have:

a,a,a,b,b,b,c,c,d,d,e

and distinct count distribution is:

3: 2 # count of distint element that have 3 occurnes (a, b) 
2: 2 # c, d
1: 1 # e

1 Answer 1

1

Actually you cannot do aggregations like this.

But, using transform API (https://www.elastic.co/guide/en/elasticsearch/reference/current/transform-examples.html) you could create a new index to do a simple terms aggregation:

PUT _transform/so
{
  "dest" : {
   "index" : "my-so"
  },
  "source": {
    "index": "my-index"
  },
  "pivot": {
    "group_by": { 
      "country": {
        "terms": {
          "field": "letter"
        }
      }
    },
    "aggregations": {
      "cardinality": {
        "value_count": { 
          "field" : "letter"
        }
      }
    }
  }
}

This will give you:

[
    {
      "country" : "a",
      "cardinality" : 22
    },
    {
      "country" : "b",
      "cardinality" : 4
    },
    {
      "country" : "c",
      "cardinality" : 5049
    }...

Then, you can use simple terms or histogram aggregation:

GET /my-so/_search
{
  "size" : 0,
  "aggs": {
    "cc": {
      "terms": {
        "field": "cardinality"
      }
    }
  }
}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.